wgtmac commented on code in PR #1275:
URL: https://github.com/apache/orc/pull/1275#discussion_r1004643214


##########
c++/src/io/OutputStream.cc:
##########
@@ -95,9 +91,14 @@ namespace orc {
 
   uint64_t BufferedOutputStream::flush() {
     uint64_t dataSize = dataBuffer->size();
+    // flush data buffer into outputStream
+    if (dataSize > 0)
     {
-      SCOPED_STOPWATCH(metrics, IOBlockingLatencyUs, IOCount);
-      outputStream->write(dataBuffer->data(), dataSize);
+      SCOPED_STOPWATCH(metrics, IOBlockingLatencyUs, nullptr);
+      uint64_t IOCount = dataBuffer->writeTo(outputStream);

Review Comment:
   Return IOCount is weird. Better to pass the pointer of metrics into the 
writeTo function as a parameter.



##########
c++/src/BlockBuffer.cc:
##########
@@ -82,4 +83,42 @@ namespace orc {
       }
     }
   }
+
+  uint64_t BlockBuffer::writeTo(OutputStream* output) {
+    static uint64_t MAX_CHUNK_SIZE = 1024 * 1024 * 1024;
+    uint64_t chunkSize = std::min(output->getNaturalWriteSize(), 
MAX_CHUNK_SIZE);
+    if (chunkSize == 0) {
+      throw std::logic_error("Natural write size cannot be zero");
+    }
+    char* chunk = memoryPool.malloc(chunkSize);

Review Comment:
   Yes, however you can still get rid of the allocation when blockNumber == 1 
&& naturalWriteSize >= block.size. BTW, we can keep the current implementation 
for now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to