wgtmac commented on code in PR #1932:
URL: https://github.com/apache/orc/pull/1932#discussion_r1600989314
##########
c++/src/Writer.cc:
##########
@@ -45,11 +45,11 @@ namespace orc {
std::string timezone;
WriterMetrics* metrics;
bool useTightNumericVector;
- uint64_t outputBufferCapacity;
+ uint64_t memoryBlockSize;
WriterOptionsPrivate() : fileVersion(FileVersion::v_0_12()) { // default
to Hive_0_12
stripeSize = 64 * 1024 * 1024; // 64M
- compressionBlockSize = 64 * 1024; // 64K
+ compressionBlockSize = 256 * 1024; // 256K
Review Comment:
We should not change the default behavior if unnecessary.
##########
c++/src/io/OutputStream.cc:
##########
@@ -98,6 +98,12 @@ namespace orc {
dataBuffer_->resize(0);
}
+ uint64_t BufferedOutputStream::getRawInputBufferSize() const {
+ // we're unable to determine the size of the raw input buffer
+ // simply return 0
+ return 0;
Review Comment:
+1
##########
c++/include/orc/Writer.hh:
##########
@@ -266,17 +266,15 @@ namespace orc {
bool getUseTightNumericVector() const;
/**
- * Set the initial capacity of output buffer in the class
BufferedOutputStream.
- * Each column contains one or more BufferOutputStream depending on its
type,
- * and these buffers will automatically expand when more memory is
required.
+ * Set the initial block size of input buffer in the class
CompressionStream.
*/
- WriterOptions& setOutputBufferCapacity(uint64_t capacity);
+ WriterOptions& setMemoryBlockSize(uint64_t capacity);
Review Comment:
We cannot remove or rename any public API. This will immediately break
downstream code once they upgrade to this version. If we think these APIs are
not used any more, we'd better add `[[deprecated]]` attribute.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]