cxzl25 commented on code in PR #2371:
URL: https://github.com/apache/orc/pull/2371#discussion_r2344221223


##########
java/core/src/java/org/apache/orc/OrcConf.java:
##########
@@ -182,6 +187,9 @@ public enum OrcConf {
       "added to all of the writers.  Valid range is [1,10000] and is primarily 
meant for" +
       "testing.  Setting this too low may negatively affect performance."
         + " Use orc.stripe.row.count instead if the value larger than 
orc.stripe.row.count."),
+  STRIPE_SIZE_CHECK("orc.stripe.size.check", 
"hive.exec.orc.default.stripe.size.check",
+      128L * 1024 * 1024,

Review Comment:
   Can you consider the configuration according to the scale of stripe size?  
   
   `orc.stripe.size`*ratio



##########
java/core/src/java/org/apache/orc/impl/WriterImpl.java:
##########
@@ -325,9 +327,9 @@ public boolean checkMemory(double newScale) throws 
IOException {
   }
 
   private boolean checkMemory() throws IOException {
-    if (rowsSinceCheck >= ROWS_PER_CHECK) {
+    long size = treeWriter.estimateMemory();

Review Comment:
   I suggest that if the conditions do not meet the requirements, the call 
frequency of `estimateMemory` is appropriately reduced.
   ```suggestion
       long size = rowsSinceCheck < ROWS_PER_CHECK && STRIPE_SIZE_CHECK <= 0
           ? 0 : treeWriter.estimateMemory();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to