This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/orc.git


The following commit(s) were added to refs/heads/main by this push:
     new 21c43267e ORC-2031:Document orc.dictionary.max.size.bytes and 
orc.stripe.size.check.ratio
21c43267e is described below

commit 21c43267e9817eba9adc4b54c043417b7bfdcdec
Author: yongqian <[email protected]>
AuthorDate: Tue Oct 21 09:21:13 2025 -0700

    ORC-2031:Document orc.dictionary.max.size.bytes and 
orc.stripe.size.check.ratio
    
    ### What changes were proposed in this pull request?
    
    Add documentation for two ORC configuration options to core-java-config.md:
    - orc.dictionary.max.size.bytes (default: 16777216)
    - orc.stripe.size.check.ratio (default: 2.0)
    
    ### Why are the changes needed?
    
    These configuration options were defined in OrcConf.java but missing from 
the official documentation. Users need official guidance on their purpose and 
usage.
    
    ### How was this patch tested?
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #2450 from QianyongY/features/document_orc_conf_2031.
    
    Authored-by: yongqian <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 site/_docs/core-java-config.md | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/site/_docs/core-java-config.md b/site/_docs/core-java-config.md
index 067852a47..42bbbd17f 100644
--- a/site/_docs/core-java-config.md
+++ b/site/_docs/core-java-config.md
@@ -165,6 +165,13 @@ permalink: /docs/core-java-config.html
     If the number of distinct keys in a dictionary is greater than this 
fraction of the total number of non-null rows, turn off  dictionary encoding.  
Use 1 to always use dictionary encoding.
   </td>
 </tr>
+<tr>
+  <td><code>orc.dictionary.max.size.bytes</code></td>
+  <td>16777216</td>
+  <td>
+    If the total size of the dictionary is greater than this, turn off 
dictionary encoding. Use 0 to disable this check.
+  </td>
+</tr>
 <tr>
   <td><code>orc.dictionary.early.check</code></td>
   <td>true</td>
@@ -284,6 +291,13 @@ permalink: /docs/core-java-config.html
     How often should MemoryManager check the memory sizes? Measured in rows 
added to all of the writers.  Valid range is [1,10000] and is primarily meant 
fortesting.  Setting this too low may negatively affect performance. Use 
orc.stripe.row.count instead if the value larger than orc.stripe.row.count.
   </td>
 </tr>
+<tr>
+  <td><code>orc.stripe.size.check.ratio</code></td>
+  <td>2.0</td>
+  <td>
+    Flush stripe if the tree writer size in bytes is larger than (this * 
orc.stripe.size). Use 0 to disable this check.
+  </td>
+</tr>
 <tr>
   <td><code>orc.overwrite.output.file</code></td>
   <td>false</td>

Reply via email to