Fokko commented on code in PR #3502:
URL: https://github.com/apache/parquet-java/pull/3502#discussion_r3256327637


##########
parquet-column/src/main/java/org/apache/parquet/column/values/dictionary/DictionaryValuesWriter.java:
##########
@@ -121,8 +125,20 @@ protected DictionaryPage dictPage(ValuesWriter 
dictPageWriter) {
 
   @Override
   public boolean shouldFallBack() {
-    // if the dictionary reaches the max byte size or the values can not be 
encoded on 4 bytes anymore.
-    return dictionaryByteSize > maxDictionaryByteSize || getDictionarySize() > 
MAX_DICTIONARY_ENTRIES;
+    return dictionarySizeExceeded;
+  }
+
+  /**
+   * Called by subclass write methods after adding a new dictionary entry to 
check if the dictionary
+   * has exceeded its size limits. This avoids the per-value virtual dispatch 
overhead of calling
+   * getDictionarySize() on every write -- the check only runs when a new 
entry is actually added.
+   *
+   * @param newDictionarySize the current dictionary size after adding the new 
entry
+   */
+  protected void checkDictionarySizeLimit(int newDictionarySize) {

Review Comment:
   Since `dictionarySizeExceeded` is private, shouldn't this method be private 
as well?
   
   ```suggestion
     private void checkDictionarySizeLimit(int newDictionarySize) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to