sachouche opened a new pull request #1713: Fixed IllegalStateException while 
reading Parquet data
URL: https://github.com/apache/drill/pull/1713
 
 
   **Problem Description -**
   
   - The code sampled a varchar field and realized it is of fixed length type
   - It processed a couple of internal batches (4k max) just fine
   - Somehow at the last batch of the row-group, the column precision became 
lower
   - The logic to compute the next-record-batch was based on how much data it 
has loaded in the byte buffer and the expected precision: "loaded-data / 
expected-precision"
   - Hence the chunk-size became zero when the loaded data changed precision
   
   **Fix -**
   - The logic to compute the next-record-batch should only worry that we do 
not try to read beyond the byte-buffer (to handle changes in precision for the 
last few values)
   - Modified the code to use: "buffer-capacity / expected-precision"
   - This way the code will have a chance to discover the change in precision 
(which it already does)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to