InvisibleProgrammer commented on code in PR #4738:
URL: https://github.com/apache/hive/pull/4738#discussion_r1335442468


##########
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ColumnarSplitSizeEstimator.java:
##########
@@ -35,6 +35,9 @@ public class ColumnarSplitSizeEstimator implements 
SplitSizeEstimator {
   @Override
   public long getEstimatedSize(InputSplit inputSplit) throws IOException {
     long colProjSize = inputSplit.getLength();
+    if (colProjSize == 0) {

Review Comment:
   The original code had two paths to overwrite the value provided from 
`inputSplit.getLength` and now we skip those paths. Also, what if the columnar 
projection size or the inner split has 0 bytes? And the original root cause of 
the issue is at the end of this method:
   ```java
       if (colProjSize <= 0) {
         /* columnar splits of unknown size - estimate worst-case */
         return Integer.MAX_VALUE;
       }
   ```
   
   What about changing `colProjSize <= 0` to `colProjSize < 0` so that we can 
keep the original logic and fix the Integer.MAX_VALUE issue?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to