RussellSpitzer commented on PR #7688:
URL: https://github.com/apache/iceberg/pull/7688#issuecomment-1568940063

   I think this is mentioned above, but it does feel like we are targeting this 
at the wrong place. If we have a min parallelism I think the controls should 
probably be centered around task coalescing. Currently for files with offsets 
we always break them into the maximal amount of offset tasks before 
recombining. The only real issue is for files without offsets correct? That's 
the only reason we may want to control the split size since they are cut up 
based on that property rather than actual offsets?
   
   I wonder if it might be clearer to just have a "Offset" codepath that just 
works during recombination and a special codepath for non-offset filetypes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to