TGooch44 commented on pull request #2399:
URL: https://github.com/apache/iceberg/pull/2399#issuecomment-812240754


   > A handful of minor comments @TGooch44 , I have also been concerned about 
Java and Python diverging. Been thinking of how to better keep the two in sync 
with so few people on the python impl. Something we need to figure out if we 
want to push this to pypi at some point. Curious to see what you think!
   > 
   > This change itself LGTM, its hard to tell w/o all the context. What other 
changes are you thinking of after this and what is the broader plan for the 
other N-1 PRs?
   
   The next PR, fleshes out the capability around partition 
transformation/evaluation, and updates some of the evaluators.  From there, I 
think a few months back we had discussed a higher level construct over the 
parquet reader that can take a Table or TableScan and turn that in to a set of 
parquet reads and then unify that into an arrow table, record batches, etc.  I 
know it can be difficult to review some of this broad PRs, so I was trying to 
chunk it up into some more manageable sized chunks and be sure to have good 
test coverage especially if the java side has the tests already there.
   
   I think it's going to be challenging maintaining a close coupling with the 
java implementation.  It may be worthwhile sometime soon, to look at what's 
currently in the python library and how it can be simplified or designed in a 
way that's less aligned with java and more aligned with a design that works 
well in python. I'd just kind of like to get a functional read path(and maybe 
at least some of the simple append type write path cases) in before iterating 
too much on the current design.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to