emkornfield commented on issue #12618:
URL: https://github.com/apache/arrow/issues/12618#issuecomment-1070329458


   Looking through this at a high-level (I think I might have already mentioned 
some of this on the mailing list) but here are a few comments:
   0.  I think having easy conversion from a map based Rows to a 
VectorSchemaRoot is valuable.  Would the intention be to have a mapping for all 
Arrow data types from a java object?  I think some of the existing getObject 
calls don't return the optimal types would the intention be to follow those 
mappings when possible?  
   1.  I'm hesitant create a class named Dataframe in the project just for easy 
conversion back and forth between tuples.  I think DataFrames come with a lot 
of expectations and in particular it seems like the canonical memory 
representation here seems to be row-based on-heap objects, I would expect an 
implementation to use a columnar representation (and at least use the concept 
of Vectors for columns even if VectorSchemaRoot isn't used).
   2.  I started a mailing list discussion on minimum Java version, but I 
believe we should be targetting at most JDK 11 for the time being.
   3. for conversion from strings you need to pass UTF_ENCODING to avoid 
brittleness in conversion.
   4.  I think trying to implement this in the pattern 
[Loader](https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/VectorLoader.html)
 and 
[Unloader](https://arrow.apache.org/docs/java/reference/org/apache/arrow/vector/VectorUnloader.html).
   Maybe a new interface like VectorRowLoader and VectorRowUnloader?   If the 
goal is to interface well with flight I think this might be the most ergonomic.
   5.  This probably belongs in a new contrib module, but I think this would 
lower the barrier for entry, so if you are willing to contribute something I'd 
be willing to help review.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to