I don't understand the limitation to different types, so +1 for generalized unions. That said, I don't think it's high-priority either.
Regards Antoine. Le 24/05/2019 à 04:17, Micah Kornfield a écrit : > I'd like to bump this thread, to see if anyone has any comments. If nobody > objects I will try to start implementing the changes next week. > > Thanks, > Micah > > On Mon, May 20, 2019 at 9:37 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> In the past [1] there hasn't been agreement on the final requirements for >> union types. >> >> Briefly the two approaches that are currently advocated: >> 1. Limit unions to only contain one field of each individual type (e.g. >> you can't have two separate int32 fields). Java takes this approach. >> 2. Generalized unions (unions can have any number of fields with the same >> type). C++ takes this approach. >> >> There was a prior PR [2] that stalled in trying to take this approach with >> Java. For writing vectors it seemed to be slower on a benchmark. >> >> My proposal: We should pursue option 2 (the general approach). There are >> already data interchange formats that support it and it would be nice to a >> data-model that lets us make the translation between Arrow schemas easy: >> 1. Avro Seems to support it [3] (with the exception of complex types) >> 2. Protobufs loosely support it [4] via one-of. >> >> In order to address issues in [2], I propose the following making the >> changes/additions to the Java implementation: >> 1. Keep the default write-path untouched with the existing class. >> 2. Add in a new sparse union class that implements the same interface >> that can be used on the read path, and if a client opts in (via direct >> construction). >> 3. Add in a dense union class (I don't believe Java has one). >> >> I'm still ramping up the Java code base, so I'd like other Java >> contributors to chime in to see if this plan sounds feasible and acceptable. >> >> Any other thoughts on Unions? >> >> Thanks, >> Micah >> >> [1] >> https://lists.apache.org/thread.html/82ec2049fc3c29de232c9c6962aaee9ec022d581cecb6cf0eb6a8f36@%3Cdev.arrow.apache.org%3E >> [2] https://github.com/apache/arrow/pull/987#issuecomment-493231493 >> [3] https://github.com/apache/arrow/pull/987#issuecomment-493231493 >> [4] https://developers.google.com/protocol-buffers/docs/proto#oneof >> >