Hi Jacques, Do we have a good definition of what is necessary to add a new data type? > Adding a type but not pulling it through most of the code seems less than > ideal since it means one part of Arrow doesn't work with another (providing > a less optimal end-user experience).
I think what I proposed below is a minimum viable integration plan (and covers previously discussed requirements for new types). It demonstrates interop between two reference implementations and allows conversion to/from idiomatic language analogues. So it covers the basic goal of "arrow interop". For example, would this work include making Gandiva and all the kernels > support this new data type where appropriate? Not initially. There needs to be a stepping stone to start supporting new types. I don't think it is feasible to try to land all of this functionality in one PR. I'll lend a hand at trying get support for built-in compute after we get the first part done. Thanks, Micah On Fri, Aug 14, 2020 at 5:08 PM Jacques Nadeau <jacq...@apache.org> wrote: > Do we have a good definition of what is necessary to add a new data type? > Adding a type but not pulling it through most of the code seems less than > ideal since it means one part of Arrow doesn't work with another (providing > a less optimal end-user experience). > > For example, would this work include making Gandiva and all the kernels > support this new data type where appropriate? > > On Wed, Aug 5, 2020 at 12:01 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > Sounds fine to me. I guess one question is what needs to be formalized > > in the Schema.fbs files or elsewhere in the columnar format > > documentation (and we will need to hold an associated vote for that I > > think) > > > > On Mon, Aug 3, 2020 at 11:30 PM Micah Kornfield <emkornfi...@gmail.com> > > wrote: > > > > > > Given no objections, we'll go ahead and start implementing support for > > 256-bit decimals. > > > > > > I'm considering setting up another branch to develop all the components > > so they can be merged to master atomically. > > > > > > Thanks, > > > Micah > > > > > > On Tue, Jul 28, 2020 at 6:39 AM Wes McKinney <wesmck...@gmail.com> > > wrote: > > >> > > >> Generally this sounds fine to me. At some point it would be good to > > >> add 32-bit and 64-bit decimal support but this can be done in the > > >> future. > > >> > > >> On Tue, Jul 28, 2020 at 7:28 AM Fan Liya <liya.fa...@gmail.com> > wrote: > > >> > > > >> > Hi Micah, > > >> > > > >> > Thanks for opening the discussion. > > >> > I am aware of some scenarios where decimal requires more than 16 > > bytes, so > > >> > I think it would be beneficial to support this in Arrow. > > >> > > > >> > Best, > > >> > Liya Fan > > >> > > > >> > > > >> > On Tue, Jul 28, 2020 at 11:12 AM Micah Kornfield < > > emkornfi...@gmail.com> > > >> > wrote: > > >> > > > >> > > Hi Arrow Dev, > > >> > > ZetaSQL (Google's open source standard SQL library) recently > > introduced a > > >> > > BigNumeric [1] type which requires a 256 bit width to properly > > support it. > > >> > > I'd like to add support (possibly in collaboration with some of my > > >> > > colleagues) to add support for 256 bit width Decimals in Arrow to > > support a > > >> > > type corresponding to BigNumeric. > > >> > > > > >> > > In past discussions on this, I don't think we established a > minimum > > bar for > > >> > > supporting additional bit-widths within Arrow. > > >> > > > > >> > > I'd like to propose the following requirements: > > >> > > 1. A vote agreeing on adding support for a new bitwidth (we can > > discuss > > >> > > any objections here). > > >> > > 2. Support in Java and C++ for integration tests verifying the > > ability to > > >> > > round-trip the value. > > >> > > 3. Support in Java for conversion to/from BigDecimal [2] > > >> > > 4. Support in Python converting to/from Decimal [3] > > >> > > > > >> > > Is there anything else that people feel like is a requirement for > > basic > > >> > > support of an additional bit width for Decimal's? > > >> > > > > >> > > Thanks, > > >> > > Micah > > >> > > > > >> > > > > >> > > [1] > > >> > > > > >> > > > > > https://github.com/google/zetasql/blob/1aefaa7c62fc7a50def879bb7c4225ec6974b7ef/zetasql/public/numeric_value.h#L486 > > >> > > [2] > > https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html > > >> > > [3] https://docs.python.org/3/library/decimal.html > > >> > > > > >