Hi Jacques,

Do we have a good definition of what is necessary to add a new data type?
> Adding a type but not pulling it through most of the code seems less than
> ideal since it means one part of Arrow doesn't work with another (providing
> a less optimal end-user experience).

I think what I proposed below is a minimum viable integration plan (and
covers previously discussed requirements for new types). It demonstrates
interop between two reference implementations and allows conversion to/from
idiomatic language analogues.  So it covers the basic goal of "arrow
interop".


For example, would this work include making Gandiva and all the kernels
> support this new data type where appropriate?

Not initially.  There needs to be a stepping stone to start supporting new
types. I don't think it is feasible to try to land all of this
functionality in one PR.  I'll lend a hand at trying get support for
built-in compute after we get the first part done.

Thanks,
Micah



On Fri, Aug 14, 2020 at 5:08 PM Jacques Nadeau <jacq...@apache.org> wrote:

> Do we have a good definition of what is necessary to add a new data type?
> Adding a type but not pulling it through most of the code seems less than
> ideal since it means one part of Arrow doesn't work with another (providing
> a less optimal end-user experience).
>
> For example, would this work include making Gandiva and all the kernels
> support this new data type where appropriate?
>
> On Wed, Aug 5, 2020 at 12:01 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > Sounds fine to me. I guess one question is what needs to be formalized
> > in the Schema.fbs files or elsewhere in the columnar format
> > documentation (and we will need to hold an associated vote for that I
> > think)
> >
> > On Mon, Aug 3, 2020 at 11:30 PM Micah Kornfield <emkornfi...@gmail.com>
> > wrote:
> > >
> > > Given no objections, we'll go ahead and start implementing support for
> > 256-bit decimals.
> > >
> > > I'm considering setting up another branch to develop all the components
> > so they can be merged to master atomically.
> > >
> > > Thanks,
> > > Micah
> > >
> > > On Tue, Jul 28, 2020 at 6:39 AM Wes McKinney <wesmck...@gmail.com>
> > wrote:
> > >>
> > >> Generally this sounds fine to me. At some point it would be good to
> > >> add 32-bit and 64-bit decimal support but this can be done in the
> > >> future.
> > >>
> > >> On Tue, Jul 28, 2020 at 7:28 AM Fan Liya <liya.fa...@gmail.com>
> wrote:
> > >> >
> > >> > Hi Micah,
> > >> >
> > >> > Thanks for opening the discussion.
> > >> > I am aware of some scenarios where decimal requires more than 16
> > bytes, so
> > >> > I think it would be beneficial to support this in Arrow.
> > >> >
> > >> > Best,
> > >> > Liya Fan
> > >> >
> > >> >
> > >> > On Tue, Jul 28, 2020 at 11:12 AM Micah Kornfield <
> > emkornfi...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Arrow Dev,
> > >> > > ZetaSQL (Google's open source standard SQL library) recently
> > introduced a
> > >> > > BigNumeric [1] type which requires a 256 bit width to properly
> > support it.
> > >> > > I'd like to add support (possibly in collaboration with some of my
> > >> > > colleagues) to add support for 256 bit width Decimals in Arrow to
> > support a
> > >> > > type corresponding to BigNumeric.
> > >> > >
> > >> > > In past discussions on this, I don't think we established a
> minimum
> > bar for
> > >> > > supporting additional bit-widths within Arrow.
> > >> > >
> > >> > > I'd like to propose the following requirements:
> > >> > > 1.  A vote agreeing on adding support for a new bitwidth (we can
> > discuss
> > >> > > any objections here).
> > >> > > 2.  Support in Java and C++ for integration tests verifying the
> > ability to
> > >> > > round-trip the value.
> > >> > > 3.  Support in Java for conversion to/from BigDecimal [2]
> > >> > > 4.  Support in Python converting to/from Decimal [3]
> > >> > >
> > >> > > Is there anything else that people feel like is a requirement for
> > basic
> > >> > > support of an additional bit width for Decimal's?
> > >> > >
> > >> > > Thanks,
> > >> > > Micah
> > >> > >
> > >> > >
> > >> > > [1]
> > >> > >
> > >> > >
> >
> https://github.com/google/zetasql/blob/1aefaa7c62fc7a50def879bb7c4225ec6974b7ef/zetasql/public/numeric_value.h#L486
> > >> > > [2]
> > https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html
> > >> > > [3] https://docs.python.org/3/library/decimal.html
> > >> > >
> >
>

Reply via email to