Hi Dmitry, Thanks for volunteering to contribute. Note that we are in the process of implementing Decimal256 support already, which is currently on a separate branch [1] but I'm hoping to have PR sometime early next week. If we are proposing adding support for these lower bit-widths, I think we will likely need an implementation in Java (maybe someone else is interested in doing this on that side). For reference [2] is the discussion for adding Decimal 256-bit width
Thanks, Micah [1] https://github.com/apache/arrow/tree/decimal256 [2] https://mail-archives.apache.org/mod_mbox/arrow-dev/202007.mbox/%3CCAK7Z5T9VgsTmjD%3Dmdq1Sci5D0%2BVZPVKWzxCWfHErUdNjak1X9w%40mail.gmail.com%3E On Thu, Oct 8, 2020 at 2:12 PM Wes McKinney <wesmck...@gmail.com> wrote: > Based in what I've seen in other systems (like Apache Kudu) that > support at least 32/64-bit decimal, representing them with a single > integer value is probably the best thing (in terms of computing > performance, consistency with other implementations) > > I added you as a contributor on Jira so you can assign yourself issues now > > On Thu, Oct 8, 2020 at 2:33 PM Chigarev, Dmitry > <dmitry.chiga...@intel.com> wrote: > > > > Hi everyone, I would like to work on this JIRA ticket: > > https://issues.apache.org/jira/browse/ARROW-9404 > > ([C++] Add support for Decimal16, Decimal32 and Decimal64) > > > > This will be my first experience with contributing to Arrow, so I want > to ask advice what approach should I use. > > As far as I know, currently, Arrow supports only Decimal128 and its > basic and primary implementations located at > `cpp/src/arrow/util/basic_decimal.h` and `.../util/decimal.h`. In current > implementation 128-bit Decimal represented by two 64-bit integers. So there > are several approaches that can be applied: > > > > 1. From current BasicDecimal128 class make a template class > BasicDecimal<bit_width>, whose `low` and `high` variable types will be > dependent on `bit_width` template parameter, implementation of methods also > will be rewritten to depend on `bit_width`. As the result, we will have > classes that work with `bit_width / 2`-bit integers. > > The disadvantage of this approach is that even when we can fit our > decimal in a single int value, we will still be splitting it into two > variables and have to do that unnecessary extra logic of handling these two > variables. But all Decimals will be the instances of one template class and > will be consistent with each other. > > > > 2. Implement new template class BasicDecimal<bit_width> and > Decimal<bit_width> which will work only for bit_width <= 64 (where we can > represent our decimal with a single `int##bit_width##_t` variable), and > also reimplement all methods of Decimals in this new class. > > But that approach makes ambiguous what Decimal is, because technically > Decimal64 and Decimal128 will be completely different classes which can > create some inconsistency between them. > > > > 3. If we have some variable that indicates a maximum bit integer, then > we can try to apply the following approach. Define template > BasicDecimal<bit_width> class whose value will be represented not by ints > variables, but by array of ints: > > ``` > > // Pseudo-code, may be incorrect > > template<int width> > > class BasicDecimal{ > > using int_type = IntBitWidthTypes<max(width, MAX_INT_WIDTH)>::type; > > int_type values[max(width / MAX_INT_WIDTH, 1)]; > > // all of these can be computed at the compile time > > .... > > }; > > > > using BasicDecimal128 = BasicDecimal<128>; > > using BasicDecimal64 = BasicDecimal<64>; > > .... > > ``` > > In the result, Decimal128 will have an uint64_t array of 2 elements, > Decimal64 will have an uint64_t array of 1 element, Decimal32 - uint32_t > array of 1 element, and so on... > > This also allows us to define decimals of arbitrary bitness. For > example, Decimal256 will be represented as an array of uint64_t with 4 > elements. > > The bad side of this approach is its complexity - we need to rewrite the > whole BasicDecimal and Decimal class. > > > > Which one of these approaches will be the correct one? > > > > P.S. I've just noticed that I'm not being able to assign JIRA tasks for > myself, how can I do this? > > > > > > -------------------------------------------------------------------- > > Joint Stock Company Intel A/O > > Registered legal address: Krylatsky Hills Business Park, > > 17 Krylatskaya Str., Bldg 4, Moscow 121614, > > Russian Federation > > > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). Any review or distribution > > by others is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies. >