Based in what I've seen in other systems (like Apache Kudu) that
support at least 32/64-bit decimal, representing them with a single
integer value is probably the best thing (in terms of computing
performance, consistency with other implementations)

I added you as a contributor on Jira so you can assign yourself issues now

On Thu, Oct 8, 2020 at 2:33 PM Chigarev, Dmitry
<dmitry.chiga...@intel.com> wrote:
>
> Hi everyone, I would like to work on this JIRA ticket:
> https://issues.apache.org/jira/browse/ARROW-9404
> ([C++] Add support for Decimal16, Decimal32 and Decimal64)
>
> This will be my first experience with contributing to Arrow, so I want to ask 
> advice what approach should I use.
> As far as I know, currently, Arrow supports only Decimal128 and its basic and 
> primary implementations located at `cpp/src/arrow/util/basic_decimal.h` and 
> `.../util/decimal.h`. In current implementation 128-bit Decimal represented 
> by two 64-bit integers. So there are several approaches that can be applied:
>
> 1. From current BasicDecimal128 class make a template class 
> BasicDecimal<bit_width>, whose `low` and `high` variable types will be 
> dependent on `bit_width` template parameter, implementation of methods also 
> will be rewritten to depend on `bit_width`. As the result, we will have 
> classes that work with `bit_width / 2`-bit integers.
> The disadvantage of this approach is that even when we can fit our decimal in 
> a single int value, we will still be splitting it into two variables and have 
> to do that unnecessary extra logic of handling these two variables. But all 
> Decimals will be the instances of one template class and will be consistent 
> with each other.
>
> 2. Implement new template class BasicDecimal<bit_width> and 
> Decimal<bit_width> which will work only for bit_width <= 64 (where we can 
> represent our decimal with a single `int##bit_width##_t` variable), and also 
> reimplement all methods of Decimals in this new class.
> But that approach makes ambiguous what Decimal is, because technically 
> Decimal64 and Decimal128 will be completely different classes which can 
> create some inconsistency between them.
>
> 3. If we have some variable that indicates a maximum bit integer, then we can 
> try to apply the following approach. Define template BasicDecimal<bit_width> 
> class whose value will be represented not by ints variables, but by array of 
> ints:
> ```
> // Pseudo-code, may be incorrect
> template<int width>
> class BasicDecimal{
>     using int_type = IntBitWidthTypes<max(width, MAX_INT_WIDTH)>::type;
>     int_type values[max(width / MAX_INT_WIDTH, 1)];
> // all of these can be computed at the compile time
> ....
> };
>
> using BasicDecimal128 = BasicDecimal<128>;
> using BasicDecimal64 = BasicDecimal<64>;
> ....
> ```
> In the result, Decimal128 will have an uint64_t array of 2 elements, 
> Decimal64 will have an uint64_t array of 1 element, Decimal32 - uint32_t 
> array of 1 element, and so on...
> This also allows us to define decimals of arbitrary bitness. For example, 
> Decimal256 will be represented as an array of uint64_t with 4 elements.
> The bad side of this approach is its complexity - we need to rewrite the 
> whole BasicDecimal and Decimal class.
>
> Which one of these approaches will be the correct one?
>
> P.S. I've just noticed that I'm not being able to assign JIRA tasks for 
> myself, how can I do this?
>
>
> --------------------------------------------------------------------
> Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park,
> 17 Krylatskaya Str., Bldg 4, Moscow 121614,
> Russian Federation
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.

Reply via email to