Re: [Discuss] Expression Language Options to Add Decimal Support

Pierre Villard Mon, 25 Apr 2016 07:23:23 -0700

With explanations from Mark, option 3 is fine with me especially as it is
backward compatible.


Pierre

2016-04-25 16:14 GMT+02:00 Joe Percivall <[email protected]>:

> Pierre and Andy,
>
> How do you guys feel about option 3 as Mark has explained it? Keeping in
> mind that it would be backwards compatible. As it stands we have differing
> opinions, which is a good thing, but we need to reach an agreement in order
> to proceed.
>
> Joe
> - - - - - -
> Joseph Percivall
> linkedin.com/in/Percivall
> e: [email protected]
>
>
>
>
> On Thursday, April 21, 2016 11:12 AM, Joe Percivall
> <[email protected]> wrote:
> Mark,
>
>
> After thinking through this a bit more I believe you're right. We can
> assume this logic for every operation:
>
> Two whole numbers as input returns a whole number
> A whole number and a decimal as input returns a decimal
> Two decimals as input returns a Decimal
>
> I believe, one problem with your explanation as written is "...infer the
> return type and allow the user to explicitly change it via toNumber() and
> toDecimal() calls if need be". More specifically you would need to call
> toNumber/toDecimal on the inputs in order to change the return type. You
> can't change the return type after the expression runs (without the
> potential loss of information/precision of converting between the two)
> because once the expression evaluator exits it needs to have already
> interpreted it as a long or double which would occur before the user is
> able to call toNumber/toDecimal. So the user would have to control the
> return type would by calling toNumber/toDecimal on the inputs.
>
> That leads to the one confusing bit about this approach, when a user
> attempts to do an operation on two whole numbers that they would expect to
> return a decimal (like 8/5 = 1.6). In order to get their expected output
> the user would have to know to explicitly convert at least one of them to a
> double by using the toDecimal operation.
>
> That being said it's probably that's worth it to keep down the verbosity
> and still have backwards compatibility.
>
>
> Joe
> - - - - - - Joseph Percivall
> linkedin.com/in/Percivall
> e: [email protected]
>
>
>
>
> On Thursday, April 21, 2016 10:20 AM, Mark Payne <[email protected]>
> wrote:
>
>
>
> Joe,
>
> I am definitely in favor of #3. I think if we perform some sort of binary
> operation on two numbers,
> we should return a Decimal type if either of the operands is a Decimal and
> a Number (Long) type
> if both operands are Longs. We would also need a toDecimal() method that
> would convert a Number
> type ot a Decimal type and we would need to support calling toNumber() on
> a decimal as well.
>
> Given this, though, I think it makes sense to infer the return type and
> allow the user to explicitly
> change it via toNumber() and toDecimal() calls if need be. With the
> addition of these functions, I don't
> think it would be 'shady' to interpret the types at all. We already do
> interpret types in many cases, for
> instance when we have an expression such as ${num1:divide(${num2}) } -- in
> this case, num1 and num2
> are attributes, so they are strings. We already implicitly convert them to
> numbers. Going forward, we should
> convert them to either Number or Decimal type based on the regex that they
> match. I believe this also
> addresses the concern of not being able to override. I don't think
> backward compatibility is an issue here
> either, as we currently would just fail and throw an Exception if
> attempting to divide a string like "2.3"
>
> Number 3 seems like a slam-dunk to me in terms of pros vs cons.
>
> Thanks
> -Mark
>
>
>
> > On Apr 20, 2016, at 10:16 PM, Joe Percivall
> <[email protected]> wrote:
> >
> > Hello Dev list,
> > I've been working on a couple improvements that deal with utilizing
> expression language to do data analysis and this has exposed a couple
> issues with the typing of numbers in Expression Language (EL). The
> improvements are a continuation of the topic I presented on at the Apache
> NiFi MeetUp group in MD[1].
> > The primary issue I came across is that currently in EL all numbers
> interpreted as longs[2], which only store whole numbers. This leads to
> problems when trying to do things like dividing whole numbers or just
> trying to add/subtract decimals. I am actually surprised that the request
> for decimals this hasn't come up before. That being said, after some
> initial discussion with Tony and Joe, I believe that there are four
> potential ways forward.
> > 1: Create a new EL type "decimal" backed by a double[3] and new methods
> to support it ("add_decimal"): This allows the user to explicitly choose
> whether or not they want to use decimal or whole numbers. It retains the
> simple use-cases that use whole numbers while opening up the new use-cases
> using decimals. One down side is that it is more verbose. It means adding a
> new function for each math operation. This is backwards compatible.
> >
> > 2: Back all numbers by doubles instead of longs: The easy to implement
> and retains very concise methods ("add", "subtract", etc..). A few cons,
> doubles have a lower precision than longs[4],  can lead to rounding
> errors[5] and could be confusing for users who only work with whole numbers
> to suddenly find decimals.This is not backwards compatible.
> > 3: Create a new EL type "decimal" back by a double[3] and attempt to
> smartly automatically cast depending on method/input/output: This would
> allow for the positives of having decimals and whole numbers in addition to
> having concise methods. The main cons being a much longer implementation
> time to do it right and the "shadiness" of doing things automatically for
> the user. Also this would mean the user wouldn't have the option to
> explicitly override  This is not backwards compatible.
> > 4: Create a new EL type "decimal" backed by a double[3] and overload the
> existing methods with an additional parameter to specify return type to
> support it: This would allow for the positives of having decimals and whole
> numbers in addition to having concise method names but this may cause
> confusion with less technical users who aren't used to specifying return
> types. This is backwards compatible.
> >
> > The options that are not backwards compatible would need to wait to be
> implemented until 1.0.
> > The current option I am leaning towards is number 1 due to the
> explicitness and greater control it gives the user. While it is more
> verbose I think the decimal vs whole number syntax will be easy for even
> non-technical users to pick up. Also I currently have a PR for it up
> here[6].
> > Any other ideas or suggestions are welcome!
> > [1] http://www.meetup.com/ApacheNiFi/events/229158777/
> > [2] https://docs.oracle.com/javase/7/docs/api/java/lang/Long.html
> > [3] https://docs.oracle.com/javase/7/docs/api/java/lang/Double.html
> > [4]
> https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
> > [5] http://stackoverflow.com/questions/960072/rounding-errors
> > [6] https://issues.apache.org/jira/browse/NIFI-1662
> > Joe- - - - - - Joseph Percivalllinkedin.com/in/Percivalle:
> [email protected]
>

Re: [Discuss] Expression Language Options to Add Decimal Support

Reply via email to