Hello Dev list,
I've been working on a couple improvements that deal with utilizing expression
language to do data analysis and this has exposed a couple issues with the
typing of numbers in Expression Language (EL). The improvements are a
continuation of the topic I presented on at the Apache NiFi MeetUp group in
MD[1].
The primary issue I came across is that currently in EL all numbers interpreted
as longs[2], which only store whole numbers. This leads to problems when trying
to do things like dividing whole numbers or just trying to add/subtract
decimals. I am actually surprised that the request for decimals this hasn't
come up before. That being said, after some initial discussion with Tony and
Joe, I believe that there are four potential ways forward.
1: Create a new EL type "decimal" backed by a double[3] and new methods to
support it ("add_decimal"): This allows the user to explicitly choose whether
or not they want to use decimal or whole numbers. It retains the simple
use-cases that use whole numbers while opening up the new use-cases using
decimals. One down side is that it is more verbose. It means adding a new
function for each math operation. This is backwards compatible.
2: Back all numbers by doubles instead of longs: The easy to implement and
retains very concise methods ("add", "subtract", etc..). A few cons, doubles
have a lower precision than longs[4], can lead to rounding errors[5] and could
be confusing for users who only work with whole numbers to suddenly find
decimals.This is not backwards compatible.
3: Create a new EL type "decimal" back by a double[3] and attempt to smartly
automatically cast depending on method/input/output: This would allow for the
positives of having decimals and whole numbers in addition to having concise
methods. The main cons being a much longer implementation time to do it right
and the "shadiness" of doing things automatically for the user. Also this would
mean the user wouldn't have the option to explicitly override This is not
backwards compatible.
4: Create a new EL type "decimal" backed by a double[3] and overload the
existing methods with an additional parameter to specify return type to support
it: This would allow for the positives of having decimals and whole numbers in
addition to having concise method names but this may cause confusion with less
technical users who aren't used to specifying return types. This is backwards
compatible.
The options that are not backwards compatible would need to wait to be
implemented until 1.0.
The current option I am leaning towards is number 1 due to the explicitness and
greater control it gives the user. While it is more verbose I think the decimal
vs whole number syntax will be easy for even non-technical users to pick up.
Also I currently have a PR for it up here[6].
Any other ideas or suggestions are welcome!
[1] http://www.meetup.com/ApacheNiFi/events/229158777/
[2] https://docs.oracle.com/javase/7/docs/api/java/lang/Long.html
[3] https://docs.oracle.com/javase/7/docs/api/java/lang/Double.html
[4] https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
[5] http://stackoverflow.com/questions/960072/rounding-errors
[6] https://issues.apache.org/jira/browse/NIFI-1662
Joe- - - - - - Joseph Percivalllinkedin.com/in/Percivalle:
[email protected]