from:"Eric Barnhill"

Re: [numbers] Fraction

2020-04-10 Thread Eric Barnhill

Great +1

On Thu, Apr 9, 2020 at 3:59 PM Gilles Sadowski  wrote:

> Le jeu. 9 avr. 2020 à 23:20, Alex Herbert  a
> écrit :
> >
> >
> >
> > > On 9 Apr 2020, at 21:36, Gilles Sadowski  wrote:
> > >
> > > Le jeu. 9 avr. 2020 à 22:20, Alex Herbert 
> a écrit :
> > >>
> > >>
> > >>
> > >>> On 9 Apr 2020, at 16:32, Gilles Sadowski 
> wrote:
> > >>>
> > >>>
> > >>>
> >  Given this I am thinking that using ZERO when possible is a better
> >  option and avoid 0 / -1.
> > >>>
> > >>> Hmm, then I'm both +0 and -0 (which is the same, right?)
> > >>> on this issue. ;-)
> > >>
> > >> Ironically the conversion to a double is a minor bug:
> > >>
> > >> Fraction.of(0, 1).doubleValue() == 0.0
> > >> Fraction.of(0, -01).doubleValue() == -0.0
> > >>
> > >> IEEE754 arithmetic for 0.0 / -1.0 creates a -0.0.
> > >>
> > >> Do we want to support -0.0?
> > >
> > > Why prevent it since it looks expected from the above call?
> >
> > Well, in the against argument -0.0 is an artefact of the IEEE floating
> point format. It is not a real number.
> >
> > If we allow 0 / -1 as a fraction to mean something then we should really
> support it fully which means carrying the sign of the denominator through
> arithmetic as would be done for -0.0 (from the top of my head):
> >
> > -0.0 + -0.0 = -0.0
> > -0.0 + 0.0 = 0.0
> > 0.0 - -0.0 = 0.0
> > 0.0 - 0.0 = 0.0
> > 0.0 * 42 = 0.0
> > -0.0 * 42 = -0.0
> >
> > And so on...
> >
> > It is easier to exclude this representation from ever existing by
> changing the factory constructor to not allow it.
> >
> > Note that Fraction.of(-0.0) creates 0 / 1. So the support for 0 / 1 is
> inconsistent with conversion to and from double:
> >
> > Fraction.of(-0.0).doubleValue() == 0.0
> > Fraction.of(0, -1).doubleValue() == -0.0
> >
> > I have checked and Fraction.of(0, 1).compareTo(Fraction.of(0, -1)) is 0.
> They evaluate to equal and have the same hash code. So this behaviour is
> different from Double.compareTo, Double.equals and Double.hashCode which
> distinguishes the two values.
> >
> > If fraction represented a signed number using the signed numerator and
> an unsigned denominator, reduced to smallest form, then the representation
> of zero is fixed. It would be 0 / 1 as you cannot have -0 as an integer.
>
> This seems to be the winning argument to transform all zero to
> canonical form.
>
> > This issue has been created by the support for the sign in either part
> so that Integer.MIN_VALUE can be used as a denominator. This is a nice
> change to allow support for fractions up to 2^-31. But creates this signed
> zero issue.
> >
> > It leads me to think we should have a canonical representation of zero
> as 0 / 1 and prevent creation of 0 / -1 by careful management of class
> creation.
>
> +1
>
> Best,
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers] Continued Fraction

2020-04-07 Thread Eric Barnhill

> I think that we then change it so it matches the source paper. It makes it
> a lot easier to follow the adaption from the paper (which is only about 8
> lines of pseudocode) if the variables have the same name.
>

+1 Thanks for catching this interesting issue.

Re: [numbers] Complex vs. reference current performance

2019-12-10 Thread Eric Barnhill

Thank you for this great work. To the extent I implemented some of the
missing trig functions in Complex, I followed the implementations in
Complex.js , which seemed well worked out and supported. However I am sure
Boost would be much better engineered still. So +1

On Tue, Dec 10, 2019 at 7:42 AM Alex Herbert 
wrote:

> The latest round of fixes on Complex have made it fully C99 compliant
> for a range of -5 to +5 in the real and imaginary components. I have yet
> to finish the testing of overflow/underflow conditions. Here I report on
> the standard range data.
>
> I have updated the test of Complex to load test data from file
> resources. These files have been produced using GNU gcc. But the data
> can be swapped for another reference source.
>
> Currently the test is using very high tolerances specified in units of
> least precision (ULPs). Dropping in another file using low precision
> data (not the full 17 fractional digits of a double) would break the
> test. So this may have to be revised in the future if other references
> are to be used.
>
> Measuring the difference between Complex and the reference data with the
> units of least precision (ULP) delta between them shows:
>
> functionmean +/- sd (n=count) [min, max] Median=x
>
> acos 3.94 +/-  5.03  (n=484)  [  0,  36]   Median=  2
> acosh3.94 +/-  5.03  (n=484)  [  0,  36]   Median=  2
> asinh0.05 +/-  0.23  (n=242)  [  0,   1]   Median=  0
> atanh0.84 +/-  2.14  (n=242)  [  0,  17]   Median=  0
> cosh 0.09 +/-  0.31  (n=242)  [  0,   2]   Median=  0
> sinh 0.08 +/-  0.30  (n=242)  [  0,   2]   Median=  0
> tanh 0.82 +/-  2.30  (n=242)  [  0,  34]   Median=  0
> exp  0.06 +/-  0.31  (n=484)  [  0,   2]   Median=  0
> log  0.10 +/-  0.37  (n=484)  [  0,   3]   Median=  0
> sqrt 0.02 +/-  0.16  (n=484)  [  0,   1]   Median=  0
> multiply 0.00 +/-  0.00  (n= 32)  [  0,   0]   Median=  0
> divide   1.03 +/-  1.75  (n= 32)  [  0,   7]   Median=  1
> pow  1.12 +/-  2.61  (n= 32)  [  0,   9]   Median=  0
>
> If we only include those with a delta above 1 ULP (i.e. so that there is
> at least one number between the two):
>
> acos 6.54 +/-  5.38  (n=274)  [  2,  36]   Median=  5
> acosh6.54 +/-  5.38  (n=274)  [  2,  36]   Median=  5
> asinh NaN +/-   NaN  (n=  0)  [NaN, NaN]   Median=NaN
> atanh4.56 +/-  3.96  (n= 34)  [  2,  17]   Median=  3
> cosh 2.00 +/-  0.00  (n=  2)  [  2,   2]   Median=  2
> sinh 2.00 +/-  0.00  (n=  2)  [  2,   2]   Median=  2
> tanh 3.22 +/-  5.30  (n= 36)  [  2,  34]   Median=  2
> exp  2.00 +/-  0.00  (n=  9)  [  2,   2]   Median=  2
> log  3.00 +/-  0.00  (n=  4)  [  3,   3]   Median=  3
> sqrt  NaN +/-   NaN  (n=  0)  [NaN, NaN]   Median=NaN
> multiply  NaN +/-   NaN  (n=  0)  [NaN, NaN]   Median=NaN
> divide   5.00 +/-  2.31  (n=  4)  [  3,   7]   Median=  5
> pow  5.83 +/-  3.06  (n=  6)  [  2,   9]   Median=  7
>
> This mainly shows a count of real differences ignoring floating-point
> round-off.
>
> These show that Complex is doing quite well for all but:
>
> acos
> acosh
> atanh
> tanh
>
> acosh is implemented using a trigonomic identity with acos and the two
> are equally bad. Fixing acos would fix this too.
>
> I am not sure what can be done for tanh. This uses the formula:
>
> tan(a + b i) = sinh(2a)/(cosh(2a)+cos(2b)) + i [sin(2b)/(cosh(2a)+cos(2b))]
>
> It is implemented entirely using the Math library. A histogram of the
> failures show that there is mainly one anomaly with a ULP delta of 34:
>
> tanh 2 23
> tanh 3 12
> tanh 34 1
>
> For now this can be left for further investigation. More data may show
> that this error is an outlier.
>
> The data is: (0.0,1.5).tanh(). Perhaps there is a trigonomic identity to
> use when the real (or imaginary) component is zero.
>
> So how to fix acos and atanh?
>
> atanh uses divide and log on a Complex.
>
> acos uses multiply, sqrt and log on a Complex.
>
> Thus the differences between Complex and the standard are in those
> methods that are using the Complex object to perform part of the
> computation.
>
> I note that asin uses asinh which uses multiply, sqrt and log on a
> Complex and this does not perform badly. The method is very similar to
> acos. This may be due to the range of the test data or the
> implementation using only positive component parts to preserve the
> conjugate and odd function equalities.
>
> The C++ boost library has implementations for asin, acos and atanh.
> These have overflow and underflow protection and an efficient
> computation in a 'normal' value range. I will investigate using those
> methods to see if they make a difference to the error.
>
> I would like to get this error down to under 1 ULP on average with a
> lower maximum.
>
> Then I will look at the boundary cases for finite numbers which have
> overflow or underflow during parts of the

[general] Phishing emails mentioning Apache coming in after pull request submitted

2019-12-05 Thread Eric Barnhill

Some unsavory types are watching Apache activity. I submitted my first
Apache PR in a long time yesterday, by morning I had two phishing emails
mentioning Apache in both subject and body at the relevant email address.

I hope the community is aware of this issue? I don't recall this happening
even this summer.

I've still got them if anyone at the foundation is keeping track of this.

Be careful out there,
Eric

Re: [Numbers] Towards a release?

2019-12-04 Thread Eric Barnhill

NUMBERS-136 was a pretty simple fix, I don't think it will interfere with
anything, so I submitted a PR anyway. Close it if you prefer but I think it
will be easy to integrate.

Gilles, I saw no way to request you as reviewer, so only requested Alex.

On Tue, Dec 3, 2019 at 3:37 PM Alex Herbert 
wrote:

> On Tue, 3 Dec 2019, 17:58 Gilles Sadowski,  wrote:
>
> > Hello.
> >
> > Le mar. 3 déc. 2019 à 18:33, Eric Barnhill  a
> > écrit :
> > >
> > > It seems like we're pretty close.
> > >
> > > I can take a look at 136, 137 related to log. I have had trouble
> finding
> > > the space to launch the regression project. But I could work on some
> > > smaller things.
> >
> > Great. ;-)
> >
>
> Hi Eric,
>
> I have another bunch of changes to complex to push. I'll do this tomorrow
> and then I think the c99 standard is almost done. There are a few
> outstanding items that I was investigating today. I'll provide a status
> update with a bit more time tomorrow. I don't think any work on the log
> functions will clash with what I have done but it would be simpler if I
> push first as I have tweaked a lot of the c99 functions and it may affect
> the log function.
>
> Alex
>
>
>
> > >
> > > Regarding 70, the user guide, what do you think of submitting an
> > > application to Google Season of Docs?
> >
> > It would be nice, but probably low(er) priority; in effect, having split
> > off
> > several components out of Commons Math has the nice (IMO) side-effect
> > that any of them is easier to grasp, and figuring out what it does and
> how
> > to use it is in general relatively straightforward.
> > Moreover, most developers looking for such tools don't have to be told
> what
> > a complex number is, and (maven) modules make it easy to navigate the
> > code base.
> >
> > > I can initiate if you have had quite
> > > enough of that sort of thing.
> >
> > Indeed, delegating documentation tasks was often more work than it would
> > have been doing it! :-}
> >
> > However, a user guide for "Commons Geometry" is on the TODO list.[1]
> > You should then coordinate with Matt.
> >
> > Thanks,
> > Gilles
> >
> > [1] https://issues.apache.org/jira/browse/GEOMETRY-73
> >
> >
> > >
> > > Eric
> > >
> > >
> > > On Tue, Dec 3, 2019 at 7:41 AM Gilles Sadowski 
> > wrote:
> > >
> > > > Hello.
> > > >
> > > > What do you think of releasing "Commons Numbers"?
> > > > Please have a look at the list of pending issues.[1]
> > > >
> > > > Gilles
> > > >
> > > > [1]
> > > >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20NUMBERS%20AND%20fixVersion%20%3D%201.0%20AND%20statusCategory%20%3D%20new
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>

Re: [Statistics] New component for (standard) distributions?

2019-12-03 Thread Eric Barnhill

Sorry I misspoke. Anyway I quite agree, it would work well as
commons-distribution. +1

On Tue, Dec 3, 2019 at 9:29 AM Gilles Sadowski  wrote:

> Hi.
>
> Le mar. 3 déc. 2019 à 18:23, Eric Barnhill  a
> écrit :
> >
> > I agree, distributions seems stable and well supported.
> >
> > You are proposing releasing it outside of numbers?
>
> Code is currently in module "commons-statistics-distribution", within
> the [Statistics] component (actually: the sole non-empty module!).
> The proposal is to have a standalone "Commons Distribution" component
> (that will depend on "Commons Numbers" and "Commons RNG").
>
> Gilles
>
> >
> > I think it's a good idea. +1
> >
> > On Tue, Dec 3, 2019 at 8:00 AM Gilles Sadowski 
> wrote:
> >
> > > Hello.
> > >
> > > Most functionality of the "o.a.c.math4.distribution" package was
> migrated
> > > from Commons Math almost 2 years ago.
> > > The [Statistics] component should also host a refactoring of CM's
> "stat"
> > > package but development has stalled.  Obviously, it is unlikely that
> we can
> > > perform this task in the short term, while the design of the
> "distribution"
> > > module looks fairly stable (and it had already been refactored within
> CM).
> > > IMO, it should belong to its own maven project so it can be released
> > > without
> > > being encumbered for months (or years?) by the instability of the rest
> of
> > > the port...
> > >
> > > WDYT?
> > >
> > > Gilles
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [Numbers] Towards a release?

2019-12-03 Thread Eric Barnhill

It seems like we're pretty close.

I can take a look at 136, 137 related to log. I have had trouble finding
the space to launch the regression project. But I could work on some
smaller things.

Regarding 70, the user guide, what do you think of submitting an
application to Google Season of Docs? I can initiate if you have had quite
enough of that sort of thing.

Eric

On Tue, Dec 3, 2019 at 7:41 AM Gilles Sadowski  wrote:

> Hello.
>
> What do you think of releasing "Commons Numbers"?
> Please have a look at the list of pending issues.[1]
>
> Gilles
>
> [1]
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20NUMBERS%20AND%20fixVersion%20%3D%201.0%20AND%20statusCategory%20%3D%20new
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [Statistics] New component for (standard) distributions?

2019-12-03 Thread Eric Barnhill

I agree, distributions seems stable and well supported.

You are proposing releasing it outside of numbers?

I think it's a good idea. +1

On Tue, Dec 3, 2019 at 8:00 AM Gilles Sadowski  wrote:

> Hello.
>
> Most functionality of the "o.a.c.math4.distribution" package was migrated
> from Commons Math almost 2 years ago.
> The [Statistics] component should also host a refactoring of CM's "stat"
> package but development has stalled.  Obviously, it is unlikely that we can
> perform this task in the short term, while the design of the "distribution"
> module looks fairly stable (and it had already been refactored within CM).
> IMO, it should belong to its own maven project so it can be released
> without
> being encumbered for months (or years?) by the instability of the rest of
> the port...
>
> WDYT?
>
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-07 Thread Eric Barnhill

On Thu, Nov 7, 2019 at 3:25 PM Gilles Sadowski  wrote:

> Le jeu. 7 nov. 2019 à 18:36, Eric Barnhill  a
> écrit :
> >
> > I should also add on this note, my use case for developing ComplexUtils
> in
> > the first place was compatibility with JTransforms and JOCL. In both
> cases
> > I wanted to convert Complex[] arrays into interleaved double[] arrays to
> > feed into algorithms using those libraries.
>
> Implicit in my remark below is the question: Where does the "Complex[]"
> come from?  If it is never a good idea to create such an array, why provide
> code to convert from it?  Do we agree that we should rather create the
> "ComplexList" abstraction, including accessors that shape the data for
> use with e.g. JTransforms?
>
>
I completely agree this is a superior solution and look forward to its
development.

Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-07 Thread Eric Barnhill

I should also add on this note, my use case for developing ComplexUtils in
the first place was compatibility with JTransforms and JOCL. In both cases
I wanted to convert Complex[] arrays into interleaved double[] arrays to
feed into algorithms using those libraries.

On Thu, Nov 7, 2019 at 9:34 AM Eric Barnhill  wrote:

>
>
> On Thu, Nov 7, 2019 at 6:09 AM Gilles Sadowski 
> wrote:
>
>>
>> This is also what started this thread: The user called the Commons Math's
>> FFT utilities using arrays of "Complex" objects and got the "OutOfMemory"
>> error.  Hence the question of whether storing many "Complex" objects is
>> ever useful, as compared to the "ComplexList", backed with an array of
>> primitives, and instantiating transient "Complex" instances on-demand.
>>
>
> I'm glad it has provoked such improvements. As you implicitly reference in
> your reply, probably we should just be shading JTransforms in the first
> place. I started using JTransforms because I had trouble using the
> commons-math FFT as well.
>

Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-07 Thread Eric Barnhill

On Thu, Nov 7, 2019 at 6:09 AM Gilles Sadowski  wrote:

>
> This is also what started this thread: The user called the Commons Math's
> FFT utilities using arrays of "Complex" objects and got the "OutOfMemory"
> error.  Hence the question of whether storing many "Complex" objects is
> ever useful, as compared to the "ComplexList", backed with an array of
> primitives, and instantiating transient "Complex" instances on-demand.
>

I'm glad it has provoked such improvements. As you implicitly reference in
your reply, probably we should just be shading JTransforms in the first
place. I started using JTransforms because I had trouble using the
commons-math FFT as well.

Re: [numbers] Bug in complex multiply + divide + isNaN

2019-11-07 Thread Eric Barnhill

On Thu, Nov 7, 2019 at 3:59 AM Alex Herbert 
wrote:

>
> There is a matrix for real/imaginary/complex all-vs-all additive and
> multiplicative operators in the standard (tables in G.5.1 and G.5.2).
> The question is do we want to support the entire matrix:
>
> Covered:
>
> Complex.multiplyReal(double x) as  Complex.multiply(double x)
> Complex.divideReal(double x)   as  Complex.divide(double x)
> Complex.addReal(double x)  as  Complex.add(double x)
> Complex.subtractReal(double x) as  Complex.subtract(double x)
>
> Not covered:
>
> Complex.multiplyImaginary(double x)
> Complex.divideImaginary(double x)
> Complex.addImaginary(double x)
> Complex.subtractImaginary(double x)
> Complex.subtractFromReal(double x)
> Complex.subtractFromImaginary(double x)
>
>
> I am going through Complex now to fix code coverage and make it in line
> with the C.99 standard. I will push all the config changes with the
> update to Complex. Should be done be end of today.

Well that's interesting, I did not see that the standard specified
all-vs-all methods in all those cases. There isn't a performance gain for
multiplying by an imaginary double like there is for a real double, as one
has to deal with imaginary*imaginary, so one might as well just pass a
Complex rather than an imaginary double. Consequently I would imagine
implementations of that corner of the standard's matrix are pretty rare.
But I see no reason not to have it for completeness and continuing the goal
I set, of being the only non-C library that I've ever seen that fulfills
the whole standard.

+1

Re: [numbers] Bug in complex multiply + divide + isNaN

2019-11-06 Thread Eric Barnhill

+1 on all suggestions. Thanks, Alex.

On Wed, Nov 6, 2019 at 2:38 PM Alex Herbert 
wrote:

>
>
> > On 6 Nov 2019, at 18:17, Gilles Sadowski  wrote:
> >
> >> [...]
> >>
> >>
> >> Any objections to updating multiply/divide/isNaN to match the standard?
> >
> > Let me think... ;-)
>
> OK, I’ll fix it and double check the other tests against the c reference.
>
> >
> >>
> >> I'll add unit tests to hit the edge cases that should fail with the
> >> current implementation.
> >
> > Thanks,
> > Gilles
>
> Are changes to numbers going under Jira tickets?
>
> It looks like it needs an update to checkstyle, PMD, spotbugs, the
> commons-parent and travis.
>
> Checkstyle config from commons-rng finds:
>
> [INFO] There are 115 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-core/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 202 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-complex/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 102 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-complex-streams/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 276 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-primes/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 68 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-quaternion/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 289 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-fraction/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 10 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-angle/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 3503 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-gamma/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 56 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-combinatorics/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 50 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-arrays/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 10 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-field/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 4 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-rootfinder/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
>
> The mass of errors is white space style in the test classes. Without the
> test classes the result is:
>
> [INFO] There are 12 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-core/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 54 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-complex/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 19 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-complex-streams/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 49 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-primes/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 5 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-quaternion/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 6 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-fraction/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 3 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-angle/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 20 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-gamma/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 19 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-combinatorics/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 4 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-arrays/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
> [INFO] There are 6 errors reported by Checkstyle 8.20 with
> /Users/ah403/git/commons-numbers/commons-numbers-field/../src/main/resources/checkstyle/checkstyle.xml
> ruleset.
>
>
> Also looking at Complex it

Re: [Numbers] Arrays of "Complex" objects and RAM

2019-11-04 Thread Eric Barnhill

That's interesting. The JTransforms library performs Fourier transforms
that can take complex input, output, or both. They do this with interleaved
double[] arrays, which I suppose is much more space efficient, and the
status of a number as real or imaginary is implicit by its location being
odd or even.

The MultidimensionalCounter functionality you mention is for example known
in Matlab as ind2sub() and sub2ind() . It allows for 1d vectorizing of
certain functions which can improve performance. Some people swear by it. I
always found it an additional mental exercise that I didn't want to occupy
myself with unless I absolutely had to. So, maybe it makes sense to
"advertise" this approach like you say, but some users may just want the 3D
arrays for more rapid prototyping-ish applications.

I wonder if there isn't some streaming solution for this -- the numbers are
stored as an interleaved 1D array, but are streamed through a Complex
constructor before any needed operations are performed.

And I guess that leads to my last question -- suppose someone wants to call
a trig function on a series of Complex numbers. Now let's imagine the
primitives have been stored in some maximally efficient way. It seems like,
to use any of the functionality in Complex, these numbers would have to be
unpacked, cast as Complex, operated on, then cast back to how they are
being stored. I wonder if that would prove to be more efficient in the end.

On Sat, Nov 2, 2019 at 7:14 PM Gilles Sadowski  wrote:

> Hello.
>
> The class "ComplexUtils" deal with multi-dimensional arrays that hold
> instances of the "Complex" class.
> I've recently encountered a use-case where it was pointed out that storing
> many "Complex" instances (as seems the purpose of these utilities) is quite
> inefficient memory-wise as each instance will take 32 bytes[1] while the
> real and imaginary parts would only take 16 bytes as 2 primitive "double"s.
> This is compounded by multi-dimensional array where each sub-dimensional
> element is an array object (and thus takes another additional 16 bytes).
> For example, a
> double[10][5][4]
> would take
> 16 * (1 + 10 * 5) + 10 * 5 * 4 * 8
>   = 2416 bytes.
> Assuming that in the above array, the last dimension holds 2 complex
> numbers,
> the same data can be represented as
> Complex[10][5][2]
> that would take
> 16 * ((1 + 10 * 5) + (10 * 5 * 2)) + 10 * 5 * 2 * 2 * 8
>   = 4016 bytes.
> In both cases, the payload (10 * 5 * 2 complex numbers) is
> 10 * 5 * 2 * 2 * 8
>   = 1600 bytes.
> If stored in a one-dimensional array, the size in memory would be 1616
> bytes.
> Thus in the case of a data cube holding 100 complex numbers, a 3D array
> takes 1.5 (primitives) or 2.5 ("Complex" objects) more memory than a 1D
> array.
> If this is correct, I wonder whether we should advertise such a
> "ComplexUtils"
> class.  It would perhaps be preferable to propose a data cube
> abstraction.[2]
>
> WDYT?
>
> Regards,
> Gilles
>
> [1] https://www.baeldung.com/java-size-of-object
> [2] Based on
> http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math4/util/MultidimensionalCounter.html
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [statistics-regression] Proposed Regression class/method structure

2019-10-28 Thread Eric Barnhill

On Mon, Oct 28, 2019 at 3:01 PM Gilles Sadowski 
wrote:

> Hi Eric.
>
> 2019-10-28 18:55 UTC+01:00, Eric Barnhill :
> > Here is a schematic for how the interface might be made more abstract.
> >
> > https://imgur.com/a/izx5Xkh
>
> This cannot be downloaded.
> Please attach the image to a JIRA issue.
>
> Regards,
> Gilles
>
> It is attached to STATISTICS-8.

As for whether Vector is necessary. The idea was to sketch out an interface
that was more abstracted. Maybe Vector is a bit too abstract in a java
context, it's a pretty common container in many languages.

With more time to ponder, my vote is just to use EJML Matrix and double[]
as I proposed in the first scheme. Any use cases for which Matrix and
double[] will not suffice would be quite far off and I suspect this simple
approach will be sufficient for the commons mission.

Eric

Re: [statistics-regression] Proposed Regression class/method structure

2019-10-28 Thread Eric Barnhill

Here is a schematic for how the interface might be made more abstract.

https://imgur.com/a/izx5Xkh

In this case, we may want to just implement the simplest case, using Matrix
and double[], for now.

In principle the RegressionMetric class could extend a Metrics class later.

Do you feel this would set up the library better for the future?

Eric

Re: [statistics-regression] Proposed Regression class/method structure

2019-10-22 Thread Eric Barnhill

Here is a link to the picture

https://imgur.com/a/9jjoOGB

On Tue, Oct 22, 2019 at 4:13 PM Gilles Sadowski 
wrote:

> Hello.
>
> Le mar. 22 oct. 2019 à 21:50, Eric Barnhill  a
> écrit :
> >
> > I propose the following class structure for
> commons-statistics-regression.
>
> Which?
> [Attachment was probably stripped: such should go to a JIRA report.]
>
> > The interface carried over from commons-math is more of an academic
> approach to thinking about regression. For rebooting the library (and I
> hinted at this when I wrote the tickets for summer of code) I was hoping to
> emulate widespread tools like R and scikit-learn, and consider that
> "machine learning" is an increasingly popular use of regression. This
> proposed structure creates an interface that is not the same as, but will
> be very friendly to, anyone coming from R or scikit-learn, or similar tools
> in JavaScript.
> >
> > There are of course many ways I can see to elaborate this scheme, say
> using RegressionResult objects and so forth. But Matrices paired with a
> double[], returning a double[] of coefficients or predictions, are likely
> to be the most common use cases and should be plenty to get started.
>
> Commenting perhaps too early (not seeing the proposed design), but we
> broadly
> discussed that the linear algebra API is not easy to get right, and once
> we "get
> started", the trend is to be stuck with it for ages (related issues
> are among the
> oldest unresolved ones in CM).
>
> > Under the hood I would use the available implementations in commons-math
> to get up and running, and worry about improving them later.
>
> Do you mean port from, or depend on, CM?
>
> Regards,
> Gilles
>
> >
> > Feedback appreciated,
> > Eric
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

[statistics-regression] Proposed Regression class/method structure

2019-10-22 Thread Eric Barnhill

I propose the following class structure for commons-statistics-regression.

The interface carried over from commons-math is more of an academic
approach to thinking about regression. For rebooting the library (and I
hinted at this when I wrote the tickets for summer of code) I was hoping to
emulate widespread tools like R and scikit-learn, and consider that
"machine learning" is an increasingly popular use of regression. This
proposed structure creates an interface that is not the same as, but will
be very friendly to, anyone coming from R or scikit-learn, or similar tools
in JavaScript.

There are of course many ways I can see to elaborate this scheme, say using
RegressionResult objects and so forth. But Matrices paired with a double[],
returning a double[] of coefficients or predictions, are likely to be the
most common use cases and should be plenty to get started.

Under the hood I would use the available implementations in commons-math to
get up and running, and worry about improving them later.

Feedback appreciated,
Eric

[image: image.png]

Re: [GSoC][Commons][Statistics][Descriptive] Mean should be initiated with 0 or NaN ?

2019-07-19 Thread Eric Barnhill

Hi Virenda,

I think that's right in terms of initialization. If it is initialized to
NaN then accumulation will require an additional step getting rid of the
NaN. Just initialize to zero.

I just looked around and it's pretty clear that it is best practice to
return NaN in the edge case of an average of no values. That is what
happens in Python when calling numpy.mean([]) and in R when calling
mean(c()) , and that is also mathematically right.

So, and I think this is a step that could be saved until after the
milestone, a check for zero values and returning NaN in that case should
probably be somehow implemented. But in terms of under the hood initialize
to zero.

On Thu, Jul 18, 2019 at 7:26 PM Virendra singh Rajpurohit <
virendrasing...@gmail.com> wrote:

> Hi all,
> Hope you all are doing well, I had a discussion  on Slack with my GSoC
> mentors regarding this variable initiation. I'm posting it on ML for more
> opinions.
>
> *Should the variables like mean be initiated with NaN or 0?*
> Because, definitional formula of mean is,
> mean = (sum of values)/n
> Hence for  n=0 it is 0/0 which is NaN
> But also Java's SummaryStatistics classes(Double, Long & Int) return
> average=0 for n=0.
> As discussed on slack, "The initialization should not set the initial value
> to NaN. This is a convenience to make getMean() faster. This is likely to
> cause fewer problems than NaN when used in downstream computations".
> Assigning '0' will make things faster because if condition to check n value
> will be removed in calculation and assigning 'NaN' will be more correct.
> *Alex Herbert* suggested NaN can be used in getMean() method with if
> condition to check 'n' value, that way we don't check condition everytime a
> value is added.
> What are your opinions about it?
>
> --
> *Virendra Singh Rajpurohit*
>

Re: [numbers-fraction] Double approximation constructor/factory method overhaul

2019-07-19 Thread Eric Barnhill

I'm looking forward to reviewing your code within my limited knowledge,
however I can't guarantee a quick time frame since Apache GSoC mentor
milestones are due next week and I think that could get time consuming.

Thanks for the contribution,
Eric

On Thu, Jul 18, 2019 at 4:13 PM Heinrich Bohne 
wrote:

> So, I think the code I have so far is ripe for a pull request, so I
> submitted one. I changed the contracts of the epsilon and
> max-denominator factory methods, because the old specifications were not
> very useful, especially that of the max-denominator method – the only
> guarantee you could get from it was that /some/ fraction with a
> denominator not greater than the specified maximum would be returned,
> without any relation to the double value that should be approximated.
>
> The simple continued fraction is now created from an exact BigFraction
> representation of the double value rather than with floating-point
> arithmetic on the double value itself, to preserve maximum precision
> (the old epsilon algorithm produces a fraction of 1/3 when passed the
> double value obtained from the expression 1.0/3.0 and an epsilon of 5 *
> 2^(-58), which is incorrect because the distance between the rational
> number 1/3 and its closest double representative is larger than 5 *
> 2^(-58); I added a corresponding unit test).
>
> The methods setLastCoefficient(BigInteger) and removeLastCoefficient()
> in the new class SimpleContinuedFraction are unused. I wrote this class
> before I implemented the algorithms, and I thought these methods might
> be useful in the max-denominator method, but this turned out not to be
> the case. However, from a design-perspective, these two methods
> complement the functionality of addCoefficient(BigInteger), so I thought
> their presence is still tolerable.
>
> I solved the problem with the maxIterations argument in the epsilon
> method by simply not limiting the number of iterations if this argument
> is negative. Maybe this parameter was necessary in the old algorithm
> which calculated the simple continued fraction with floating-point
> arithmetic, to prevent an infinite loop or something.
>
> On 7/17/19 10:20 AM, Heinrich Bohne wrote:
> > It just occured to me that you might have misunderstood my sentence:
> >
> >> I am even more confused by your suggestion seeing as it was
> >> you who banned BigInteger from Fraction.addSub(Fraction, boolean) in
> >> https://issues.apache.org/jira/browse/NUMBERS-79  , which, even though
> >> you were not aware of it at that time, did not limit the method's
> >> functionality in any way whatsoever
> >
> > The "which" was referring to your removal of BigInteger, not to the use
> > of BigInteger prior to your edits, so what I meant to say was, by
> > removing BigInteger, you did not limit the method's functionality.
> >
> >
> > On 7/17/19 10:10 AM, Heinrich Bohne wrote:
> >>> The reason it was done was because Knuth proved
> >>> (as in mathematical proof) that a long is insufficient for certain
> >>> fraction
> >>> multiplications where both numerator and denominator are large ints; 65
> >>> rather than 64 bits are necessary and a long will not suffice.
> >>
> >> You seem to have missed my comment in ticket
> >> https://issues.apache.org/jira/browse/NUMBERS-79 , which you created –
> I
> >> don't have the book by D. Knuth, but I can only assume that the section
> >> referenced by the code talks about unsigned integers, because by the
> >> logic in the comment I left in the JIRA ticket, long values are
> >> **always** sufficient in Fraction.addSub(Fraction, boolean).
> >>
> >> But this is beside the point, I only mentioned it because I didn't
> >> understand why you suggested to remove the BigFraction class, and
> >> actually, I still don't, as the class BigFraction provides functionality
> >> that Fraction cannot have, both with and without my suggested
> >> alterations.
> >>
> >>
> >> On 7/17/19 2:29 AM, Eric Barnhill wrote:
> >>> On Tue, Jul 16, 2019 at 2:41 PM Heinrich Bohne
> >>> wrote:
> >>>
> >>>>> Do you think we really even need a BigFraction class at all in the
> >>>> context
> >>>>> of these upgrades? Or should one of the Fraction factory methods just
> >>>> take
> >>>>> BigInteger argumentsm and all fractions use the lazy dynamic
> >>>>> method of
> >>>>> calculation you are proposing?
> >>>> I don't quite understand what you mean by this. The BigInteger clas

[statistics] Proposed OLS grammar

2019-07-18 Thread Eric Barnhill

I suggested the following grammar to aim for in our meeting today with the
developing OLS module. If you see anything you'd prefer to change let's
establish it now , if anyone doesn't like it later, it's on me.

RegressionData data = RegressionDataLoader.of(double[][] y, double[] x);
Regression ols = new OLSRegression();
RegressionResults results = ols.regress(data);
betas = results.getBetas() ;

where:
RegressionData is an interface
RegressionDataLoader is a factory class and of() a (possibly overloaded)
static method
Regression is an interface, implemented by OLSRegression
RegressionResults is an interface, the specific class returned is
OLSResults which implements it.
betas are the intercept and slopes of the regression model

I think this preserves abstraction at the levels desired, since we will
want in future flexibility as to regression type, posslble state parameters
set on the regression object, and results contents and format. But also
doesn't take on any unnecessary abstractions.

Eric

[GSoC] Required assignment for July 22nd milestone

2019-07-17 Thread Eric Barnhill

Phase 2 Evals are coming up starting July 22nd. For this eval we will be
emphasizing the necessity for the mentees to submit code on the same
quality level as typical commons contributors; let's get to a small amount
of code that is production-level.

We therefore present the following assignment for the GSoC mentees
(due July 22):
- create a gsoc-milestone-1 branch
- this branch contains ONE single functionality out of what you have
been working on, to be clarified in discussion with the mentors at
tomorrow's meeting
- this code *must* be at the level of regular contributors to Apache
commons and worthy of being pulled into production. this means:
- the code must be clean, no blocks of commented out unfinished
code, no modules of any kind that are not in use by the milestone's
functionality
- the code must be free of all code smells
(https://en.wikipedia.org/wiki/Code_smell)
- the code must be completely encapsulated, can compile and run a
unit test on its own that does not require reference to
un-accounted-for dependencies
- all variables must be of appropriate scope and following best
practice naming conventions

We will discuss some related issues regarding Google's requirements and
guidance at the meeting tomorrow as well, that will fill out your
understanding of  this assignment.

Re: [numbers-fraction] Double approximation constructor/factory method overhaul

2019-07-16 Thread Eric Barnhill

On Tue, Jul 16, 2019 at 2:41 PM Heinrich Bohne 
wrote:

> > Do you think we really even need a BigFraction class at all in the
> context
> > of these upgrades? Or should one of the Fraction factory methods just
> take
> > BigInteger argumentsm and all fractions use the lazy dynamic method of
> > calculation you are proposing?
>
> I don't quite understand what you mean by this. The BigInteger class
> provides flexibility and the ability to store and operate on
> (practically) unlimited values, which Fraction does not have. The
> Fraction class, on the other hand, is faster and more memory efficient,
> due to its use of primitive values, which is an advantage over
> BigFraction.

That's fine.

> I am even more confused by your suggestion seeing as it was
> you who banned BigInteger from Fraction.addSub(Fraction, boolean) in
> https://issues.apache.org/jira/browse/NUMBERS-79 , which, even though
> you were not aware of it at that time, did not limit the method's
> functionality in any way whatsoever (the use of int rather than long
> did, however, but this is now fixed).
>

I don't know what you mean by "functionality" but constructing a BigInteger
for every fraction multiplication uses up more memory and operations than
necessary and scales poorly. BigIntegers are not fast.

However, I understand why the previous coders incorporated a BigInteger and
I'm not sure that you do. The reason it was done was because Knuth proved
(as in mathematical proof) that a long is insufficient for certain fraction
multiplications where both numerator and denominator are large ints; 65
rather than 64 bits are necessary and a long will not suffice. For me,
these cases are so extreme and likely so rare that we might as well let
them fail, report to the user that these cases need to be handled with
BigFraction and leave it there. It could easily be handled in a try catch
block and such a block would be high performance.

That was the judgment I made and it is open to interpretation, provided
such interpretation agrees with Knuth's proof. We are entitled to our own
opinions but not our own facts.

Anyway I think your approximation schemes sound good and implement them
however you see fit.

Eric

Re: [numbers-fraction] Double approximation constructor/factory method overhaul

2019-07-16 Thread Eric Barnhill

Sorry for the delay, I was on vacation.

On Fri, Jul 5, 2019 at 2:09 PM Heinrich Bohne  wrote:

> Hello!
>
> I think a re-design of the factory method BigFraction.from(double,
> double, int, int) is in order, because I see several problems with it:
>
> First, having a separate fraction class intended to overcome the
> limitations of the int range with a factory method (formerly a
> constructor) for approximating double values that can only produce
> denominators within the int range because it has been copy-pasted from
> Fraction (where this code is still a constructor) seems a bit like a
> joke. I think it would be more useful to have this method accept a
> BigInteger as an upper bound for the denominator instead of an int.
>

Quite right! I wanted to look this up before replying. It absolutely makes
sense to use a BigInteger there.


>
> Second, the method only calculates the convergents of the corresponding
> continued fraction, but never its semi-convergents, so it doesn't
> necessarily produce the best rational approximation of the double number
> within the given bounds. For example, the test method
> BigFractionTest.testDigitLimitConstructor() asserts that the method
> calculates 3/5 as an approximation of 0.6152 with the upper bound for
> the denominator set to 9, but 5/8 = 0.625 is closer to 0.6152 than 3/5 =
> 0.6. Since the method is already using continued fractions to
> approximate fractional numbers, I think it would be a pity if it didn't
> take advantage of them for all that they're worth.
>

Wow. That is indeed problematic, nice catch.


>
> Finally, the documentation of the method rightfully acknowledges the
> latter's confusing design, with the method's general behavior being
> dependent on some of its arguments and the validity of these arguments
> also being dependent on each other. However, a better way to solve this
> problem than to simply hide the design from the public would be to
> improve it, e.g. by extracting the functionality that is common to both
> the "maxDenominator mode" and the epsilon mode (which is the calculation
> of the continued fraction), and separating the differences in the
> functionality of the two modes into distinct methods that call the
> common functionality.
>

Yes absolutely.


>
> My suggestion for the third point above would be to create a separate
> class (not necessarily public) that provides an interface for
> calculating simple continued fractions and their convergents (I see that
> there's an abstract class ContinuedFraction, but I don't think it will
> be useful, because all the methods only return double values, and the
> class also requires that all coefficients can be explicitly calculated
> based on their index). The class would ideally be able to calculate the
> continued fraction dynamically/lazily, because only a limited number of
> coefficients are needed to approximate a fractional number within given
> bounds.


That would be awesome.


> What I think could be useful is if the class stores a list of
> the coefficients internally in addition to the current and previous
> convergent (two consecutive convergents are needed to calculate the next
> one recursively based on the next coefficient), and has methods like
> addCoefficient(BigInteger) and removeLastCoefficient() for building a
> continued fraction, and also a static method like
> coefficientsOf(BigFraction) that returns an Iterator that
> computes the coefficients only as they are queried through the iterator,
> so that they can then be passed to addCoefficient(BigInteger).
>

+1


>
> The maxDenominator factory method could then just iterate over the
> coefficients of the continued fraction representation of the passed
> double and build the continued fraction from them until the denominator
> of the current convergent exceeds the upper bound,


yes.



> and the epsilon
> method could iterate over the coefficients of both the lower and upper
> bound's continued fraction representation until the coefficients start
> to differ, at which point it can build the continued fraction of the
> close enough approximation from all coefficients at once (this would
> also prevent any loss of precision when repeatedly performing arithmetic
> operations with floating-point values).
>

right.


>
> Furthermore, this code could not only be used by the approximation
> factory methods in BigFraction, but also by those in Fraction, possibly
> adjusted so that not only the denominator must be within a given bound,
> but also the numerator needs to be within the int range.
>
> Any opinions or objections?


All expressed very clearly. This will be a major upgrade for the package.

Do you think we really even need a BigFraction class at all in the context
of these upgrades? Or should one of the Fraction factory methods just take
BigInteger argumentsm and all fractions use the lazy dynamic method of
calculation you are proposing?

Re: [All] Actively seek contributor? [Was: External dependency for linear algebra?]

2019-06-26 Thread Eric Barnhill

+1

On Tue, Jun 25, 2019 at 6:24 PM Gilles Sadowski 
wrote:

> Hi.
>
> Thanks for the suggestion, Rob.
>
> Should we contact him?  [Perhaps he reads this ML...]
>https://www.linkedin.com/in/peter-abeles-59b2603
>https://github.com/lessthanoptimal/
>
> In addition to EJML, it seems that there could be a nice consolidation
> with [Geometry]:
>https://github.com/lessthanoptimal/GeoRegression
>
> Gilles
>
> Le lun. 24 juin 2019 à 02:24, Gilles Sadowski  a
> écrit :
> >
> > Hello.
> >
> > Le sam. 22 juin 2019 à 20:22, Rob Tompkins  a écrit
> :
> > >
> > > Have we tried asking if he wants to be a part of commons?
> >
> > AFAIK, no.
> >
> > > Seems like that library could be a good fit
> >
> >>> [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers-fraction] Code duplication between FractionTest and BigFractionTest

2019-06-20 Thread Eric Barnhill

>
> > If additional context is required it fails to meet the definition of
> > a unit test and is instead an integration test,  and the function being
> > tested may require rethinking.
>
> Depends what you define as a unit test. I'd say the unit was BigFraction
> or Fraction. An integration test is something that must be tested with
> coherant components working together to provide functionality. You are
> not doing that.
>

Well, I totally agree with both of you that this is the superior approach
for architecture and maintainability. Maybe I should think about adding a
bit more setup to my own unit tests.

I think it is not quite right to say that Fraction is the unit. I think
unit tests test atomic behaviors in the code that ideally can only fail one
way; those are the units. But this is just semantics.

So if you are both in agreement I can change to a +1.

Re: Re: [numbers-fraction] Code duplication between FractionTest and BigFractionTest

2019-06-20 Thread Eric Barnhill

Sorry for the slow reply, I thought I sent this yesterday.

I agree from a code architecture standpoint such a refactoring makes sense.
However from the perspective of unit tests it makes it no longer a unit
test.

IIUC it's best practice for a unit test that all context be within the
test. If additional context is required it fails to meet the definition of
a unit test and is instead an integration test,  and the function being
tested may require rethinking.

This results in unit tests often being clunkily and awkwardly coded, but I
think it is the way they are typically written and it has its reasons.

 So I am +0 .

On Thu, Jun 20, 2019, 02:01 Heinrich Bohne  wrote:

> > A quick looks shows that the BigFractionTest does have test cases for
> very large numbers. However the add, subtract, divide and multiply tests
> and a few others just use values that would work with Fraction. Possibly
> these can be moved to a shared common tests location too.
>
> That's what I was thinking too – the draft was by no means intended to be
> complete, I just created it to give a general idea of how I would go about
> implementing this. I'll work some more on it before I create an actual pull
> request.
>
>
> On 6/20/19 10:40 AM, Alex Herbert wrote:
> >
> >> On 20 Jun 2019, at 00:54, Heinrich Bohne  wrote:
> >>
> >> An awful lot of code is duplicated between FractionTest and
> >> BigFractionTest. Often, the test cases in the two classes only differ in
> >> the types they use (e.g. Fraction vs. BigFraction), but the actual
> >> values the tests use are the same.
> >>
> >> I think this could be mitigated by adding a new class that stores the
> >> values for these common test cases, and the classes FractionTest and
> >> BigFractionTest retrieve the values from this class and only implement
> >> the test patterns.
> >>
> >> I created a draft here:
> >>
> https://github.com/Schamschi/commons-numbers/commit/53906afd991cd190f1a05beb0952a40ae6c6ea3f
> >>
> >> Any opinions on this?
> > 1. BigFraction should work the same way as Fraction when the numbers are
> the same
> >
> > So collecting the common tests together makes sense. The change in the
> PR looks good.
> >
> > 2. BigFraction should work with numbers that cannot be handled by
> Fraction
> >
> > A quick looks shows that the BigFractionTest does have test cases for
> very large numbers. However the add, subtract, divide and multiply tests
> and a few others just use values that would work with Fraction. Possibly
> these can be moved to a shared common tests location too.
> >
> > Then variants added using BigInteger arguments just to make sure the Big
> part of BigFraction is working.
> >
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [GSoC][Commons][STATISTICS][Regression][Matrix] Flexibility in Matrix Libraries in Regression Component?

2019-06-19 Thread Eric Barnhill

On Mon, Jun 17, 2019 at 7:13 AM Ben Nguyen  wrote:

> I don’t believe the plan is or that the use of EJML should be permanent….
>

There's no reason it couldn't be permanent. Obviously we want to give
credit where it is due in all the appropriate ways. But the code is
licensed so that others may incorporate it. It is hard to see any downside
for the EJML team to gaining greater exposure and use by being shaded by
Apache. That is probably what they want.

Efficient matrix implementations are serious business. If you ask me,
commons would be well within its mission by making EJML easy to find, use,
and combine with other libraries of useful code. We would not necessarily
be in the commons mission by developing our own sparse matrix factorization
libraries.

I feel exactly the same way about the JTransforms library, on the day that
we get to that.

[git] please avoid force pushes

2019-06-13 Thread Eric Barnhill

Apologies if everyone knows this but...

There has been some force pushing in the git repos lately. Unfortunately
there are a lot of Stack Overflow answers that will tell the user to solve
a complex commit situation by force pushing. These answers are just
*wrong*. By the nature of our code it is being used by more than one
pseron, and therefore force pushing should never happen. There is always a
better way.

If your commit situation gets complex, consider starting a new branch. You
can then push it without conflicts, and you can probably PR over your
master or develop branch without rewriting any history. If that still
doesn't solve your problem, you can ask here or DM me on the ASF Slack.

I am only miiltant on this because I once created pandemonium early in my
new job by force pushing after Stack Overflow suggested it. Fortunately we
recovered and I was not fired. Just say no to push -f. :)

Re: [numbers] Code blocks in test methods

2019-06-13 Thread Eric Barnhill

I agree this increases readability and is nice.+1

The only thing that gives me the creeps is the force push in the PR. But
that is off topic, so another email for that.

On Thu, Jun 13, 2019 at 5:42 AM Gilles Sadowski 
wrote:

> Hi.
>
> Le jeu. 13 juin 2019 à 01:34, Heinrich Bohne  a
> écrit :
> >
> >  > (2) Why not refactor and pull-out methods? This then forces you to
> _name_
> >  > the methods, instead of the above (anonymous blocks vs. commented
> > blocks.)
> >
> > I did not pull out the code sections into separate methods because I had
> no
> > intention of re-structuring the whole class. I only wanted to fix a bug
> in
> > the class Fraction and add a test case in FractionTest that would have
> > failed
> > due to this bug, and in theprocess organize that which was already
> > present in
> > FractionTest a bit better, because it has been pointed out to me that new
> > contributions should not only strive to improve functionality but also
> > readability.
> > Introducing those code blocks seemed like a straightforward way of
> > making the
> > mess in the bodies of some of the test methods in FractionTest more
> > comprehensible –
> > I would think that adding new methods could be more controversial than
> > adding
> > code blocks.
> >
> > Besides, should anyone in the future wish to extract these code sections
> > into
> > separate methods, I doubt that the code blocks would be a hindrance – if
> > anything,
> > I imagine that they would it easier, because with the code blocks, it is
> > a lot
> > clearer WHAT can be extracted to a method in the first place than
> > without them.
> >
> >  > (1) It is helpful to add a // comment for each block, otherwise, it
> > feels
> >  > anonymous and weird to me.
> >
> > I am not sure how adding a comment to the code blocks would be helpful in
> > this case. The blocks only serve to separate the test cases, and most of
> > the time,
> > these test cases differ only in the values that they use, and not in
> > some defining
> > characteristic. For example, the first three blocks in testReciprocal()
> > only test
> > some different arbitrary fractions, but are otherwise completely
> > analogous, so I
> > couldn't think of any other comment than something like "test case 1",
> > "test case 2",
> > etc. Granted, the fourth block tests failure with a zero-denominator, so
> > in this
> > case, I can understand your point about adding comments. By the way,
> > some of these
> > test cases were already commented before I edited the class.
>
> Overall, readability is not worse; and the addition of blocks is in
> the spirit of "small steps".
> My first reaction was also that functions would be more readable,
> but some would be quite trivial, adding another layer for not much
> improvement.
> Anyway, this can be done in another pass (one is already foreseen
> in order to switch to Junit5).  So, any objection to merging the PR?
>
> Thanks,
> Gilles
>
> >
> > On 6/12/19 3:08 PM, Gary Gregory wrote:
> >
> > > I've used code blocks in this style in the past but...
> > >
> > > (1) It is helpful to add a // comment for each block, otherwise, it
> feels
> > > anonymous and weird to me.
> > > (2) Why not refactor and pull-out methods? This then forces you to
> _name_
> > > the methods, instead of the above (anonymous blocks vs. commented
> blocks.)
> > >
> > > Gary
> > >
> > > On Wed, Jun 12, 2019 at 9:00 AM Heinrich Bohne 
> > > wrote:
> > >
> > >> I have been asked to request some feedback on this pull request:
> > >> https://github.com/apache/commons-numbers/pull/36– specifically,
> about
> > >> the introduction of code blocks in the commit "NUMBERS-100: Reduce
> scope
> > >> of local variables".
> > >>
> > >> I had the idea with the code blocks when I wanted to add a test to the
> > >> method testAdd() but was intimidated by the huge wall of code
> contained
> > >> in the method. When taking a closer look, this code wall is actually
> > >> composed of several test cases that are completely independent of each
> > >> other, but because the local variables live throughout the whole
> method
> > >> and are re-used in almost every test case, this is not obvious. The
> more
> > >> variables are involved, the closer you have to look to figure out
> which
> > >> sections are independent of the rest.
> > >>
> > >> I think that, with the code blocks, it is instantly obvious that a
> > >> specific section does not depend on anything that happened before it,
> or
> > >> that it does not affect anything that comes after it. So I think that
> > >> they are preferable to the previous version of the file.
> > >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [lang][rng] org.apache.commons.lang3.ArrayUtils.shuffle()

2019-06-13 Thread Eric Barnhill

An iterator that dynamically shuffles as you go along. That's really nice,
I had never even thought of that. Thanks.

On Thu, Jun 13, 2019 at 10:11 AM Alex Herbert 
wrote:

>
> On 13/06/2019 17:56, Eric Barnhill wrote:
> > On Thu, Jun 13, 2019 at 9:36 AM sebb  wrote:
> >
> >>
> >> Rather than shuffle etc in place, how about various
> >> iterators/selectors to return entries in randomised order?
> >> [Or does that already exist?]
> >>
> > I am pretty sure random draws, and shuffling, are implemented with
> > different algorithms. Though sampling without replacement the full length
> > of the set would yield a shuffled set, I think there are more efficient
> > ways to shuffle a set.
>
> Iterators to return a random draw *without* replacement over the full
> length of the array? The iterator would dynamically shuffle the array on
> each call to next() so could be stopped early or can be called
> infinitely as if a continuous stream. Is that your idea?
>
> UniformRandomProvider rng = ...;
> int[] big = new int[100];
> //
> // Fill big with lots of data
> //
> IntIterator iter = ShuffleIterators.create(rng, big);
> int x = iter.next();
> int y = iter.next();
> int z = iter.next();
>
> This doesn't exist but it is easy to do. Memory requirements would
> require a copy of the data, or it could be marked as destructive to the
> input array order and shuffle in place.
>
> If you want a random draw *with* replacement then you can just call
> nextInt(int) with the size of the array to pick something.
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [lang][rng] org.apache.commons.lang3.ArrayUtils.shuffle()

2019-06-13 Thread Eric Barnhill

On Thu, Jun 13, 2019 at 9:36 AM sebb  wrote:

>
>
> Rather than shuffle etc in place, how about various
> iterators/selectors to return entries in randomised order?
> [Or does that already exist?]
>

I am pretty sure random draws, and shuffling, are implemented with
different algorithms. Though sampling without replacement the full length
of the set would yield a shuffled set, I think there are more efficient
ways to shuffle a set.

Re: [numbers] Redundant methods in ArithmeticUtils

2019-06-11 Thread Eric Barnhill

On Tue, Jun 11, 2019 at 9:52 AM Heinrich Bohne 
wrote:

> The class ArithmeticUtils in the commons-numbers-core module contains
> several methods where, since Java 8, equivalent methods in
> java.lang.Math exist. These methods are the following:
>
> addAndCheck(int, int)
> addAndCheck(long, long)
> mulAndCheck(int, int)
> mulAndCheck(long, long)
> subAndCheck(int, int)
> subAndCheck(long, long)
>
> The corresponding methods from java.lang.Math are:
>
> addExact(int, int)
> addExact(long, long)
> multiplyExact(int, int)
> multiplyExact(long, long)
> subtractExact(int, int)
> subtractExact(long, long)
>
> The former methods are probably relics from pre-Java-8 times, when the
> latter methods did not exist.


Often true with commons, and no shame in that.


> But now, they are redundant. I suggest
> they be removed from ArithmeticUtils in commons-numbers-core, and their
> invocations replaced by invocations of the java.lang.Math equivalents.
>

+1


> Both groups of methods specify the same type of exception to be thrown
> in case of an overflow (a java.lang.ArithmeticException), so the
> replacement should be straightforward.
>

Utils classes are of course frowned upon in general, so IMO the less in
there the better.

Re: [numbers][fraction] pulling fraction-dev into master

2019-06-06 Thread Eric Barnhill

Changed are merged; in particular the travis updates were kept; if you are
working on Fraction kindly rebase.

On Wed, Jun 5, 2019 at 3:40 PM Eric Barnhill  wrote:

> For some months I worked on the Fraction class on a fraction-dev branch,
> now others are furthering it, but IIUC working off of master, plus it
> sounds like my edits are out of date in other ways.
>
> So within the next day, I will pull fraction-dev into master. I would
> request any other contributors contributing to Fraction, to merge these
> changes into their own work and rebase.
>
> I was at the final checkstyle edits of what I was working on, so hopefully
> it will not cause anyone more than minor conveniences. If it will cause you
> a larger inconvenience and you would rather work together to merge it,
> please post here. If you find after the fact it causes you a headache I can
> roll it back, I will keep the branch around for a while.
>
> Eric
>

[numbers][fraction] pulling fraction-dev into master

2019-06-05 Thread Eric Barnhill

For some months I worked on the Fraction class on a fraction-dev branch,
now others are furthering it, but IIUC working off of master, plus it
sounds like my edits are out of date in other ways.

So within the next day, I will pull fraction-dev into master. I would
request any other contributors contributing to Fraction, to merge these
changes into their own work and rebase.

I was at the final checkstyle edits of what I was working on, so hopefully
it will not cause anyone more than minor conveniences. If it will cause you
a larger inconvenience and you would rather work together to merge it,
please post here. If you find after the fact it causes you a headache I can
roll it back, I will keep the branch around for a while.

Eric

Re: [gsoc] Weekly meeting tomorrow

2019-06-05 Thread Eric Barnhill

That looked like a list of times. How is this one for you all

https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019=6=6=16=0=0=136=224=265=1249=1860=1800


On Wed, Jun 5, 2019 at 11:59 AM Alex Herbert 
wrote:

> Time for another meeting to discuss progress.
>
> Shall we change to UTC +4 this time? Here is the meeting time clock for
> everyone:
>
>
> https://www.timeanddate.com/worldclock/meetingtime.html?iso=20190606=136=224=265=1249=1860
> <
> https://www.timeanddate.com/worldclock/meetingtime.html?iso=20190606=136=224=265=1249=1860
> >
>
> Alex
>
>

Re: [Commons][Descriptive][STATISTICS-7][GSoC] SummaryStatistics class design & Whether to use DoubleSummaryStatistics class from java.util package?

2019-06-02 Thread Eric Barnhill

As discussed on prior threads you should have both. There will need to be
static convenience methods for a user who wants to make a very simple call,
say Stats.mean() . But, as Alex said, this convenience class will just be a
front end for the statistics functionality itself. That needs to be in its
own classes (Mean(), Variance()) which can produce instances that give the
user more flexibility, For example storeless statistics like Mean() or
Variance(), or StandardDeviation(), should be updatable, as Gilles said, or
handle different kind of streams like Alex said. Yet these classes need to
be designed so that they perform as well as simple implementations when
desired.

On Sun, Jun 2, 2019 at 5:45 AM Virendra singh Rajpurohit <
virendrasing...@gmail.com> wrote:

> I've been trying to make summary statistics class. I have some doubt.
> There is a class DoubleSummaryStatistics in java.util package(There are two
> more for Int and Long). I'll attach this file here.
> Do I have to design SummaryStatistics in this way only? I mean,
> description on DoubleSummaryStatistics is "This class is designed to work
> with (though does not require) streams
> .
> For example, you can compute summary statistics on a stream of doubles with:
>
>
>  DoubleSummaryStatistics stats = 
> doubleStream.collect(DoubleSummaryStatistics::new,
>   
> DoubleSummaryStatistics::accept,
>
>
> DoubleSummaryStatistics::combine);"
> Earlier my understanding of the project was that the user just have to
> call the function "getSummary()" & all the calculations will be done
> automatically in streams. but As we can see in DoubleSummaryStatistics we
> have to call collect() method.
> There are some functions like max, min, sum, count, average which are
> already defined in this class. So should I extend this class in my class or
> not? Also, I'll have to add more statistics other than max,min,sum for that
> I have to override accept() function which will be used for  streams.
>
> Warm Regards,
> --
> *Virendra Singh Rajpurohit*
>
> *University of Petroleum and Energy Studies,Dehradun*
> Linkedin:https://www.linkedin.com/in/virendra-singh-rajpurohit
>
>
>
>
>
> [image: Mailtrack]
> 
>  Sender
> notified by
> Mailtrack
> 
>  06/02/19,
> 6:14:27 PM
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org

Re: [numbers] - Contributions to Commons Numbers

2019-05-31 Thread Eric Barnhill

This is well worth discussing.

The protocol here could be improved. Where I work, we all write a lot of
code and we all have write access. We also *always* submit PRs rather than
push directly, and *always* request review from at least one other person.
This is because it is always risky to push code that doesn't have other
eyes on it.

So whether you get/have write access or not, I think the protocol should
always be PRs. That is common practice in industry. We could all make more
use of the "request review" portion of the PR interface. For numbers, this
might entail requesting review from Gilles and one peer. To clarify, this
is only my suggestion and others may disagree.

Speaking to Fraction specifically where you have been contributing. First
of all thank you for your contributions there. I just about finished my
contributions to that module, but have been using my "Apache time" to
mentor the GSoC coders, and have not had time to consider the recent
suggestions. Please feel free to finish it and add your name as a
contributor. If you do I would prefer that you submit a PR and request
Gilles and myself for review.

On Fri, May 31, 2019 at 10:12 AM Karl Heinz Marbaise 
wrote:

> Hi to all,
>
> I have contributed some PR#s (via GitHub) to the commons-numbers
> project...(They have been accepted and merged ;-))
>
> I have some questions:
>
> 1. The documentation[1] states that every Apache committer has write
> access to the commons projects.
>
> So I could change to use gitbox directly via branch instead of GitHub PR's.
>
> The question is: What is the prefered way to contribute to the projects?
>
>   - via GitHub PR
>   - via Branch GitBox ?
>
>
> 2. I have already access to JIRA but unfortunately I can't assign JIRA
> issue to myself ?
>
> Is this intentionally or is this an issue?
>
>
> Kind regards
> Karl Heinz Marbaise
>
>
> [1]: https://commons.apache.org/
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

[gsoc] Weekly meeting tomorrow

2019-05-29 Thread Eric Barnhill

Let's have another weekly gathering tomorrow for GSoC mentees at the usual
time.

Everyone should have written at least some code, and a unit test that goes
with that code, and submitted it for review via a PR.

If you have difficulties doing this, please raise questions on the Slack.

Thanks to the contributors who are  helping out so much at these meetings.

Re: [statistics][descriptive] Classes or static methods for common descriptive statistics?

2019-05-29 Thread Eric Barnhill

At the end of the day, like we just saw on the user list today. users are
going to come around with arrays and want to get the mean, median,
variance, or quantiles of that array. The easiest way to do this is to have
some sort of static method that delivers these:

double mean = Stats.mean(double[] data)

and the user doesn't have to think more than that. Yes this should
implemented functionally, although in this simple case we probably just
need to call Java's SummaryStats() under the hood. If we overcomplicate
this, again like we just saw on the user list, users will simply not use
the code.

Then yes, I agree Alex's argument for updateable instances containing state
is compelling. How to relate these more complicated instances with the
simple cases is a great design question.

But first, let's nail the Matlab/Numpy case of just having an array of
doubles and wanting the mean / median. I am just speaking of my own use
cases here but I used exactly this functionality all the time:

Mean m = new Mean().
double mean = m.evaluate(data)

and I think this should be the central use case for the new module.


On Wed, May 29, 2019 at 4:51 AM Gilles Sadowski 
wrote:

> Hello.
>
> Le mar. 28 mai 2019 à 20:36, Alex Herbert  a
> écrit :
> >
> >
> >
> > > On 28 May 2019, at 18:09, Eric Barnhill 
> wrote:
> > >
> > > The previous commons-math interface for descriptive statistics used a
> > > paradigm of constructing classes for various statistical functions and
> > > calling evaluate(). Example
> > >
> > > Mean mean = new Mean();
> > > double mn = mean.evaluate(double[])
> > >
> > > I wrote this type of code all through grad school and always found it
> > > unnecessarily bulky.  To me these summary statistics are classic use
> cases
> > > for static methods:
> > >
> > > double mean .= Mean.evaluate(double[])
> > >
> > > I don't have any particular problem with the evaluate() syntax.
> > >
> > > I looked over the old Math 4 API to see if there were any benefits to
> the
> > > previous class-oriented approach that we might not want to lose. But I
> > > don't think there were, the functionality outside of evaluate() is
> minimal.
> >
> > A quick check shows that evaluate comes from UnivariateStatistic. This
> has some more methods that add little to an instance view of the
> computation:
> >
> > double evaluate(double[] values) throws MathIllegalArgumentException;
> > double evaluate(double[] values, int begin, int length) throws
> MathIllegalArgumentException;
> > UnivariateStatistic copy();
> >
> > However it is extended by StorelessUnivariateStatistic which adds
> methods to update the statistic:
> >
> > void increment(double d);
> > void incrementAll(double[] values) throws MathIllegalArgumentException;
> > void incrementAll(double[] values, int start, int length) throws
> MathIllegalArgumentException;
> > double getResult();
> > long getN();
> > void clear();
> > StorelessUnivariateStatistic copy();
> >
> > This type of functionality would be lost by static methods.
> >
> > If you are moving to a functional interface type pattern for each
> statistic then you will lose the other functionality possible with an
> instance state, namely updating with more values or combining instances.
> >
> > So this is a question of whether updating a statistic is required after
> the first computation.
> >
> > Will there be an alternative in the library for a map-reduce type
> operation using instances that can be combined using Stream.collect:
> >
> >  R collect(Supplier supplier,
> >   ObjDoubleConsumer accumulator,
> >   BiConsumer combiner);
> >
> > Here  would be Mean:
> >
> > double mean = Arrays.stream(new double[1000]).collect(Mean::new,
> Mean::add, Mean::add).getMean() with:
> >
> > void add(double);
> > void add(Mean);
> > double getMean();
> >
> > (Untested code)
> >
> > >
> > > Finally we should consider whether we really need a separate class for
> each
> > > statistic at all. Do we want to call:
> > >
> > > Mean.evaluate()
> > >
> > > or
> > >
> > > SummaryStats.mean()
> > >
> > > or maybe
> > >
> > > Stats.mean() ?
> > >
> > > The last being nice and compact.
> > >
> > > Let's make a decision so our esteemed mentee Virendra knows in what
> > > direction to take his work this summer. :)
> >
>
> I'm not sure I understand the imp

[statistics][descriptive] Classes or static methods for common descriptive statistics?

2019-05-28 Thread Eric Barnhill

The previous commons-math interface for descriptive statistics used a
paradigm of constructing classes for various statistical functions and
calling evaluate(). Example

Mean mean = new Mean();
double mn = mean.evaluate(double[])

I wrote this type of code all through grad school and always found it
unnecessarily bulky.  To me these summary statistics are classic use cases
for static methods:

double mean .= Mean.evaluate(double[])

I don't have any particular problem with the evaluate() syntax.

I looked over the old Math 4 API to see if there were any benefits to the
previous class-oriented approach that we might not want to lose. But I
don't think there were, the functionality outside of evaluate() is minimal.

Finally we should consider whether we really need a separate class for each
statistic at all. Do we want to call:

Mean.evaluate()

or

SummaryStats.mean()

or maybe

Stats.mean() ?

The last being nice and compact.

Let's make a decision so our esteemed mentee Virendra knows in what
direction to take his work this summer. :)

Re: [Commons-Statistics][GSoC][Descriptive] Class Diagram & development flow

2019-05-28 Thread Eric Barnhill

Thanks for this great work.

This chart will serve you well and you are now in a great place to proceed
further. Are you able to now create a UML for the components you are going
to create? Is there a set of core functionalities that you will target
first? Can you maybe divide your proposed summer's work into core goals and
stretch goals?

For example, FourthMoment is probably not a super-important statistical
function, but Median certainly is. So your core goals should definitely
include mean, median, variance, etc...and you have more freedom to decide
what interests you to do after that.

On Tue, May 28, 2019 at 2:21 AM Virendra singh Rajpurohit <
virendrasing...@gmail.com> wrote:

> Hi All,
> As my GSoC project is to refactor "commons.math4.stat.descriptive.*", and
> upgrade it using Java 8 features like Stream API, Functional Interface etc.
> I've created a Class-Diagram of  "commons.math4.stat.descriptive.*"  so as
> to understand the old code, it's flow and working.
> I've attached the class-diagram and flow of the development of the classes
> on JIRA:  https://issues.apache.org/jira/browse/STATISTICS-15
> I'll start coding from this week only. Any kind of guidance & help is most
> welcome.
>
> --
> *Virendra Singh Rajpurohit*
>
> *University of Petroleum and Energy Studies,Dehradun*
> Linkedin:https://www.linkedin.com/in/virendra-singh-rajpurohit
>

Re: [statitsics] .gitattributes

2019-05-24 Thread Eric Barnhill

+1

Users should also beware that working on a repo in Windows in an IDE can
cause the file to take on a pile of Windows line endings which git then
pushes. This has happened to me elsewhere. Maybe this fix takes care of it.

On Fri, May 24, 2019, 00:28 Alex Herbert  wrote:

> The recent PR to add a new module to statistics may have suffered from
> problems with converting line endings.
>
> This can be solved by having Windows users run this (optionally with
> --global):
>
> > git config core.autocrlf true
>
> But a better fix [1] is to add a .gitattributes file [2] containing:
>
> * text=auto
>
> The fix then applies to anyone using the repo irrespective of their own
> git global config.
>
> [1] https://www.edwardthomson.com/blog/git_for_windows_line_endings.html <
> https://www.edwardthomson.com/blog/git_for_windows_line_endings.html>
> [2] https://git-scm.com/docs/gitattributes <
> https://git-scm.com/docs/gitattributes>
>
> Any objections to modifying the repo to have this configuration file?
>
> Alex
>
>

Re: [statistics] Pull request for GLSMultipleLinearRegression

2019-05-23 Thread Eric Barnhill

Hi Elena,

Thanks for this intriguing idea. As far as I ever knew IRLS requires a
matrix. Can you provide me with a citation where I can read about this
vector-based approach?

Thanks,
Eric


On Thu, May 23, 2019, 06:44 Елена Картышева  wrote:

> Hello.
>
> I would like to propose a pull request implementing an option to use
> variance vector instead of covariance matrix. It allows users to avoid
> unnecessary memory usage and excessive computation in case of uncorrelated
> but heteroscedastic errors thus making it possible to work with huge input
> matrices. Using variance vector in such cases allows to reduce time
> complexity from O(N^2) to just O(N) (where N is a number of observations)
> and dramatically reduce memory usage. For example, in my practice arose a
> need to train generalized linear model. Usage of Iteratively reweighted
> least squares algorithm requires weighted regression with more than a
> million observations. Current implementation would require approximately 12
> terabytes of memory while patched version needs only 8 megabytes. Since
> IRLS is iterative algorithm a million-times complexity reduction is also
> pretty handy.
>
>
> --
> Sincerely yours, Elena Kartysheva.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: Proposal to introduce JUnit 5 in commons-numbers

2019-05-22 Thread Eric Barnhill

+1

On Wed, May 22, 2019 at 3:15 PM Gilles Sadowski 
wrote:

> Hi.
>
> Le mer. 22 mai 2019 à 18:43, Heinrich Bohne  a
> écrit :
> >
> > Right now, commons-numbers is using JUnit 4.12, the last stable version
> > of JUnit 4. As far as I am aware, there is no explicit syntax in JUnit
> > 4.12 for testing whether an exception is thrown apart from either using
> > the deprecated class ExpectedException or adding the "expected"
> > parameter to the Test annotation. The problem with the latter approach
> > is that it is impossible to ascertain where exactly in the annotated
> > method the exception is thrown – it could be thrown somewhere unexpected
> > and the test will still pass. Besides, when testing the same exception
> > trigger with multiple different inputs, it is impractical to create a
> > separate method for each test case, which would be necessary with both
> > aforementioned approaches.
> >
> > This has led to the creation of constructs where the expected exception
> > is swallowed, which has been deemed undesirable
> > <
> https://issues.apache.org/jira/browse/NUMBERS-99?focusedCommentId=16843419=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16843419
> >.
> > Because of this, I propose to add JUnit 5 as a dependency in
> > commons-numbers. JUnit 5 has several "assertThrows" methods that would
> > solve the described dilemma.
>
> +1
>
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [commons-numbers] branch fraction-dev updated (3b21325 -> 92de0b4)

2019-05-22 Thread Eric Barnhill

Yes I regret that I did not finish up the last mile on Fraction before this
ticket was submitted. It would haved saved time as maybe I have fixed
someof those already. But, I will integrate these suggestions after I
finish my edits, all that is left to look at in my branch is the
checkstyle.

On Wed, May 22, 2019 at 3:04 PM Gilles Sadowski 
wrote:

> Hi Eric.
>
> Will you have look a NUMBERS-100:
>https://issues.apache.org/jira/browse/NUMBERS-100
>
> I've just thought that it might interfere with your changes in the
> "fraction-dev" branch.
>
> Regards,
> Gilles
>
>
> Le mer. 22 mai 2019 à 21:23,  a écrit :
> >
> > This is an automated email from the ASF dual-hosted git repository.
> >
> > ericbarnhill pushed a change to branch fraction-dev
> > in repository https://gitbox.apache.org/repos/asf/commons-numbers.git.
> >
> >
> > from 3b21325  NUMBERS-97: restoring pow() method, lost in rebase
> >  new 092e816  NUMBERS-97: replacing pow method
> >  new 97683d5  NUMBERS-97: test for Fraction parse method
> >  new 3460841  NUMBERS-97: Added test of parse method in
> BigFractionTest, and updated outdated use of RoundingMode
> >  new 92de0b4  minor: login credentials test
> >
> > The 4 revisions listed above as "new" are entirely new to this
> > repository and will be described in separate emails.  The revisions
> > listed as "add" were already present in the repository and have only
> > been added to this reference.
> >
> >
> > Summary of changes:
> >  .../commons/numbers/fraction/BigFraction.java  | 27
> +++
> >  .../commons/numbers/fraction/BigFractionTest.java  | 30
> --
> >  .../commons/numbers/fraction/FractionTest.java |  4 +--
> >  3 files changed, 57 insertions(+), 4 deletions(-)
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

[GSoC] Thursday mentee meeting

2019-05-22 Thread Eric Barnhill

Let's have another mentee meeting Thursday morning, same time as the
previous two. (Sorry about the miscommunication Abhishek).

As preparation for this meeting please have prepared a detailed flow
diagram for your proposed components, ideally with sufficient detail that
it includes some unit tests. The more you mark up in advance the easier
coding will be when it starts next week.

[statistics] develop branch created

2019-05-22 Thread Eric Barnhill

As I mentioned previously, there is now a "develop" branch in
commons-statistics. Recommended standard procedure from now on, create
feature branches off the develop branch, then PR into the develop branch.
Then when stability is confirmed, someone can merge develop into master.

Re: [Lang] BigDecimalStatistics proposition

2019-05-14 Thread Eric Barnhill

Yes. This sounds great for commons-statistics. Other work in a similar vein
will be happening this summer by one of our GSOC mentees.

On Tue, May 14, 2019, 15:04 Gary Gregory  wrote:

> We have a Commons Statistics component that might be a fit.
>
> Gary
>
> On Tue, May 14, 2019, 17:34 Aleksander Ściborek <
> aleksanderscibo...@gmail.com> wrote:
>
> > Hi, I've come up with the idea of making easier using Stream with
> > BigDecimal class.
> > The idea is to create BigDecimalStatistics class which provide a
> convenient
> > way for calculating max, min, average and sum from BigDecimals from
> Stream.
> > I think that it's very suitable for commons library.
> > Should it be implemented in commons lang or commons math? I believe that
> > it's more suitable for commons lang
> > This is a link to Jira Ticket : LANG-1459
> > 
> > Aleksander
> >
>

[GSoC] commons-gsoc Thursday meeting?

2019-05-14 Thread Eric Barnhill

Should we have another Slack meeting at the same time this Thursday, 5pm
UTC (9am California time)?

The first focus of this meeting will be blockers and other questions the
mentees have, trying to get up to speed on command line git, maven and
POMs, and IDEs. Everyone should bring at least one thing to ask about. We
will otherwise assume the mentees are ready to go with these topics.

After that we'll move on to goals for the next week. I propose this goal is
a software flowchart of your commons component that you will be developing.
This can be in as much detail as you like, including method names, doc,
etc. The more you mock out, the easier your later work will be, and mine
too.

In particular, you have probably noticed that maven creates src/main and
src/test folders. Even if you have already done some flowcharting for your
component, see if you can start to flowchart the unit tests. As I said on
the Slack you may even find it interesting to sketch the tests first as it
will give your coding work a clear endpoint to focus on. Don't
underestimate how much time testing and doc will take you -- it can take
half of your time with new projects.

This is all flexible but we will need to see a detailed spec to sign off on
clearing the project out of community bonding.

So, I'll be available to answer questions on that next step as well.

Re: [statistics] Mode function for Cauchy distribution

2019-05-09 Thread Eric Barnhill

Awesome!

On Thu, May 9, 2019 at 10:44 AM Udit Arora  wrote:

> I will see what I can do. It will take some time, but I will get to know
> more about the other distributions.
>
>
> On Thu, 9 May 2019, 10:58 pm Eric Barnhill, 
> wrote:
>
> > Udit, is it clear what to do here? Gilles recommends you propose some
> edits
> > to ContinuousDistribution instead, to return Mode and Median.
> >
> > But then, if an interface is altered, all the classes that implement that
> > interface need to have these functions added, so we hope you are up for
> all
> > that additional work. We can help you.
> >
> > Last is the idea of accessor methods. if the method starts with get_()
> then
> > in principle this is just returning a field already present. But with
> that
> > in mind, I don't know why we already have a method name like getMean() in
> > this interface. We don't really know whether for a given distribution,
> that
> > would be a true accessor or need to be calculated. So I think all these
> > method names should just be mean(), mode(), median(), etc.
> >
> > So sorry if this is blowing up into more work than you expected. It often
> > works that way! I certainly think these changes are worthwhile however.
> >
> >
> >
> > On Thu, May 9, 2019 at 7:17 AM Gilles Sadowski 
> > wrote:
> >
> > > Hi Udit.
> > >
> > > Le jeu. 9 mai 2019 à 12:52, Udit Arora  a
> écrit :
> > > >
> > > > I intend to add a mode function for the Cauchy Distribution. It is a
> > > small
> > > > addition which i thought might be helpful.
> > >
> > > How will it be helpful?  I.e. what would an application developer
> > > be able to do, that he can't with the current code?
> > >
> > > You've surely noted that that the class you want to modify is but
> > > one of the implementations of the interface "ContinuousDistribution".
> > > So if you propose to change the API, the change should be done
> > > at the interface level, and the appropriate computation performed, or
> > > method overloads defined, for all implementations.
> > >
> > > The "accessor" methods refer to fields that were set by the contructor;
> > > e.g. for "CauchyDistribution", "median" and "scale".
> > > In this case, it happens that "mode" has the same value as "median",
> > > but does this warrant an additional method?
> > >
> > > Regards,
> > > Gilles
> > >
> > > > Thanks
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> >
>

Re: [statistics] Mode function for Cauchy distribution

2019-05-09 Thread Eric Barnhill

Udit, is it clear what to do here? Gilles recommends you propose some edits
to ContinuousDistribution instead, to return Mode and Median.

But then, if an interface is altered, all the classes that implement that
interface need to have these functions added, so we hope you are up for all
that additional work. We can help you.

Last is the idea of accessor methods. if the method starts with get_() then
in principle this is just returning a field already present. But with that
in mind, I don't know why we already have a method name like getMean() in
this interface. We don't really know whether for a given distribution, that
would be a true accessor or need to be calculated. So I think all these
method names should just be mean(), mode(), median(), etc.

So sorry if this is blowing up into more work than you expected. It often
works that way! I certainly think these changes are worthwhile however.

On Thu, May 9, 2019 at 7:17 AM Gilles Sadowski  wrote:

> Hi Udit.
>
> Le jeu. 9 mai 2019 à 12:52, Udit Arora  a écrit :
> >
> > I intend to add a mode function for the Cauchy Distribution. It is a
> small
> > addition which i thought might be helpful.
>
> How will it be helpful?  I.e. what would an application developer
> be able to do, that he can't with the current code?
>
> You've surely noted that that the class you want to modify is but
> one of the implementations of the interface "ContinuousDistribution".
> So if you propose to change the API, the change should be done
> at the interface level, and the appropriate computation performed, or
> method overloads defined, for all implementations.
>
> The "accessor" methods refer to fields that were set by the contructor;
> e.g. for "CauchyDistribution", "median" and "scale".
> In this case, it happens that "mode" has the same value as "median",
> but does this warrant an additional method?
>
> Regards,
> Gilles
>
> > Thanks
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [STATISTICS][Regression][Linear Math] Is there any plan/anyone working on a new Linear Math module currently?

2019-05-08 Thread Eric Barnhill

It looks to me like the EJML library is the best choice for linear algebra
right now, is well supported, and we should not reinvent the wheel unless
we have the motivation and expertise to do so.

EJML is under the Apache 2.0 license which I read to mean we can use it in
any derivative way we please so long as (and this would be true regardless
if the license requires it IMO) we attribute the source.

So as a default plan I would shade these libraries within the regression
module, with thanks and attribution to the EJML site and org.


On Wed, May 8, 2019 at 2:49 PM Rob Tompkins  wrote:

>
>
> > On May 8, 2019, at 4:37 PM, Ben Nguyen  wrote:
> >
> > Hello,
> >
> > The regression module will require a lot of linear math, specifically
> matrix operations which I’ve heard is outdated. Are there any updates on
> it’s development? Is this someone’s GSoC project? If not I could try to
> help by attempting to start porting regression essential operations. But
> the dependencies for the current library is vast so this would end up being
> a large endeavor and I know I am not one to properly design a linear math
> library, I only know the basics, it would probably become a mess. So if
> there is no current development plan I fear I might have to start by using
> the old library for now until linear’s development kicks in…. Is this okay?
> >
>
> I suppose the question is: what is commons-numbers, and if a matrix is a
> “number” or it is sufficiently different to warrant a separate component.
>
> It is worth noting that in there have been past arguments over additional
> math components before we get 1.0 releases for the current ones in flight
> (but I feel like the fastest route to any component’s 1.0 should take
> priority).
>
> What are other folks’ thoughts here? I would think that linear algebra
> would likely be a widely used library as it’s fairly fundamental to a
> collection of machine learning algorithms as they are based in least
> squares.
>
> -Rob
>
> > Thank you,
> > Ben
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

[statistics][numbers] set up develop branches?

2019-05-08 Thread Eric Barnhill

Since it looks like we will have some development in these libraries this
summer (whee!) I propose starting 'develop' branches for these libraries.
The mentees and others can then create feature branches off of develop, and
submit pull requests for feature branches into develop. Then develop is
merged into master periodically when all is clear. That is the typical
GitHub cadence as I know it anyway. I am very used to this pattern and will
happy to be the person making sure it happens.

So, perhaps interested parties could vote, if it goes ahead I will write
the ticket, then create the develop branches.

Re: [numbers][GSoC] Slack for GSoC mentees

2019-05-06 Thread Eric Barnhill

On Mon, May 6, 2019 at 11:51 AM Virendra singh Rajpurohit <
virendrasing...@gmail.com> wrote:

> Hey Eric, My name is there on projects list, but I haven't yet received any
> official mail from Apache or Google.
> Does that mean I'm selected?
>

Yes, congratulations you were selected. Check the spam filter maybe?

We've had less interaction with you, so we will need to work together
during community bonding to make sure you are contributing on the same
level as the others.

I will add you to the slack. Is this your preferred email?

Re: [numbers][GSoC] Slack for GSoC mentees

2019-05-06 Thread Eric Barnhill

Udit you are welcome to join us to chat on the channel. Do you want an
invitation?

On Mon, May 6, 2019 at 11:13 AM Udit Arora  wrote:

> Congrats Ben.
>
> On Mon, 6 May 2019, 11:42 pm Ben Nguyen,  wrote:
>
> > Hello,
> >
> > Any update on which communication tool will be used? Slack, Zulip? I’m
> > excited to get started!
> >
> > Ben
> >
> > From: Mark Thomas
> > Sent: Wednesday, May 1, 2019 4:21 PM
> > To: Commons Developers List
> > Subject: Re: [numbers][GSoC] Slack for GSoC mentees
> >
> > On 01/05/2019 22:09, Eric Barnhill wrote:
> > > Thanks Mark,
> > >
> > > It looks like an apache.org domain email is required to register, and
> I
> > > don't think my mentees are going to have one of those, so I may still
> > open
> > > a Zulip on the side. I am happy to have joined the commons slack there
> > > though!
> >
> > You should be able to invite folks without @apache.org addresses
> >
> > Mark
> >
> >
> > >
> > > Eric
> > >
> > > On Wed, May 1, 2019 at 1:58 PM Mark Thomas  wrote:
> > >
> > >> On 01/05/2019 21:54, Eric Barnhill wrote:
> > >>> On Wed, May 1, 2019 at 1:49 PM Mark Thomas  wrote:
> > >>>
> > >>>> On 01/05/2019 21:38, Eric Barnhill wrote:
> > >>>>> Actually some objections have been raised to using Slack because it
> > is
> > >>>> not
> > >>>>> open source. So the options will be either zulipchat if a group of
> > >> people
> > >>>>> want to use it, or Riot if it is just me.
> > >>>>
> > >>>> Better stop using GitHub as well then.
> > >>>>
> > >>>> There is no ASF policy that requires the tools we use to be open
> > source.
> > >>>>
> > >>>>
> > >>> Thanks for clarifying.
> > >>>
> > >>>
> > >>>> There is an ASF slack instance - you could request (create?) a
> > >>>> commons-gsoc channel there.
> > >>>>
> > >>>
> > >>> Knock me over with a feather. This not appear to be mentioned at
> > >>> community.apache.org .  Is it asf.slack.com?
> > >>
> > >> the-asf.slack.com
> > >> There is already a commons channel. I think you can create
> commons-gsoc
> > >> if you need it. If not, I look to be able to create channels. Just
> ping
> > me.
> > >>
> > >> Mark
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > >> For additional commands, e-mail: dev-h...@commons.apache.org
> > >>
> > >>
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
> >
>

Re: [numbers][GSoC] Slack for GSoC mentees

2019-05-06 Thread Eric Barnhill

I didn't realize that and perhaps I should have wrapped up that discussion
more clearly. Apologies.

I thought Mark's suggestion was very good. All would take place at the
official Apache slack channel. I set up a commons-gsoc channel and invited
Ben and Abhishek. Certainly, that solves all the issues I had. I just
wanted a communication channel.

Welcome Ben and Abhishek, please find the commons-gsoc and let's
communicate there. I will check in today, by end of California workday.




On Mon, May 6, 2019 at 11:18 AM Rob Tompkins  wrote:

>
>
> > On May 6, 2019, at 2:13 PM, Udit Arora  wrote:
> >
> > Congrats Ben.
>
> +1 Congrats,  Ben.
>
> I think we’re trying to sort out what messaging system we’re going to use.
>
> -Rob
>
> >
> > On Mon, 6 May 2019, 11:42 pm Ben Nguyen,  wrote:
> >
> >> Hello,
> >>
> >> Any update on which communication tool will be used? Slack, Zulip? I’m
> >> excited to get started!
> >>
> >> Ben
> >>
> >> From: Mark Thomas
> >> Sent: Wednesday, May 1, 2019 4:21 PM
> >> To: Commons Developers List
> >> Subject: Re: [numbers][GSoC] Slack for GSoC mentees
> >>
> >> On 01/05/2019 22:09, Eric Barnhill wrote:
> >>> Thanks Mark,
> >>>
> >>> It looks like an apache.org domain email is required to register, and
> I
> >>> don't think my mentees are going to have one of those, so I may still
> >> open
> >>> a Zulip on the side. I am happy to have joined the commons slack there
> >>> though!
> >>
> >> You should be able to invite folks without @apache.org addresses
> >>
> >> Mark
> >>
> >>
> >>>
> >>> Eric
> >>>
> >>> On Wed, May 1, 2019 at 1:58 PM Mark Thomas  wrote:
> >>>
> >>>> On 01/05/2019 21:54, Eric Barnhill wrote:
> >>>>> On Wed, May 1, 2019 at 1:49 PM Mark Thomas  wrote:
> >>>>>
> >>>>>> On 01/05/2019 21:38, Eric Barnhill wrote:
> >>>>>>> Actually some objections have been raised to using Slack because it
> >> is
> >>>>>> not
> >>>>>>> open source. So the options will be either zulipchat if a group of
> >>>> people
> >>>>>>> want to use it, or Riot if it is just me.
> >>>>>>
> >>>>>> Better stop using GitHub as well then.
> >>>>>>
> >>>>>> There is no ASF policy that requires the tools we use to be open
> >> source.
> >>>>>>
> >>>>>>
> >>>>> Thanks for clarifying.
> >>>>>
> >>>>>
> >>>>>> There is an ASF slack instance - you could request (create?) a
> >>>>>> commons-gsoc channel there.
> >>>>>>
> >>>>>
> >>>>> Knock me over with a feather. This not appear to be mentioned at
> >>>>> community.apache.org .  Is it asf.slack.com?
> >>>>
> >>>> the-asf.slack.com
> >>>> There is already a commons channel. I think you can create
> commons-gsoc
> >>>> if you need it. If not, I look to be able to create channels. Just
> ping
> >> me.
> >>>>
> >>>> Mark
> >>>>
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >>>> For additional commands, e-mail: dev-h...@commons.apache.org
> >>>>
> >>>>
> >>>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> >>
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers][GSoC] Slack for GSoC mentees

2019-05-06 Thread Eric Barnhill

Sounds like the word is out. Ben is this your preferred email for a slack
communication? I will invite you to our dedicated ASF commons-gsoc slack
channel.

On Mon, May 6, 2019 at 11:12 AM Ben Nguyen  wrote:

> Hello,
>
> Any update on which communication tool will be used? Slack, Zulip? I’m
> excited to get started!
>
> Ben
>
> From: Mark Thomas
> Sent: Wednesday, May 1, 2019 4:21 PM
> To: Commons Developers List
> Subject: Re: [numbers][GSoC] Slack for GSoC mentees
>
> On 01/05/2019 22:09, Eric Barnhill wrote:
> > Thanks Mark,
> >
> > It looks like an apache.org domain email is required to register, and I
> > don't think my mentees are going to have one of those, so I may still
> open
> > a Zulip on the side. I am happy to have joined the commons slack there
> > though!
>
> You should be able to invite folks without @apache.org addresses
>
> Mark
>
>
> >
> > Eric
> >
> > On Wed, May 1, 2019 at 1:58 PM Mark Thomas  wrote:
> >
> >> On 01/05/2019 21:54, Eric Barnhill wrote:
> >>> On Wed, May 1, 2019 at 1:49 PM Mark Thomas  wrote:
> >>>
> >>>> On 01/05/2019 21:38, Eric Barnhill wrote:
> >>>>> Actually some objections have been raised to using Slack because it
> is
> >>>> not
> >>>>> open source. So the options will be either zulipchat if a group of
> >> people
> >>>>> want to use it, or Riot if it is just me.
> >>>>
> >>>> Better stop using GitHub as well then.
> >>>>
> >>>> There is no ASF policy that requires the tools we use to be open
> source.
> >>>>
> >>>>
> >>> Thanks for clarifying.
> >>>
> >>>
> >>>> There is an ASF slack instance - you could request (create?) a
> >>>> commons-gsoc channel there.
> >>>>
> >>>
> >>> Knock me over with a feather. This not appear to be mentioned at
> >>> community.apache.org .  Is it asf.slack.com?
> >>
> >> the-asf.slack.com
> >> There is already a commons channel. I think you can create commons-gsoc
> >> if you need it. If not, I look to be able to create channels. Just ping
> me.
> >>
> >> Mark
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> >>
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>
>

Re: [numbers][rng[GSoC] Slack for GSoC mentees

2019-05-06 Thread Eric Barnhill

Hi Abishek, is this your preferred email?

On Mon, May 6, 2019 at 11:14 AM Abhishek Dhadwal 
wrote:

> +1
> I'd like to know if the zulip chat would be available for RNG members also
> !
> Look forward to working alongside all the members of the organization.
> Regards,
> Abhishek
>
>
> On Mon, May 6, 2019, 23:42 Ben Nguyen  wrote:
>
> > Hello,
> >
> > Any update on which communication tool will be used? Slack, Zulip? I’m
> > excited to get started!
> >
> > Ben
> >
> > From: Mark Thomas
> > Sent: Wednesday, May 1, 2019 4:21 PM
> > To: Commons Developers List
> > Subject: Re: [numbers][GSoC] Slack for GSoC mentees
> >
> > On 01/05/2019 22:09, Eric Barnhill wrote:
> > > Thanks Mark,
> > >
> > > It looks like an apache.org domain email is required to register, and
> I
> > > don't think my mentees are going to have one of those, so I may still
> > open
> > > a Zulip on the side. I am happy to have joined the commons slack there
> > > though!
> >
> > You should be able to invite folks without @apache.org addresses
> >
> > Mark
> >
> >
> > >
> > > Eric
> > >
> > > On Wed, May 1, 2019 at 1:58 PM Mark Thomas  wrote:
> > >
> > >> On 01/05/2019 21:54, Eric Barnhill wrote:
> > >>> On Wed, May 1, 2019 at 1:49 PM Mark Thomas  wrote:
> > >>>
> > >>>> On 01/05/2019 21:38, Eric Barnhill wrote:
> > >>>>> Actually some objections have been raised to using Slack because it
> > is
> > >>>> not
> > >>>>> open source. So the options will be either zulipchat if a group of
> > >> people
> > >>>>> want to use it, or Riot if it is just me.
> > >>>>
> > >>>> Better stop using GitHub as well then.
> > >>>>
> > >>>> There is no ASF policy that requires the tools we use to be open
> > source.
> > >>>>
> > >>>>
> > >>> Thanks for clarifying.
> > >>>
> > >>>
> > >>>> There is an ASF slack instance - you could request (create?) a
> > >>>> commons-gsoc channel there.
> > >>>>
> > >>>
> > >>> Knock me over with a feather. This not appear to be mentioned at
> > >>> community.apache.org .  Is it asf.slack.com?
> > >>
> > >> the-asf.slack.com
> > >> There is already a commons channel. I think you can create
> commons-gsoc
> > >> if you need it. If not, I look to be able to create channels. Just
> ping
> > me.
> > >>
> > >> Mark
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > >> For additional commands, e-mail: dev-h...@commons.apache.org
> > >>
> > >>
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
> >
>

Re: [All] Help with GitHub "support"

2019-05-02 Thread Eric Barnhill

I am happy to review PRs and approve merges as well, it's become part of my
daily routine at work to use GitHub in this way, so it's no trouble.

On Thu, May 2, 2019 at 3:56 AM Gilles Sadowski  wrote:

> Hi.
>
> Some people are providing PRs[1] on GitHub without engaging with
> us, here, or on JIRA.
> When this happens for codes[2] which I'm the assumed reviewer,[3]
> I'd need help from someone, with a GitHub account, who would post
> a comment there, in order to let the "outside" contributors know that
> we won't apply PRs without tracking information (JIRA ticket and/or
> post on "dev"), as per the "contributions guidelines".[4]
>
> Thanks,
> Gilles
>
> [1] Last examples:
> https://github.com/apache/commons-math/pull/105
> https://github.com/apache/commons-statistics/pull/4
> [2] "RNG", "Numbers", "Statistics"
> [3] Unless someone else is willing to engage in reviewing the
> proposal on GitHub, and perform the merge.
> [4] http://commons.apache.org/patches.html
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers][GSoC] Slack for GSoC mentees

2019-05-01 Thread Eric Barnhill

Thanks Mark,

It looks like an apache.org domain email is required to register, and I
don't think my mentees are going to have one of those, so I may still open
a Zulip on the side. I am happy to have joined the commons slack there
though!

Eric

On Wed, May 1, 2019 at 1:58 PM Mark Thomas  wrote:

> On 01/05/2019 21:54, Eric Barnhill wrote:
> > On Wed, May 1, 2019 at 1:49 PM Mark Thomas  wrote:
> >
> >> On 01/05/2019 21:38, Eric Barnhill wrote:
> >>> Actually some objections have been raised to using Slack because it is
> >> not
> >>> open source. So the options will be either zulipchat if a group of
> people
> >>> want to use it, or Riot if it is just me.
> >>
> >> Better stop using GitHub as well then.
> >>
> >> There is no ASF policy that requires the tools we use to be open source.
> >>
> >>
> > Thanks for clarifying.
> >
> >
> >> There is an ASF slack instance - you could request (create?) a
> >> commons-gsoc channel there.
> >>
> >
> > Knock me over with a feather. This not appear to be mentioned at
> > community.apache.org .  Is it asf.slack.com?
>
> the-asf.slack.com
> There is already a commons channel. I think you can create commons-gsoc
> if you need it. If not, I look to be able to create channels. Just ping me.
>
> Mark
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers][GSoC] Slack for GSoC mentees

2019-05-01 Thread Eric Barnhill

On Wed, May 1, 2019 at 1:49 PM Mark Thomas  wrote:

> On 01/05/2019 21:38, Eric Barnhill wrote:
> > Actually some objections have been raised to using Slack because it is
> not
> > open source. So the options will be either zulipchat if a group of people
> > want to use it, or Riot if it is just me.
>
> Better stop using GitHub as well then.
>
> There is no ASF policy that requires the tools we use to be open source.
>
>
Thanks for clarifying.


> There is an ASF slack instance - you could request (create?) a
> commons-gsoc channel there.
>

Knock me over with a feather. This not appear to be mentioned at
community.apache.org .  Is it asf.slack.com?

[numbers][GSoC] Slack for GSoC mentees

2019-05-01 Thread Eric Barnhill

I am going to set up a Slack to communicate with my GSoC mentees.

I know official policy is to communicate on this list, but especially with
small setup questions the mentees might have, or gaps in their knowledge,
that will create unnecessary spam for everyone. Larger-scale decisions will
be posted here so they are on the record.

I didn't know what scope to make this Slack. So I called it
apachecommonsnumbers.slack.com . If more people in commons are interested
in this idea, I could change its name to apachecommons.slack.com, and we
could all set up our own channels on it. It's free if that concerns anyone.

Give me a +1 if you want to use it and I will invite you. Otherwise it will
be lonely, just me and my mentees. :)

Re: [numbers][GSoC] Slack for GSoC mentees

2019-05-01 Thread Eric Barnhill

Actually some objections have been raised to using Slack because it is not
open source. So the options will be either zulipchat if a group of people
want to use it, or Riot if it is just me.

Thanks, Eric

On Wed, May 1, 2019 at 1:20 PM Eric Barnhill  wrote:

> I am going to set up a Slack to communicate with my GSoC mentees.
>
> I know official policy is to communicate on this list, but especially with
> small setup questions the mentees might have, or gaps in their knowledge,
> that will create unnecessary spam for everyone. Larger-scale decisions will
> be posted here so they are on the record.
>
> I didn't know what scope to make this Slack. So I called it
> apachecommonsnumbers.slack.com . If more people in commons are interested
> in this idea, I could change its name to apachecommons.slack.com, and we
> could all set up our own channels on it. It's free if that concerns anyone.
>
> Give me a +1 if you want to use it and I will invite you. Otherwise it
> will be lonely, just me and my mentees. :)
>

[statistics] [gsoc] New ticket for regression proposals

2019-04-02 Thread Eric Barnhill

The STATISTICS-7 ticket is not relevant for the exciting regression
proposals we have received. Would the two authors of these regression
proposals please reference ticket
https://issues.apache.org/jira/browse/STATISTICS-8 . Sorry it is a bit of a
rush job, I can iterate it a bit when I have more time.

Since I was asked about how many will be accepted. My understanding is that
how many GSoc slots Apache gets, is out of our hands, as Google pays the
students. However I think there are a lot of other benefits to contributing
to a high-visibility library and project like this and everyone who applies
is more than welcome to stick around and build your scientific and
engineering track record. We will help you get started contributing whether
or not you are in GSoC.

Re: [commons-statistics] STATISTICS-7 discussion

2019-04-02 Thread Eric Barnhill

Sorry you are right I am reading Salman's. Looking forward to reading yours
as well.

On Tue, Apr 2, 2019 at 1:27 PM Ben Nguyen  wrote:

> Hello Mr. Eric Barnhill
> I have not submitted my draft proposal yet, you must’ve read someone
> else’s but I will submit mine later today or tomorrow with some more
> details about this approach idea.
> Thanks,
> -Ben
>
> From: Eric Barnhill
> Sent: Tuesday, April 2, 2019 3:18 PM
> To: Commons Developers List
> Subject: Re: [commons-statistics] STATISTICS-7 discussion
>
> Estimators and Residuals interfaces. I'd never thought of that. I like it!
>
> I have read your draft proposal and I will make some comments over there,
> shortly.
>
>
>
> On Mon, Apr 1, 2019 at 5:33 PM Ben Nguyen  wrote:
>
> > Hello,
> > With the regression library restructuring, am I correct to assume that a
> > priority is to structure it such that appendage of new tools after the
> port
> > of current linear regression (OLS, GLS, SimpleRegression) is as painless
> as
> > possible?
> >
> > I’ve seen this approach elsewhere and want to know what you think:
> > an approach which separates key regression features by implementing for
> > e.g an Estimators and Residuals parent abstract/interface (others as
> > needed) which is extended by for ex: OLSEstimators and OLSResiduals….
> Then
> > have a central handler ex: OLSRegression…. All of which are in the
> package
> > regression-linear-ols? What do you think of this preliminary idea?
> > I would think that appending say the LogisticRegression (and other types)
> > would be more straightforward as a result, having different regression
> > types each having defined behavior and in separate packages with minimal
> > dependencies as well of course.
> >
> > Thank you
> > -Ben
> >
> > From: Eric Barnhill
> > Sent: Monday, April 1, 2019 11:02 AM
> > To: Commons Developers List
> > Subject: [commons-statistics] STATISTICS-7 discussion
> >
> > Our ongoing discussion with potential mentees is being moved here as
> > suggested by Gilles.
> >
> > Gilles commented on STATISTICS-7:
> > -
> >
> > current "math-linear" will be ported to "Commons Linear" in the future?
> >
> >
> > Perhaps; we'd need expert advice on how to design a modern implementation
> > of matrix algebra (?).
> >
> > In the meantime, it may be worth exploring the implications of having a
> > very focused {{commons-numbers-matrix}} module in "Commons Numbers".
> >
> > I also recommend checking out the EJML, which appears to be well
> > maintained, and probably has more expertise behind it than we would be
> able
> > to bring here. Like JTransforms its performance appears to be best in
> class
> > and it is appealingly encapsulated with no mission creep.
> >
> >
> >
> > > just use the current library temporarily for now
> >
> >
> > I'd rather not, as it will perpetuate the impression that "Commons Math"
> is
> > still supported.  A new major version of CM should be released (with
> > "legacy" codes) that will depend on "Commons Statistics".
> >
> > I agree, we do not want these libraries depending on commons-math.
> >
> >
> >  "math-util"
> >
> >
> > Anything in there that is still useful is a candidate for "Commons
> > Numbers".  Did you have a look at what's there already?
> >
> >
> > It is worth continuing the discussion about these Utils and utils-type
> > classes. They are often antipatterns that are falling between the stools
> of
> > object encapsulation and functional programming. MathUtils in particular
> > does nothing to describe the random functionalities in that class, all of
> > which probably have a better home somewhere else.
> >
> > Someone else in our discussion mentioned MathArrays; most of this
> > functionality should be handled by streams now for example, and the
> current
> > algorithmic approach of most of MathArrays should be discouraged.
> >
> >
>
>

Re: [commons-statistics] STATISTICS-7 discussion

2019-04-02 Thread Eric Barnhill

Estimators and Residuals interfaces. I'd never thought of that. I like it!

I have read your draft proposal and I will make some comments over there,
shortly.



On Mon, Apr 1, 2019 at 5:33 PM Ben Nguyen  wrote:

> Hello,
> With the regression library restructuring, am I correct to assume that a
> priority is to structure it such that appendage of new tools after the port
> of current linear regression (OLS, GLS, SimpleRegression) is as painless as
> possible?
>
> I’ve seen this approach elsewhere and want to know what you think:
> an approach which separates key regression features by implementing for
> e.g an Estimators and Residuals parent abstract/interface (others as
> needed) which is extended by for ex: OLSEstimators and OLSResiduals…. Then
> have a central handler ex: OLSRegression…. All of which are in the package
> regression-linear-ols? What do you think of this preliminary idea?
> I would think that appending say the LogisticRegression (and other types)
> would be more straightforward as a result, having different regression
> types each having defined behavior and in separate packages with minimal
> dependencies as well of course.
>
> Thank you
> -Ben
>
> From: Eric Barnhill
> Sent: Monday, April 1, 2019 11:02 AM
> To: Commons Developers List
> Subject: [commons-statistics] STATISTICS-7 discussion
>
> Our ongoing discussion with potential mentees is being moved here as
> suggested by Gilles.
>
> Gilles commented on STATISTICS-7:
> -
>
> current "math-linear" will be ported to "Commons Linear" in the future?
>
>
> Perhaps; we'd need expert advice on how to design a modern implementation
> of matrix algebra (?).
>
> In the meantime, it may be worth exploring the implications of having a
> very focused {{commons-numbers-matrix}} module in "Commons Numbers".
>
> I also recommend checking out the EJML, which appears to be well
> maintained, and probably has more expertise behind it than we would be able
> to bring here. Like JTransforms its performance appears to be best in class
> and it is appealingly encapsulated with no mission creep.
>
>
>
> > just use the current library temporarily for now
>
>
> I'd rather not, as it will perpetuate the impression that "Commons Math" is
> still supported.  A new major version of CM should be released (with
> "legacy" codes) that will depend on "Commons Statistics".
>
> I agree, we do not want these libraries depending on commons-math.
>
>
>  "math-util"
>
>
> Anything in there that is still useful is a candidate for "Commons
> Numbers".  Did you have a look at what's there already?
>
>
> It is worth continuing the discussion about these Utils and utils-type
> classes. They are often antipatterns that are falling between the stools of
> object encapsulation and functional programming. MathUtils in particular
> does nothing to describe the random functionalities in that class, all of
> which probably have a better home somewhere else.
>
> Someone else in our discussion mentioned MathArrays; most of this
> functionality should be handled by streams now for example, and the current
> algorithmic approach of most of MathArrays should be discouraged.
>
>

Re: Lam Gia Thuan - GSoC19 - Numbers 96: A few questions about the topic!

2019-04-02 Thread Eric Barnhill

Sorry, a bad keystroke combination sent that early, one reply is not
finished. Never alternate between vim in one window and gmail in the other.

 2. *Is it okay if I am not familiar with Interpolation**'now'?* The
>> truth is, Interpolation is an area I have not known about, which is
>> why I find this task more exciting since I can definitely learn
>> something completely new.
>>
>
> Ideally you have:
>
>
1. Sufficient Java background demonstrable through a GitHUb repository, and
perhaps a track record of other collaborations
2. Sufficient mathematical background. You need to know something about
polynomials and related function families, how functions are approximated,
particularly least squares and weighted least squares, some familiarty with
calculus and numerical methods, some work with matrices as many
interpolations are found through solving a linear system, and  ideally some
knowledge of  Fourier and other frequency-domain methods.

Re: Lam Gia Thuan - GSoC19 - Numbers 96: A few questions about the topic!

2019-04-02 Thread Eric Barnhill

Lam,

A warm welcome to you.  I have replied within your message below.

On Mon, Apr 1, 2019 at 4:49 PM thuan  wrote:

>
>
>  1. *What is the complete scope of this project?* Is it only NUMBERS-96
> or all NUMBERS-related JIRAs? I want to know it to ensure where to
> focus.
>

It is only NUMBERS-96 and would focus on interpolation.

>  2. *Is it okay if I am not familiar with Interpolation**'now'?* The
> truth is, Interpolation is an area I have not known about, which is
> why I find this task more exciting since I can definitely learn
> something completely new.
>

Ideally you have:

>  3. Are there any specific requirements in terms of skills? *How do you
> assure that I have the skills (now or in the near future) that you
> need for this task?*
>

Your proposal should point to some examples of previous Java coding that
show you have the necessary knowledge of Java and sufficient mathematical
background.

>  4. By porting and redeveloping, is it only about translating to Java 8?
> I mean: *Will there be any research areas for improving the
> performance of these algorithms in time complexity and memory*? I
> really want to know, since I am thinking of a thesis topic after
> this and it would be great if I can make use of this project.
>

I would be more concerned you grasped the basics of the interpolation
first. As far as this code library goes, there is almost certainly room to
improve its performance. As far as a thesis goes interpolation is a pretty
well worked out mathematical problem, but there are a lot of good CS
applications, for example in video games, that you could apply it to.

>  5. *Do we implement any algorithms other than those available in
> commons-math?*
>

The current framework in commons-math is in my opinion good and the first
priority is to turn this into a freestanding library.

>  6. By documentation, what kind of documentation would you expect? *A
> javadoc like the old commons-math or a user guide with examples like
> that of **JUnit 5
> **?*
>

Both. You will need to follow Javadoc best practices, and you should
conclude the work with some kind of user friendly document.

>  7. By Java 8+, *do you expect Java 11 also*?
>

Target Java 8.

>  8. In the JIRA, there is no requirement for tests?*Will we implement
> tests? If we do, will we implement both accuracy tests and
> performance tests?* I would like to know to put them in the
> deliverables.
>

There are already tests. Porting the tests will be step 1. Then we can
evaluate how good the testing coverage is and what changes to the test
library need to be made. That would be good experience for you I think.

>  By the way, the link to the package summary in [ NUMBERS-96 ] has
> one wrong character ')' at the end, so it is basically inaccessible.
>

I think Gilles just fixed that.

[commons-statistics] STATISTICS-7 discussion

2019-04-01 Thread Eric Barnhill

Our ongoing discussion with potential mentees is being moved here as
suggested by Gilles.

Gilles commented on STATISTICS-7:
-

current "math-linear" will be ported to "Commons Linear" in the future?


Perhaps; we'd need expert advice on how to design a modern implementation
of matrix algebra (?).

In the meantime, it may be worth exploring the implications of having a
very focused {{commons-numbers-matrix}} module in "Commons Numbers".

I also recommend checking out the EJML, which appears to be well
maintained, and probably has more expertise behind it than we would be able
to bring here. Like JTransforms its performance appears to be best in class
and it is appealingly encapsulated with no mission creep.



> just use the current library temporarily for now


I'd rather not, as it will perpetuate the impression that "Commons Math" is
still supported.  A new major version of CM should be released (with
"legacy" codes) that will depend on "Commons Statistics".

I agree, we do not want these libraries depending on commons-math.


 "math-util"


Anything in there that is still useful is a candidate for "Commons
Numbers".  Did you have a look at what's there already?


It is worth continuing the discussion about these Utils and utils-type
classes. They are often antipatterns that are falling between the stools of
object encapsulation and functional programming. MathUtils in particular
does nothing to describe the random functionalities in that class, all of
which probably have a better home somewhere else.

Someone else in our discussion mentioned MathArrays; most of this
functionality should be handled by streams now for example, and the current
algorithmic approach of most of MathArrays should be discouraged.

[numbers-fraction] of() methods for BigFraction - take BigInteger only, or include long and int?

2019-03-29 Thread Eric Barnhill

Almost done with Fraction here.

Fraction() operates with int inputs only, due to the mathematical
limitations of the fast algorithm, so of() methods only need to handle int
inputs.

But what about BigFraction()? Right now the of() methods handle BigIngeters
only. Do we want to expand this so a BigFraction of() method can handle any
combination of BigInteger, long, and int? That would be quite a few
constructors but that's life with hard typing. Or should we document that
the user should wrap arguments to a BigFraction of() call in the BigInteger
class? I suppose I lean toward the latter.

Once I resolve this issue, I think I'll be able to run a style check and
submit the code for anyone interested to review.

[numbers-fraction] merging changes

2019-03-26 Thread Eric Barnhill

I'm rebasing fraction to master and the next merge is looking tricky.
Gilles, am I correct that you have added an interface and some methods to
Fraction, and pushed this to master? But master does not yet have the of()
and from() method name changes that we discussed while my branch does, also
my branch has parse().

If you like I can merge these two branches together. What I see remaining
to do with Fraction is:
- Write some unit tests for parse()
- Figure out what we are doing with the Format classes that we are no
longer using. Just delete them, knowing they are previous releases if
someone wants to implement numbers-formats some day?

Those are all the remaining changes I see.

Eric

Re: Google Summer of Code 2019 Mentor Registration

2019-03-21 Thread Eric Barnhill

>
> P.S. Do you know that a potential GSoC candidate is waiting for your
> feedback?
> https://issues.apache.org/jira/browse/STATISTICS-5
>
>
I did not see that! Thank you, I replied. I think STATISTICS-5 is
superseded by STATISTICS-7 which is also a bit more specific.

Re: [LANG]DurationUtils pull request reminder

2019-03-18 Thread Eric Barnhill

I don't have time to unravel this for you, and I am not saying my proposed
idea was the best, but generally it is an antipattern to have a bunch of
similarly named methods performing approximately the same task.

It is a nice idea for a common utility but surely there is a way to
implement it where someone sets the rounding method and precision somehow,
and then calls round(), rather than having lots of method names like
"roundUpDays()".

On Sun, Mar 17, 2019 at 2:49 PM Aleksander Ściborek <
aleksanderscibo...@gmail.com> wrote:

> I was thinking about this, the implementation of Duration class makes it
> quite hard. If method "public long get(TemporalUnit unit)" from Duration
> class supported all ChronoUnit it would be pretty easy to do, but with the
> current implementation I would have to make a lot of switch or if
> statements so therefore I don't see to much benefits from make this in more
> objectish way
>
>
> On Wed, 13 Mar 2019 at 00:20, Eric Barnhill 
> wrote:
>
> > I think this class is on its way howver I agree with Sebb's comments
> there
> > has to be more flexibility about the rounding approach.
> >
> > I am not sure a Utils class is the way to handle this flexibility. What
> > about a DurationRounder class or similar. Then an Enum for rounding
> method:
> > RoundingMethod.ROUND_UP,  RoundingMethod  .ROUND_DOWN etc. You will want
> to
> > include what is known in Matlab as the fix() or "round toward zero"
> method,
> > which is a common method when there are positive and negative numbers.
> >
> > The user can then set the rounding method of the object during
> construction
> > (or a setter I guess). Then when the object is passed a Duration the
> state
> > of the object will dictate how it rounds. In Python or Matlab the enum
> > would reference a function handle, I am not sure what would be the most
> > elegant Java solution for such a situation -- perhaps using MethodHandle,
> > or using a lambda expression?
> >
> > And then I would say ditto for the unit of rounding. So rather than
> > repetitive methods like roundUpDays(), you construct a
> > DurationRounder(RoundingMethod.FIX, RoundingUnit.SECONDS) and then just
> > call round() .
> >
> > Please jump in if anyone finds this approach objectionable.
> >
> > Eric
> >
> > On Tue, Mar 12, 2019 at 4:05 PM Aleksander Ściborek <
> > aleksanderscibo...@gmail.com> wrote:
> >
> > > Hi,
> > > I would like to remind abut my pull request :
> > > https://github.com/apache/commons-lang/pull/406
> > > I know that you have a lot of work, but please take look at it - this
> PR
> > > was created almost month ago.
> > > Aleksander
> > >
> >
>

Re: [LANG]DurationUtils pull request reminder

2019-03-12 Thread Eric Barnhill

I think this class is on its way howver I agree with Sebb's comments there
has to be more flexibility about the rounding approach.

I am not sure a Utils class is the way to handle this flexibility. What
about a DurationRounder class or similar. Then an Enum for rounding method:
RoundingMethod.ROUND_UP,  RoundingMethod  .ROUND_DOWN etc. You will want to
include what is known in Matlab as the fix() or "round toward zero" method,
which is a common method when there are positive and negative numbers.

The user can then set the rounding method of the object during construction
(or a setter I guess). Then when the object is passed a Duration the state
of the object will dictate how it rounds. In Python or Matlab the enum
would reference a function handle, I am not sure what would be the most
elegant Java solution for such a situation -- perhaps using MethodHandle,
or using a lambda expression?

And then I would say ditto for the unit of rounding. So rather than
repetitive methods like roundUpDays(), you construct a
DurationRounder(RoundingMethod.FIX, RoundingUnit.SECONDS) and then just
call round() .

Please jump in if anyone finds this approach objectionable.

Eric

On Tue, Mar 12, 2019 at 4:05 PM Aleksander Ściborek <
aleksanderscibo...@gmail.com> wrote:

> Hi,
> I would like to remind abut my pull request :
> https://github.com/apache/commons-lang/pull/406
> I know that you have a lot of work, but please take look at it - this PR
> was created almost month ago.
> Aleksander
>

Re: Google Summer of Code 2019 Mentor Registration

2019-03-12 Thread Eric Barnhill

On Tue, Mar 12, 2019 at 11:48 AM Gilles Sadowski 
wrote:

>
> There are also a couple of CM packages that would be worth porting
> to [Numbers] or their own component:
>   * o.a.c.math4.analysis.integration
>   * o.a.c.math4.analysis.interpolation
>   * o.a.c.math4.analysis.solvers
> (with adaptation to the interfaces of Java 8 "function" package).
>
> As for the "o.a.c.math4.ml" package, it should be fairly easy to
> port it to its own component, as there are no dependencies towards
> other CM packages.
> It could be worth having a small component focused on classification.
>
> WDYT?
>
>
 Interpolation I am well familiar with and have used the commons library
before, and would be happy to mentor.

The other analysis libraries are pretty far outside of my expertise, and I
am not qualified to mentor alone anyway, but would be happy to be involved
and learn how they work.

I guess I am not too interested in putting time into ML components if Weka
does it better.

Re: Google Summer of Code 2019 Mentor Registration

2019-03-12 Thread Eric Barnhill

What I have now found, doing a bit of background research for this, is that
there is a well-developed pure Java machine learning library called WEKA (
https://www.cs.waikato.ac.nz/~ml/weka/) . It seems to have good
institutional support and be well maintained. LIke I had in mind, the
syntax is pretty intuitive and similar in style to Scikit-Learn. There is a
nice tutorial using it that can be found at
https://tech.io/playgrounds/3771/machine-learning-with-java---part-1-linear-regression
which illustrates this. I don't know what I would want to do differently,
that Weka hasn't already done, other than its targeting of Java 8. So I
think it would probably be re-inventing the wheel to try to get something
similar started here.

I will re-focus my mind on trying to get some momentum for the stats
functions, which is what I had in mind last summer. I do think if healthy
momentum can build for stats functions, there is a natural fit for a fair
amount of machine learning to be incorporated including our own mothballed
clustering and neural net libraries.

Eric

On Mon, Mar 11, 2019 at 5:28 PM Bruno P. Kinoshita  wrote:

>  Sounds like an interesting idea Eric. I wonder if we would get some
> dogfooding through projects like Apache OpenNLP (one that I know uses ML in
> Java).
>
> CheersBruno
>
> On Tuesday, 12 March 2019, 1:24:24 pm NZDT, Eric Barnhill <
> ericbarnh...@gmail.com> wrote:
>
>  On Sat, Mar 9, 2019 at 4:56 PM Gilles Sadowski 
> wrote:
>
> > Hi Eric.
> >
> > Le ven. 8 mars 2019 à 22:22, Eric Barnhill  a
> > écrit :
> > >
> > > I am definitely willing to mentor development of the stats libraries
> as I
> > > was last year. Now that I work more in data science I am happy to also
> > > mentor the ML library
> >
> > What are you referring to?
> >
>
> Commons-math had a machine learning library. Now that I look it over it is
> really a bit emaciated. Still, I think there is an opportunity here to get
> some components up to date that could be pretty widely used, rethinking the
> structure and grammar of the library to echo Python's highly successful
> scikit-learn and Keras libraries.
>
> There are a lot of young people who are interested in getting into data
> science, we might get a good candidate or two looking to distinguish
> themselves. Also Java is such an important language in data science and
> engineering, even if a lot of the ML model building to date is in R and
> Python, so it is a great language for someone entering ML to know.
>
>
> > You have to register as a mentor. :-)
> >
>
> Sent.
>
>
> >
> > Then, read and follow the guidelines:
> >  http://community.apache.org/guide-to-being-a-mentor.html
> >
> > What should be done ASAP is tag existing, or new issues,
> > with the appropriate label so that tasks will appear here:
> >http://s.apache.org/gsoc2019ideas
>
>
> Will do tomorrow, hopefully is not too late.

Re: Google Summer of Code 2019 Mentor Registration

2019-03-11 Thread Eric Barnhill

On Sat, Mar 9, 2019 at 4:56 PM Gilles Sadowski  wrote:

> Hi Eric.
>
> Le ven. 8 mars 2019 à 22:22, Eric Barnhill  a
> écrit :
> >
> > I am definitely willing to mentor development of the stats libraries as I
> > was last year. Now that I work more in data science I am happy to also
> > mentor the ML library
>
> What are you referring to?
>

Commons-math had a machine learning library. Now that I look it over it is
really a bit emaciated. Still, I think there is an opportunity here to get
some components up to date that could be pretty widely used, rethinking the
structure and grammar of the library to echo Python's highly successful
scikit-learn and Keras libraries.

There are a lot of young people who are interested in getting into data
science, we might get a good candidate or two looking to distinguish
themselves. Also Java is such an important language in data science and
engineering, even if a lot of the ML model building to date is in R and
Python, so it is a great language for someone entering ML to know.

> You have to register as a mentor. :-)
>

Sent.

>
> Then, read and follow the guidelines:
>   http://community.apache.org/guide-to-being-a-mentor.html
>
> What should be done ASAP is tag existing, or new issues,
> with the appropriate label so that tasks will appear here:
> http://s.apache.org/gsoc2019ideas

Will do tomorrow, hopefully is not too late.

Re: Google Summer of Code 2019 Mentor Registration

2019-03-08 Thread Eric Barnhill

I am definitely willing to mentor development of the stats libraries as I
was last year. Now that I work more in data science I am happy to also
mentor the ML library -- in today's world this is NOT too distant a subject
for commons to cover and I am using those models every day, also it
integrates tightliy with stats.

However Gilles and I recruited someone to work on stats last year and due
to some sort of communications disaster, they were rejected despite our
approvals on this list, they couldn't get credit for working on the
project, and an enormous amount of everyone's time was wasted.

Have safeguards been put in place to make sure it won't happen again this
time? What should we have done differently?

And if we can go forward, I would like to contact that kid and give him
first crack at the stats library -- he went to all the trouble to make a
nice proposal for it and everything last year...

Eric



On Fri, Mar 8, 2019 at 12:14 PM Gilles Sadowski 
wrote:

> Hi.
>
> Anyone willing to apply (cf. message below)?
>
> Regards,
> Gilles
>
> -- Forwarded message -
> From: Ulrich Stärk 
> Date: ven. 8 mars 2019 à 20:49
> Subject: Google Summer of Code 2019 Mentor Registration
> To: 
> Cc: d...@community.apache.org 
>
>
> Dear PMCs,
>
> I'm happy to announce that the ASF has made it onto the list of
> accepted organizations for
> Google Summer of Code 2019! [1,2]
>
> It is now time for mentors to sign up, so please pass this email on to
> your community and
> podlings. If you aren’t already subscribed to
> ment...@community.apache.org you should do so now else
> you might miss important information.
>
> Mentor signup requires two steps: mentor signup in Google's system [3]
> and PMC acknowledgement.
>
> If you want to mentor a project in this year's SoC you will have to
>
> 1. Be an Apache committer.
> 2. Request an acknowledgement from the PMC for which you want to
> mentor projects. Use the below
> template and *do not forget to copy ment...@community.apache.org*. We
> will use the email adress you
> indicate to send the invite to be a mentor for Apache.
>
> PMCs, read carefully please.
>
> We request that each mentor is acknowledged by a PMC member. This is
> to ensure the mentor is in good
> standing with the community. When you receive a request for
> acknowledgement, please ACK it and cc
> ment...@community.apache.org
>
> Lastly, it is not yet too late to record your ideas in Jira (see
> previous emails for details).
> Students will now begin to explore ideas so if you haven’t already
> done so, record your ideas
> immediately!
>
> Cheers,
>
> The Apache GSoC Team
>
> mentor request email template:
> 
> to: private@.apache.org
> cc: ment...@community.apache.org
> subject: GSoC 2019 mentor request for 
>
>  PMC,
>
> please acknowledge my request to become a mentor for Google Summer of
> Code 2018 projects for Apache
> .
>
> I would like to receive the mentor invite to 
>
> 
>
> 
>
> [1] https://summerofcode.withgoogle.com/organizations/
> [2] https://summerofcode.withgoogle.com/organizations/6614885824200704/
> [3] https://summerofcode.withgoogle.com/
>
> -
> To unsubscribe, e-mail: private-unsubscr...@commons.apache.org
> For additional commands, e-mail: private-h...@commons.apache.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers-fraction] Maven surefire plugin error

2019-03-04 Thread Eric Barnhill

Rebasing on master fixed the problem, thank you.

On Mon, Mar 4, 2019 at 3:39 PM Gilles Sadowski  wrote:

>
>
> I'd recommend that you regularly rebase from "master".
>
> Regards,
> Gilles
>
> >
> > > Here is an excerpt of the exception trace (from running "mvn -e test"):
> > > ---CUT---
> > > Caused by: java.lang.NullPointerException
> > >at
> org.apache.maven.surefire.shade.org.apache.commons.lang3.SystemUtils.isJavaVersionAtLeast
> > > (SystemUtils.java:1626)
> > >at
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.getEffectiveJvm
> > > (AbstractSurefireMojo.java:2107)
> > >at
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.getForkConfiguration
> > > (AbstractSurefireMojo.java:1976)
> > >at
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider
> > > (AbstractSurefireMojo.java:)
> > >at
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked
> > > (AbstractSurefireMojo.java:954)
> > > ---CUT---
> > >
> > > Regards,
> > > Gilles
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

[numbers-fraction] Maven surefire plugin error

2019-03-04 Thread Eric Barnhill

I am getting a maven error for the surefire plugin, but don't see it listed
as a dependency in numbers-fraction or numbers-core. I see it listed in
numbers-parent, but with no version number. Any ideas what I need to do to
get the tests running again?

I also get this error with other numbers modules so I suspect it is down to
my setup, but just can't quite see what to modify. "It was all working last
week..."

Error msg:
---
Failed to execute goal
org.apache.maven.plugins:maven-surefire-plugin:2.20.1:test (default-test)
on project commons-numbers-fraction: Execution default-test of goal
org.apache.maven.plugins:maven-surefire-plugin:2.20.1:test failed.
NullPointerException -> [Help 1]

Re: [VOTE] Redirect github notifications to issues@

2019-02-19 Thread Eric Barnhill

+1

On Tue, Feb 19, 2019 at 1:35 PM Marcelo Vanzin 
wrote:

> I'm opening a vote based on recent discussions about the extra noise
> generated by github updates going to dev@. So please vote:
>
> - +1 to redirect github updates of all commons repos to the issues@ list
> - -1 to keep things as is
>
> If the vote passes, I'll take care of opening an infra ticket
> referencing the result.
>
> --
> Marcelo
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [STATISTICS] Possible imprecision or bias of BinomialDistribution.inverseCumulativeProbability() or NormalDistribution.inverseCumulativeProbability()

2019-02-13 Thread Eric Barnhill

I have read the Stack Overflow thread and will give a look at you rminimal
working example.

On Wed, Feb 13, 2019 at 10:45 AM Gilles Sadowski 
wrote:

> Hi.
>
> Le mer. 13 févr. 2019 à 13:12, Roman Leventov  a
> écrit :
> >
> > I try to approximate inverse CDF of BinomialDistribution with inverse CDF
> > of NormalDistribution. Works pretty well, but there is a noticeable bias,
> > that may be a sign of some imprecision or bias in either
> > BinomialDistribution.inverseCumulativeProbability(), or
> > NormalDistribution.inverseCumulativeProbability(), or both.
>
> Thanks for your interest.
> Review, documentation, unit tests and patches are welcome.  :-)
>
> Regards,
> Gilles
>
> >
> > See
> >
> https://stats.stackexchange.com/questions/392281/approximation-of-inverse-cdf-of-binomial-distribution-with-inverse-cdf-of-normal
> > .
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Multiplicity of GitBox messages

2019-02-08 Thread Eric Barnhill

Is it the consensus outcome for the dev list, that we all receieve a large
amount of GitBox postings? Wouldn't it be better to leave it to individuals
to track the projects they want to track? I know I do this.

Apologies if I missed any prior discussions.

Eric

Re: [Numbers] Formatting classes

2019-01-31 Thread Eric Barnhill

Please ignore previous post which was sent by accident.



>
> Le mar. 29 janv. 2019 à 00:14, Eric Barnhill  a
> écrit :
> >
> > Fraction already has a toString() method which should cover VALJO
> concerns
> > by representing the instance in one specific way.
>
> It has 2 different outputs (suppress the fraction bar and denominator
> when it is 1).
> Not sure that's very robust: Expecting a "/" as part of the representation
> will make parsing easier (noting that class is still missing the
> parse/valueOf
> method).
>

Good point but I think it is not too much to ask to have a logic block:
-- if contains slash: parse numerator and denominator and construct using
of()
-- if doesn't contain slash: parse as double and construct using from()



> > The FractionFormat classes allow for options beyond this such as proper
> > fractions or region-specific versions.
>
> IMO it's out of scope for a low-level component, and at least until we
> have an actual use-case.
> Locale-specific input/output is a can of worms that should be handled
> by text-oriented libraries.
> Having output (e.g. error messages) differ from locale to locale is a
> very bad idea (in a low-level component[1]), and so is the capacity (of
> a low-level component) to truncate data that might be needed by the
> caller.  [Those two things are the purpose of the "NumberFormat"
> family of classes.]
>

The AbstractFractionFormat is an extension of NumberFormat. I agree that
these classes would serve better in the NumberFormat family. However, I
would rather do that than throw away good work. I would be surprised if
these formatting classes had _not_ been designed around some reasonable use
cases.

So my proposed workflow is:
 -- keep the toString class
 -- move parse() out of the formatting classes and into the Fraction
classes (and follow above logic)
 -- move the formatting classes into the NumberFormat family


> The Javadoc of the "...FractionFormat" classes is also badly out-of-sync
> since most methods refer to "complex" (?), witnessing the extremely low
> usage.
>

I will write a ticket for this.

Eric

Re: [Numbers] Formatting classes

2019-01-31 Thread Eric Barnhill

>
> > Fraction already has a toString() method which should cover VALJO
> concerns
> > by representing the instance in one specific way.
>
> It has 2 different outputs (suppress the fraction bar and denominator
> when it is 1).
> Not sure that's very robust: Expecting a "/" as part of the representation
> will make parsing easier (noting that class is still missing the
> parse/valueOf
> method).
>

Good point, although it doesn't seem to me too much to ask to have a logic
block:



>
> > The FractionFormat classes allow for options beyond this such as proper
> > fractions or region-specific versions.
>
> IMO it's out of scope for a low-level component, and at least until we
> have an actual use-case.
> Locale-specific input/output is a can of worms that should be handled
> by text-oriented libraries.
> Having output (e.g. error messages) differ from locale to locale is a
> very bad idea (in a low-level component[1]), and so is the capacity (of
> a low-level component) to truncate data that might be needed by the
> caller.  [Those two things are the purpose of the "NumberFormat"
> family of classes.]
>
> The Javadoc of the "...FractionFormat" classes is also badly out-of-sync
> since most methods refer to "complex" (?), witnessing the extremely low
> usage.
>
> Best regards,
> Gilles
>
> >
> > It doesn't seem to me like it violates VALJO principles to have an
> > auxiliary class that takes care of these alternate cases. On the contrary
> > from a VALJO perspective it seems like a nice idea to have encapsulated
> > them in an auxiliary class structure.
> >
> > Eric
> >
>
> [1] Application developers can customize at will from the return values of
> "getNumerator()" and "getDenominator()" methods.
>
> > On Sat, Jan 26, 2019 at 12:11 PM Gilles Sadowski 
> > wrote:
> >
> > > Le sam. 26 janv. 2019 à 17:24, Gary Gregory  a
> > > écrit :
> > > >
> > > > On Sat, Jan 26, 2019 at 10:19 AM Gilles Sadowski <
> gillese...@gmail.com>
> > > > wrote:
> > > >
> > > > > Le sam. 26 janv. 2019 à 14:01, Gary Gregory <
> garydgreg...@gmail.com> a
> > > > > écrit :
> > > > > >
> > > > > > Are we talking about formatting [numbers] specific classes or JRE
> > > > > classes?
> > > > >
> > > > > They are classes that aim to customize the output from classes in
> > > > > [Numbers],
> > > > > (specifically, the way to display a {{Fraction}} or {{BigFraction}}
> > > > > object).
> > > > >
> > > >
> > > > Well, then that code does not belong in [text] since it requires
> [number]
> > > > classes.
> > >
> > > Not what I meant.
> > > People wanting custom formatting of a fraction can get the parts that
> > > define it (i.e. numerator and denominator) and call whatever they want
> > > (e.g. something which might be in [Text]) in order to craft a string
> > > representation of those numbers.
> > > Following ValJO (even if "BigFraction" is not one, strictly speaking),
> > > we want *one* way to represent the contents of the instance.
> > >
> > > Regards,
> > > Gilles
> > >
> > > >
> > > > Gary
> > > >
> > > >
> > > > > The classes provide accessors to the numerator and denominator
> which
> > > can be
> > > > > used by outside code to display the fraction as it wishes.
> > > > >
> > > > > Side note: Similar formatting classes in Commons Math are more of a
> > > > > nuisance
> > > > > than anything, e.g. displaying small numbers as "0" (in exception
> > > > > messages),
> > > > > because the default is to provide 6 decimal digits, thus discarding
> > > > > all significant
> > > > > information.
> > > > >
> > > > > Gilles
> > > > >
> > > > > >
> > > > > > Gary
> > > > > >
> > > > > > On Sat, Jan 26, 2019 at 7:14 AM Gilles Sadowski <
> > > gillese...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi.
> > > > > > >
> > > > > > > In reference to the current changes in module
> > > > > "commons-numbers-fraction"
> > > > > > > (on the "fraction-dev"), my opinion is that the formatting
> classes
> > > > > should
> > > > > > > be
> > > > > > > removed.[1]
> > > > > > > At the level of a math component, it's safer to stick to a
> single,
> > > > > > > locale-independent format.[2]
> > > > > > >
> > > > > > > Regards,
> > > > > > > Gilles
> > > > > > >
> > > > > > > [1] Rationale is that pretty-printing is not the purpose of the
> > > library
> > > > > > > (better
> > > > > > >  leave that to [Text], or a dedicated module).
> > > > > > > [2] There is a pending issue (NUMBERS-88) that suggests
> > > hard-coding the
> > > > > > > format.
> > > > > > >
> > > > > > >
> > > -
> > > > > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > > > > >
> > > > > > >
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > > For additional commands, e-mail:

Re: [Numbers] Formatting classes

2019-01-28 Thread Eric Barnhill

Fraction already has a toString() method which should cover VALJO concerns
by representing the instance in one specific way.

The FractionFormat classes allow for options beyond this such as proper
fractions or region-specific versions.

It doesn't seem to me like it violates VALJO principles to have an
auxiliary class that takes care of these alternate cases. On the contrary
from a VALJO perspective it seems like a nice idea to have encapsulated
them in an auxiliary class structure.

Eric


On Sat, Jan 26, 2019 at 12:11 PM Gilles Sadowski 
wrote:

> Le sam. 26 janv. 2019 à 17:24, Gary Gregory  a
> écrit :
> >
> > On Sat, Jan 26, 2019 at 10:19 AM Gilles Sadowski 
> > wrote:
> >
> > > Le sam. 26 janv. 2019 à 14:01, Gary Gregory  a
> > > écrit :
> > > >
> > > > Are we talking about formatting [numbers] specific classes or JRE
> > > classes?
> > >
> > > They are classes that aim to customize the output from classes in
> > > [Numbers],
> > > (specifically, the way to display a {{Fraction}} or {{BigFraction}}
> > > object).
> > >
> >
> > Well, then that code does not belong in [text] since it requires [number]
> > classes.
>
> Not what I meant.
> People wanting custom formatting of a fraction can get the parts that
> define it (i.e. numerator and denominator) and call whatever they want
> (e.g. something which might be in [Text]) in order to craft a string
> representation of those numbers.
> Following ValJO (even if "BigFraction" is not one, strictly speaking),
> we want *one* way to represent the contents of the instance.
>
> Regards,
> Gilles
>
> >
> > Gary
> >
> >
> > > The classes provide accessors to the numerator and denominator which
> can be
> > > used by outside code to display the fraction as it wishes.
> > >
> > > Side note: Similar formatting classes in Commons Math are more of a
> > > nuisance
> > > than anything, e.g. displaying small numbers as "0" (in exception
> > > messages),
> > > because the default is to provide 6 decimal digits, thus discarding
> > > all significant
> > > information.
> > >
> > > Gilles
> > >
> > > >
> > > > Gary
> > > >
> > > > On Sat, Jan 26, 2019 at 7:14 AM Gilles Sadowski <
> gillese...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi.
> > > > >
> > > > > In reference to the current changes in module
> > > "commons-numbers-fraction"
> > > > > (on the "fraction-dev"), my opinion is that the formatting classes
> > > should
> > > > > be
> > > > > removed.[1]
> > > > > At the level of a math component, it's safer to stick to a single,
> > > > > locale-independent format.[2]
> > > > >
> > > > > Regards,
> > > > > Gilles
> > > > >
> > > > > [1] Rationale is that pretty-printing is not the purpose of the
> library
> > > > > (better
> > > > >  leave that to [Text], or a dedicated module).
> > > > > [2] There is a pending issue (NUMBERS-88) that suggests
> hard-coding the
> > > > > format.
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > > >
> > > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: Math Sparse Linear Programing -- Math Commons

2019-01-28 Thread Eric Barnhill

It sounds like this is a worthwhile upgrade to the performance of the
Simplex solvers. I agree with Gilles that from a design perspective, the
class is accomplishing the same task only with an internal modification
difference, so if possible it should be set with an argument rather than a
whole new class.

Eric



On Fri, Jan 25, 2019 at 7:24 PM Bill Igoe  wrote:

> Hi Gang,
>
> I recently alter the code for both SimplexSolver and SimplexTableau to use
> the OpenMapRealMatrix object rather than the Array2DRowRealMatrix.  Most
> large Linear Programming programming problems in fact have a very sparse
> Simplex Tableau --- lots of zeros  -- perhaps 90 percent of the matrix is
> zeros!.  When using Array2DRealMatrix,  one invariable runs into a heap
> space problem quite quickly.  Instead, by using the OpenMapRealMatrix the
> code not physically allocate space for a complete K by N matrix as does the
> Array2DRealMatrix.  The modifications to existing code were quite modest.
> I think this approach is quite valuable for practitioners of large scale
> linear programming problems.  There is a reduction in speed as the
> System.arraycopy procedure is no longer employed.  The cost however
> provides users with a vastly larger sandbox of memory for problem solving.
>
>
> My code is thus:
> LargeSimplexSolver.java
> LargeSImplexTableau.java
>  LargeSolutionCallback.java
> and
> SimplexMapMatrix a minor  extension of OpenMaprealMatrix.
>
> I am willing to share the modifications if the math common group  thinks
> such an effort is worthwhile.
>
> Cheers to you all and keep up the good work.
>
> Bill Igoe
>

Re: [commons-numbers] [...] NUMBERS-91: Added ofInt() factory methods [...]

2019-01-14 Thread Eric Barnhill

I think you make some good points here. Why call the "straightforward"
factory method ofInt when there is no reason not to use a long, or
BigInteger (or I suppose byte). These could all be handled with the
overloaded factory method of(). For Fraction, it is probably reasonable to
have only int-based constructors, but BigFraction should IMO take ints,
longs, or BigIntegers.

Similarly, the decimal case requires conversion, iteration and estimation.
I don't think there are scenarios under which it fails, but if the rounding
criterion were too strict, it would be unlikely to deliver what the user
wants (e.g. submitting .142857 but rounding to the first decimal place). So
Fractions with decimal input could be covered by the overloaded factory
method from() .

So just of() and from(), I think we could get a three person consensus on
this! :)

Eric

On Fri, Dec 28, 2018 at 10:34 AM Gilles 
wrote:

> On Fri, 28 Dec 2018 09:17:08 -0800, Eric Barnhill wrote:
> > Fractions are constructed using either ints or doubles. In the case
> > of
> > ints, the numerator and denominator are passed (or the denominator is
> > assumed to be one). Constructing fractions from doubles is more
> > algorithmic
> > work: if I pass a known fixed quantity such as 0.6 of course it will
> > not be
> > hard for the constructor to determine that is the equivalent of 3 / 5
> > .
> > However if doubles are being passed of unknown precision, then I may
> > want
> > to request a max value on the denominator, or a precision within
> > which the
> > simplest fraction should be returned, or even the maximum iterations
> > in the
> > computation.
> >
> > I think of those as qualitatively very different activities
>
> I agree.
>
> > so I called
> > them ofInt and ofDouble.
>
> But we could consider:
> Cat.1
>   * of(long, long)
>   * of(int, long)
>   * of(BigInteger, BigInteger)
>   * ...
> and
> Cat.2
>   * ofDouble(double)
>   * ofDouble(int, double)
>   * ...
> where "Cat.1" and "Cat.2" delineates the very different handling
> which you referred to; and in the case of "Cat.1", an exact (?)
> representation is constructed, while "Cat.2" could be lossy.
> The former can also be construed as closer to the convention for
> "ValJO" ("BigFaction" not being "ValJO" does not preclude choosing
> the simplest name for its factory methods).
>
> > The example I had in mind was probably Complex,
> > where we have ofPolar and ofCartesian. I suppose you are right, in
> > this
> > case the hard typing of the passed variables alone could invoke
> > either an
> > int or double based method while with Complex, both constructors are
> > taking
> > doubles.
>
> Quite right, there is some inconsistency; we may consider using
> "of" if the "ValJO" aspect is more important that the equivalence
> between polar and Cartesian input (cf. also the suggestion that
> conversion methods should be name "from...", to which I'm not really
> a fan yet).
> If there is no strong argument yet for either, we could open a JIRA
> report asking for opinions.  And leave that open as long as we
> release "beta" versions.
>
> > You do then have some very similar methods, for example of(int a, int
> > b)
> > will be an integer fraction with a on top and b on bottom; while
> > calling
> > of(double a, int b) will produce a fraction that approximates double
> > a with
> > max denominator b.
> >
> > Those two processes are so different that it might be more clarifying
> > to
> > distinguish them as ofInt(int a, int b) and ofDouble(double a, int b)
>
> IMHO, it is not sufficiently self-documenting anyway: one has to go
> to the docs in order to understand the difference; hence my proposal
> to have "of" for the "obvious thing" (a/b) and "ofDouble" for the more
> elaborate "transform".
> Not sure if I'm clear in why the "non-symmetric" makes sense. :-}
>
> Best regards,
> Gilles
>
> >
> > Eric
> >
> >
> > On Fri, Dec 28, 2018 at 4:33 AM Gilles 
> > wrote:
> >
> >> Hello Eric.
> >>
> >> On Thu, 27 Dec 2018 17:00:15 -0800, Eric Barnhill wrote:
> >> > I am overloading:
> >> >
> >> > public static BigFraction ofInt(final BigInteger num) {
> >> > return new BigFraction(num, BigInteger.ONE);
> >> > }
> >> >
> >> > public static BigFraction ofInt(BigInteger num, BigInteger
> >> den) {
> >> > return new BigFraction(num, den);
> >> > }
> >> >
> >> > private BigFraction(BigInteger num, BigInteger den) {
> >> >
> >> > Did my comment not give that impression?
> >>
> >> I was in fact wondering why "ofInt" rather than just "of".
> >>
> >> Best,
> >> Gilles
> >>
> >> >> [...]
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [Numbers] Outdated branches

2019-01-08 Thread Eric Barnhill

I use fraction-dev for Fraction and complex-dev for Complex. cis-method,
complex-constructors, eb-test were all I believe side branches I used for
testing but did  not delete.

Eric

On Tue, Jan 8, 2019 at 5:39 AM Gilles Sadowski  wrote:

> Hi.
>
> Command
>   $ git branch -a
> shows several stale (?) branches:
> ---CUT---
>   remotes/origin/cis-method
>   remotes/origin/complex-constructors
>   remotes/origin/complex-dev
>   remotes/origin/eb-test
>   remotes/origin/feature__NUMBERS-69__autodiff
>   remotes/origin/fraction-dev
>   remotes/origin/hypot-change
>   remotes/origin/master
>   remotes/origin/multimodule
>   remotes/origin/null-removal
>   remotes/origin/numbers-56-bugfix
>   remotes/origin/remove-nan-returns
>   remotes/origin/task_NUMBERS-33__Gamma
> ---CUT---
>
> Let's have a list of which are being used for development and
> which could be deleted.
> [I mean: Is there anyone who uses one of those as "upstream"?]
>
> Thanks,
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [commons-numbers] [...] NUMBERS-91: Added ofInt() factory methods [...]

2018-12-28 Thread Eric Barnhill

Fractions are constructed using either ints or doubles. In the case of
ints, the numerator and denominator are passed (or the denominator is
assumed to be one). Constructing fractions from doubles is more algorithmic
work: if I pass a known fixed quantity such as 0.6 of course it will not be
hard for the constructor to determine that is the equivalent of 3 / 5 .
However if doubles are being passed of unknown precision, then I may want
to request a max value on the denominator, or a precision within which the
simplest fraction should be returned, or even the maximum iterations in the
computation.

I think of those as qualitatively very different activities so I called
them ofInt and ofDouble. The example I had in mind was probably Complex,
where we have ofPolar and ofCartesian. I suppose you are right, in this
case the hard typing of the passed variables alone could invoke either an
int or double based method while with Complex, both constructors are taking
doubles.

You do then have some very similar methods, for example of(int a, int b)
will be an integer fraction with a on top and b on bottom; while calling
of(double a, int b) will produce a fraction that approximates double a with
max denominator b.

Those two processes are so different that it might be more clarifying to
distinguish them as ofInt(int a, int b) and ofDouble(double a, int b)

Eric

On Fri, Dec 28, 2018 at 4:33 AM Gilles  wrote:

> Hello Eric.
>
> On Thu, 27 Dec 2018 17:00:15 -0800, Eric Barnhill wrote:
> > I am overloading:
> >
> > public static BigFraction ofInt(final BigInteger num) {
> > return new BigFraction(num, BigInteger.ONE);
> > }
> >
> > public static BigFraction ofInt(BigInteger num, BigInteger den) {
> > return new BigFraction(num, den);
> > }
> >
> > private BigFraction(BigInteger num, BigInteger den) {
> >
> > Did my comment not give that impression?
>
> I was in fact wondering why "ofInt" rather than just "of".
>
> Best,
> Gilles
>
> >> [...]
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [commons-numbers] branch fraction-dev updated: NUMBERS-91: Added ofInt() factory methods and made BigInteger-based constructor private

2018-12-27 Thread Eric Barnhill

I am overloading:

public static BigFraction ofInt(final BigInteger num) {
return new BigFraction(num, BigInteger.ONE);
}

public static BigFraction ofInt(BigInteger num, BigInteger den) {
return new BigFraction(num, den);
}

private BigFraction(BigInteger num, BigInteger den) {

Did my comment not give that impression?

On Thu, Dec 27, 2018 at 4:52 PM Gilles  wrote:

> On Thu, 27 Dec 2018 23:54:57 +, ericbarnh...@apache.org wrote:
> > This is an automated email from the ASF dual-hosted git repository.
> >
> > ericbarnhill pushed a commit to branch fraction-dev
> > in repository https://gitbox.apache.org/repos/asf/commons-numbers.git
> >
> >
> > The following commit(s) were added to refs/heads/fraction-dev by this
> > push:
> >  new ebb8e03  NUMBERS-91: Added ofInt() factory methods
>
> Why not rely on method overload?  There is no need to duplicate
> part of the the method's signature in its name.
>
> Gilles
>
> > and made
> > BigInteger-based constructor private
> > ebb8e03 is described below
> >
> > commit ebb8e03f139b8cec84564b3e558fea39b71d2f24
> > Author: Eric Barnhill 
> > AuthorDate: Thu Dec 27 15:54:51 2018 -0800
> >
> > NUMBERS-91: Added ofInt() factory methods and made
> > BigInteger-based
> > constructor private
> > ---
> >  .../commons/numbers/fraction/BigFraction.java  | 68
> > +++---
> >  1 file changed, 20 insertions(+), 48 deletions(-)
> >
> > [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers] propose making BigFraction an extension of Fraction

2018-12-27 Thread Eric Barnhill

Thanks for this response and it took me some time to think your various
points through.

On Thu, Dec 13, 2018 at 4:59 PM Gilles  wrote:

>
> On Thu, 13 Dec 2018 11:20:12 -0800, Eric Barnhill wrote:
>
> > Among the elegancies afforded by this change, if a Fraction operation
> > causes overflow as previously discussed, a BigFraction could be
> > returned
> > and should be able to handle all further calls to Fraction unaltered.
> > (This
> > might not always be desired behavior, so Fraction may need to contain
> > a
> > setting to either throw and exception, or convert to BigFraction in
> > case of
> > overflow.)
>
> Doesn't this setting achieve at runtime what the application
> developer should decide at compile time (by instantiating the
> class that has the desired behaviour)?
>

Yes. Perhaps I have been spending too much time writing Python lately.

>
> >
> > So I propose writing a ticket for this change. As sub-points on the
> > ticket
> > the BigFraction class could be conformed to Fraction class in terms
> > of
> > reduction of constants and producing a VALJO.
>
> Inheritance and ValJO turn out being contradictory (see thread
> with subject "Inheritance and ValJO ?").
> And (IIUC) the workaround/alternative hinted at by Stephen
> in that same thread might not be directly applicable because,
> here, the instance fields are different in "Fraction" and
> "BigFraction" ("long" vs "BigInteger").
>
> I've just noticed that "BigInteger" is not final; hence
> "BigFraction" cannot be a ValJO either.[1]
>

It sounds like this is sufficient to disqualify this proposal.

I don't think that we should rule out a "Fraction" interface.
>

Since BigFraction and Fraction have the use cases covered for now
(improved, I would argue, by only the former requiring Big* classes) I
propose wrapping up this work and leaving this until after a release.

> [1] So this issue:
>https://issues.apache.org/jira/browse/NUMBERS-75
>  should probably be resolved as "Invalid".
>

Done. But, there were some "peripheral" improvements that came out of
making Fraction a ValJO that should still be applied to BigFraction, for
example conforming both classes to use the same factory methods, and
reducing the absurd number of BigFraction constants. Shall I reopen and
rename the ticket to focus on these changes, or is it better to start a new
one?

Eric

[numbers/general] unlikely argument type warning

2018-12-13 Thread Eric Barnhill

For the line:

Assert.assertFalse(zero.equals(Double.valueOf(0)));

Eclipse is producing a warning:

"Unlikely argument type for equals(): Double seems to be unrelated to
Fraction"

Does anyone have a suggestion for how to handle this warning, thank you.

Eric

[numbers] propose making BigFraction an extension of Fraction

2018-12-13 Thread Eric Barnhill

Right now BigFraction and Fraction are separate parallel classes.

I propose altering this so that BigFraction extends Fraction, overrides its
methods, but also keeps its own unique methods.

I think it would be an improvement to the API to have both classes share
the same interface (and indeed an interface-based solution would be
possible, but strikes me as overkill, since I don't see any additional
classes beyond Fraction and BigFraction). BigFraction would in addition
have its current methods to convert BigIntegers to ints and longs.

Among the elegancies afforded by this change, if a Fraction operation
causes overflow as previously discussed, a BigFraction could be returned
and should be able to handle all further calls to Fraction unaltered. (This
might not always be desired behavior, so Fraction may need to contain a
setting to either throw and exception, or convert to BigFraction in case of
overflow.)

So I propose writing a ticket for this change. As sub-points on the ticket
the BigFraction class could be conformed to Fraction class in terms of
reduction of constants and producing a VALJO.

Eric

Re: [VOTE][LAZY] move commons git-wip repos to gitbox

2018-12-10 Thread Eric Barnhill

+1

On Sun, Dec 9, 2018 at 9:04 AM Oliver Heger 
wrote:

> +1
>
> Oliver
>
> Am 08.12.2018 um 21:09 schrieb Rob Tompkins:
> > Infra stated that we need documented consensus on this. So, let’s have
> at it.
> >
> > I propose that we move the following repos over to gitbox:
> >
> > commons-build-plugin.git   11 weeks ago
> > commons-cli.git30 weeks ago
> > commons-collections.git15 days ago
> > commons-compress.git   19 days ago
> > commons-crypto.git  9 weeks ago
> > commons-csv.git 7 weeks ago
> > commons-dbcp.git   24 days ago
> > commons-dbutils.git30 weeks ago
> > commons-fileupload.git 29 weeks ago
> > commons-imaging.git 6 weeks ago
> > commons-io.git 18 days ago
> > commons-lang.git6 days ago
> > commons-math.git   15 weeks ago
> > commons-numbers.git 8 days ago
> > commons-pool.git   16 days ago
> > commons-rdf.git25 weeks ago
> > commons-release-plugin.git 25 days ago
> > commons-rng.git   < 2 days ago
> > commons-scxml.git   7 weeks ago
> > commons-statistics.git 30 weeks ago
> > commons-testing.git30 weeks ago
> > commons-text.git   14 days ago
> >
> > This vote will close in 72 hours.
> >
> > Cheers,
> > -Rob
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers] Fraction() and Knuth 4.5.1 -- overflow, BigInteger, long, and rounding

2018-12-03 Thread Eric Barnhill

>
>
> Does this mean that computations can "unpredictably" overflow
> (or throw an exception)?
>

The ArithmeticUtils() methods mulAndCheck and addAndCheck throw exceptions
if there is overflow during primitive operations. That is the "check" part
of the method name.


> Is it acceptable, or should we enclose the problematic code in
> a "try" block and redo the computation with "BigInteger" when
> necessary?
>
> What is the performance hit of using "BigFraction" rather than
> "Fraction"?
>

I once used BigDecimal for a project, it is great code but the performance
is nothing close to using primitives.


> Are there use-cases that would need the ultimate performance from
> "Fraction" while not worry about overflow?
>

You would need a greatest common factor between the two fractions that was
larger than 64 bits.

Again, BigFraction is there for anyone worried about such a case and there
is no significant performance hit to switching over to BigFraction compared
to a Fraction class that was using BigInteger under the hood. But I
suspect  there would be a substantial performance gain if longs were being
used under the hood for the Fraction class for the more common use case of
smaller fractions. If it would be best practice, a bit of microbenchmarking
could be done to check.

A FractionOverflowException could be specifically tailored to this use case
and the error message can suggest using BigFraction. Or as you suggest, the
catch block could silently or with warning return a BigFraction. If we have
class inheritance straight, and both Fraction and BigFraction have the
exact same interface, this could be an elegant solution.

Eric

Re: [numbers] Fraction() and Knuth 4.5.1 -- overflow, BigInteger, long, and rounding

2018-11-30 Thread Eric Barnhill

Here is what I propose for the Fraction doc text regarding this issue:

 * Implement add and subtract. This algorithm is similar to that
 * described in Knuth 4.5.1. while making some concessions to
 * performance. Note Knuth 4.5.1 Exercise 7, which observes that
 * adding two fractions with 32-bit numerators and denominators
 * requires 65 bits in extreme cases. Here calculations are performed
 * with 64-bit longs and the BigFraction class is recommended for
numbers
 * that may grow large enough to be in danger of overflow.


On Fri, Nov 9, 2018 at 4:33 PM Eric Barnhill  wrote:

> Addendum to the above. In an exercise in the Knuth book Knuth does indeed
> state that "If the inputs are n-bit binary numbers, 2N+1 bits may be
> necessary to represent t." where t is a derived quantity that would take
> some time to explain.
>
> So that means in extreme cases, the needed precision to represent a
> fraction operation with 32 bits ints is 65 bits, one more than a long has.
>
> The present code solves this by using BigInteger briefly in the code,
> which strikes me as an awfully big performance hit for what must surely be
> very occasional and very  extreme cases.
>
> I think the most sensible strategy would be to restrict the precision of
> Fraction to longs, with user guidance to use BigFraction if there is
> concern of overflow.
>
> Eric
>
>
>
>
>
>
>
> On Thu, Nov 8, 2018 at 11:11 AM Gary Gregory 
> wrote:
>
>> I'm all for the Javadoc made to reflect the reality of the code. It is
>> fine
>> to have an additional section that points out Knuth and how we may want to
>> change things as a hint or request to contributors.
>>
>> Gary
>>
>> On Wed, Nov 7, 2018 at 10:52 AM Eric Barnhill 
>> wrote:
>>
>> > I read Kunth's "Art of Computer Programming 4.5.1" that is referenced
>> many
>> > times in the doc as the guidance for the commons-math/commons-numbers
>> > Fraction class. It is an interesting read. Also, for all the times it is
>> > cited in the doc, it is interesting that Fraction doesn't really use it
>> as
>> > implemented. Here is one example.
>> >
>> > Knuth is concerned about overflow in multiplication and division,
>> because
>> > numerator of f1 is multiplied by denominator of f2 and so forth, so he
>> > suggests a technique called "mediant rounding" that allows for
>> intermediate
>> > quantities in fraction multiplication to be rounded.
>> >
>> > It is a clever technique and probably works well, however the current
>> > Fraction class cites this chapter, then implements multiplication with
>> > BigInteger instead, ignoring this suggestion.
>> >
>> > First of all, the doc should be clear that the code is NOT following
>> 4.5.1,
>> > while it gives the opposite impression. And that's ok but the use of
>> > BigInteger creates additional inconsistency: Multiply and divide are
>> > accomplished using ArithmeticUtils.addAndCheck and
>> > ArithmeticUtils.mulAndCheck . These convert the relevant ints to longs,
>> > then perform the operation, then if the resulting long is greater than
>> the
>> > range of an int, throw an OverflowException. So some parts of Fraction
>> > check for overflow using longs and others use BigInteger.
>> >
>> > It seems to me that BigInteger is overkill here for the vast majority of
>> > practical uses of Fraction in a way that could be damaging for
>> performance.
>> > And furthermore, we already have a BigFraction class to handle cases
>> that
>> > require BigInteger.
>> >
>> > So, I propose rewriting the doc to say the opposite of what it currently
>> > says when appropriate, and get usages of BigInteger out of Fraction, use
>> > them only in BigFraction, and use the long-based ArithmeticUtils
>> methods to
>> > check for overflow and underflow in fraction addition and subtraction.
>> >
>> > Eric
>> >
>>
>

Re: [all] Amazon Corretto

2018-11-14 Thread Eric Barnhill

It reminds me uncomfortably of Microsoft's old "embrace, extend,
exterminate" philosophy in the 1990s.

On Wed, Nov 14, 2018 at 10:03 AM Pascal Schumacher 
wrote:

> Isn't this basically the same as Adopt Open JDK:
>
> https://adoptopenjdk.net
>
> or am I missing something?
>
> -Pascal
>
> Am 14.11.2018 um 15:14 schrieb Rob Tompkins:
> > Curious to see what people’s thoughts are to this:
> >
> > https://aws.amazon.com/corretto/
> >
> > -Rob
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [numbers] Fraction() and Knuth 4.5.1 -- overflow, BigInteger, long, and rounding

2018-11-09 Thread Eric Barnhill

Addendum to the above. In an exercise in the Knuth book Knuth does indeed
state that "If the inputs are n-bit binary numbers, 2N+1 bits may be
necessary to represent t." where t is a derived quantity that would take
some time to explain.

So that means in extreme cases, the needed precision to represent a
fraction operation with 32 bits ints is 65 bits, one more than a long has.

The present code solves this by using BigInteger briefly in the code, which
strikes me as an awfully big performance hit for what must surely be very
occasional and very  extreme cases.

I think the most sensible strategy would be to restrict the precision of
Fraction to longs, with user guidance to use BigFraction if there is
concern of overflow.

Eric







On Thu, Nov 8, 2018 at 11:11 AM Gary Gregory  wrote:

> I'm all for the Javadoc made to reflect the reality of the code. It is fine
> to have an additional section that points out Knuth and how we may want to
> change things as a hint or request to contributors.
>
> Gary
>
> On Wed, Nov 7, 2018 at 10:52 AM Eric Barnhill 
> wrote:
>
> > I read Kunth's "Art of Computer Programming 4.5.1" that is referenced
> many
> > times in the doc as the guidance for the commons-math/commons-numbers
> > Fraction class. It is an interesting read. Also, for all the times it is
> > cited in the doc, it is interesting that Fraction doesn't really use it
> as
> > implemented. Here is one example.
> >
> > Knuth is concerned about overflow in multiplication and division, because
> > numerator of f1 is multiplied by denominator of f2 and so forth, so he
> > suggests a technique called "mediant rounding" that allows for
> intermediate
> > quantities in fraction multiplication to be rounded.
> >
> > It is a clever technique and probably works well, however the current
> > Fraction class cites this chapter, then implements multiplication with
> > BigInteger instead, ignoring this suggestion.
> >
> > First of all, the doc should be clear that the code is NOT following
> 4.5.1,
> > while it gives the opposite impression. And that's ok but the use of
> > BigInteger creates additional inconsistency: Multiply and divide are
> > accomplished using ArithmeticUtils.addAndCheck and
> > ArithmeticUtils.mulAndCheck . These convert the relevant ints to longs,
> > then perform the operation, then if the resulting long is greater than
> the
> > range of an int, throw an OverflowException. So some parts of Fraction
> > check for overflow using longs and others use BigInteger.
> >
> > It seems to me that BigInteger is overkill here for the vast majority of
> > practical uses of Fraction in a way that could be damaging for
> performance.
> > And furthermore, we already have a BigFraction class to handle cases that
> > require BigInteger.
> >
> > So, I propose rewriting the doc to say the opposite of what it currently
> > says when appropriate, and get usages of BigInteger out of Fraction, use
> > them only in BigFraction, and use the long-based ArithmeticUtils methods
> to
> > check for overflow and underflow in fraction addition and subtraction.
> >
> > Eric
> >
>

1 2 3 >

1 - 100 of 209 matches

Mail list logo