[jira] [Comment Edited] (MATH-1426) Add constructor with Double[] argument to DescriptiveStatistics

2017-08-04 Thread Karl Richter (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114581#comment-16114581
 ] 

Karl Richter edited comment on MATH-1426 at 8/4/17 4:25 PM:


bq. So, in this issue's case, using a randomly populated array is just a 
convenience: the code could be checked with a "manually" chosen set of values; 
hence, we can choose any seed (and keep that one forever).

Interesting, thanks for the detailed explanation. I imitated the code of other 
tests as suggested.

bq. For "Commons Math" main code (in the "main" directory), only the API part 
would become a dependency.
bq. I also think that "slf4j-api" is a safe choice (but this must be decided on 
the ML).

One discussion is http://markmail.org/message/pf46s7775bt37swk for reference. 
That's tricky because of the wide range of interests. SLF4J is the most often 
declared dependency in github.com 
projectshttps://en.wikipedia.org/wiki/SLF4J.

bq. I'm no maven expert; please raise this issue on the "dev" ML (use different 
posts for different subjects).

Done. The title is `[MATH] Enforce run of checkstyle-maven-plugin in validate 
instead of site phase`

bq. I didn't figure out what you mean by wrong tabulation
bq. For example, I'd think that in most of the code, the alignment would be (to 
be viewed with a fixed-width font...):

I took over your example, the rest will be covered by checkstyle enforcement.


was (Author: krichter):
bq. So, in this issue's case, using a randomly populated array is just a 
convenience: the code could be checked with a "manually" chosen set of values; 
hence, we can choose any seed (and keep that one forever).

Interesting, thanks for the detailed explanation. I imitated the code of other 
tests as suggested.

bq. For "Commons Math" main code (in the "main" directory), only the API part 
would become a dependency.
I also think that "slf4j-api" is a safe choice (but this must be decided on the 
ML).

One discussion is http://markmail.org/message/pf46s7775bt37swk for reference. 
That's tricky because of the wide range of interests. SLF4J is the most often 
declared dependency in github.com 
projectshttps://en.wikipedia.org/wiki/SLF4J.

bq. I'm no maven expert; please raise this issue on the "dev" ML (use different 
posts for different subjects).

Done. The title is `[MATH] Enforce run of checkstyle-maven-plugin in validate 
instead of site phase`

bq. I didn't figure out what you mean by wrong tabulation
bq. For example, I'd think that in most of the code, the alignment would be (to 
be viewed with a fixed-width font...):

I took over your example, the rest will be covered by checkstyle enforcement.

> Add constructor with Double[] argument to DescriptiveStatistics
> ---
>
> Key: MATH-1426
> URL: https://issues.apache.org/jira/browse/MATH-1426
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Karl Richter
> Fix For: 4.0
>
> Attachments: 
> 0001-fixed-javadoc-of-constructors-in-DescriptiveStatisti.patch, 
> 0002-added-constructor-with-Double-argument-to-Descriptiv.patch
>
>
> It'd be nice to have a `Double[]` constructor in `DescriptiveStatistics`.
> The patch is available at https://github.com/apache/commons-math/pull/54 in 
> form of a PR as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (MATH-1426) Add constructor with Double[] argument to DescriptiveStatistics

2017-08-04 Thread Karl Richter (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114457#comment-16114457
 ] 

Karl Richter edited comment on MATH-1426 at 8/4/17 2:47 PM:


Thanks for your review. You have impressive coding discipline!

> Prefer several small test methods (one for each tested functionality) even if 
> some boiler-plate code repetition do occur. Having "testInit1()", 
> "testInit2()", ... is fine (but more meaningful names are preferred if 
> possible).

Done.

> If using random data, use a fixed seed, unless the behaviour under test has 
> intrinsic variability (which is not the case here).

I don't see the downside of using a different "fixed" ("fixed" for the run of 
the test, not "fixed" as you mean it) seed for each run of the unit tests, e.g. 
based on the test start time millis, and log the seed so that reproduction in 
case of test failure is possible since we're working with pseudo-random 
generators which can be re-initialized with the same seed to produce the same 
pseudo-random results. In case you agree, how do I initialize a Commons RNG 
with such a variable seed? If you don't I'll comply with your explanation.

> Make test sets small (as long as they can reasonably check the functionality) 
> to avoid long-running "mvn test"; here I don't think that arrays of length 
> 1048576 were needed.

Agreed. I chose 1024.

> Comment out debugging output ("System.out.println")

I'm a huge fan of configurable logging framework which on the one hand require 
the evaluation of one very cheap statement, but on the other minimize 
controversy between devs and avoid adding previously deleted code (since you 
can turn logging statements off in your `logback.xml` or whather is used). I 
wouldn't speak of "debugging" statements since it's either debugging or 
logging. The logging of the generator seed is necessary unless the above is 
wrong, I guess you'll shed some light on this issue.

Do you have interest in adding a logging framework (I suggest slf4j-api + 
logback-classic). There're about 100 System.out.print statements in the code, 
some commented out. I'd provide a patch if you want.

> Apply a uniform coding style (e.g. there must be a space around operators, 
> and the tabulation is wrong).

I suggest you move `maven-checkstyle-plugin` from `reporting` to `build` with 
{code:java}

  
checkstyle

  check

validate
  

{code}
That reveals some hundred issues which should either be silenced or fixed 
(almost all of them are Javadoc issues which you might deactive for the check 
and put on your schedule to fix later). It minimizes communication overhead 
before reviewing contributions like in this situation. The issues you're 
describing are not revealed by checkstyle and I didn't figure out what you mean 
by wrong tabulation - no need to explain if you change the checkstyle, since 
then I can rebase the patch.


was (Author: krichter):
Thanks for your review. You have impressive coding discipline!

> Prefer several small test methods (one for each tested functionality) even if 
> some boiler-plate code repetition do occur. Having "testInit1()", 
> "testInit2()", ... is fine (but more meaningful names are preferred if 
> possible).

Done.

> If using random data, use a fixed seed, unless the behaviour under test has 
> intrinsic variability (which is not the case here).

I don't see the downside of using a different "fixed" ("fixed" for the run of 
the test, not "fixed" as you mean it) seed for each run of the unit tests, e.g. 
based on the test start time millis, and log the seed so that reproduction in 
case of test failure is possible since we're working with pseudo-random 
generators which can be re-initialized with the same seed to produce the same 
pseudo-random results. In case you agree, how do I initialize a Commons RNG 
with such a variable seed?

> Make test sets small (as long as they can reasonably check the functionality) 
> to avoid long-running "mvn test"; here I don't think that arrays of length 
> 1048576 were needed.

Agreed. I chose 1024.

> Comment out debugging output ("System.out.println")

I'm a huge fan of configurable logging framework which on the one hand require 
the evaluation of one very cheap statement, but on the other minimize 
controversy between devs and avoid adding previously deleted code (since you 
can turn logging statements off in your `logback.xml` or whather is used). I 
wouldn't speak of "debugging" statements since it's either debugging or 
logging. The logging of the generator seed is necessary unless the above is 
wrong, I guess you'll shed some light on this issue.

Do you have interest in adding a logging framework (I suggest slf4j-api + 
logback-classic). There're about 100 System.out.print statements in the code, 
some commented out. I'd 

[jira] [Comment Edited] (MATH-1426) Add constructor with Double[] argument to DescriptiveStatistics

2017-08-04 Thread Karl Richter (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114457#comment-16114457
 ] 

Karl Richter edited comment on MATH-1426 at 8/4/17 2:46 PM:


Thanks for your review. You have impressive coding discipline!

> Prefer several small test methods (one for each tested functionality) even if 
> some boiler-plate code repetition do occur. Having "testInit1()", 
> "testInit2()", ... is fine (but more meaningful names are preferred if 
> possible).

Done.

> If using random data, use a fixed seed, unless the behaviour under test has 
> intrinsic variability (which is not the case here).

I don't see the downside of using a different "fixed" ("fixed" for the run of 
the test, not "fixed" as you mean it) seed for each run of the unit tests, e.g. 
based on the test start time millis, and log the seed so that reproduction in 
case of test failure is possible since we're working with pseudo-random 
generators which can be re-initialized with the same seed to produce the same 
pseudo-random results. In case you agree, how do I initialize a Commons RNG 
with such a variable seed?

> Make test sets small (as long as they can reasonably check the functionality) 
> to avoid long-running "mvn test"; here I don't think that arrays of length 
> 1048576 were needed.

Agreed. I chose 1024.

> Comment out debugging output ("System.out.println")

I'm a huge fan of configurable logging framework which on the one hand require 
the evaluation of one very cheap statement, but on the other minimize 
controversy between devs and avoid adding previously deleted code (since you 
can turn logging statements off in your `logback.xml` or whather is used). I 
wouldn't speak of "debugging" statements since it's either debugging or 
logging. The logging of the generator seed is necessary unless the above is 
wrong, I guess you'll shed some light on this issue.

Do you have interest in adding a logging framework (I suggest slf4j-api + 
logback-classic). There're about 100 System.out.print statements in the code, 
some commented out. I'd provide a patch if you want.

> Apply a uniform coding style (e.g. there must be a space around operators, 
> and the tabulation is wrong).

I suggest you move `maven-checkstyle-plugin` from `reporting` to `build` with 
{code:java}

  
checkstyle

  check

validate
  

{code}
That reveals some hundred issues which should either be silenced or fixed 
(almost all of them are Javadoc issues which you might deactive for the check 
and put on your schedule to fix later). It minimizes communication overhead 
before reviewing contributions like in this situation. The issues you're 
describing are not revealed by checkstyle and I didn't figure out what you mean 
by wrong tabulation - no need to explain if you change the checkstyle, since 
then I can rebase the patch.


was (Author: krichter):
Thanks for your review. You have impressive coding discipline!

> Prefer several small test methods (one for each tested functionality) even if 
> some boiler-plate code repetition do occur. Having "testInit1()", 
> "testInit2()", ... is fine (but more meaningful names are preferred if 
> possible).
> For random generation, please use the classes from Commons RNG (cf. examples 
> in other test classes).

Done.

> If using random data, use a fixed seed, unless the behaviour under test has 
> intrinsic variability (which is not the case here).

I don't see the downside of using a different "fixed" ("fixed" for the run of 
the test, not "fixed" as you mean it) seed for each run of the unit tests, e.g. 
based on the test start time millis, and log the seed so that reproduction in 
case of test failure is possible since we're working with pseudo-random 
generators which can be re-initialized with the same seed to produce the same 
pseudo-random results. In case you agree, how do I initialize a Commons RNG 
with such a variable seed?

> Make test sets small (as long as they can reasonably check the functionality) 
> to avoid long-running "mvn test"; here I don't think that arrays of length 
> 1048576 were needed.

Agreed. I chose 1024.

> Comment out debugging output ("System.out.println")

I'm a huge fan of configurable logging framework which on the one hand require 
the evaluation of one very cheap statement, but on the other minimize 
controversy between devs and avoid adding previously deleted code (since you 
can turn logging statements off in your `logback.xml` or whather is used). I 
wouldn't speak of "debugging" statements since it's either debugging or 
logging. The logging of the generator seed is necessary unless the above is 
wrong, I guess you'll shed some light on this issue.

Do you have interest in adding a logging framework (I suggest slf4j-api + 
logback-classic). There're about 100 

[jira] [Comment Edited] (MATH-1426) Add constructor with Double[] argument to DescriptiveStatistics

2017-08-04 Thread Karl Richter (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114221#comment-16114221
 ] 

Karl Richter edited comment on MATH-1426 at 8/4/17 10:29 AM:
-

> A unit test would be welcome.

Done. Coverage increased. Travis CI failed because of a crash of the JVM, 
please restart it.


was (Author: krichter):
> A unit test would be welcome.

Done. Travis CI failed because of a crash of the JVM, please restart it.

> Add constructor with Double[] argument to DescriptiveStatistics
> ---
>
> Key: MATH-1426
> URL: https://issues.apache.org/jira/browse/MATH-1426
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Karl Richter
> Fix For: 4.0
>
> Attachments: 
> 0001-fixed-javadoc-of-constructors-in-DescriptiveStatisti.patch, 
> 0002-added-constructor-with-Double-argument-to-Descriptiv.patch
>
>
> It'd be nice to have a `Double[]` constructor in `DescriptiveStatistics`.
> The patch is available at https://github.com/apache/commons-math/pull/54 in 
> form of a PR as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)