[ 
https://issues.apache.org/jira/browse/STATISTICS-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829243#comment-16829243
 ] 

Gilles commented on STATISTICS-7:
---------------------------------

Hi [~Udit Arora],
{quote}this is kinda fun...
{quote}
Glad to hear. :)

Such small changes can indeed be a good opportunity to learn the process:
 # File an appropriate JIRA report: Almost every modification must be tracked 
(a notable exception is made for Javadoc improvement, e.g. correcting typos, or 
adding more unit tests, e.g. to improve code coverage).
 # The commit message should be prepended by the name of the JIRA ticket (e.g. 
"STATISTICS-123: ..."). See the output of the "git log" for examples of how 
detailed the message should be.
 Long time committers sometimes omit to open a JIRA ticket, but that should not 
be emulated. ;)
 # Describe the changes in the commit message: It's obvious that the commit 
contains a "change", but the reviewer should know, by reading the commit 
message, what was the purpose of the change.
 # It's always good to specify that you ran the unit test suite, and that the 
change is covered, and still produce the expected results. Side-note: We should 
ask INFRA to activate [Travis|https://travis-ci.org/apache/commons-statistics] 
for "Commons Statistics".

> Stream-based Java statistical processing
> ----------------------------------------
>
>                 Key: STATISTICS-7
>                 URL: https://issues.apache.org/jira/browse/STATISTICS-7
>             Project: Apache Commons Statistics
>          Issue Type: New Feature
>            Reporter: Eric Barnhill
>            Priority: Major
>              Labels: GSoC2019, gsoc2019, statistics, streams
>
> The new component aims to be a library of commons statistics functions 
> synchronized with the latest developments in the Java language, in particular 
> Java's functional programming syntax.
> The library will make commonly used statistical functions available to an end 
> user through a simple grammar comparable to commons-math-statistics or 
> scikit-learn, while under the hood will implement Java's mapping, streaming, 
> and other producer and consumer functions to ensure the statistical methods 
> run optimally in new Java implementations.
> As functional programming grows increasingly central to big data applications 
> we believe these libraries will play an important function in the data 
> engineering ecosystem. In particular, data engineering is widely done with 
> Java, then passed to other languages for data-scientific analyses; however, 
> the common availability of functionally implemented statistical mapping and 
> reductions in Java could prove very useful at the interface of data science 
> and engineering, by enabling teams to more easily perform reductions on the 
> engineering side before handing off to the analysis side.
> Developers working on the project will have the opportunity to demonstrate 
> Java programming, functional programming, algorithm design, and data science 
> skills and receive authorship on a commons project that is likely to be 
> widely used.
> The ideal contributor will also be able to help with important architectural 
> decision making. The old source of these libraries, commons-math, grew too 
> large, hierarchically complex and interdependent for the commons mission. The 
> developers on this project need to make architectural choices that will 
> enable the statiscal code to be lightweight and reusable, with a minimum of 
> outside dependencies while avoiding redundancy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to