[ 
https://issues.apache.org/jira/browse/STATISTICS-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803070#comment-16803070
 ] 

Eric Barnhill commented on STATISTICS-7:
----------------------------------------

[~Salman] First of all I'll be delighted to read over the proposal, this is 
happening right now in some other projects.

Seems to me time frame depends entirely on the skill set of the applicant. But 
I would set aside some time for each of the following in the proposal:
 * Project design – choose which stat functions you are interested in developing
 * Dependency analysis – what other classes do the stat functions currently 
depend on in commons-math and what is the purpose of those functions? What 
changes need to be made in order to create a standalone statistical component? 
For example, is the method depending on abstract classes, exception classes, 
formatting classes? Are these dependencies necessary or can they be 
restructured to be more stand alone? Whatever decision creates the best 
architecture is best of course
 * New component architecture - what will be the class hiearchy of the 
redesigned component?
 * Class design - do I want the user to create an object instance, or call a 
static method, to use this functionality? Do I want some enums to handle 
parameters of various kinds? What kinds of inputs should it take? How do I 
handle different data types? What are the outputs?
 * Unit test design - creating a property-driven unit testing scheme
 * Algorithm Validation – is there code in other languages that I can use to 
validate test inputs?
 * Documentation - what doc will go with each class and method? What addition 
doc should be in the user guides? Ideally the project could conclude with a 
couple of tutorials implementing working examples

> Stream-based Java statistical processing
> ----------------------------------------
>
>                 Key: STATISTICS-7
>                 URL: https://issues.apache.org/jira/browse/STATISTICS-7
>             Project: Apache Commons Statistics
>          Issue Type: New Feature
>            Reporter: Eric Barnhill
>            Priority: Major
>              Labels: GSoC2019, gsoc2019, statistics, streams
>
> The new component aims to be a library of commons statistics functions 
> synchronized with the latest developments in the Java language, in particular 
> Java's functional programming syntax.
> The library will make commonly used statistical functions available to an end 
> user through a simple grammar comparable to commons-math-statistics or 
> scikit-learn, while under the hood will implement Java's mapping, streaming, 
> and other producer and consumer functions to ensure the statistical methods 
> run optimally in new Java implementations.
> Developers working on the project will have the opportunity to demonstrate 
> Java programming, functional programming, algorithm design, and data science 
> skills and receive authorship on a commons project that is likely to be 
> widely used.
> The ideal contributor will also be able to help with important architectural 
> decision making. The old source of these libraries, commons-math, grew too 
> large, hierarchically complex and interdependent for the commons mission. The 
> developers on this project need to make architectural choices that will 
> enable the statiscal code to be lightweight and reusable, with a minimum of 
> outside dependencies while avoiding redundancy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to