[ 
https://issues.apache.org/jira/browse/STATISTICS-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804625#comment-16804625
 ] 

Ben Nguyen commented on STATISTICS-7:
-------------------------------------

Hello,

I am a second year computer science and economics student who has taken some of 
higher level econometrics courses, I'm very interested in working with the 
stat.regression library (I can see myself using it a lot in the near future). 
Looking through it briefly, I do see the problem; the implementation could be 
more intuitive to use. I am in the process of drafting a proposal. which lead 
me to wonder how to approach the dependency issue; to what extent should the 
new library be 'standalone' ; for example: should it still depend on 
math.linear (especially needed in OLS and GLS) since linear won't be getting an 
upgrade anytime soon? (I have not looked into math.linear yet) or should 
matrices be implemented internally with the new stats library? (repeating code 
in general is bad, though there is the benefit of 'zero' dependencies -> better 
maintainability?). I guess the same question lies with other dependencies too 
(math.exceptions, util).... 'to what extent should this new library be 
standalone'?

 

> Stream-based Java statistical processing
> ----------------------------------------
>
>                 Key: STATISTICS-7
>                 URL: https://issues.apache.org/jira/browse/STATISTICS-7
>             Project: Apache Commons Statistics
>          Issue Type: New Feature
>            Reporter: Eric Barnhill
>            Priority: Major
>              Labels: GSoC2019, gsoc2019, statistics, streams
>
> The new component aims to be a library of commons statistics functions 
> synchronized with the latest developments in the Java language, in particular 
> Java's functional programming syntax.
> The library will make commonly used statistical functions available to an end 
> user through a simple grammar comparable to commons-math-statistics or 
> scikit-learn, while under the hood will implement Java's mapping, streaming, 
> and other producer and consumer functions to ensure the statistical methods 
> run optimally in new Java implementations.
> Developers working on the project will have the opportunity to demonstrate 
> Java programming, functional programming, algorithm design, and data science 
> skills and receive authorship on a commons project that is likely to be 
> widely used.
> The ideal contributor will also be able to help with important architectural 
> decision making. The old source of these libraries, commons-math, grew too 
> large, hierarchically complex and interdependent for the commons mission. The 
> developers on this project need to make architectural choices that will 
> enable the statiscal code to be lightweight and reusable, with a minimum of 
> outside dependencies while avoiding redundancy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to