Hi,

Thanks for your interest in Apache Commons.

The GSoC project for Statistics is part of the ongoing project to
refactor the large Commons Math (CM) component into smaller modular
components (see [1-5]).

I have CC'd the commons developer's list on this e-mail. If you
subscribe you will be able to track all the discussion on GSoC by
searching the subject for the GSoC tag.

The suggested project for Statistics 54 ([6]) is to develop the
various univariate statistics in CM for use in Java 8 streams. You can
see the statistics in the latest javadoc for CM ([7]); the relevant
packages are under 'descriptive'. A start point would be to look at
the storeless statistics such as mean, variance, moments, as well as
the summary statistics classes which group together more than one
statistic. The project would be to develop an API that complements the
SummaryStatistics in Java (see [8]) for double, long and int. In
general a collector for a stream would have to be able to accept both
a single value and be combined with another collector to create an
aggregate, e.g:

Mean.add(double)
Mean.add(Mean)

This is to allow parallel stream support.

Currently the JDK only offers a summary containing min, max, count,
average and sum. To extend this would be development of some
aggregator classes for individual statistics and some type of generic
aggregator class that can be constructed to summarise statistics of
interest, e.g. mean and standard deviation; the statistics could be
user-configurable.

Please take a look at the current code in CM and then ask any
questions, either on the dev mailing list or on the Jira ticket. If
you wish to register for a Jira account to allow you to track the GSoC
issue then see here [9, 10]. You send your preferred username,
alternate username and display name to priv...@commons.apache.org and
we shall create an account for you.

Regards,

Alex

[1] https://commons.apache.org/proper/commons-rng/
[2] https://commons.apache.org/proper/commons-geometry/
[3] https://commons.apache.org/proper/commons-statistics/
[4] https://commons.apache.org/proper/commons-numbers/
[5] https://commons.apache.org/proper/commons-math/
[6] https://issues.apache.org/jira/browse/STATISTICS-54
[7] 
https://commons.apache.org/proper/commons-math/javadocs/api-4.0-beta1/index.html
[8] 
https://docs.oracle.com/javase/8/docs/api/java/util/DoubleSummaryStatistics.html
[9] https://infra.apache.org/jira-guidelines.html
[10] https://issues.apache.org/jira/secure/Dashboard.jspa

On Fri, 24 Feb 2023 at 21:03, Md Tanvir Alfesani
<tanviralfesani3...@gmail.com> wrote:
>
> I hope this email finds you well. My name is Md Tanvir Alfesani and I'm a 
> student who is interested in contributing to Apache Foundation's project 
> 'Summary Statistics API for JAVA 8 Streams', for Google Summer of Code 2023.
>
> As I was going through the project idea, I realized that I need to learn more 
> about the project. I'm particularly interested in the functionalities of the 
> Common Statistics Library and how to access them to get a good idea about the 
> aforesaid project. I would appreciate any advice or resources you could 
> provide to help me prepare for the project.
>
> Thank you for taking the time to read my email. I am looking forward to 
> hearing from you and hopefully working together on the project.
>
> Best regards,
> Md Tanvir Alfesani

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to