[
https://issues.apache.org/jira/browse/MAHOUT-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Schelter resolved MAHOUT-1000.
----------------------------------------
Resolution: Fixed
Closing this as it hasn't been picked up for several months
> Implementation of Single Sample T-Test using Map Reduce/Mahout
> --------------------------------------------------------------
>
> Key: MAHOUT-1000
> URL: https://issues.apache.org/jira/browse/MAHOUT-1000
> Project: Mahout
> Issue Type: New Feature
> Components: Math
> Affects Versions: Backlog
> Environment: Linux, Mac OS, Hadoop 0.20.2, Mahout 0.x
> Reporter: Dev Lakhani
> Labels: newbie
> Fix For: Backlog
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> Implement a map/reduce version of the single sample t test to test whether a
> sample of n subjects comes from a population in which the mean equals a
> particular value.
> For a large dataset, say n millions of rows, one can test whether the sample
> (large as it is) comes from the population mean.
> Input:
> 1) specified population mean to be tested against
> 2) hypothesis direction : i.e. "two.sided", "less", "greater".
> 3) confidence level or alpha
> 4) flag to indicate paired or not paired
> The procedure is as follows:
> 1. Use Map/Reduce to calculate the mean of the sample.
> 2. Use Map/Reduce to calculate standard error of the population mean.
> 3. Use Map/Reduce to calculate the t statistic
> 4. Estimate the degrees of freedom depending on equal sample variances
> Output
> 1) The value of the t-statistic.
> 2) The p-value for the test.
> 3) Flag that is true if the null hypothesis can be rejected with confidence 1
> - alpha; false otherwise.
> References
> http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira