BBenNguyenn opened a new pull request #23: GSoC Milestone 1: OLS Regression 
Complete
URL: https://github.com/apache/commons-statistics/pull/23
 
 
   **OLS functionality complete:**
   - converted all math4.linear dependencies to EJML (more complicated and time 
consuming than expected)
   -> should be easier now for GLS and logistic since experience was gained 
while porting OLS....
   - ensured full unit test coverage by porting all old ols tests and adding 
some new ones
   - created preliminary RegressionResults interface which essentially holds 
calculated results of a regression (to be accessed multiple times but 
calculated once)
   -> Note: preliminary usage with testSwissFertilityInterfaceFormat() in 
OLSRegressionTest, more to be added....
   
   **Known Code Smell:** math4.stat depedency
   Dependency usage: StatUtil, SumOfSquares, Variance, SecondMoment
   **Explanation:**
   This dependency is temporary until Statistics Descriptive completes array as 
input methods for above class functionalities which is said to be coming soon.
   I have considered helping Virendra with it to prevent all old dependencies 
completely for this milestone but I don't think I should interfere while I 
haven't completed my component since I would have to learn how to use streams 
properly as well, and it does sound like Virendra will be done soon anyways.
   Once Virendra is done, the switch will be swift, since only about 3 methods 
total use those functionalities.
   
   **Known Code Smell:** Data loading is perhaps not ideal
   **Explanation:**
   The current RegressionDataLoader stores the input data within a 
RegressionRawData object and passes an interface with a getter.
   This should be improved by using one of the suggested strategies in the ML.
   Will get to this this week or maybe after port of GLS....
   
   **Next Objectives:**
   - Improve data loading strategy
   -> as suggested, a proper Factory pattern model
   - Finalize RegressionResults interface for OLS and other regressions to 
output
   -> Summary statistics printout method?
   - Port GLS (expected to not take as long as OLS)
   - Start LogisticRegression implementation design
   
   **PLEASE NOTE:**
   - I've created a UML "UML_current.png" in the README directory if anyone 
thinks a visual would be helpful.
   - Full commit history (before squashing) is in 
STATISTICS-8_Regression_Module branch.
   
   Thank you for your review,
   -Ben Nguyen

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to