It means to aim to remove the following things *from Mahout Math*: - Lanczos (use SSVD instead) - Hadoop entropy stuff in org.apache.mahout.math.stats.entropy
2013/7/25 Dmitriy Lyubimov <[email protected]> > On Thu, Jul 25, 2013 at 6:44 AM, Suneel Marthi <[email protected] > >wrote: > > > With Isabel's help, updated the 0.8 Release notes on the Wiki and below > is > > the text version of the Release notes. > > > > Checkout the Wiki version at > > > > https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8 > > > > ------------------------------------------- > > > > The Apache Mahout PMC is pleased to announce the release of Mahout 0.8. > > Mahout's goal is to build scalable machine learning libraries focused > > primarily in the areas of collaborative filtering (recommenders), > > clustering and classification (known as the "3Cs"), as well as the > > necessary infrastructure to support those implementations including, but > > not limited to, math packages for statistics, linear algebra and others > > as well as Java primitive collections, local and distributed vector and > > matrix classes and a variety of integrative code to work with popular > > packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache > > Cassandra and much more. The 0.8 release is mainly a clean up release in > > preparation for an upcoming 1.0 release, but there are several > > significant new features, which are highlighted below. > > > > To get started with Apache Mahout 0.8, download the release artifacts and > > signatures at http://www.apache.org/dyn/closer.cgi/mahout. The examples > > directory contains several working examples of the core > > functionality available in Mahout. These can be run via scripts in the > > examples/bin directory. Most examples do not need a Hadoop cluster in > > order to run. > > > > Please pay attention to the section labelled FUTURE PLANS below for more > > information about upcoming releases of Mahout. > > > > As with any release, we wish to thank all of the users and contributors > > to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for > > individual credits, as there are too many to list here. > > > > RELEASE HIGHLIGHTS > > > > The highlights of the Apache Mahout 0.8 release include, but are not > > limited to the list below. For further information, see the included > > CHANGELOG file. > > > > - Numerous performance improvements to Vector and Matrix > > implementations, API's and their iterators (see also MAHOUT-1192, > > MAHOUT-1202) > > - Numerous performance improvements to the recommender implementations > > (see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151, > > MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264) > > - MAHOUT-1088: Support for biased item-based recommender > > - MAHOUT-1089: SGD matrix factorization for rating prediction with user > > and item biases > > - MAHOUT-1106: Support for SVD++ > > - MAHOUT-944: Support for converting one or more Lucene storage indexes > > to SequenceFiles as well as an upgrade of the supported Lucene version > > to Lucene 4.3.1. > > - MAHOUT-1154 and friends: New streaming k-means implementation that > > offers on-line (and fast) clustering > > - MAHOUT-833: Make conversion to SequenceFiles Map-Reduce, 'seqdirectory' > > can now be run as a MapReduce job. > > - MAHOUT-1052: Add an option to MinHashDriver that specifies the > dimension > > of vector to hash (indexes or values). > > - MAHOUT-884: Matrix Concat utility, presently only concatenates two > > matrices. > > - MAHOUT-1244: Upgraded to use Lucene 4.3 > > - MAHOUT-1187: Upgraded to CommonsLang3 > > - MAHOUT-916: Speedup the Mahout build by making tests run in parallel. > > - The usual bug fixes. See JIRA [2] for more > > information on the 0.8 release. > > > > A total of 218 separate JIRA issues are addressed in this release. > > > > CONTRIBUTING > > > > Mahout is always looking for contributions focused on the 3Cs. If you are > > interested in contributing, please see our > > https://cwiki.apache.org/MAHOUT/how-to-contribute.html on the Mahout > wiki > > or contact us via email at [email protected]. > > > > FUTURE PLANS > > > > 0.9 > > > > As the project moves towards a 1.0 release, the community is working to > > clean up and/or remove parts of the code base that are under-supported > > or that underperform as well as to better focus the energy and > > contributions on key algorithms that are proven to scale in production > > and have seen wide-spread adoption. To this end, in the next release, > > the project is planning on removing support for the following algorithms > > unless there is sustained support and improvement of them before the > > next release. > > > > The algorithms to be removed are: > > - From Clustering: > > Dirichlet > > MeanShift > > MinHash > > Eigencuts > > - From Classification (both are sequential implementations) > > Winnow > > Perceptron > > - Frequent Pattern Mining > > - Collaborative Filtering > > All recommenders in org.apache.mahout.cf.taste. > > impl.recommender.knn > > SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone > and > > org.apache.mahout.cf.taste.impl.recommender.slopeone > > Distributed pseudo recommender in > org.apache.mahout.cf.taste.hadoop.pseudo > > TreeClusteringRecommender in org.apache.mahout.cf.taste.impl.recommender > > - Mahout Math > > > > What does it mean -- remove "Mahout Math"? > > > > Lanczos in favour of SSVD > > Hadoop entropy stuff in org.apache.mahout.math.stats.entropy > > > > If you are interested in supporting 1 or more of these algorithms, please > > make it known on [email protected] and via JIRA issues that fix > > and/or improve them. Please also provide > > supporting evidence as to their effectiveness for you in production. > > > > 1.0 PLANS > > > > Our plans as a community are to focus 0.9 on cleanup of bugs and the > > removal of the code mentioned above and then to follow with a 1.0 > > release soon thereafter, at which point the community is committing to > > the support of the algorithms packaged in the 1.0 for at least two minor > > versions after their release. In the case of removal, we will deprecate > > the functionality in the 1.(x+1) minor release and remove it in the > > 1.(x+2) release. For instance, if feature X is to be removed after the > > 1.2 release, it will be deprecated in 1.3 and removed in 1.4. > > {quote} > > > > [1] > > > http://svn.apache.org/viewvc/mahout/trunk/CHANGELOG?revision=1501110&view=markup > > [2] > > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20MAHOUT%20AND%20fixVersion%20%3D%20%220.8%22 > > ] > > > > > > > > > > ________________________________ > > From: Grant Ingersoll <[email protected]> > > To: "[email protected]" <[email protected]> > > Sent: Wednesday, July 24, 2013 7:51 AM > > Subject: 0.8 > > > > > > 0.8 artifacts are pushed to the mirror location. I will send an official > > announcement tomorrow. > > > > In the meantime, please review the release notes at: > > https://cwiki.apache.org/confluence/display/MAHOUT/Release+0.8 > > > > The new features/fixes section is pretty weak. > > > > -Grant > > >
