[ANNOUNCE] Apache Mahout 0.1 Released

2009-04-07 Thread Grant Ingersoll
The Apache Lucene project is pleased to announce the release of Apache  
Mahout 0.1.
Apache Mahout is a subproject of Apache Lucene with the goal of  
delivering scalable
machine learning algorithm implementations under the Apache license.   
The first public

release includes implementations for clustering, classification,
collaborative filtering and evolutionary programming.

Highlights include:
1. Taste Collaborative Filtering
2. Several distributed clustering implementations: k-Means, Fuzzy k- 
Means, Dirchlet, Mean-Shift and Canopy
3. Distributed Naive Bayes and Complementary Naive Bayes  
classification implementations
4. Distributed fitness function implementation for the Watchmaker  
evolutionary programming library
5.  Most implementations are built on top of Apache Hadoop (http://hadoop.apache.org 
) for scalability


The release contents have been pushed out to the main Apache release
site and the m2 ibiblio sync repository.

Apache Mahout 0.1 is the project's first release and is focused on  
establishing a baseline release while

attracting more contributors. Details can
be found in JIRA:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310751styleName=Htmlversion=12312976

Apache Mahout is available in source form from the following download  
page:

http://www.apache.org/dyn/closer.cgi/lucene/mahout/0.1/mahout-0.1-project.tar.gz

Apache Mahout is also available for Maven 2 users via
the Central Maven Repositories:
http://repo1.maven.org/maven2/org/apache/mahout/
http://mirrors.ibiblio.org/pub/mirrors/maven2/org/apache/mahout/

When downloading from a mirror site, please remember to verify the  
downloads

using signatures found on the Apache site:
http://www.apache.org/dist/lucene/mahout/KEYS

For more information on Apache Mahout, visit the project home page:
http://lucene.apache.org/mahout

Apache Solr 3.1.0

2011-03-31 Thread Grant Ingersoll
March 2011, Apache Solr 3.1 available

The Lucene PMC is pleased to announce the release of Apache Solr 3.1.

This release contains numerous bug fixes, optimizations, and
improvements, some of which are highlighted below.  The release is
available for immediate download at 
http://www.apache.org/dyn/closer.cgi/lucene/solr (see note below).
See the CHANGES.txt file included with the release for a full list of
details as well as instructions on upgrading.

What's in a Version? 

The version number for Solr 3.1 was chosen to reflect the merge of
development with Lucene, which is currently also on 3.1.  Going
forward, we expect the Solr version to be the same as the Lucene
version.  Solr 3.1 contains Lucene 3.1 and is the release after Solr 1.4.1.

Solr 3.1 Release Highlights

* Numeric range facets (similar to date faceting).

* New spatial search, including spatial filtering, boosting and sorting 
capabilities.

* Example Velocity driven search UI at http://localhost:8983/solr/browse

* A new termvector-based highlighter

* Extend dismax (edismax) query parser which addresses some
 missing features in the dismax query parser along with some
 extensions.

* Several more components now support distributed mode:
 TermsComponent, SpellCheckComponent.

* A new Auto Suggest component.

* Ability to sort by functions.

* JSON document indexing

* CSV response format

* Apache UIMA integration for metadata extraction

* Leverages Lucene 3.1 and it's inherent optimizations and bug fixes
 as well as new analysis capabilities.

* Numerous improvements, bug fixes, and optimizations.

Note: The Apache Software Foundation uses an extensive mirroring network for 
distributing releases.  It is possible that the mirror you are using may not 
have replicated the release yet.  If that is the case, please try another 
mirror.  This also goes for Maven access.

Apache Mahout 0.8 Released

2013-07-25 Thread Grant Ingersoll
The Apache Mahout PMC is pleased to announce the release of Mahout 0.8. 
Mahout's goal is to build scalable machine learning libraries focused 
primarily in the areas of collaborative filtering (recommenders), 
clustering and classification (known collectively as the 3Cs), as well as the 
necessary infrastructure to support those implementations including, but
not limited to, math packages for statistics, linear algebra and others
as well as Java primitive collections, local and distributed vector and
matrix classes and a variety of integrative code to work with popular 
packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache 
Cassandra and much more. The 0.8 release is mainly a clean up release in
preparation for an upcoming 1.0 release, but there are several 
significant new features, which are highlighted below.

To get started with Apache Mahout 0.8, download the release artifacts and 
signatures at http://www.apache.org/dyn/closer.cgi/mahout or visit the central 
Maven repository. 

In addition to the release highlights and artifacts, please pay attention to 
the section labelled FUTURE PLANS below for more information about upcoming 
releases of Mahout.

As with any release, we wish to thank all of the users and contributors 
to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for 
individual credits, as there are too many to list here.

GETTING STARTED

In the release package, the examples directory contains several working 
examples of the core 
functionality available in Mahout. These can be run via scripts in the 
examples/bin directory and will prompt you for more information to help you try 
things out. Most examples do not need a Hadoop cluster in 
order to run.

RELEASE HIGHLIGHTS

The highlights of the Apache Mahout 0.8 release include, but are not 
limited to the list below. For further information, see the included 
CHANGELOG file.

- Numerous performance improvements to Vector and Matrix 
implementations, API's and their iterators (see also MAHOUT-1192, 
MAHOUT-1202)
- Numerous performance improvements to the recommender implementations 
(see also MAHOUT-1272, MAHOUT-1035, MAHOUT-1042, MAHOUT-1151, 
MAHOUT-1166, MAHOUT-1167, MAHOUT-1169, MAHOUT-1205, MAHOUT-1264)
- MAHOUT-1088: Support for biased item-based recommender
- MAHOUT-1089: SGD matrix factorization for rating prediction with user and 
item biases
- MAHOUT-1106: Support for SVD++
- MAHOUT-944: Support for converting one or more Lucene storage indexes 
to SequenceFiles as well as an upgrade of the supported Lucene version 
to Lucene 4.3.1.
- MAHOUT-1154 and friends: New streaming k-means implementation that offers 
on-line (and fast) clustering
- MAHOUT-833: Make conversion to SequenceFiles Map-Reduce, 'seqdirectory' can 
now be run as a MapReduce job.
- MAHOUT-1052: Add an option to MinHashDriver that specifies the dimension of 
vector to hash (indexes or values).
- MAHOUT-884: Matrix Concat utility, presently only concatenates two matrices.
- MAHOUT-1244: Upgraded to use Lucene 4.3
- MAHOUT-1187: Upgraded to CommonsLang3
- MAHOUT-916: Speedup the Mahout build by making tests run in parallel.
- The usual bug fixes. See JIRA [2] for more
information on the 0.8 release.

A total of 218 separate JIRA issues are addressed in this release.

CONTRIBUTING

Mahout is always looking for contributions focused on the 3Cs. If you are 
interested in contributing, please see our contribution page, 
https://cwiki.apache.org/MAHOUT/how-to-contribute.html, on the Mahout wiki or 
contact us via email at d...@mahout.apache.org.

FUTURE PLANS

0.9

As the project moves towards a 1.0 release, the community is working to 
clean up and/or remove parts of the code base that are under-supported 
or that underperform as well as to better focus the energy and 
contributions on key algorithms that are proven to scale in production 
and have seen wide-spread adoption. To this end, in the next release, 
the project is planning on removing support for the following algorithms
unless there is sustained support and improvement of them before the 
next release.

The algorithms to be removed are:
- From Clustering:
Dirichlet
MeanShift
MinHash
Eigencuts

- From Classification (both are sequential implementations)
Winnow
Perceptron

- Frequent Pattern Mining

- Collaborative Filtering
All recommenders in org.apache.mahout.cf.taste.
impl.recommender.knn
SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone and 
org.apache.mahout.cf.taste.impl.recommender.slopeone
Distributed pseudo recommender in org.apache.mahout.cf.taste.hadoop.pseudo
TreeClusteringRecommender in org.apache.mahout.cf.taste.impl.recommender

- Mahout Math
Lanczos in favour of SSVD
Hadoop entropy stuff in org.apache.mahout.math.stats.entropy

If you are interested in supporting 1 or more of these algorithms, please make 
it known on d...@mahout.apache.org and via JIRA issues that fix and/or improve 
them. Please also provide 
supporting evidence as to their