Random Projection using sampled values
--------------------------------------
Key: MAHOUT-771
URL: https://issues.apache.org/jira/browse/MAHOUT-771
Project: Mahout
Issue Type: New Feature
Components: Math
Reporter: Lance Norskog
Priority: Minor
Random Projection implementation which follows two deterministic guarantees:
# The same data projected multiple times produces the same output
# Dense and sparse data with the same contents produce the same output
Custom class that does Random Projection based on Johnson-Lindenstrauss. This
implementation uses Achlioptas's results, which allow using method other than a
full-range random multiplier per sample:
* use 1 random bit to add or subtract a sample to a row sum
* use a random value from 1/6 to add (1/6), subtract (1/6), or ignore (4 out of
6) a sample to a row sum
Custom implementations for both dense and sparse vectors are included. The
sparse vector implementation assumes the active values will fit in memory.
An implementation using full-range random multipliers made by java.util.Random
is included for reference/research.
*Database-friendly random projections: Johnson-Lindenstrauss with binary coins*
_Dimitris Achlioptas_
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.4546&rep=rep1&type=pdf]
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira