[
https://issues.apache.org/jira/browse/MAHOUT-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Lyubimov updated MAHOUT-673:
------------------------------------
Description:
So, per earlier discussion on the list: for random matrix Omega in stochastic
projection, let's use murmur hash to generate uniformly distributed elements in
a closed interval [-1,+1] instead of using Random.nextGaussian().
I am not sure if there's really compelling mathematical reason to do this but
maybe it's just faster and more inline with practice accepted in Mahout for all
this.
The murmur 64bit value is already in the code. I just need to figure the
optimal way to convert it into a uniform distribution.
Github url for this issue tree:
https://github.com/dlyubimov/mahout-commits/branches/MAHOUT-673, pull requests
are welcome.
was:
So, per earlier discussion on the list: for random matrix Omega in stochastic
projection, let's use murmur hash to generate uniformly distributed elements in
a closed interval [-1,+1] instead of using Random.nextGaussian().
I am not sure if there's really compelling mathematical reason to do this but
maybe it's just faster and more inline with practice accepted in Mahout for all
this.
The murmur 64bit value is already in the code. I just need to figure the
optimal way to convert it into a uniform distribution.
> Stochastic projection (SSVD) to use 64bit murmur hash to produce uniform
> distribution matrix elements
> -----------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-673
> URL: https://issues.apache.org/jira/browse/MAHOUT-673
> Project: Mahout
> Issue Type: Improvement
> Affects Versions: 0.4
> Reporter: Dmitriy Lyubimov
> Assignee: Dmitriy Lyubimov
> Priority: Minor
> Fix For: 0.6
>
>
> So, per earlier discussion on the list: for random matrix Omega in stochastic
> projection, let's use murmur hash to generate uniformly distributed elements
> in a closed interval [-1,+1] instead of using Random.nextGaussian().
> I am not sure if there's really compelling mathematical reason to do this but
> maybe it's just faster and more inline with practice accepted in Mahout for
> all this.
> The murmur 64bit value is already in the code. I just need to figure the
> optimal way to convert it into a uniform distribution.
> Github url for this issue tree:
> https://github.com/dlyubimov/mahout-commits/branches/MAHOUT-673, pull
> requests are welcome.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira