Hi Mario,
this is indeed a bug. The problem is that the CF code (taste) uses long
ids, while our math library internally uses int keys.
I'll open a jira and post patch that will hopefully help you.
--sebastian
On 04/18/2014 11:03 PM, Mario Levitin wrote:
In my dataset ID's are strings so I use MemoryIDMigrator. This migrator
produces large longs.
I'm not doing any translation.
I could not understand why there is a cast to int in the Mahout code. This
will produce errors for large long values.
On Fri, Apr 18, 2014 at 8:06 PM, Ted Dunning <[email protected]> wrote:
Are you translating the ID's down into a range that will fit into int's?
On Thu, Apr 17, 2014 at 3:02 PM, Mario Levitin <[email protected]
wrote:
Hi,
I'm trying to run the ALS algorithm. However, I get the following error:
Exception in thread "pool-1-thread-3"
org.apache.mahout.math.IndexException: Index -691877539 is outside
allowable range of [0,2147483647)
at org.apache.mahout.math.AbstractVector.set(AbstractVector.java:395)
at
org.apache.mahout.cf.taste.impl.recommender.svd.ALSWRFactorizer.sparseUserRatingVector(ALSWRFactorizer.java:305)
At line 305 in ALSWRFactorizer.java, there is the following code
ratings.set((int) preference.getItemID(), preference.getValue());
My suspicion is that the error results from the casting to int in the
above
line. Item IDs in mahout are long, so if you cast a long (which does not
fit into an int) then you will get negative numbers and hence the error.
However, this explanation also seems to me implausible since I don't
think
such an error exists in Mahout code.
Any help will be appreciated.
Thanks