thanks again. i will set some jvm options and try it as you said.
2009-11-09 cumtyjh 发件人: Sean Owen 发送时间: 2009-11-09 12:36:51 收件人: mahout-user 抄送: 主题: Re: Re: got Error: GC overhead limit exceeded when generateproductsimilariy OK, I do suggest you use -server for an application computing recommendations. In fact I recommend other flags for the best performance: http://lucene.apache.org/mahout/taste.html#performance If, by "offline", you mean computing item-item similarity before any recommendations are computed, then you are already doing so. The one line of code which creates a GenericItemSimilarity from a PearsonCorrelationSimilarity does exactly that. You could also manually compare every item-item pair with PearsonCorrelationSimilarity, write out the result in a file, then read it back in later, and manually create a GenericItemSimilarity from that info. It would involve writing a bit more code but would work fine. This approach will certainly take more memory, and your memory requirements will grow as the square of the number of items. But it will make things faster. Sean On Mon, Nov 9, 2009 at 4:31 AM, cumtyjh <[email protected]> wrote: > thanks Sean Owen. > > i have set jvm options like -Xmx2048m,no -server. > > > i want to generate item-item similarity offline, then i can use it for > recommendation. > > > do you have some suggestion on generating item-item similarity offline? > > > 2009-11-09 > > > > cumtyjh > > > > 发件人: Sean Owen > 发送时间: 2009-11-09 12:05:08 > 收件人: mahout-user > 抄送: > 主题: Re: got Error: GC overhead limit exceeded when generate productsimilariy > > This basically means "out of heap space". It didn't technically run > out of memory, but, is triggering garbage collection so much that the > JVM figures it is critically low on free heap space. > You need to allocate more heap space. Try just 256MB with -Xm256m > Have you looked at the documentation, there are some other settings > you should be using with the JVM, like -server. > Also you don't need to create a GenericItemSimilarity. That is going > to consume a fair bit of memory as it is pre-computing and storing in > memory all item-item similarities from a Pearson correlation. > On Mon, Nov 9, 2009 at 2:28 AM, cumtyjh <[email protected]> wrote: >> hi,all >> >> i got some error when generate product similarity according to rating file, >> and there is about 250,000 recordes in rating file. >> >> it works when there is only 10,000 recordes in rating file. >> >> >> do you have some suggestion? any help is appreciated >> >> thanks in advance. >> >> >> following is code and log: >> >> >> File file = new File(ratingFile); >> logger.log(Level.INFO, "begin to load rating file..."); >> FileDataModel model = new FileDataModel(file); >> logger.log(Level.INFO, "load rating file OK."); >> ItemSimilarity pearson = new LogLikelihoodSimilarity(model); >> GenericItemSimilarity gif = new GenericItemSimilarity(pearson,model); >> >> >> >> INFO: load rating file OK. >> - Reading file info... >> - Processed 100000 lines >> - Processed 200000 lines >> Exception in thread "Thread-9" java.lang.OutOfMemoryError: GC overhead limit >> exceeded >> at org.apache.mahout.cf.taste.impl.common.FastSet.<init>(FastSet.java:74) >> at >> org.apache.mahout.cf.taste.impl.model.GenericDataModel.getNumUsersWithPreferenceFor(GenericDataModel.java:195) >> at >> org.apache.mahout.cf.taste.impl.model.file.FileDataModel.getNumUsersWithPreferenceFor(FileDataModel.java:314) >> at >> org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity.itemSimilarity(LogLikelihoodSimilarity.java:48) >> at >> org.apache.mahout.cf.taste.impl.similarity.GenericItemSimilarity$DataModelSimilaritiesIterator.next(GenericItemSimilarity.java:291) >> at >> org.apache.mahout.cf.taste.impl.similarity.GenericItemSimilarity$DataModelSimilaritiesIterator.next(GenericItemSimilarity.java:260) >> at >> org.apache.mahout.cf.taste.impl.similarity.GenericItemSimilarity.initSimilarityMaps(GenericItemSimilarity.java:128) >> at >> org.apache.mahout.cf.taste.impl.similarity.GenericItemSimilarity.<init>(GenericItemSimilarity.java:103) >> >> 2009-11-09 >> >> >> >> cumtyjh >> >
