Hundreds of users are going to generate a really, really tiny amount of data (relative to the normal amounts that recommenders get to see).
The problem is that hundreds of hyper-active users who issue thousands of queries are only going to generate a tiny amount of data per document. You will need to have roughly 20 positive interactions per document to get decent performance. If you have a thousand documents, that means you will need an absolutely (and implausible) 20 thousand engagements. Because the distribution will be very lop-sided, you probably need 10-100x more than that. The final result is your hundreds of users would likely need to issue thousands of queries. Each. That seems like a lot. You should get good results for a small minority of documents at smaller data volumes. On Sat, Apr 1, 2017 at 11:37 PM, arun abraham <arunabraham...@gmail.com> wrote: > Hi Ted, > > Each documents to be indexed by Solr has fairly large content in it and > 100+ users searching within it(once the solr search tool goes live). > Kindly guide me on the integration steps for mahout with Solr(with respect > all the stats mentioned). > > Thanks and Regards, > Arun > > On 2 April 2017 at 11:59, Ted Dunning <ted.dunn...@gmail.com> wrote: > > > Arun, > > > > That's good news. > > > > The second limitation will be how much data you have for each document > and > > whehter you have a good measure of how engaged users are with documents. > > > > > > > > On Sat, Apr 1, 2017 at 6:48 PM, arun abraham <arunabraham...@gmail.com> > > wrote: > > > > > Hi Ted, > > > > > > Thanks for the reply. > > > > > > I understood Ted,to have a good effective results a larger set of > > > documents/index is required. > > > > > > For all the Solr related functionalities and Search,I used ~100 > docs(path > > > pointing to my local system) to index and set things up.This is only > for > > > testing and implementing. > > > > > > Once the configuration and high level testing is done the configuration > > > will be changed in such way the document path will be pointing to the > LAN > > > location where we have a large collection of documents for indexing > and > > > high level testing is done. > > > > > > It wont be a problem for me to use the LAN path for configurations and > > > index.I can use the larger document base. > > > > > > Thanks and Regards, > > > Arun > > > > > > On 2 April 2017 at 07:00, Ted Dunning <ted.dunn...@gmail.com> wrote: > > > > > > > On Sat, Apr 1, 2017 at 6:21 PM, arun abraham < > arunabraham...@gmail.com > > > > > > > wrote: > > > > > > > > > As a first step I am trying to recommend min of two documents(As > my > > > > > Solr document index is ~100 docs). > > > > > > > > > > > > > This is kind of weird. > > > > > > > > Can you say why you have so very few documents? > > > > > > > > There may be something special going on that will make this work > better > > > or > > > > worse. > > > > > > > > I have seen people use indicator-based recommendations for ad > targeting > > > > where they had several thousand options, but haven't seen anything > with > > > > only 100 options. > > > > > > > > > >