Lucene allows you to build a kind of inverted index "content to document identifier". Solr or ElasticSearch allows to scale the process.
However, if I am reading it correctly, you are saying that you can not pre compute a structure (such an index) before the search? If that's true and that you need to process GB of data, then you have to allow a latency, if you can not have everything in memory before the search itself. I can't say anything more precisely. It will depend on your context. One may ask : why can't you index the content of your database and your files? Bertrand On Sun, Aug 19, 2012 at 9:06 PM, mahout user <[email protected]> wrote: > Thanks Mohit and Bertrand, > > I am looking into hadoop for search engine as many others. But in > case of search engine, I know lucene is there. But in my case i have > implemented java classes, they are searching from databases as well as from > csv files. But i cant understand if there are GB's of data is there, then > how can i get real time search service with hadoop. ? > > > On Sun, Aug 19, 2012 at 10:06 PM, Mohit Anchlia <[email protected]>wrote: > >> >> >> On Sun, Aug 19, 2012 at 8:44 AM, mahout user <[email protected]>wrote: >> >>> Hello folks, >>> >>> >>> I am new to hadoop, I just want to get information that how hadoop >>> framework is usefull for real time service.?can any one explain me..? >>> >>> Thanks. >>> >> >> Can you specify your use case? Each use case calls for different design >> consideration. >> > > -- Bertrand Dechoux
