Has anyone ever been successful in processing 150M records using the
Suggester Component? The make of the component, please comment.

On Tue, Jun 26, 2018 at 1:37 AM, Ratnadeep Rakshit <ratnad...@qedrix.com>
wrote:

> The site_address field has all the address of United states. Idea is to
> build something similar to Google Places autosuggest.
>
> Here's an example query: curl "http://localhost/solr/
> addressbook/suggest?suggest.q=1054%20club&wt=json"
>
> Response:
>
> {
> "responseHeader": {
> "status": 0,
> "QTime": 3125,
> "params": {
> "suggest.q": "1054 club",
> "wt": "json"
> }
> },
> "suggest": {
> "mySuggester2": {
> "1054 club": {
> "numFound": 3,
> "suggestions": [{
> "term": "<b>1054</b> null N COUNTRY <b>CLUB</b> null BLVD null STOCKTON CA
> 95204 5008",
> "weight": 0,
> "payload": "0023865882|06077|37.970769,-121.310433"
> }, {
> "term": "<b>1054</b> null E HERITAGE <b>CLUB</b> null CIR null DELRAY
> BEACH FL 33483 3482",
> "weight": 0,
> "payload": "0117190535|12099|26.445485,-80.069336"
> }, {
> "term": "<b>1054</b> null null CORAL <b>CLUB</b> null DR <b>1054</b> CORAL
> SPRINGS FL 33071 5657",
> "weight": 0,
> "payload": "0111342342|12011|26.243918,-80.267577"
> }]
> }
> },
> "mySuggester1": {
> "1054 club": {
> "numFound": 0,
> "suggestions": []
> }
> }
> }
> }
>
> Now when I start building with 25M address records in the addressbook
> core, the process runs smoothly. I can check the Heap utilization upto 56%
> max out of the 20GB allotted to Solr.
> I am not very experienced in metering solr performance. But it looks like
> when I increase the record size beyond 25M in the core, the build process
> fails. The query process of the suggester still works.
>
> Did that answer your questions correctly?
>
> On Tue, Jun 12, 2018 at 3:17 PM, Alessandro Benedetti <
> a.benede...@sease.io> wrote:
>
>> Hi,
>> first of all the two different suggesters you are using are based on
>> different data structures ( with different memory utilisation) :
>>
>> - FuzzyLookupFactory -> FST ( in memory and stored binary on disk)
>> - AnalyzingInfixLookupFactory -> Auxiliary Lucene Index
>>
>> Both the data structures should be very memory efficient ( both in
>> building
>> and storage).
>> What is the cardinality of the fields you are building suggestions from ?
>> (
>> site_address and site_address_other)
>> What is the memory situation in Solr when you start the suggester
>> building ?
>> You are allocating much more memory to the JVM Solr process than the OS (
>> which in your situation doesn't fit the entire index ideal scenario).
>>
>> I would recommend to put some monitoring in place ( there are plenty of
>> open
>> source tools to do that)
>>
>> Regards
>>
>>
>>
>> -----
>> ---------------
>> Alessandro Benedetti
>> Search Consultant, R&D Software Engineer, Director
>> Sease Ltd. - www.sease.io
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>
>

Reply via email to