Re: Seeing very low ingestion performance for a single non-cloud Solr core

2017-09-25 Thread saiks
Hi All,

Thanks for the response.

Increasing hard/soft commit intervals did not help.
But by changing "text" field in the ingestion input from the same message to
random messages of similar length gave 60% improved performance.

Im able to ingest 40k - 45k messages per second, earlier I did 26k.

Thanks a lot.





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Seeing very low ingestion performance for a single non-cloud Solr core

2017-09-21 Thread Walter Underwood
5 seconds and 10 seconds is very short for auto commit.

20 Gb is probably too much heap.

Sending the exact same message for every update will create a few very long 
posting lists. Not sure if that is slow, but it is not realistic.

Finally, 26,000 per second is not that slow. That is over 1.5 million/minute. 
We are indexing bigger documents, but seeing 1 million/minute to a cluster with 
four shards.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Sep 21, 2017, at 1:18 AM, Emir Arnautović  
> wrote:
> 
> Hi,
> What are your commit configs? Maybe you are committing too frequently. 
> 
> Thanks,
> Emir
> 
>> On 21 Sep 2017, at 06:19, saiks  wrote:
>> 
>> Hi,
>> 
>> Environment:
>> - Solr is running in non-cloud mode on 6.4.2, Sun Java8, Linux
>> 4.4.0-31-generic x86_64
>> - Ingesting into a single core
>> - SoftCommit = 5 seconds, HardCommit = 10 seconds
>> - System has 16 Cpus and 32 Gb of memory (Solr is given 20 Gb of JVM heap)
>> - text = StandardTokenizer, id = solr.StrField/docValues, hostname =
>> solr.StrField/docValues, app = solr.StrField/docValues, epoch =
>> solr.TrieLongField/docValues
>> 
>> I am using jmeter to ingest to Solr core using UpdateRequestHandle
>> ("/update/json") and sending in a batch of 1000 messages(same message) in a
>> single json array.
>> 
>> Sample message
>> [{"text":"May 11 10:18:22 scrooge Web-Requests: May 11 10:18:22
>> @IunAIir17k-- EVENT_WR-Y-attack-600 SG_child[823]: [event.error]
>> Possible attack - 5 blocked requests within 120 seconds",
>> "id":"id1",
>> "hostname": "xx.com",
>> "app": "",
>> "epoch": 1483667347941
>> },
>> ]
>> 
>> Jmeter is configured to run 10 threads in parallel repeating the request
>> 1000 times, which should ingest 10,000,000 messages in total.
>> Jmeter post url:
>> "/solr/mycore/update/json?overwrite=false&wt=json&commit=false"
>> 
>> Jmeter summary:
>> summary =   5000 in 00:03:07 =   26.7/s Avg:   370 Min:27 Max:  1734
>> Err: 0 (0.00%)
>> 
>> I am only able to ingest 26000 messages per second, looking at system
>> resources only one or two cpus are at 25-30% and the rest are sitting idle
>> and also Solr heap is flat at 3Gb with no iowait on the devices.
>> Increasing parallelism in Jmeter to ingest using 20 threads did not increase
>> ingested messages per second, but increased the latency by 2x for each
>> request.
>> 
>> I don't understand why Solr is not able to use all the cpus on the host if I
>> increase Jmeter parallelism from 10 -> 20 -> 40. What can I do to achieve
>> performance gain and make Solr utilize system resources to their maximum.
>> 
>> Please help.
>> 
>> Thank you
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 



Re: Seeing very low ingestion performance for a single non-cloud Solr core

2017-09-21 Thread Emir Arnautović
Hi,
What are your commit configs? Maybe you are committing too frequently. 

Thanks,
Emir

> On 21 Sep 2017, at 06:19, saiks  wrote:
> 
> Hi,
> 
> Environment:
> - Solr is running in non-cloud mode on 6.4.2, Sun Java8, Linux
> 4.4.0-31-generic x86_64
> - Ingesting into a single core
> - SoftCommit = 5 seconds, HardCommit = 10 seconds
> - System has 16 Cpus and 32 Gb of memory (Solr is given 20 Gb of JVM heap)
> - text = StandardTokenizer, id = solr.StrField/docValues, hostname =
> solr.StrField/docValues, app = solr.StrField/docValues, epoch =
> solr.TrieLongField/docValues
> 
> I am using jmeter to ingest to Solr core using UpdateRequestHandle
> ("/update/json") and sending in a batch of 1000 messages(same message) in a
> single json array.
> 
> Sample message
> [{"text":"May 11 10:18:22 scrooge Web-Requests: May 11 10:18:22
> @IunAIir17k-- EVENT_WR-Y-attack-600 SG_child[823]: [event.error]
> Possible attack - 5 blocked requests within 120 seconds",
> "id":"id1",
> "hostname": "xx.com",
> "app": "",
> "epoch": 1483667347941
> },
> ]
> 
> Jmeter is configured to run 10 threads in parallel repeating the request
> 1000 times, which should ingest 10,000,000 messages in total.
> Jmeter post url:
> "/solr/mycore/update/json?overwrite=false&wt=json&commit=false"
> 
> Jmeter summary:
> summary =   5000 in 00:03:07 =   26.7/s Avg:   370 Min:27 Max:  1734
> Err: 0 (0.00%)
> 
> I am only able to ingest 26000 messages per second, looking at system
> resources only one or two cpus are at 25-30% and the rest are sitting idle
> and also Solr heap is flat at 3Gb with no iowait on the devices.
> Increasing parallelism in Jmeter to ingest using 20 threads did not increase
> ingested messages per second, but increased the latency by 2x for each
> request.
> 
> I don't understand why Solr is not able to use all the cpus on the host if I
> increase Jmeter parallelism from 10 -> 20 -> 40. What can I do to achieve
> performance gain and make Solr utilize system resources to their maximum.
> 
> Please help.
> 
> Thank you
> 
> 
> 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html