The way we make Solr ingest faster (single document ingest) is by turn off 
the engine soft commit and hard commit and use a client to commit the 
changes every 10 seconds. 

Solr ingest speed remains at 800 docs per second where ES ingest speed 
drops in half when we increase the fields (ie. from 1000 to 10k).
I have asked Maco to send you the requested script so you can do more 
analysis.

If you can help to solve the first level ES performance degradation (ie. 
1000 to 10k) as a starting point, that will be the best.

We do have real customer scenario that require large amount of metadata 
fields, that is why this is a blocking issue for the stack evaluation 
between Solr and Elastic Search.

Thanks!
Cindy

On Thursday, June 12, 2014 10:57:23 PM UTC-7, Maco Ma wrote:
>
> I try to measure the performance of ingesting the documents having lots of 
> fields.
>
>
> The latest elasticsearch 1.2.1:
> Total docs count: 10k (a small set definitely)
> ES_HEAP_SIZE: 48G
> settings:
>
> {"doc":{"settings":{"index":{"uuid":"LiWHzE5uQrinYW1wW4E3nA","number_of_replicas":"0","translog":{"disable_flush":"true"},"number_of_shards":"5","refresh_interval":"-1","version":{"created":"1020199"}}}}}
>
> mappings:
>
> {"doc":{"mappings":{"type":{"dynamic_templates":[{"t1":{"mapping":{"store":false,"norms":{"enabled":false},"type":"string"},"match":"*_ss"}},{"t2":{"mapping":{"store":false,"type":"date"},"match":"*_dt"}},{"t3":{"mapping":{"store":false,"type":"integer"},"match":"*_i"}}],"_source":{"enabled":false},"properties":{}}}}}
>
> All fields in the documents mach the templates in the mappings.
>
> Since I disabled the flush & refresh, I submitted the flush command (along 
> with optimize command after it) in the client program every 10 seconds. (I 
> tried the another interval 10mins and got the similar results)
>
> Scenario 0 - 10k docs have 1000 different fields:
> Ingestion took 12 secs.  Only 1.08G heap mem is used(only states the used 
> heap memory).
>
>
> Scenario 1 - 10k docs have 10k different fields(10 times fields compared 
> with scenario0):
> This time ingestion took 29 secs.   Only 5.74G heap mem is used.
>
> Not sure why the performance degrades sharply.
>
> If I try to ingest the docs having 100k different fields, it will take 17 
> mins 44 secs.  We only have 10k docs totally and not sure why ES perform so 
> badly. 
>
> Anyone can give suggestion to improve the performance?
>
>
>
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/79911a7f-4118-4421-bc2d-2284eccebd3f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to