Thanks Shay for updating us with perf improvements.
Apart from using the default parameters, should we follow the guideline 
listed in 

http://elasticsearch-users.115913.n3.nabble.com/Is-ES-es-index-store-type-memory-equivalent-to-Lucene-s-RAMDirectory-td4057417.html
 

Lucene supports MMapDirectory at the data indexing phase (in a batch) and 
switch to in-memory for queries to optimize on search latency.

Should we use JVM system parameter -Des.index.store.type=memory .  Isn't 
this equivalent to using RAMDirectory in Lucene for in-memory search query 
 ?
Thanks
--Mahesh

On Saturday, July 5, 2014 8:46:59 AM UTC-7, kimchy wrote:
>
> Heya, I worked a bit on it, and 1.x (upcoming 1.3) has some significant 
> perf improvements now for this case (including improvements Lucene wise, 
> that are for now in ES, but will be in Lucene next version). Those include:
>
> 6648: https://github.com/elasticsearch/elasticsearch/pull/6648
> 6714: https://github.com/elasticsearch/elasticsearch/pull/6714
> 6707: https://github.com/elasticsearch/elasticsearch/pull/6707
>
> It would be interesting if you can run the tests again with 1.x branch. 
> Note, also, please use default features in ES for now, no disable flushing 
> and such.
>
> On Friday, June 13, 2014 7:57:23 AM UTC+2, Maco Ma wrote:
>>
>> I try to measure the performance of ingesting the documents having lots 
>> of fields.
>>
>>
>> The latest elasticsearch 1.2.1:
>> Total docs count: 10k (a small set definitely)
>> ES_HEAP_SIZE: 48G
>> settings:
>>
>> {"doc":{"settings":{"index":{"uuid":"LiWHzE5uQrinYW1wW4E3nA","number_of_replicas":"0","translog":{"disable_flush":"true"},"number_of_shards":"5","refresh_interval":"-1","version":{"created":"1020199"}}}}}
>>
>> mappings:
>>
>> {"doc":{"mappings":{"type":{"dynamic_templates":[{"t1":{"mapping":{"store":false,"norms":{"enabled":false},"type":"string"},"match":"*_ss"}},{"t2":{"mapping":{"store":false,"type":"date"},"match":"*_dt"}},{"t3":{"mapping":{"store":false,"type":"integer"},"match":"*_i"}}],"_source":{"enabled":false},"properties":{}}}}}
>>
>> All fields in the documents mach the templates in the mappings.
>>
>> Since I disabled the flush & refresh, I submitted the flush command 
>> (along with optimize command after it) in the client program every 10 
>> seconds. (I tried the another interval 10mins and got the similar results)
>>
>> Scenario 0 - 10k docs have 1000 different fields:
>> Ingestion took 12 secs.  Only 1.08G heap mem is used(only states the used 
>> heap memory).
>>
>>
>> Scenario 1 - 10k docs have 10k different fields(10 times fields compared 
>> with scenario0):
>> This time ingestion took 29 secs.   Only 5.74G heap mem is used.
>>
>> Not sure why the performance degrades sharply.
>>
>> If I try to ingest the docs having 100k different fields, it will take 17 
>> mins 44 secs.  We only have 10k docs totally and not sure why ES perform so 
>> badly. 
>>
>> Anyone can give suggestion to improve the performance?
>>
>>
>>
>>
>>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9456c6ab-1f0b-4021-b011-d8573032915a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to