Thanks Shay for updating us with perf improvements. Apart from using the default parameters, should we follow the guideline listed in
http://elasticsearch-users.115913.n3.nabble.com/Is-ES-es-index-store-type-memory-equivalent-to-Lucene-s-RAMDirectory-td4057417.html Lucene supports MMapDirectory at the data indexing phase (in a batch) and switch to in-memory for queries to optimize on search latency. Should we use JVM system parameter -Des.index.store.type=memory . Isn't this equivalent to using RAMDirectory in Lucene for in-memory search query ? Thanks --Mahesh On Saturday, July 5, 2014 8:46:59 AM UTC-7, kimchy wrote: > > Heya, I worked a bit on it, and 1.x (upcoming 1.3) has some significant > perf improvements now for this case (including improvements Lucene wise, > that are for now in ES, but will be in Lucene next version). Those include: > > 6648: https://github.com/elasticsearch/elasticsearch/pull/6648 > 6714: https://github.com/elasticsearch/elasticsearch/pull/6714 > 6707: https://github.com/elasticsearch/elasticsearch/pull/6707 > > It would be interesting if you can run the tests again with 1.x branch. > Note, also, please use default features in ES for now, no disable flushing > and such. > > On Friday, June 13, 2014 7:57:23 AM UTC+2, Maco Ma wrote: >> >> I try to measure the performance of ingesting the documents having lots >> of fields. >> >> >> The latest elasticsearch 1.2.1: >> Total docs count: 10k (a small set definitely) >> ES_HEAP_SIZE: 48G >> settings: >> >> {"doc":{"settings":{"index":{"uuid":"LiWHzE5uQrinYW1wW4E3nA","number_of_replicas":"0","translog":{"disable_flush":"true"},"number_of_shards":"5","refresh_interval":"-1","version":{"created":"1020199"}}}}} >> >> mappings: >> >> {"doc":{"mappings":{"type":{"dynamic_templates":[{"t1":{"mapping":{"store":false,"norms":{"enabled":false},"type":"string"},"match":"*_ss"}},{"t2":{"mapping":{"store":false,"type":"date"},"match":"*_dt"}},{"t3":{"mapping":{"store":false,"type":"integer"},"match":"*_i"}}],"_source":{"enabled":false},"properties":{}}}}} >> >> All fields in the documents mach the templates in the mappings. >> >> Since I disabled the flush & refresh, I submitted the flush command >> (along with optimize command after it) in the client program every 10 >> seconds. (I tried the another interval 10mins and got the similar results) >> >> Scenario 0 - 10k docs have 1000 different fields: >> Ingestion took 12 secs. Only 1.08G heap mem is used(only states the used >> heap memory). >> >> >> Scenario 1 - 10k docs have 10k different fields(10 times fields compared >> with scenario0): >> This time ingestion took 29 secs. Only 5.74G heap mem is used. >> >> Not sure why the performance degrades sharply. >> >> If I try to ingest the docs having 100k different fields, it will take 17 >> mins 44 secs. We only have 10k docs totally and not sure why ES perform so >> badly. >> >> Anyone can give suggestion to improve the performance? >> >> >> >> >> >> >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9456c6ab-1f0b-4021-b011-d8573032915a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
