Wow, awsome. I'll try that, Thanks! On Friday, April 24, 2015 at 2:17:45 PM UTC+3, christian...@elasticsearch.com wrote: > > Hi Eran, > > If you are assigning your own ID, Elasticsearch need to search and check > if the document already exists before writing it. This could explain why > the bulk insert performance goes down as the size of the index grows. If > you are not going to update the documents, I would therefore recommend > allowing Elasticsearch to assign the document ID automatically. > > Best regards, > > Christian > > > > On Friday, April 24, 2015 at 7:49:56 AM UTC+1, Eran wrote: >> >> Hello, >> >> I've created an index I use for logging. >> >> This means there are mostly writes, and some searches once in a while. >> In the phase of the first loading, I'm using several clients to >> concurrently index documents using the bulk API. >> >> At first, indexing takes 200 ms for a bulk of 5000 documents. >> As time goes by, the indexing time increases, and gets to 1000-4500 ms. >> >> I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with >> an IO provisioned volume set to 7000 IOPS. >> >> Looking at the metrics, I see that the CPU and memory are fine, the write >> IOPS are at 300, but the read IOPS have slowly gone up and got to 7000. >> >> How come I'm only indexing, but most of the IOPS are read? >> >> I am attaching some screen captures from the BigDesk plugin, that show >> the two states of the index, ater about 20% of the graphs is the point in >> time where I stopped the clients, so you can see the load drop of. >> >> My settings are: >> >> threadpool.bulk.type: fixed >> threadpool.bulk.size: 32 # availableProcessors >> threadpool.bulk.queue_size: 1000 >> >> # Indices settings >> indices.memory.index_buffer_size: 50% >> >> >> 376,1 97% >> indices.cache.filter.expire: 6h >> >> bootstrap.mlockall: true >> >> >> and I've change the index settings to: >> >> >> {"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}} >> I also tried "refresh_interval":"-1" >> >> >> Please let me know what else I need to provide if needed (settings, logs, >> metrics) >> >>
-- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/84687c05-49a5-4e0a-9a4f-41e4136a120a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.