I have set the replica to 0 and queue to 50. and it can index about 7 - 8 millions documents per hour now. It's acceptable . Though i don't know which change makes it.
Thank you all. 在 2014年1月13日星期一UTC+8下午9时04分35秒,Eric Lu写道: > > I observed the GC occured once every 15 seconds when heap mem was 75% of > the heap size. Is it too frequent? there is no OOMs. > > I set refresh interval to 30s. > > I'll try to use a smaller queue and set replica to 0 > > Thank you. > > 在 2014年1月13日星期一UTC+8下午8时42分56秒,Jörg Prante写道: >> >> 12 hours is an absurdly long time for indexing 10 million docs. >> >> queue:1000 is much too high for production. For test it may be ok (it >> effectively disables queue rejections) but on production, you play with the >> risk of starving your cluster resources. >> >> Do you rmonitor the resource usage of ES, especially the heap? Is GC >> starving your cluster? Do you see OOMs? >> >> Do you evaluate the bulk responses for errors? Do you throttle bulk >> request concurrency? >> >> Do you set refresh interval to -1? >> >> Hint: if 5 nodes is your maximum, you can also bulk index with 5 shards >> and replica level 0, after bulk, you can increase replica level to 1. >> >> Jörg >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8b9fab05-fa3e-455c-b8ba-1253b72c9e46%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
