[ https://issues.apache.org/jira/browse/CONNECTORS-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651942#comment-16651942 ]
Steph van Schalkwyk commented on CONNECTORS-1546: ------------------------------------------------- Hans is correct. I would remove it. It can mess up merging later if not used correctly. It may also take a long time to complete. I'm going to upload a patch or two soon and will remove it if you concur. BTW, from the ES 6.4 doc: "Force merge should only be called against *read-only indices*. Running force merge against a read-write index can cause very large segments to be produced (>5Gb per segment), and the merge policy +*will never consider it for merging again until it mostly consists of deleted docs*+. This can cause very large segments to remain in the shards." But I agree. It isn't up to MCF to decide what to do as it does impact ingesting. Hans may want to try this before ingesting: PUT /_cluster/settings{"transient" : {"indices.store.throttle.type" : "none" }} and after ingesting: PUT /_cluster/settings{"transient" : {"indices.store.throttle.type" : "merge" }} > Optimize Elasticsearch performance by removing 'forcemerge' > ----------------------------------------------------------- > > Key: CONNECTORS-1546 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1546 > Project: ManifoldCF > Issue Type: Improvement > Components: Elastic Search connector > Reporter: Hans Van Goethem > Assignee: Steph van Schalkwyk > Priority: Major > > After crawling with ManifoldCF, forcemerge is applied to optimize the > Elasticsearch index. This optimization makes the Elastic faster for > read-operations but not for write-opeartions. On the contrary, performance on > the write operations becomes worse after every forcemerge. > Can you remove this forcemerge in ManifoldCF to optimize perfomance for > recurrent crawling to Elasticsearch? > If somene needs this forcemerge, it can be applied mannually against > Elasticsearch directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)