[
https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477462#comment-13477462
]
Shawn Heisey commented on SOLR-3954:
------------------------------------
bq. In any case, I don't think we would add an option to skip the update log -
you can remove it if the performance is unacceptable.
When I revamp my SolrJ application, I plan to use soft commit on a very short
interval (maybe 10 seconds) but only do a hard commit every five minutes,
possibly even less often.
If I understand the updateLog functionality right, and I don't claim that I do,
it would mean that my SolrJ code would not need to keep separate track of which
updates succeeded with soft commit and which ones succeeded with hard commit.
If the server went down four minutes and 55 seconds after the last hard commit,
I would have reasonable expectation that when it came back up, all those soft
commits would get properly applied to my index.
Assuming I have a proper understanding above, I want the updateLog for my
incremental updates. It makes the bulk import take at least twice as long, and
I do not need it there because if that fails, I will just start it over. If I
am going to benefit from updateLog, I need to be able to turn it off for bulk
indexing.
Is there a way to create a second updateHandler that does not have updateLog
enabled and tell DIH to use that handler?
> Option to have updateHandler and DIH skip updateLog
> ---------------------------------------------------
>
> Key: SOLR-3954
> URL: https://issues.apache.org/jira/browse/SOLR-3954
> Project: Solr
> Issue Type: Improvement
> Components: update
> Affects Versions: 4.0
> Reporter: Shawn Heisey
> Fix For: 4.1
>
>
> The updateLog feature makes updates take longer, likely because of the I/O
> time required to write the additional information to disk. It may take as
> much as three times as long for the indexing portion of the process. I'm not
> sure whether it affects the time to commit, but I would imagine that the
> difference there is small or zero. When doing incremental updates/deletes on
> an existing index, the time lag is probably very small and unimportant.
> When doing a full reindex (which may happen via DIH), especially if this is
> done in a build core that is then swapped with a live core, this performance
> hit is unacceptable. It seems to make the import take about three times as
> long.
> An option to have an update skip the updateLog would be very useful for these
> situations. It should have a method in SolrJ and be exposed in DIH as well.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]