[
https://issues.apache.org/jira/browse/SOLR-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255827#comment-13255827
]
James Dyer commented on SOLR-3366:
----------------------------------
I don't see how this would be related to DIH. Even if you had "clean=true", it
doesn't commit the deletes until the entire update is complete. So, like you
say, we should expect to only lose the changes from the current import, not the
entire index.
I wonder if this is a side-effect from using replication. Sometimes,
replication copies an entire new index to the slaves in a new directory, then
writes this new directory to "index.properties". On restart solr looks for
"index.properties" to find the appropriate index directory. If this file had
been touched or removed, possibly it restarted and didn't find the correct
directory, then created a new index? Of course, this would have affected the
slaves only.
I vaguely remember there being a bug some releases back where index corruption
could occur if the system is ungracefully shut down, and I see you're on 3.4.
But then again, maybe my memory is failing me because I didn't see this in the
release notes.
> Restart of Solr during data import causes an empty index to be generated on
> restart
> -----------------------------------------------------------------------------------
>
> Key: SOLR-3366
> URL: https://issues.apache.org/jira/browse/SOLR-3366
> Project: Solr
> Issue Type: Bug
> Components: contrib - DataImportHandler, replication (java)
> Affects Versions: 3.4
> Reporter: Kevin Osborn
>
> We use the DataImportHandler and Java replication in a fairly simple setup of
> a single master and 4 slaves. We had an operating index of about 16,000
> documents. The DataImportHandler is pulled periodically by an external
> service using the "command=full-import&clean=false" command for a delta
> import.
> While processing one of these commands, we did a deployment which required us
> to restart the application server (Tomcat 7). So, the import was interrupted.
> Prior to this deployment, the full index of 16,000 documents had been
> replicated to all slaves and was working correctly.
> Upon restart, the master restarted with an empty index and then this empty
> index was replicated across all slaves. So, our search index was now empty.
> My expected behavior was to lose any changes in the delta import (basically
> prior to the commit). However, I was not expecting to lose all data. Perhaps
> this is due to the fact that I am using the full-import method, even though
> it is really a delta, for performance reasons? Or does the data import just
> put the index in some sort of invalid state?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]