[
https://issues.apache.org/jira/browse/SOLR-11287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141571#comment-16141571
]
Vannia Rajan commented on SOLR-11287:
-------------------------------------
I figured out when this issue happens, by observing the patterns with a small
set of data.
SPLITSHARD issues a soft-commit (with some of the files still not fully written
to disk). If I restart SOLR without issuing an explicit <commit />, the index
directory is not fully written and the process is killed. During next restart,
the incomplete index is set to 0 records and cleaned up.
I think we should update the documentations to let users know that they need to
issue a hard <commit /> immediately after a SPLITSHARD operation.
> Sub-shards by SPLITSHARD loses data on restarting SOLR
> ------------------------------------------------------
>
> Key: SOLR-11287
> URL: https://issues.apache.org/jira/browse/SOLR-11287
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: 5.5.1
> Environment: Ubuntu 64-bit 32-core server, 240GB RAM
> Reporter: Vannia Rajan
>
> We are running SOLR 5.5.1 with 4 nodes (1 shard per node). We are in the
> process of splitting the 4 shards into 8 shards.
> The SPLITSHARD collections API works great - it does create the sub-shards
> and activates sub-shards, inactivates the parent shard upon completion. The
> row count compard with parent shard vs sub-shards are matching. However, the
> data in sub-shards doesn't seem to be persistent in our case.
> A restart of SOLR leaves the sub-shards with 0 documents with their data
> directory sizes getting reduced from 40+ GB to 71KB.
> If I'm missing any other steps to be followed after SPLITSHARD to make the
> data in sub-shards persistent, please let me know. Otherwise, I feel this may
> be a bug in v5.5.1.
> Note: I was able to manually set the states of parent to "active" and
> children with 0 documents as "inactive" by setting
> /collections/{collection}/state.json in zookeeper, to get back the lost data.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]