Re: Solrcloud Index corruption

Martin de Vries Tue, 10 Mar 2015 04:36:01 -0700

Hi,

this _sounds_ like you somehow don't have indexed="true" set for the
field in question.

We investigated a lot more. The CheckIndex tool didn't find any error.We now think the following happened:- We changed the schema two months ago: we changed a field toindexed="true". We reloaded the cores, but two of them doesn't seem tobe reloaded (maybe we forgot).

- We reindexed all content. The new field worked fine.
- We think the leader changed to a server that didn't reload the core
- After that we field stopped working for new indexed documents

Thanks for your help.


Martin




Erick Erickson schreef op 06.03.2015 17:02:

bq: You say in our case some docs didn't made it to the node, but
that's not really true: the docs can be found on the corrupted nodes
when I search on ID. The docs are also complete. The problem is that
the docs do not appear when I filter on certain fields

this _sounds_ like you somehow don't have indexed="true" set for the
field in question. But it also sounds like you're saying that search
on that field works on some nodes but not on others, I'm assuming
you're adding "&distrib=false" to verify this. It shouldn't be
possible to have different schema.xml files on the different nodes,
but you might try checking through the admin UI.

Network burps shouldn't be related here. If the content is stored,
then the info made it to Solr intact, so this issue shouldn't be
related to that.
Sounds like it may just be the bugs Mark is referencing, sorry Idon't
have the JIRA numbers right off.

Best,
Erick
On Thu, Mar 5, 2015 at 4:46 PM, Shawn Heisey <apa...@elyograg.org>wrote:
On 3/5/2015 3:13 PM, Martin de Vries wrote:
I understand there is not a "master" in SolrCloud. In our case weusehaproxy as a load balancer for every request. So when indexingevery
document will be sent to a different solr server, immediately after
each other. Maybe SolrCloud is not able to handle that correctly?
SolrCloud can handle that correctly, but currently sending index
updates to a core that is not the leader of the shard will incur a
significant performance hit, compared to always sending updates tothe
correct core. A small performance penalty would be understandable,
because the request must be redirected, but what actually happens isamuch larger penalty than anyone expected. We have an issue in Jiratoinvestigate that performance issue and make it work as efficientlyaspossible. Indexing batches of documents is recommended, not sendingone
document per update request. General performance problems with Solr
itself can lead to extremely odd and unpredictable behavior from
SolrCloud. Most often these kinds of performance problems arerelatedin some way to memory, either the java heap or available memory inthesystem. http://wiki.apache.org/solr/SolrPerformanceProblems [1]Thanks,
Shawn




Links:
------
[1] http://wiki.apache.org/solr/SolrPerformanceProblems

Re: Solrcloud Index corruption

Reply via email to