RE: .skip.autorecovery=Y + restart solr after crash + losing many documents

2013-05-23 Thread Gilles Comeau
Hi Otis, 

Thank you for your reply.  I'm in the middle of that upgrade and will report 
back when testing is complete.   I'd like to get some nice set of reproducible 
steps so I'm not just ranting on. :)   

Regards,

Gilles

-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] 
Sent: 20 May 2013 04:29
To: solr-user@lucene.apache.org
Subject: Re: .skip.autorecovery=Y + restart solr after crash + losing many 
documents

Hi Gilles,

Could you upgrade to 4.3.0 and see if you can reproduce?

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, May 13, 2013 at 5:26 PM, Gilles Comeau gilles.com...@polecat.co wrote:
 Hi all,

 We write to two same-named cores in the same collection for redundancy, and 
 are not taking advantage of the full benefits of solr cloud replication.

 We use solrcloud.skip.autorecovery=true so that Solr doesn't try to sync the 
 indexes when it starts up.

 However, we find that if the core is not optimized prior to shutting it down 
 (in a crash situation), we can lose all of the data after starting up.   The 
 files are written to disk, but we can lose a full 24 hours worth of data as 
 they are all removed when we start SOLR.  (I don't think it is a commit issue)

 If we optimize before shutting down, we never lose any data.   Sadly, 
 sometimes SOLR is in a state where optimizing is not an option.

 Can anyone think of why that might be?   Is there any special configuration 
 you need if you want to write directly to two cores rather than use 
 replication?   Version 4.0, this used to work in our 4.0 nightly build, but 
 broke when we migrated to 4.0 production.(until we test and migrate to 
 the replication setup - it won't be too long and I'm a bit embarrassed to be 
 asking this question!)

 Regards,

 Gilles



Re: .skip.autorecovery=Y + restart solr after crash + losing many documents

2013-05-19 Thread Otis Gospodnetic
Hi Gilles,

Could you upgrade to 4.3.0 and see if you can reproduce?

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, May 13, 2013 at 5:26 PM, Gilles Comeau gilles.com...@polecat.co wrote:
 Hi all,

 We write to two same-named cores in the same collection for redundancy, and 
 are not taking advantage of the full benefits of solr cloud replication.

 We use solrcloud.skip.autorecovery=true so that Solr doesn't try to sync the 
 indexes when it starts up.

 However, we find that if the core is not optimized prior to shutting it down 
 (in a crash situation), we can lose all of the data after starting up.   The 
 files are written to disk, but we can lose a full 24 hours worth of data as 
 they are all removed when we start SOLR.  (I don't think it is a commit issue)

 If we optimize before shutting down, we never lose any data.   Sadly, 
 sometimes SOLR is in a state where optimizing is not an option.

 Can anyone think of why that might be?   Is there any special configuration 
 you need if you want to write directly to two cores rather than use 
 replication?   Version 4.0, this used to work in our 4.0 nightly build, but 
 broke when we migrated to 4.0 production.(until we test and migrate to 
 the replication setup - it won't be too long and I'm a bit embarrassed to be 
 asking this question!)

 Regards,

 Gilles