Re: Cloud Solr 5.3.1 + 6.0.1 cannot delete documents

2016-06-07 Thread Erick Erickson
OK, let's see the code you're using, including how you open your solrClient,
how you commit, and how you show that the deleted doc is still there. This
should be translatable into a test case. Like I said, this is tested in the unit
tests, so it would be good to see the difference between the test case
and what you're doing.

Best,
Erick

On Sun, Jun 5, 2016 at 1:47 PM, Moritz Becker  wrote:
> I just checked the shards again (with =false) and it seems that
> I was mistaken, the document does *not* reside in _different_ shards -
> everything good in this respect.
>
> However, I still have the issue that deleteById those not work whereas
> deleteByQuery works. Specifically, the following line does *not* work:
>
> UpdateResponse response = solrClient.deleteById(collection, );
>
> And the following line works:
>
> UpdateResponse response = solrClient.deleteByQuery(collection, "id:" +
> );
>
> I do not touch/change any other code when switching between these two
> modes and in both scenarios I use CloudSolrClient.
>
> Am 31.05.2016 um 05:32 schrieb Erick Erickson:
>> bq: I checked in the Solr Admin and noticed that the same document
>> resided in both shards on the same node
>>
>> If this means two _different_ shards (as opposed to two replicas in
>> the _same_ shard) showed the
>> document, then that's the proverbial "smoking gun", somehow your setup
>> isn't what you think
>> it is, perhaps you are somehow using implicit routing and routing the
>> doc with the same ID to
>> two different shards?
>>
>> try querying each of your replicas with =false to see if the
>> doc is somehow on two different
>> shards. If so, I suspect that's the root of your problems and figuring
>> out _how_ that happened
>> is the next step I'd recommend.
>>
>> As to why the raw URL deletes should work and CloudSolrClient doesn't,
>> CloudSolrClient
>> tries to send updates only to the shard that they should end up on. So
>> if your routing is
>> odd or you somehow have the same doc on two shards, the "wrong" shard 
>> wouldn't
>> see the delete. There's some speculation here BTW, I didn't trace
>> through the code...
>>
>> But this functionality is tested in the unit tests
>> (CloudSolrClientTest.java), so I suspect it's
>> something odd in your setup
>>
>> Best,
>> Erick
>>
>> On Mon, May 30, 2016 at 12:33 PM, Moritz Becker  wrote:
>>> Hi,
>>>
>>> I have the following issue:
>>> I initially started with a Solr 5.3.1 + Zookeeper 3.4.6 cloud setup with 2 
>>> solr nodes and with one collection consisting of 2 shards and 2 replicas.
>>>
>>> I am accessing the cluster using the CloudSolrClient. When I tried to 
>>> delete a document, no error occurred but after deletion and subsequent 
>>> commit, the document was still available via index queries.
>>> I checked in the Solr Admin and noticed that the same document resided in 
>>> both shards on the same node which I thought was odd.
>>> Also after deleting the collection and recreating it, the issue remained.
>>>
>>> Then I tried upgrading to latest Solr 6.0.1 with the same setup. Again, I 
>>> recreated the collection but I still could not delete the documents. Here 
>>> is a log snippet of the deletion attempt of a single document:
>>>
>>> 
>>>
>>> 126023 INFO  (qtp12209492-16) [c:cc5363_dm_documentversion s:shard1 
>>> r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>>> o.a.s.u.p.LogUpdateProcessorFactory 
>>> [cc5363_dm_documentversion_shard1_replica1]  webapp=/solr path=/update 
>>> params={update.distrib=FROMLEADER=http://localhost:8983/solr/cc5363_dm_documentversion_shard1_replica2/=javabin=2}{delete=[12535
>>>  (-1535773473331216384)]} 0 16
>>> 126024 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>>> o.a.s.u.DirectUpdateHandler2 start 
>>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
>>> 126036 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>>> o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: 
>>> org.apache.solr.search.SolrIndexSearcher
>>> 126038 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>>> o.a.s.u.DirectUpdateHandler2 end_commit_flush
>>> 126049 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
>>> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
>>> o.a.s.u.DirectUpdateHandler2 start 
>>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>>> 126050 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
>>> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
>>> o.a.s.u.DirectUpdateHandler2 No 

Re: Cloud Solr 5.3.1 + 6.0.1 cannot delete documents

2016-06-05 Thread Moritz Becker
I just checked the shards again (with =false) and it seems that
I was mistaken, the document does *not* reside in _different_ shards -
everything good in this respect.

However, I still have the issue that deleteById those not work whereas
deleteByQuery works. Specifically, the following line does *not* work:

UpdateResponse response = solrClient.deleteById(collection, );

And the following line works:

UpdateResponse response = solrClient.deleteByQuery(collection, "id:" +
);

I do not touch/change any other code when switching between these two
modes and in both scenarios I use CloudSolrClient.

Am 31.05.2016 um 05:32 schrieb Erick Erickson:
> bq: I checked in the Solr Admin and noticed that the same document
> resided in both shards on the same node
>
> If this means two _different_ shards (as opposed to two replicas in
> the _same_ shard) showed the
> document, then that's the proverbial "smoking gun", somehow your setup
> isn't what you think
> it is, perhaps you are somehow using implicit routing and routing the
> doc with the same ID to
> two different shards?
>
> try querying each of your replicas with =false to see if the
> doc is somehow on two different
> shards. If so, I suspect that's the root of your problems and figuring
> out _how_ that happened
> is the next step I'd recommend.
>
> As to why the raw URL deletes should work and CloudSolrClient doesn't,
> CloudSolrClient
> tries to send updates only to the shard that they should end up on. So
> if your routing is
> odd or you somehow have the same doc on two shards, the "wrong" shard wouldn't
> see the delete. There's some speculation here BTW, I didn't trace
> through the code...
>
> But this functionality is tested in the unit tests
> (CloudSolrClientTest.java), so I suspect it's
> something odd in your setup
>
> Best,
> Erick
>
> On Mon, May 30, 2016 at 12:33 PM, Moritz Becker  wrote:
>> Hi,
>>
>> I have the following issue:
>> I initially started with a Solr 5.3.1 + Zookeeper 3.4.6 cloud setup with 2 
>> solr nodes and with one collection consisting of 2 shards and 2 replicas.
>>
>> I am accessing the cluster using the CloudSolrClient. When I tried to delete 
>> a document, no error occurred but after deletion and subsequent commit, the 
>> document was still available via index queries.
>> I checked in the Solr Admin and noticed that the same document resided in 
>> both shards on the same node which I thought was odd.
>> Also after deleting the collection and recreating it, the issue remained.
>>
>> Then I tried upgrading to latest Solr 6.0.1 with the same setup. Again, I 
>> recreated the collection but I still could not delete the documents. Here is 
>> a log snippet of the deletion attempt of a single document:
>>
>> 
>>
>> 126023 INFO  (qtp12209492-16) [c:cc5363_dm_documentversion s:shard1 
>> r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>> o.a.s.u.p.LogUpdateProcessorFactory 
>> [cc5363_dm_documentversion_shard1_replica1]  webapp=/solr path=/update 
>> params={update.distrib=FROMLEADER=http://localhost:8983/solr/cc5363_dm_documentversion_shard1_replica2/=javabin=2}{delete=[12535
>>  (-1535773473331216384)]} 0 16
>> 126024 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>> o.a.s.u.DirectUpdateHandler2 start 
>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
>> 126036 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>> o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: 
>> org.apache.solr.search.SolrIndexSearcher
>> 126038 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
>> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>> o.a.s.u.DirectUpdateHandler2 end_commit_flush
>> 126049 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
>> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
>> o.a.s.u.DirectUpdateHandler2 start 
>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>> 126050 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
>> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
>> o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
>> 126051 INFO  (qtp12209492-19) [c:cc5363_dm_documentversion s:shard1 
>> r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
>> o.a.s.u.DirectUpdateHandler2 start 
>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>> 126054 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
>> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] o.a.s.c.SolrCore 
>> SolrIndexSearcher has not changed - not 

Re: Cloud Solr 5.3.1 + 6.0.1 cannot delete documents

2016-05-30 Thread Erick Erickson
bq: I checked in the Solr Admin and noticed that the same document
resided in both shards on the same node

If this means two _different_ shards (as opposed to two replicas in
the _same_ shard) showed the
document, then that's the proverbial "smoking gun", somehow your setup
isn't what you think
it is, perhaps you are somehow using implicit routing and routing the
doc with the same ID to
two different shards?

try querying each of your replicas with =false to see if the
doc is somehow on two different
shards. If so, I suspect that's the root of your problems and figuring
out _how_ that happened
is the next step I'd recommend.

As to why the raw URL deletes should work and CloudSolrClient doesn't,
CloudSolrClient
tries to send updates only to the shard that they should end up on. So
if your routing is
odd or you somehow have the same doc on two shards, the "wrong" shard wouldn't
see the delete. There's some speculation here BTW, I didn't trace
through the code...

But this functionality is tested in the unit tests
(CloudSolrClientTest.java), so I suspect it's
something odd in your setup

Best,
Erick

On Mon, May 30, 2016 at 12:33 PM, Moritz Becker  wrote:
> Hi,
>
> I have the following issue:
> I initially started with a Solr 5.3.1 + Zookeeper 3.4.6 cloud setup with 2 
> solr nodes and with one collection consisting of 2 shards and 2 replicas.
>
> I am accessing the cluster using the CloudSolrClient. When I tried to delete 
> a document, no error occurred but after deletion and subsequent commit, the 
> document was still available via index queries.
> I checked in the Solr Admin and noticed that the same document resided in 
> both shards on the same node which I thought was odd.
> Also after deleting the collection and recreating it, the issue remained.
>
> Then I tried upgrading to latest Solr 6.0.1 with the same setup. Again, I 
> recreated the collection but I still could not delete the documents. Here is 
> a log snippet of the deletion attempt of a single document:
>
> 
>
> 126023 INFO  (qtp12209492-16) [c:cc5363_dm_documentversion s:shard1 
> r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
> o.a.s.u.p.LogUpdateProcessorFactory 
> [cc5363_dm_documentversion_shard1_replica1]  webapp=/solr path=/update 
> params={update.distrib=FROMLEADER=http://localhost:8983/solr/cc5363_dm_documentversion_shard1_replica2/=javabin=2}{delete=[12535
>  (-1535773473331216384)]} 0 16
> 126024 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
> o.a.s.u.DirectUpdateHandler2 start 
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
> 126036 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
> o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: 
> org.apache.solr.search.SolrIndexSearcher
> 126038 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
> s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
> o.a.s.u.DirectUpdateHandler2 end_commit_flush
> 126049 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
> o.a.s.u.DirectUpdateHandler2 start 
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 126050 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
> o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
> 126051 INFO  (qtp12209492-19) [c:cc5363_dm_documentversion s:shard1 
> r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
> o.a.s.u.DirectUpdateHandler2 start 
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 126054 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] o.a.s.c.SolrCore 
> SolrIndexSearcher has not changed - not re-opening: 
> org.apache.solr.search.SolrIndexSearcher
> 126056 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
> o.a.s.u.DirectUpdateHandler2 end_commit_flush
> 126055 INFO  (qtp12209492-19) [c:cc5363_dm_documentversion s:shard1 
> r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
> o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
> 126057 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
> r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
> o.a.s.u.p.LogUpdateProcessorFactory 
> [cc5363_dm_documentversion_shard2_replica1]  webapp=/solr path=/update 
> 

Cloud Solr 5.3.1 + 6.0.1 cannot delete documents

2016-05-30 Thread Moritz Becker
Hi,
 
I have the following issue:
I initially started with a Solr 5.3.1 + Zookeeper 3.4.6 cloud setup with 2 solr 
nodes and with one collection consisting of 2 shards and 2 replicas.

I am accessing the cluster using the CloudSolrClient. When I tried to delete a 
document, no error occurred but after deletion and subsequent commit, the 
document was still available via index queries.
I checked in the Solr Admin and noticed that the same document resided in both 
shards on the same node which I thought was odd.
Also after deleting the collection and recreating it, the issue remained.
 
Then I tried upgrading to latest Solr 6.0.1 with the same setup. Again, I 
recreated the collection but I still could not delete the documents. Here is a 
log snippet of the deletion attempt of a single document:
 


126023 INFO  (qtp12209492-16) [c:cc5363_dm_documentversion s:shard1 
r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
o.a.s.u.p.LogUpdateProcessorFactory [cc5363_dm_documentversion_shard1_replica1] 
 webapp=/solr path=/update 
params={update.distrib=FROMLEADER=http://localhost:8983/solr/cc5363_dm_documentversion_shard1_replica2/=javabin=2}{delete=[12535
 (-1535773473331216384)]} 0 16
126024 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
o.a.s.u.DirectUpdateHandler2 start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
126036 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
o.a.s.c.SolrCore SolrIndexSearcher has not changed - not re-opening: 
org.apache.solr.search.SolrIndexSearcher
126038 INFO  (commitScheduler-15-thread-1) [c:cc5363_dm_documentversion 
s:shard1 r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
o.a.s.u.DirectUpdateHandler2 end_commit_flush
126049 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
o.a.s.u.DirectUpdateHandler2 start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
126050 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
126051 INFO  (qtp12209492-19) [c:cc5363_dm_documentversion s:shard1 
r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
o.a.s.u.DirectUpdateHandler2 start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
126054 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] o.a.s.c.SolrCore 
SolrIndexSearcher has not changed - not re-opening: 
org.apache.solr.search.SolrIndexSearcher
126056 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
o.a.s.u.DirectUpdateHandler2 end_commit_flush
126055 INFO  (qtp12209492-19) [c:cc5363_dm_documentversion s:shard1 
r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit.
126057 INFO  (qtp12209492-20) [c:cc5363_dm_documentversion s:shard2 
r:core_node1 x:cc5363_dm_documentversion_shard2_replica1] 
o.a.s.u.p.LogUpdateProcessorFactory [cc5363_dm_documentversion_shard2_replica1] 
 webapp=/solr path=/update 
params={update.distrib=FROMLEADER=true=true=true=false=http://localhost:8983/solr/cc5363_dm_documentversion_shard2_replica2/_end_point=true=javabin=2=false}{commit=}
 0 10
126059 INFO  (qtp12209492-19) [c:cc5363_dm_documentversion s:shard1 
r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] o.a.s.c.SolrCore 
SolrIndexSearcher has not changed - not re-opening: 
org.apache.solr.search.SolrIndexSearcher
126063 INFO  (qtp12209492-19) [c:cc5363_dm_documentversion s:shard1 
r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
o.a.s.u.DirectUpdateHandler2 end_commit_flush
126064 INFO  (qtp12209492-19) [c:cc5363_dm_documentversion s:shard1 
r:core_node4 x:cc5363_dm_documentversion_shard1_replica1] 
o.a.s.u.p.LogUpdateProcessorFactory [cc5363_dm_documentversion_shard1_replica1] 
 webapp=/solr path=/update 
params={update.distrib=FROMLEADER=true=true=true=false=http://localhost:8983/solr/cc5363_dm_documentversion_shard2_replica2/_end_point=true=javabin=2=false}{commit=}
 0 13

 
I used the CloudSolrClient.deleteById(collection, id); to delete the document.
 
According to the logs, Solr thinks that nothing has changed and does not 
recreate the searcher so I tried to restart the instances but the document was 
still there.
Finally, I was able to manually delete the document via the following request:
 
POST