[ 
https://issues.apache.org/jira/browse/SOLR-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marc Brette updated SOLR-16108:
-------------------------------
    Summary: Incorrect distribution of records in shards after a split with 
splitByKeyprefix, when using the CompositeId router with a router field defined 
 (was: Incorrect distribution of records in shards after a split with 
splitByKeyprefix,when using the CompositeId router with a router field defined)

> Incorrect distribution of records in shards after a split with 
> splitByKeyprefix, when using the CompositeId router with a router field 
> defined
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-16108
>                 URL: https://issues.apache.org/jira/browse/SOLR-16108
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 8.4
>            Reporter: Marc Brette
>            Priority: Major
>
> When a collection is created using the CompositeId router with a router field 
> defined, and one of its shard contains records with the same routing key, and 
> a split of its shard is performed with splitByKeyprefix parameter, we expect 
> the records to be uniformly distributed between the two resulting shards.
> Instead, one shard contains no record, the other contains all the records.
> Steps to reproduce:
> {code:java}
> docker network create solr-network
> # run in one terminal
> docker run -it -h solr1 --name solr1 --net solr-network -p 18983:8983 
> solr:8.4 /opt/solr/bin/solr -c -f
> # run in another terminal
> docker run -it -h solr2 --name solr2 --net solr-network -p 28983:8983 
> solr:8.4 /opt/solr/bin/solr -c -f -z solr1:9983
> #-----------------------------------------------------------------------------------------------
> # Works, documents are split between the 2 shards
> # Create collection with default compositeId router, routing key in the id, 
> only one shard
> curl --request GET \
>   --url 
> 'http://localhost:18983/solr/admin/collections?action=CREATE&name=routing_by_id&numShards=1'
> # Create enough documents, they all have the same routing key (france!)
> for i in {0..100}
> do
>   curl --request POST \
>   --url 
> http://localhost:18983/solr/routing_by_id/update/json/docs?commit=true \
>   --header 'Content-Type: application/json' \
>   --data "[{
>     \"id\": \"france\!${i}0\",
>     \"title_t\": \"hi\"
> }]"
> done
> # Check it is indexed correctly
> curl --request GET \
>   --url 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*'
> # Split the shard
> curl --request GET \
>   --url 
> 'http://localhost:18983/solr/admin/collections?action=SPLITSHARD&collection=routing_by_id&shard=shard1&splitByPrefix=true'
> # Check records in shard1_0 (~half of the documents there)
> curl --request GET \
>   --url 
> 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*&shards=shard1_0'
> # Check records in shard1_1(~half of the documents there)
> curl --request GET \
>   --url 
> 'http://localhost:18983/solr/routing_by_id/select?q=*%3A*&shards=shard1_1'
> #-----------------------------------------------------------------------------------------------
> # Fails, does not split documents in both shards
> # Create collection with default compositeId router, routing key in the field 
> "route_t", only one shard
> curl --request GET \
>   --url 
> 'http://localhost:18983/solr/admin/collections?action=CREATE&name=routing_by_field&numShards=1&router.field=route_t'
> # Create enough documents, they all have the same routing key (france!)
> for i in {0..100}
> do
>   curl --request POST \
>   --url 
> http://localhost:18983/solr/routing_by_field/update/json/docs?commit=true \
>   --header 'Content-Type: application/json' \
>   --data "[{
>     \"id\": \"${i}0\",
>     \"title_t\": \"hi\",
>     \"route_t\": \"france\"
> }]"
> done
> # Check it is indexed correctly
> curl --request GET \
>   --url 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*'
> # Split the shard
> curl --request GET \
>   --url 
> 'http://localhost:18983/solr/admin/collections?action=SPLITSHARD&collection=routing_by_field&shard=shard1&splitByPrefix=true'
> # Check records in shard1_0: no document!
> curl --request GET \
>   --url 
> 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*&shards=shard1_0'
> # Check records in shard1_1: all documents!
> curl --request GET \
>   --url 
> 'http://localhost:18983/solr/routing_by_field/select?q=*%3A*&shards=shard1_1'
>    {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to