Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
As per my understanding, distrib=false will be added in select query to restrict the document selection to particular shard. But how should i route the documents to only particular shard, is still my need. Thanks Hemanth -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
I suspect that when you create your collections, somehow you're not doing it like you expect. The red flag is: I tried creating a collection with compositeId routing which created shard1,shard2,shard3 , but when I indexed , all the documents went to one shard only This simply shouldn't be happening. What is your evidence that all the docs went to one shard? You can tell by adding =false to your query and sending it to particular core, something like: solr_server/solr/collection1_shard1_replica1/query?q=*:*=false. Best, Erick On Mon, Dec 25, 2017 at 4:15 AM, hemanthwrote: > Hi Erik, > Thanks for your reply. I have no issues of using either Implicit or > Composite routing but I want to insert the documents to a particular shard, > so that when I want to query the data , I can hit a particular shard, which > gives me the results in lesser time as it hits only particular shard. So, > for eg: I am creating a collection with status as Active, Inactive and > Terminated. Let me think that my data at present is equally distributed , > i.e Active 400 records, Inactive 300 records and Terminated also 300 > records. I tried creating a collection with compositeId routing which > created shard1,shard2,shard3 , but when I indexed , all the documents went > to one shard only. I also created a collection with Implicit routing > mechanism with Active,Inactive and Terminated shard with routing key as > status. When I indexed the documents , again all went to single shard. I > want to route the documents based on some input value (with out based on the > hash value of the field , I specified, because both values may always lead > to same hash value and may point to store in same shard). So , Please let > me know, how to route the documents to a particular shard based on composite > id or implicit mechanism, by using one of the existing field value or > extracting the content of the field before ! parameter. eg: if my field > value is "Active!otherfieldvalue" should go to Active shard and if my field > value is "Inactive!othercontent" should go to Inactive shard. > > Thanks > Hemanth > > -Happy Christmas > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
Hi Erik, Thanks for your reply. I have no issues of using either Implicit or Composite routing but I want to insert the documents to a particular shard, so that when I want to query the data , I can hit a particular shard, which gives me the results in lesser time as it hits only particular shard. So, for eg: I am creating a collection with status as Active, Inactive and Terminated. Let me think that my data at present is equally distributed , i.e Active 400 records, Inactive 300 records and Terminated also 300 records. I tried creating a collection with compositeId routing which created shard1,shard2,shard3 , but when I indexed , all the documents went to one shard only. I also created a collection with Implicit routing mechanism with Active,Inactive and Terminated shard with routing key as status. When I indexed the documents , again all went to single shard. I want to route the documents based on some input value (with out based on the hash value of the field , I specified, because both values may always lead to same hash value and may point to store in same shard). So , Please let me know, how to route the documents to a particular shard based on composite id or implicit mechanism, by using one of the existing field value or extracting the content of the field before ! parameter. eg: if my field value is "Active!otherfieldvalue" should go to Active shard and if my field value is "Inactive!othercontent" should go to Inactive shard. Thanks Hemanth -Happy Christmas -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
You're misinterpreting the docs. _route_ is used to tell _queries_ where to go, or to route a document as part of the parameters when you send the doc, not a field in the doc. So when you added the _route_ field to the doc, you didn't have it in the schema in the first place. So you could add a _route_ field to your schema and work that way, but then you have to also define router.field=_route_ when you create the colleciton. I'd advise instead just specifying router.field=Status to avoid confusion. Now, that said I really question whether this is a good way to set up your collection. I'd just use compositeId and when you want to restrict searches to one type or the other add =Status:Active or =Status:Terminated that way you can't forget to delete the doc from one shard or the other when the status changes. You won't have lopsided doc counts on your shards because you have 10,000,000 active docs and 10 terminated docs. And whatever ratio you start with, it'll change as the collection ages. FWIW, Erick On Fri, Dec 15, 2017 at 11:17 AM, hemanthwrote: > I created a collection with implicit routing mechanism and my shared names > are Active and Disabled , these are the values of one of my collection > field: Status. But when I am trying to upload the document using Solr UI > documents section : Upload using JSON format with all the fields including > field with value for Status as either Terminated or Active. It is going to > only one default shard. I tried to insert _route_ field with the value as > "Terminated" and when I try to insert the document , I am getting > > *unknown field '_route_' Error from server*. Am I trying in correct way? > Does the implicit routing works on the hash value of routing field and it > does not go to the shard based on the value of the routing field? > > I want to store the document with status field value : Active to > myCollectionn_Active shard and document with status field value: Terminated > to myCollection_Terminated shard automatically based on the value of my > status field in the document. I used implicit routing while creating > collection and given shard names as Active,Terminated. Plz help. I am using > Solr 6.6 version. > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
I created a collection with implicit routing mechanism and my shared names are Active and Disabled , these are the values of one of my collection field: Status. But when I am trying to upload the document using Solr UI documents section : Upload using JSON format with all the fields including field with value for Status as either Terminated or Active. It is going to only one default shard. I tried to insert _route_ field with the value as "Terminated" and when I try to insert the document , I am getting *unknown field '_route_' Error from server*. Am I trying in correct way? Does the implicit routing works on the hash value of routing field and it does not go to the shard based on the value of the routing field? I want to store the document with status field value : Active to myCollectionn_Active shard and document with status field value: Terminated to myCollection_Terminated shard automatically based on the value of my status field in the document. I used implicit routing while creating collection and given shard names as Active,Terminated. Plz help. I am using Solr 6.6 version. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
Did you try setting the "magic" field _route_ in your docs to the shard? Something like doc.addField("_route", "shard1")? Best, Erick On Wed, Jun 15, 2016 at 10:31 AM, nikosmarinoswrote: > Is it possible to give an example? I want doc1 to be explicitly routed to > "shard1" of my "implicit" collection and doc2 to "shard4". How can I do > that? > > Creating an implicit collection with one of the example configurations of > the solr package, defining the "id" field as the router.field (not sure if > necessary) and indexing id:shard1 id:shard2 id:shard3 takes all documents to > the same (random) shard. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/indexing-data-to-solrcloud-with-implicit-is-not-distributing-across-cluster-tp4232956p4282428.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
Is it possible to give an example? I want doc1 to be explicitly routed to "shard1" of my "implicit" collection and doc2 to "shard4". How can I do that? Creating an implicit collection with one of the example configurations of the solr package, defining the "id" field as the router.field (not sure if necessary) and indexing id:shard1 id:shard2 id:shard3 takes all documents to the same (random) shard. -- View this message in context: http://lucene.472066.n3.nabble.com/indexing-data-to-solrcloud-with-implicit-is-not-distributing-across-cluster-tp4232956p4282428.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
On 10/6/2015 7:58 AM, Steve wrote: > I’ve been unable to get solrcloud to distribute data across 4 solr nodes > with the “route.name=implicit” feature of the collections API. > > The nodes are live, and the graphs are green. All the data (the “Films” > example data) shows up on one node, the node that received the CREATE > command. A better name for the implicit router is "manual." The implicit router doesn't actually route. It assumes that you know what you are doing and have sent the request to the shard where you want it to be indexed. You want the compositeId router. Even though the name "implicit" makes sense in the context of Solr *code*, it is a confusing name when it comes to user expectations. You're not the first one to be confused by this, which is why I opened this issue: https://issues.apache.org/jira/browse/SOLR-6630 Thanks, Shawn
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
On 10/6/2015 10:02 AM, Steve wrote: > Thanks Shawn, that fixed it ! > > The documentation int the Collections API says "The value can be ... > *implicit*, which uses an internal default hash". Thank you for pointing out this error in the documentation. I did not know it was there. I have updated the online Reference Guide so it is correct. Hopefully this will help clear up any confusion! https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateaCollection Thanks, Shawn
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
: The documentation int the Collections API says "The value can be ... : *implicit*, which uses an internal default hash". : I think most people would assume the "hash" would be used to route the : data. : Meanwhile the description of CompositID in the "Document Routing" section : only discusses how modify your document IDs, which I did not want to do. Hmmm... I'm guessing you are looking at PDF copy of the ref guide? Pretty sure that was a mistake that's already been fixed. At the moment the Collections API CREATE command says... https://cwiki.apache.org/confluence/display/solr/Collections+API "The 'implicit' router does not automatically route documents to different shards. Whichever shard you indicate on the indexing request (or within each document) will be used as the destination for those documents" And the details on document routing say... https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud#ShardsandIndexingDatainSolrCloud-DocumentRouting If you created the collection and defined the "implicit" router at the time of creation, you can additionally define a router.field parameter to use a field from each document to identify a shard where the document belongs. If the field specified is missing in the document, however, the document will be rejected. You could also use the _route_ parameter to name a specific shard. ...which i believe is all accurate. -Hoss http://www.lucidworks.com/
indexing data to solrcloud with "implicit" is not distributing across cluster.
I’ve been unable to get solrcloud to distribute data across 4 solr nodes with the “route.name=implicit” feature of the collections API. The nodes are live, and the graphs are green. All the data (the “Films” example data) shows up on one node, the node that received the CREATE command. My CREATE command is: curl http://host-192-168-0-60.openstacklocal:8081/solr/admin/collections?action=CREATE=CollectionFilms=2=implicit=shard-1,shard-2,shard-3,shard-4=2=configAlpha solr version 5.3.1 zookeeper version 3.4.6 indexing with: cd /opt/solr/example/films; /opt/solr/bin/post -c CollectionFilms -port 8081 films.json Thanks, strick
Re: indexing data to solrcloud with "implicit" is not distributing across cluster.
Thanks Shawn, that fixed it ! The documentation int the Collections API says "The value can be ... *implicit*, which uses an internal default hash". I think most people would assume the "hash" would be used to route the data. Meanwhile the description of CompositID in the "Document Routing" section only discusses how modify your document IDs, which I did not want to do. thanks again, .strick On Tue, Oct 6, 2015 at 8:15 AM, Shawn Heiseywrote: > On 10/6/2015 7:58 AM, Steve wrote: > > I’ve been unable to get solrcloud to distribute data across 4 solr nodes > > with the “route.name=implicit” feature of the collections API. > > > > The nodes are live, and the graphs are green. All the data (the “Films” > > example data) shows up on one node, the node that received the CREATE > > command. > > A better name for the implicit router is "manual." The implicit router > doesn't actually route. It assumes that you know what you are doing and > have sent the request to the shard where you want it to be indexed. > > You want the compositeId router. > > Even though the name "implicit" makes sense in the context of Solr > *code*, it is a confusing name when it comes to user expectations. > You're not the first one to be confused by this, which is why I opened > this issue: > > https://issues.apache.org/jira/browse/SOLR-6630 > > Thanks, > Shawn > >