So, here is the problem that I am trying to solve. I am moving from Solr
master-slave architecture to SolrCloud architecture. I have one custom Solr
plugin that does following:

1. When a document (say document with unique id doc1)is getting indexed to
a core say core A then this plugin adds one more field to the indexing
request. It fetches this new field from core B. Core B in our case
maintains popularity score field for each document which gets calculated in
a different project. It fetches the popularity score from score B for doc1
and adds it to indexing request.
2. In following code, dataInfo.dataSource is the name of the core B.

I can use the name of the core B like collection_shard1_replica_n21 and it
works. But it is not a good solution. What if I had a multiple shards for
core B? In that case the the doc1 that I am trying to find might not be
present in collection_shard1_replica_n21.

So is there something like,

SolrCollecton dataCollection = getCollection(dataInfo.dataSource);

@Override
public void processAdd(AddUpdateCommand cmd) throws IOException {
   SolrInputDocument doc = cmd.getSolrInputDocument();
   String uniqueId = getUniqueId(doc);

   SolrCore dataCore =
req.getCore().getCoreContainer().getCore(dataInfo.dataSource);

   if (dataCore == null){
       LOG.error("Solr core '{}' to use as data source could not be found!  "
               + "Please check if it is loaded.", dataInfo.dataSource);
   } else{

          Document sourceDoc = getSourceDocument(dataCore, uniqueId);

          if (sourceDoc != null){

              populateDocToBeAddedFromSourceDoc(doc,sourceDoc);
          }
   }

   // pass it up the chain
   super.processAdd(cmd);
}


On Wed, Aug 28, 2019 at 6:15 PM Erick Erickson <erickerick...@gmail.com>
wrote:

> No, you cannot just use the collection name. Replicas are just cores.
> You can host many replicas of a single collection on a single Solr node
> in a single CoreContainer (there’s only one per Solr JVM). If you just
> specified a collection name how would the code have any clue which
> of the possibilities to return?
>
> The name is in the form collection_shard1_replica_n21
>
> How do you know where the doc you’re working on? Put the ID through
> the hashing mechanism.
>
> This isn’t the same at all if you’re running stand-alone, then there’s only
> one name.
>
> But as I indicated above, your ask for just using the collection name isn’t
> going to work by definition.
>
> So perhaps this is an XY problem. You’re asking about getCore, which is
> a very specific, low-level concept. What are you trying to do at a higher
> level? Why do you think you need to get a core? What do you want to _do_
> with the doc that you need the core it resides in?
>
> Best,
> Erick
>
> > On Aug 28, 2019, at 5:28 PM, Arnold Bronley <arnoldbron...@gmail.com>
> wrote:
> >
> > Wait, would I need to use core name like  collection1_shard1_replica_n4
> > etc/? Can't I use collection name? What if  I have multiple shards, how
> > would I know where does the document that I am working with lives in
> > currently.
> > I would rather prefer to use collection name and expect the core
> > information to be abstracted out that way.
> >
> > On Wed, Aug 28, 2019 at 5:13 PM Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> Hmmm, should work. What is your core_name? There’s strings like
> >> collection1_shard1_replica_n4 and core_node6. Are you sure you’re using
> the
> >> right one?
> >>
> >>> On Aug 28, 2019, at 3:56 PM, Arnold Bronley <arnoldbron...@gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> In a custom Solr plugin code,
> >>> req.getCore().getCoreContainer().getCore(core_name) is returning null
> >> even
> >>> if core by name core_name is loaded and up in Solr. req is object
> >>> of SolrQueryRequest class. I am using Solr 8.2.0 in SolrCloud mode.
> >>>
> >>> Any ideas on why this might be the case?
> >>
> >>
>
>

Reply via email to