Hi Srikanth,

You are correct that in a NiFi cluster the intent would be to schedule
GetSolr on the primary node only (on the scheduling tab) so that only one
node in your cluster was extracting data.

GetSolr determines which SolrJ client to use based on the "Solr Type"
property, so if you select "Cloud" it will use SolrCloudClient. It would
send the query to one node based on the cluster state from ZooKeeper, and
then that Solr node performs the distributed query.

Did you have a specific use case where you wanted to query each shard
individually?

I think it would be straight forward to expose something on GetSolr that
would set "distrib=false" on the query so that Solr would not execute a
distributed query. You would then most likely create separate instances of
GetSolr and configure them as Standard type pointing at the respective
shards. Let us know if that is something you are interested in.

Thanks,

Bryan


On Sun, Aug 30, 2015 at 7:32 PM, Srikanth <srikanth...@gmail.com> wrote:

> Hello,
>
> I started to explore NiFi project a few days back. I'm still trying it out.
>
> I have a few basic question on GetSolr.
>
> Should GetSolr be run as an Isolated Processor?
>
> If I have SolrCloud with 4 shards/nodes and NiFi cluster with 4 nodes,
> will GetSolr be able to query each shard from one specific NiFi node? I'm
> guessing it doesn't work that way.
>
>
> Thanks,
> Srikanth
>
>

Reply via email to