[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

Andrzej Bialecki (JIRA) Tue, 23 Sep 2014 02:19:49 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144590#comment-14144590
 ]


Andrzej Bialecki  commented on SOLR-6266:
-----------------------------------------

Hi Joel, Karol and I work together, so I thought I'd chime in.

bq. We'll need to also figure out where to place the CAPIServer so there is 
only one per node.
I think there is no such place for "global" components in Solr yet, the only 
special component that is global being the CoreAdminHandler. It would be a nice 
feature, but it's outside the scope of this issue.

So, if you can't have what you like you have to like what you have ;) This 
means that for now the only option is to run an instance of CAPIServer per 
collection.

bq. From my understanding the CAPIServer is listening on an ip/port. Couchbase 
can be configured to replicate a bucket to a specific host and port.

Karol is working now on using the Couchbase REST API to configure Couchbase 
automatically to send docs to a particular instance of CAPIServer that is 
active. This will eliminate the need for manual configuration on the Couchbase 
end, and will allow to re-target the replication to any other instance that 
becomes active, should the current instance of CAPIServer disappear.

Regarding running of CAPIServers on all replicas: with the auto-configuration 
mechanism as described above it's not needed, it's enough to activate a single 
instance per collection, using e.g. always the first shard's leader. If this 
node goes down, another leader will be elected and the CAPIServer instance will 
activate there and register itself with Couchbase.

Couchbase always sends all changes for a bucket to a replica, so if you had in 
mind an optimization where each shard would get only its own documents then it 
wouldn't work - CAPIServer-s would get all documents anyway and they would have 
to discard (N-1)/N docs - so this would only create heavier load on Couchbase 
and Solr.

If we ran multiple active CAPIServer-s on replicas it wouldn't work right 
either - copies of the same documents would be received multiple times, and 
while they would be correctly re-routed to the right shards, each shard would 
receive multiple copies and the ordering would be non-deterministic - not so 
important for adds but crucial for a mix of adds / deletes.

> Couchbase plug-in for Solr
> --------------------------
>
>                 Key: SOLR-6266
>                 URL: https://issues.apache.org/jira/browse/SOLR-6266
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Varun
>            Assignee: Joel Bernstein
>         Attachments: solr-couchbase-plugin.tar.gz, 
> solr-couchbase-plugin.tar.gz
>
>
> It would be great if users could connect Couchbase and Solr so that updates 
> to Couchbase can automatically flow to Solr. Couchbase provides some very 
> nice API's which allow applications to mimic the behavior of a Couchbase 
> server so that it can receive updates via Couchbase's normal cross data 
> center replication (XDCR).
> One possible design for this is to create a CouchbaseLoader that extends 
> ContentStreamLoader. This new loader would embed the couchbase api's that 
> listen for incoming updates from couchbase, then marshal the couchbase 
> updates into the normal Solr update process. 
> Instead of marshaling couchbase updates into the normal Solr update process, 
> we could also embed a SolrJ client to relay the request through the http 
> interfaces. This may be necessary if we have to handle mapping couchbase 
> "buckets" to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

Reply via email to