[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-10-21 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178553#comment-14178553
 ] 

Karol Abramczyk commented on SOLR-6266:
---

I made this plugin's repository public. It's here: 
https://github.com/LucidWorks/solr-couchbase-plugin. I think it would be much 
more convenient to use it with Pull Requests to provide new features.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin-0.0.3-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin-0.0.5-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin-0.0.5.1-SNAPSHOT.tar.gz, solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-10-17 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175174#comment-14175174
 ] 

Karol Abramczyk commented on SOLR-6266:
---

Kwan-I Lee,

Handling replication failure hasn't been implemented yet in this plugin. I 
think that most important improvents now are:
* automatic XDCR configuration in Couchbase Server at plugin startup
* tests for current functionality

With both this improvents I had difficulties so far and couldn't succeed. 
However, when this is done, all following work should be much easier.

In the meantime I fixed some important bugs, so I attach the latest sourcecode.



 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin-0.0.3-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin-0.0.5-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin-0.0.5.1.tar.gz, solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-10-17 Thread Kwan-I Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175485#comment-14175485
 ] 

Kwan-I Lee commented on SOLR-6266:
--

Karol,

Thanks for the reply. 
I also tried automatic XDCR configuration in Couchbase server by using the APIs 
they provide. The problem I encountered is they don't have API to switch XDCR 
protocol (at least I didn't find it on their website.) By default the protocol 
is set to version 2 upon creation, while we need version 1 to make the plugin 
work. I'll try to talk with Couchbase people to see if there's any way to make 
it. 

Kwan

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin-0.0.3-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin-0.0.5-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin-0.0.5.1-SNAPSHOT.tar.gz, solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-10-16 Thread Kwan-I Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14173588#comment-14173588
 ] 

Kwan-I Lee commented on SOLR-6266:
--

Karol, Andrzej,

I'm interested in how this plugin handles different replication failure 
scenarios. Here is one of my tests:
1. Add some documents in Couchbase.
2. Activate this plugin, creating remote cluster and replication in Couchbase. 
- Data successfully pushed to Solr through XDCR. The Couchbase documents are 
now visible in Solr. 
3. Stop Solr instance. Add a document, doc1, in Couchbase.
4. Restart Solr instance and activate plugin.

With Elasticsearch-Couchbase plugin, doc1 will be pushed to Elasticsearch node 
once the machine is back. However with this plugin, the replication of doc1 
will fail and never go to Solr instance. 

I spent some time debugging and tracing both Elasticsearch and Solr plugin 
code. For Elasticsearch one, couchbase.capi.servlet.ClusterMapServlet.doGet() 
will eventually get correct pool from req.getPathInfo(). However for Solr one, 
req.getPathInfo() keeps getting null value for pool no matter how many times 
Couchbase sends doc1 update request to Solr plugin. 
I'm testing it on Mac, so not sure if it happens in other systems.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin-0.0.3-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin-0.0.5-SNAPSHOT.tar.gz, solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-10-02 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156376#comment-14156376
 ] 

Karol Abramczyk commented on SOLR-6266:
---

commons-io-2.4 is required by couchbase-capi-server project used in this plugin

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin-0.0.3-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin.tar.gz, solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-26 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149105#comment-14149105
 ] 

Joel Bernstein commented on SOLR-6266:
--

Karol,

Thanks for confirming the couchbase behavior. Let's see if we can reach 
consensus on the SolrCloud design. Here is what I propose:

1) I agree that having a CAPIServer per-collection makes the most sense for the 
reasons that Andrzej laid out. 

2) Based on your findings, let's go with simplest plan of having CAPIServers 
listening on all replicas. This will also be the most robust scenario and allow 
Couchbase to replicate in multiple threads to multiple nodes simultaneously.

3) In CouchbaseBehaviour.getNodesServingPool()  lets auto-discover the running 
CAPIServers from the existing Zookeeper state information. To keep things 
simple we could return all active replicas in the collection. Or we could ping 
each active node on the CAPIServer port to make sure the CAPIServer is running.

Let's also not keep any extra book-keeping in Zookeeper unless we absolutely 
have to for this ticket.

Joel


.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-26 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149583#comment-14149583
 ] 

Karol Abramczyk commented on SOLR-6266:
---

Joel, Varun,

I fixed the code today to properly handle revsDiff and bulkDocs requests. It 
still runs only one CAPIServer per collection. I'm trying to implement 
automatic XDCR configuration via XDCR REST, but for unknown reasons I cannot 
create a remote cluster reference there. Usually I get response Cluster uuid 
does not match the requested.. But I can get all remote clusters though. 

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin-0.0.3-SNAPSHOT.tar.gz, 
 solr-couchbase-plugin.tar.gz, solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-25 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147850#comment-14147850
 ] 

Karol Abramczyk commented on SOLR-6266:
---

Joel,

I confirmed today, that Couchbase replicates each document to only one of the 
running CAPIServers. My test configuration was Solr 1 and 2 running CAPIServers 
1 and 2. Couchbase test bucket had 2 documents A and B. Only CAPIServer 1 was 
configured for replication with Couchbase, but in 
CouchbaseBehaviour.getNodesServingPool() method it was placing info about 
itself and CAPIServer 2 as well. So you were right about the Couchbase 
replication. But it also seems that only one CAPIServer configured with 
Couchbase is sufficient as long as it knows about the other CAPIServers in this 
cluster. We could register operating CAPIServers in ZooKeeper, to have this 
info available for every node. I didn't check what happens if more CAPIServers 
is configured with Couchabse XDCR.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-25 Thread Varun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148183#comment-14148183
 ] 

Varun commented on SOLR-6266:
-

Andrzej,
 In your setup have tried partial update ? If you have 10 docs in couchbase, 
and change onr of them, does solr again get all 10, or just the changed one ?
Seems in your bulkDocs method you are always returning empty result list. 
Shouldn't solr index and return latest revision number of documents, to let 
couchbase know what revision solr already have ?

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-25 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148329#comment-14148329
 ] 

Karol Abramczyk commented on SOLR-6266:
---

Varun,
I found out today I missed this results in bulkDocs and in revsDiff. I already 
fixed it, so now in bulkDocs Solr gets only documents that are not in the index 
or the revision is different.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-25 Thread Varun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148654#comment-14148654
 ] 

Varun commented on SOLR-6266:
-

Thanks Karol and Andrzej. Please upload the latest patch so that we can improve 
it further. We are trying to setup an instance and try out various scenarios.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-23 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144590#comment-14144590
 ] 

Andrzej Bialecki  commented on SOLR-6266:
-

Hi Joel, Karol and I work together, so I thought I'd chime in.

bq. We'll need to also figure out where to place the CAPIServer so there is 
only one per node.
I think there is no such place for global components in Solr yet, the only 
special component that is global being the CoreAdminHandler. It would be a nice 
feature, but it's outside the scope of this issue.

So, if you can't have what you like you have to like what you have ;) This 
means that for now the only option is to run an instance of CAPIServer per 
collection.

bq. From my understanding the CAPIServer is listening on an ip/port. Couchbase 
can be configured to replicate a bucket to a specific host and port.

Karol is working now on using the Couchbase REST API to configure Couchbase 
automatically to send docs to a particular instance of CAPIServer that is 
active. This will eliminate the need for manual configuration on the Couchbase 
end, and will allow to re-target the replication to any other instance that 
becomes active, should the current instance of CAPIServer disappear.

Regarding running of CAPIServers on all replicas: with the auto-configuration 
mechanism as described above it's not needed, it's enough to activate a single 
instance per collection, using e.g. always the first shard's leader. If this 
node goes down, another leader will be elected and the CAPIServer instance will 
activate there and register itself with Couchbase.

Couchbase always sends all changes for a bucket to a replica, so if you had in 
mind an optimization where each shard would get only its own documents then it 
wouldn't work - CAPIServer-s would get all documents anyway and they would have 
to discard (N-1)/N docs - so this would only create heavier load on Couchbase 
and Solr.

If we ran multiple active CAPIServer-s on replicas it wouldn't work right 
either - copies of the same documents would be received multiple times, and 
while they would be correctly re-routed to the right shards, each shard would 
receive multiple copies and the ordering would be non-deterministic - not so 
important for adds but crucial for a mix of adds / deletes.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-23 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144805#comment-14144805
 ] 

Joel Bernstein commented on SOLR-6266:
--

So, it appears that I am misunderstanding something about how Couchbase XDCR 
works. What it seems like you're saying, is that Couchbase will replicate each 
document to each available CAPIserver at a XDCR endpoint. 

I had assumed that this would not be the case. I had assumed that Couchbase 
would auto-discover the nodes for a specific replication end point. But, that 
it would only send each document to one node for the cross-datacenter 
replication. The receiving node would then be responsible for the intra-cluster 
replication. 







 

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-23 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144858#comment-14144858
 ] 

Noble Paul commented on SOLR-6266:
--

It makes sense to have a per-cluster/per-collection/per shard components in 
Solr. 

The component should be available at all nodes and SolrCloud should ensure that 
it runs in only one node (depending on scope) at a given time.

For this usecase

* I need a per collection component
* SolrCloud should ensure that one and only one instance of this component runs 
per collection
* The component lifecycle should be managed by SolrCloud. i.e callbacks for 
component init() when it is started up in a node. If possible , give a unload() 
callback if the system decided to switch the node



 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-23 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144863#comment-14144863
 ] 

Andrzej Bialecki  commented on SOLR-6266:
-

I'm no Couchbase expert by any means, but my reading of the docs indicates that 
you can only define a single destination host:port for all documents from a 
bucket, and Couchbase then sends all documents to a selected host:port and it's 
the task of the target cluster to handle distribution across the target 
cluster. In a sense it's similar to how Solr's distrib indexing works.

So if we had multiple active CAPIServer-s and each were registered as an XCDR 
destination then each of them would receive all documents.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-23 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144944#comment-14144944
 ] 

Joel Bernstein commented on SOLR-6266:
--

Andrzej, you may be right, but let's confirm. It seems like the right design 
for XCDR would be to only replicate each document once across datacenters. Then 
let the receiving node handle the intra-cluster replication. I'll try to get 
confirmation on which approach is used.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-23 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144967#comment-14144967
 ] 

Joel Bernstein commented on SOLR-6266:
--

Andrzej, just reread you last comment more closely. It seems that you're 
thinking that the CAPIServer in the remote datacenter will automatically 
forward each document to all available CAPIServers. I was thinking that the 
CAPIServer would just pass the documents to the local bulkDocs implementation 
and be done with it. But I could be mistaken here. I'll review some code and 
see if I can confirm.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-23 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144992#comment-14144992
 ] 

Joel Bernstein commented on SOLR-6266:
--

Just did a quick review of:

 
https://github.com/couchbaselabs/couchbase-capi-server/blob/master/src/main/java/com/couchbase/capi/servlet/CAPIServlet.java

It appears the documents are not forwarded to other CAPIServers automatically. 
They are just processed locally, so it is the responsibility of the bulkDocs 
implementation to handle intra-cluster replication/routing. 

So this appears to be the basic flow:

1) Couchbase sets up the initial replication and discovers the available 
CAPIServers through the couchbase behavior api.
2) Couchbase replicates each document once across datacenter to one of the 
available CAPIServers. (Still needs to be confirmed).
3) The local CAPIServer only forwards the docs to the bulkDocs implementation

If this scenario is correct, then we control the intra-cluster replication 
ourselves, and we can run a CAPIServer on each replica without any issues.



 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-22 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143073#comment-14143073
 ] 

Karol Abramczyk commented on SOLR-6266:
---

Joel,

My plan was to collect Couchbase data with a single Solr shard leader (to start 
with) and let Solr replicate the data to the other replicas. I think that 
running CAPIServer on all replicas is pointless, as at the end they will send 
all received documents to the shard leader to index it. And this would cause 
increased network load because of the communication between Couchase servers 
and all the CAPIServers on replicas and also between the Solr replicas and Solr 
shards. I also wasn't sure if running CAPIServer on multiple replicas would 
result in indexing one document multiple times with different IDs. However, it 
could be useful to run CAPIServer on all shard leaders, but it would require 
the shard leader to calculate if the received document should be indexed in it. 

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143257#comment-14143257
 ] 

Joel Bernstein commented on SOLR-6266:
--

From my understanding the CAPIServer is listening on a port. Couchbase can be 
configured to replicate a bucket to a specific host and post.  So running the 
CAPIServer just means that there will be many CAPIServers running. The actual 
replication session will be between Couchbase and a single CAPIServer. So in a 
single repication session documents will flow to one CAPIServer and that 
CAPIServer and that Solr instance move the documents into the distributed 
indexing flow.

From this scenario running a CAPIServer on all replicas really has no 
downside. 

But running the CAPIServer from just the leader has a couple of major downsides:

1) Leaders and replicas will change. Couchbase is pointing directly to an 
ip:port. If all of sudden that node is no longer the leader then replication 
has stopped. If the CAPIServer is running on all replicas then this is not an 
issue. 

2) If we run the CAPIServer everywhere we don't have to manage bringing 
CAPIServers up and down as the leader changes. So this removes quite a bit of 
complexity from the design.

We don't have to worry about duplicate indexing on shards by running 
CAPIServers on the replicas. If we inject the documents properly into the 
SolrCloud indexing flow, then SolrCloud with ensure that documents get to the 
right place.

What we do have to consider very carefully though is whether we need a 
CAPIServer running per Collection or per Solr node, because this effect the 
entire design.

My thinking is that we should have a single CAPIServer per Solr node to 
services all collections. I'm assuming that the CAPIServer has thread overhead 
that we don't want for each collection. 

But if we decide to go this route then we will need to route documents to 
correct collection based on the bucket name. We'll need to also figure out how 
to place the CAPIServer so there is only one per node. 






 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-19 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140907#comment-14140907
 ] 

Joel Bernstein commented on SOLR-6266:
--

I reviewed Karol's contribution today, it looks great. Let's use this as our 
base implementation.

It looks like Karol has worked out a lot of details of how to embed the 
Couchbase API's and handle documents. This is excellent.

I think we need to take a step back and do some planning around two areas 
before iterating on what's here.

1) SolrCloud architecture. Some questions to think about:

How does the plugin work in the context of single collection?  Should it run in 
all replicas or just leaders?

How does the plugin work in the context of multiple collections sharing the 
same Solr nodes? Should there be a different CAPIServer running for each 
collection? Or should there be a CAPIServer per Solr node?

2) Error handling. We'll need to understand the different failure scenarios and 
have strategies for handling them. And we'll need to fully understand how the 
Couchbases API's account for failure scenarios.

I'll need to catch up on the Couchbase API's before I can weigh-in on these 
issue. I should have time to review the API's next week. In the meantime if 
anyone has any thoughts fire away.

















 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-19 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141185#comment-14141185
 ] 

Karol Abramczyk commented on SOLR-6266:
---

[~joel.bernstein] In the meantime I finished my basic implementation of 
CAPIServer failover. Solr plugin runs only one CAPIServer on the leader of 
shard1 and replicas put a watch on it to start a new CAPIServer when the first 
one goes down. I will update the source and remove unnecessary dependencies.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-19 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141222#comment-14141222
 ] 

Joel Bernstein commented on SOLR-6266:
--

Karol,

Can you explain your thinking with the SolrCloud design? Why only run the 
CAPIServer on the shard leader, why not run it on all replicas?

It seems like it would be a simpler design to run it on all replicas. 

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz, 
 solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-16 Thread Karol Abramczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14136148#comment-14136148
 ] 

Karol Abramczyk commented on SOLR-6266:
---

I have been working for couple of days on Solr-Couchbase Plugin which is based 
on Elasticsearch Couchbase Plugin.  It's main features are as follows:
 * Designed as a RequestHandler which mimics the behaviour of Couchbase server
 * Built for Solrcloud cluster configuration - under development. Currently 
supports locking to have only one RequestHandler running in Solr Cluster. 
Recovery from network failures and cluster reconfiguration not supported yet.
 * Real time indexing
 * Support for different data types
 * Support for nested documents
 * Currently no support for multiple collections - all documents are indexed to 
one collection.
I attach source code of this plugin.

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein

 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6266) Couchbase plug-in for Solr

2014-09-16 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14136357#comment-14136357
 ] 

Joel Bernstein commented on SOLR-6266:
--

Karol,

Thanks for contributing your work! I should have time this week to review the 
patch and provide some feedback.

Joel

 Couchbase plug-in for Solr
 --

 Key: SOLR-6266
 URL: https://issues.apache.org/jira/browse/SOLR-6266
 Project: Solr
  Issue Type: New Feature
Reporter: Varun
Assignee: Joel Bernstein
 Attachments: solr-couchbase-plugin.tar.gz


 It would be great if users could connect Couchbase and Solr so that updates 
 to Couchbase can automatically flow to Solr. Couchbase provides some very 
 nice API's which allow applications to mimic the behavior of a Couchbase 
 server so that it can receive updates via Couchbase's normal cross data 
 center replication (XDCR).
 One possible design for this is to create a CouchbaseLoader that extends 
 ContentStreamLoader. This new loader would embed the couchbase api's that 
 listen for incoming updates from couchbase, then marshal the couchbase 
 updates into the normal Solr update process. 
 Instead of marshaling couchbase updates into the normal Solr update process, 
 we could also embed a SolrJ client to relay the request through the http 
 interfaces. This may be necessary if we have to handle mapping couchbase 
 buckets to Solr collections on the Solr side. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org