solrcloud and csv import hangs

dan sutton Mon, 24 Sep 2012 08:03:44 -0700

Hi,

This appears to happen in trunk too.


It appears that the add command request parameters get sent to the
nodes. If I comment these out like so for add and commit:

core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java

-      params = new ModifiableSolrParams(req.getParams());
+      //params = new ModifiableSolrParams(req.getParams());
+      params = new ModifiableSolrParams();

This things work as expected.

Otherwise params like stream.url gets sent to the replicant nodes
which causes failure if the file is missing, or worse repeatedly
importing the same file if exists on a replicant.

This might not be the right thing to do? ... what should be sent here
for a streaming CSV import?

Dan


On Thu, Sep 20, 2012 at 4:32 PM, dan sutton <danbsut...@gmail.com> wrote:
> Hi,
>
> I'm using Solr 4.0-BETA and trying to import a CSV file as follows:
>
> curl http://localhost:8080/solr/<core>/update -d overwrite=false -d
> commit=true -d stream.contentType='text/csv;charset=utf-8' -d
> stream.url=file:///dir/file.csv
>
> I have 2 tomcat servers running on different machines and a separate
> zookeeper quorum (3  zoo servers, 2 on same machine).  This is a 1
> shard core, replicated to the other machine.
>
> It seems that for a 255K line file I have 170 docs on the server that
> issued the command, but on the other, the index seems to grow
> unbounded?
>
> Has anyone been seen this, or been successful in using the CSV import
> with solrcloud?
>
> Cheers,
> Dan

solrcloud and csv import hangs

Reply via email to