mmmmmm one broken document in a batch should not break the entire batch ,
right ( whatever approach used) ?
Are you referring to the fact that you want to programmatically re-index
the broken docs ?

Would be interesting to return the id of the broken docs along with the
solr update response!

Cheers


On 6 October 2015 at 15:30, Bill Dueber <b...@dueber.com> wrote:

> Just to add...my informal tests show that batching has waaaaay more effect
> than solrj vs json.
>
> I haven't look at CUSC in a while, last time I looked it was impossible to
> do anything smart about error handling, so check that out before you get
> too deeply into it. We use a strategy of sending a batch of json documents,
> and if it returns an error sending each record one at a time until we find
> the bad one and can log something useful.
>
>
>
> On Mon, Oct 5, 2015 at 12:07 PM, Alessandro Benedetti <
> benedetti.ale...@gmail.com> wrote:
>
> > Thanks Erick,
> > you confirmed my impressions!
> > Thank you very much for the insights, an other opinion is welcome :)
> >
> > Cheers
> >
> > 2015-10-05 14:55 GMT+01:00 Erick Erickson <erickerick...@gmail.com>:
> >
> > > SolrJ tends to be faster for several reasons, not the least of which
> > > is that it sends packets to Solr in a more efficient binary format.
> > >
> > > Batching is critical. I did some rough tests using SolrJ and sending
> > > docs one at a time gave a throughput of < 400 docs/second.
> > > Sending 10 gave 2,300 or so. Sending 100 at a time gave
> > > over 5,300 docs/second. Curiously, 1,000 at a time gave only
> > > marginal improvement over 100. This was with a single thread.
> > > YMMV of course.
> > >
> > > CloudSolrClient is definitely the better way to go with SolrCloud,
> > > it routes the docs to the correct leader instead of having the
> > > node you send the docs to do the routing.
> > >
> > > Best,
> > > Erick
> > >
> > > On Mon, Oct 5, 2015 at 4:57 AM, Alessandro Benedetti
> > > <abenede...@apache.org> wrote:
> > > > I was doing some studies and analysis, just wondering in your opinion
> > > which
> > > > one is the best approach to use to index in Solr to reach the best
> > > > throughput possible.
> > > > I know that a lot of factor are affecting Indexing time, so let's
> only
> > > > focus in the feeding approach.
> > > > Let's isolate different scenarios :
> > > >
> > > > *Single Solr Infrastructure*
> > > >
> > > > 1) Xml/Json batch request to /update IndexHandler (xml/json)
> > > >
> > > > 2) SolrJ ConcurrentUpdateSolrClient ( javabin)
> > > > I was thinking this to be the fastest approach for a multi threaded
> > > > indexing application.
> > > > Posting batch of docs if possible per request.
> > > >
> > > > *Solr Cloud*
> > > >
> > > > 1) Xml/Json batch request to /update IndexHandler(xml/json)
> > > >
> > > > 2) SolrJ ConcurrentUpdateSolrClient ( javabin)
> > > >
> > > > 3) CloudSolrClient ( javabin)
> > > > it seems the best approach accordingly to this improvements [1]
> > > >
> > > > What are your opinions ?
> > > >
> > > > A bonus observation should be for using some Map/Reduce big data
> > indexer,
> > > > but let's assume we don't have a big cluster of cpus, but the average
> > > > Indexer server.
> > > >
> > > >
> > > > [1]
> > > >
> > >
> >
> https://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
> > > >
> > > >
> > > > Cheers
> > > >
> > > >
> > > > --
> > > > --------------------------
> > > >
> > > > Benedetti Alessandro
> > > > Visiting card : http://about.me/alessandro_benedetti
> > > >
> > > > "Tyger, tyger burning bright
> > > > In the forests of the night,
> > > > What immortal hand or eye
> > > > Could frame thy fearful symmetry?"
> > > >
> > > > William Blake - Songs of Experience -1794 England
> > >
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card - http://about.me/alessandro_benedetti
> > Blog - http://alexbenedetti.blogspot.co.uk
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>
>
>
> --
> Bill Dueber
> Library Systems Programmer
> University of Michigan Library
>



-- 
--------------------------

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to