That is exactly what we do. The entire set of loaded documents is saved as 
JSONL in S3. Very handy for loading up a prod index in test for diagnosis or 
benchmarking.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Mar 1, 2017, at 8:14 AM, Rick Leir <rl...@leirtech.com> wrote:
> 
> And perhaps put the crawl results in JSONL, so when you get a 404 you can use 
> yesterdays document in a pinch. Cheers -- Rick
> 
> On March 1, 2017 10:20:21 AM EST, Walter Underwood <wun...@wunderwood.org> 
> wrote:
>> Since I always need to know which document was bad, I back off to
>> batches of one document when there is a failure.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Mar 1, 2017, at 6:25 AM, Erick Erickson <erickerick...@gmail.com>
>> wrote:
>>> 
>>> What version of Solr? This was a pretty long-standing issue that was
>>> fixed in Solr 6.1,
>>> see: https://issues.apache.org/jira/browse/SOLR-445 Otherwise you
>> really have to
>>> write your code to re-transmit sub-packets, perhaps even one at a
>> time
>>> when a packet
>>> fails.
>>> 
>>> Best,
>>> Erick
>>> 
>>> On Wed, Mar 1, 2017 at 3:46 AM, kshitij tyagi
>>> <kshitij.shopcl...@gmail.com> wrote:
>>>> Hi Team,
>>>> 
>>>> I am facing an issue when I am updating more than 1 document on
>> solr.
>>>> 
>>>> 1. If any 1 document gives 400 error them my other documents are
>> also not
>>>> updated.
>>>> 
>>>> How can I approach to solve this? I need my other documents to be
>> indexed
>>>> which are not giving 400 error.
>>>> 
>>>> Help appreciated!
>>>> 
>>>> Regards,
>>>> Kshitij
> 
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

Reply via email to