That is exactly what we do. The entire set of loaded documents is saved as JSONL in S3. Very handy for loading up a prod index in test for diagnosis or benchmarking.
wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 1, 2017, at 8:14 AM, Rick Leir <rl...@leirtech.com> wrote: > > And perhaps put the crawl results in JSONL, so when you get a 404 you can use > yesterdays document in a pinch. Cheers -- Rick > > On March 1, 2017 10:20:21 AM EST, Walter Underwood <wun...@wunderwood.org> > wrote: >> Since I always need to know which document was bad, I back off to >> batches of one document when there is a failure. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >> >>> On Mar 1, 2017, at 6:25 AM, Erick Erickson <erickerick...@gmail.com> >> wrote: >>> >>> What version of Solr? This was a pretty long-standing issue that was >>> fixed in Solr 6.1, >>> see: https://issues.apache.org/jira/browse/SOLR-445 Otherwise you >> really have to >>> write your code to re-transmit sub-packets, perhaps even one at a >> time >>> when a packet >>> fails. >>> >>> Best, >>> Erick >>> >>> On Wed, Mar 1, 2017 at 3:46 AM, kshitij tyagi >>> <kshitij.shopcl...@gmail.com> wrote: >>>> Hi Team, >>>> >>>> I am facing an issue when I am updating more than 1 document on >> solr. >>>> >>>> 1. If any 1 document gives 400 error them my other documents are >> also not >>>> updated. >>>> >>>> How can I approach to solve this? I need my other documents to be >> indexed >>>> which are not giving 400 error. >>>> >>>> Help appreciated! >>>> >>>> Regards, >>>> Kshitij > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity.