Please let me know what the actual exception trace is. Thanks! Karl
On Thu, Oct 17, 2013 at 8:47 AM, Roland Everaert <[email protected]>wrote: > We already do that. But, solr is still raising exception for some file > types, I have to wait for the customer to provide me the corresponding log > from solr and message received by the mcf job. > > > Regards, > > > Roland. > > > On Thu, Oct 17, 2013 at 2:41 PM, Karl Wright <[email protected]> wrote: > >> Ah, here it is: >> >> >> http://lucene.472066.n3.nabble.com/ignoreTikaException-value-td3645906.html >> >> Karl >> >> >> >> On Thu, Oct 17, 2013 at 8:39 AM, Karl Wright <[email protected]> wrote: >> >>> Hi Roland, >>> >>> Usually 500 errors are from Tika (aka Solr Cell). If that's what you >>> are seeing, there is a way to disable them. I don't remember precisely >>> what you do, but it has been posted to this list (and others) so a google >>> search should find that for you. >>> >>> Thanks! >>> Karl >>> >>> >>> >>> On Thu, Oct 17, 2013 at 8:37 AM, Roland Everaert >>> <[email protected]>wrote: >>> >>>> So far we had only to deal with HTTP code 500, because solr was not >>>> able to process some file types. We manage to tel solr to ignore tika >>>> exception. This helps us quite a lot, but solr as problem with processing >>>> some file types, and I have not yet find a way to tell solr to basically >>>> skip errors, while still logging them. >>>> >>>> I will check with the customer to get the error, but it was yesterday >>>> when it shows up and they have continued with the indexing (we are still at >>>> the initial indexing of the repository) and the logs with errors have >>>> disappeared. >>>> >>>> >>>> Thanks for your support, >>>> >>>> >>>> Roland. >>>> >>>> >>>> >>>> On Thu, Oct 17, 2013 at 2:22 PM, Karl Wright <[email protected]>wrote: >>>> >>>>> Hi Roland, >>>>> >>>>> It depends on what the error code is. There is quite a bit of logic >>>>> in the Solr connector (and in ManifoldCF itself) for handling errors of >>>>> different kinds. Fundamentally there are two main kinds of error >>>>> condition >>>>> - one which causes a retry (and can, if so specified, cause either the >>>>> offending document to be skipped or the job aborted) and another which >>>>> always causes a job to abort. The Solr connector has to decide based on >>>>> limited information exactly what to do. General HTTP error codes such as >>>>> "500" errors, for example, contain little information and look just the >>>>> same whether the error represent a document Tika is unhappy with, or >>>>> something more fundamental, like a complete misconfiguration of Solr. >>>>> >>>>> If you can provide more detailed information as to the kind of >>>>> error(s) you are seeing then we can advise you further. >>>>> >>>>> Karl >>>>> >>>>> >>>>> >>>>> On Thu, Oct 17, 2013 at 8:17 AM, Roland Everaert <[email protected] >>>>> > wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I helped a customer to deploy solr+manifoldcf to index files from a >>>>>> windows share drive. But every time solr is sending back an error >>>>>> message, >>>>>> the manifoldcf jobs abort, which is not really convenient for hour long >>>>>> indexing. >>>>>> >>>>>> So is there a possibility to configure manifold so it doesn't stopped >>>>>> every time solr return an http code different from 200? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> >>>>>> Roland. >>>>>> >>>>> >>>>> >>>> >>> >> >
