Re: Tika HTTP 400 Errors with DIH

2014-12-08 Thread Dan Davis
I would say that you could determine a row that gives a bad URL, and then run it in DIH admin interface (or the command-line) with debug enabled The url parameter going into tika should be present in its transformed form before the next entity gets going. This works in a similar scenario for me.

RE: Tika HTTP 400 Errors with DIH

2014-12-05 Thread Teague James
? Is there some other setting I've missed? I appreciate the suggestions! -Teague -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Thursday, December 04, 2014 12:22 PM To: solr-user Subject: Re: Tika HTTP 400 Errors with DIH Right. Resource not found (on server

RE: Tika HTTP 400 Errors with DIH

2014-12-05 Thread steve
have not tried this with standalone server or with any SOLR type project. Cheers!Steve From: teag...@insystechinc.com To: solr-user@lucene.apache.org Subject: RE: Tika HTTP 400 Errors with DIH Date: Fri, 5 Dec 2014 12:03:23 -0500 Alex, Your suggestion might be a solution, but the issue

RE: Tika HTTP 400 Errors with DIH

2014-12-04 Thread Teague James
Rafalovitch [mailto:arafa...@gmail.com] Sent: Tuesday, December 02, 2014 1:45 PM To: solr-user Subject: Re: Tika HTTP 400 Errors with DIH On 2 December 2014 at 13:19, Teague James teag...@insystechinc.com wrote: clob=true What does ClobTransformer is doing on the DownloadURL field? Is it possible

Re: Tika HTTP 400 Errors with DIH

2014-12-04 Thread Alexandre Rafalovitch
[mailto:arafa...@gmail.com] Sent: Tuesday, December 02, 2014 1:45 PM To: solr-user Subject: Re: Tika HTTP 400 Errors with DIH On 2 December 2014 at 13:19, Teague James teag...@insystechinc.com wrote: clob=true What does ClobTransformer is doing on the DownloadURL field? Is it possible

Re: Tika HTTP 400 Errors with DIH

2014-12-04 Thread Walter Underwood
, December 02, 2014 1:45 PM To: solr-user Subject: Re: Tika HTTP 400 Errors with DIH On 2 December 2014 at 13:19, Teague James teag...@insystechinc.com wrote: clob=true What does ClobTransformer is doing on the DownloadURL field? Is it possible it is corrupting the value somehow? Regards

Re: Tika HTTP 400 Errors with DIH

2014-12-04 Thread Alexandre Rafalovitch
To: solr-user Subject: Re: Tika HTTP 400 Errors with DIH On 2 December 2014 at 13:19, Teague James teag...@insystechinc.com wrote: clob=true What does ClobTransformer is doing on the DownloadURL field? Is it possible it is corrupting the value somehow? Regards, Alex. Personal: http

Tika HTTP 400 Errors with DIH

2014-12-02 Thread Teague James
Hi all, I am using Solr 4.9.0 to index a DB with DIH. In the DB there is a URL field. In the DIH Tika uses that field to fetch and parse the documents. The URL from the field is valid and will download the document in the browser just fine. But Tika is getting HTTP response code 400. Any ideas

Re: Tika HTTP 400 Errors with DIH

2014-12-02 Thread Alexandre Rafalovitch
On 2 December 2014 at 13:19, Teague James teag...@insystechinc.com wrote: clob=true What does ClobTransformer is doing on the DownloadURL field? Is it possible it is corrupting the value somehow? Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and