Likely a good http debugger would help (wireshark, or fiddler2, for example)
http://www.telerik.com/fiddler
https://www.wireshark.org/download.html
For example, it could show the http header that the "client" uses to request 
info from an api, then the show results of that query. One small caveat: I have 
not tried this with "standalone" server or with any SOLR type project.
Cheers!Steve

> From: teag...@insystechinc.com
> To: solr-user@lucene.apache.org
> Subject: RE: Tika HTTP 400 Errors with DIH
> Date: Fri, 5 Dec 2014 12:03:23 -0500
> 
> Alex,
> 
> Your suggestion might be a solution, but the issue isn't that the resource 
> isn't found. Like Walter said 400 is a "bad request" which makes me wonder, 
> what is the DIH/Tika doing when trying to access the documents? What is the 
> "request" that is bad? Is there any other way to suss this out? Placing a 
> network monitor in this case would be on the extreme end of difficult.
> 
> I know that the URL stored is good and that the resource exists by copying it 
> out of a Solr query and pasting it into the browser, so that eliminates 404 
> and 500 errors. Is the format of the URL correct? Is there some other setting 
> I've missed?
> 
> I appreciate the suggestions!
> 
> -Teague
> 
> 
> -----Original Message-----
> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] 
> Sent: Thursday, December 04, 2014 12:22 PM
> To: solr-user
> Subject: Re: Tika HTTP 400 Errors with DIH
> 
> Right. Resource not found (on server).
> 
> The end result is the same. If it works in the browser but not from the 
> application than either not the same URL is being requested or - somehow - 
> not even the same server.
> 
> The solution (watching network traffic) is still the same, right?
> 
> Regards,
>    Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and 
> newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers 
> community: https://www.linkedin.com/groups?gid=6713853
> 
> 
> On 4 December 2014 at 11:51, Walter Underwood <wun...@wunderwood.org> wrote:
> > No, 400 should mean that the request was bad. When the server fails, that 
> > is a 500.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/
> >
> >
> > On Dec 4, 2014, at 8:43 AM, Alexandre Rafalovitch <arafa...@gmail.com> 
> > wrote:
> >
> >> 400 error means something wrong on the server (resource not found).
> >> So, it would be useful to see what URL is actually being requested.
> >>
> >> Can you run some sort of network tracer to see the actual network 
> >> request (dtrace, Wireshark, etc)? That will dissect the problem into 
> >> half for you.
> >>
> >> Regards,
> >>   Alex.
> >> Personal: http://www.outerthoughts.com/ and @arafalov Solr resources 
> >> and newsletter: http://www.solr-start.com/ and @solrstart Solr 
> >> popularizers community: https://www.linkedin.com/groups?gid=6713853
> >>
> >>
> >> On 4 December 2014 at 09:42, Teague James <teag...@insystechinc.com> wrote:
> >>> The database stores the URL as a CLOB. Querying Solr shows that the field 
> >>> value is "http://www.someaddress.com/documents/document1.docx";
> >>> The URL works if I copy and paste it to the browser, but Tika gets a 400 
> >>> error.
> >>>
> >>> Any ideas?
> >>>
> >>> Thanks!
> >>> -Teague
> >>> -----Original Message-----
> >>> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
> >>> Sent: Tuesday, December 02, 2014 1:45 PM
> >>> To: solr-user
> >>> Subject: Re: Tika HTTP 400 Errors with DIH
> >>>
> >>> On 2 December 2014 at 13:19, Teague James <teag...@insystechinc.com> 
> >>> wrote:
> >>>> clob="true"
> >>>
> >>> What does ClobTransformer is doing on the DownloadURL field? Is it 
> >>> possible it is corrupting the value somehow?
> >>>
> >>> Regards,
> >>>   Alex.
> >>>
> >>> Personal: http://www.outerthoughts.com/ and @arafalov Solr resources 
> >>> and newsletter: http://www.solr-start.com/ and @solrstart Solr 
> >>> popularizers community: https://www.linkedin.com/groups?gid=6713853
> >>>
> >
> 
                                          

Reply via email to