[ 
https://issues.apache.org/jira/browse/TIKA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240178#comment-13240178
 ] 

Chris A. Mattmann commented on TIKA-593:
----------------------------------------

OK, I give up for now. I disabled the 415 test that isn't passing. After 
researching this for hours, and working with Paul Ramirez (thanks for the help 
Paul), we basically found the following things to be true:

* Jersey automatically sets Accept to something like */* which IMHO is more 
sensible than CXF which sets it to an XML accept type (which causes the 
resource to not even find the path in test415)
* For whatever reason, if you set accept to "xxx/xxx" instead of checks up 
front like it seems Jersey did, CXF will let the call get all the way to the 
UnpackerResource#unpack method and then cause the Tika AutoDetectParser to 
fail. Jersey seemed to have caught this. I have no clue why. We mucked around 
with different accept and type calls and got it to send 200 OK back and parse 
fine (e.g., if you set the accept to */* and type to APPLICATION_MSWORD -- but 
that defeats the purpose of the test. If you send in xxx/xxx, it seems like the 
JAX RS service should send back a 415.

I need some massive help from anyone that knows CXF to figure this out. I have 
to step away from this for now. For now all tests pass, they are cleaned up 
using CXF client (with HttpClient removed), and I disabled test415. Any help to 
get 415 working with CXF is welcomed. Even if we have to modify 
UnpackerResource to do the check. I know that Sergey is watching this one (from 
CXF ville so would love some help here!)
                
> Tika network server
> -------------------
>
>                 Key: TIKA-593
>                 URL: https://issues.apache.org/jira/browse/TIKA-593
>             Project: Tika
>          Issue Type: New Feature
>          Components: general
>    Affects Versions: 0.10
>            Reporter: Jukka Zitting
>            Assignee: Chris A. Mattmann
>             Fix For: 1.2
>
>         Attachments: TIKA-593.Mattmann.032612.patch.2.txt, 
> TIKA-593.Mattmann.032612.patch.txt, TIKA-593.Mattmann.032712.patch.2.txt, 
> TIKA-593.Mattmann.032712.patch.txt, TIKA-593_pom.diff
>
>
> It would be cool to be able to run Tika as a network service that accepts a 
> binary document as input and produces the extracted content (as XHTML, text, 
> or just metadata) as output. A bit like TIKA-169, but without the dependency 
> to a servlet container.
> I'd like to be able to set up and run such a server like this:
>     $ java -jar tika-app.jar --port 1234
> We should also add a NetworkParser class that acts as a local client for such 
> a service. This way a lightweight client could use the full set of Tika 
> parsing functionality even with just the tika-core jar within its classpath.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to