[ https://issues.apache.org/jira/browse/TIKA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240178#comment-13240178 ]
Chris A. Mattmann commented on TIKA-593: ---------------------------------------- OK, I give up for now. I disabled the 415 test that isn't passing. After researching this for hours, and working with Paul Ramirez (thanks for the help Paul), we basically found the following things to be true: * Jersey automatically sets Accept to something like */* which IMHO is more sensible than CXF which sets it to an XML accept type (which causes the resource to not even find the path in test415) * For whatever reason, if you set accept to "xxx/xxx" instead of checks up front like it seems Jersey did, CXF will let the call get all the way to the UnpackerResource#unpack method and then cause the Tika AutoDetectParser to fail. Jersey seemed to have caught this. I have no clue why. We mucked around with different accept and type calls and got it to send 200 OK back and parse fine (e.g., if you set the accept to */* and type to APPLICATION_MSWORD -- but that defeats the purpose of the test. If you send in xxx/xxx, it seems like the JAX RS service should send back a 415. I need some massive help from anyone that knows CXF to figure this out. I have to step away from this for now. For now all tests pass, they are cleaned up using CXF client (with HttpClient removed), and I disabled test415. Any help to get 415 working with CXF is welcomed. Even if we have to modify UnpackerResource to do the check. I know that Sergey is watching this one (from CXF ville so would love some help here!) > Tika network server > ------------------- > > Key: TIKA-593 > URL: https://issues.apache.org/jira/browse/TIKA-593 > Project: Tika > Issue Type: New Feature > Components: general > Affects Versions: 0.10 > Reporter: Jukka Zitting > Assignee: Chris A. Mattmann > Fix For: 1.2 > > Attachments: TIKA-593.Mattmann.032612.patch.2.txt, > TIKA-593.Mattmann.032612.patch.txt, TIKA-593.Mattmann.032712.patch.2.txt, > TIKA-593.Mattmann.032712.patch.txt, TIKA-593_pom.diff > > > It would be cool to be able to run Tika as a network service that accepts a > binary document as input and produces the extracted content (as XHTML, text, > or just metadata) as output. A bit like TIKA-169, but without the dependency > to a servlet container. > I'd like to be able to set up and run such a server like this: > $ java -jar tika-app.jar --port 1234 > We should also add a NetworkParser class that acts as a local client for such > a service. This way a lightweight client could use the full set of Tika > parsing functionality even with just the tika-core jar within its classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira