[
https://issues.apache.org/jira/browse/CXF-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037199#comment-14037199
]
Sergey Beryozkin commented on CXF-5549:
---------------------------------------
Hi Andriy, thanks for the latest update, it let me confirm that we could move
tika-parsers into a test scope.
I guess we'd create some Lucene specific helper a bit later on that will deal
with checking that a query matches a given document, which I can contribute
easily.
Can you give me a favour and play a bit with adding a couple of more tests for
some other file types, say some binary file which has no text content and only
metadata ? Our code should work OK, from what I understand a tika parser will
return an empty value if it has no content, but lets double check. The other
possible area to explore: we have a single Document keeping the text and the
metadata, can we hit a problem where a user looking for some metadata gets a
wrong document due to the content getting a match, if yes then we'd need to
have 2 documents created instead...
Thanks, Sergey
> Introduce Tika Search Visitor
> -----------------------------
>
> Key: CXF-5549
> URL: https://issues.apache.org/jira/browse/CXF-5549
> Project: CXF
> Issue Type: New Feature
> Components: JAX-RS
> Reporter: Sergey Beryozkin
> Assignee: Andriy Redko
> Priority: Minor
>
> Introduce TikaSearchVisitor which will convert FIQL/etc search expression
> into Apache Tika component that can be used to search the binary data; for
> example, the service can support something like "find all PDF files matching
> a given expression"
--
This message was sent by Atlassian JIRA
(v6.2#6252)