[ 
https://issues.apache.org/jira/browse/CXF-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037199#comment-14037199
 ] 

Sergey Beryozkin commented on CXF-5549:
---------------------------------------

Hi Andriy, thanks for the latest update, it let me confirm that we could move 
tika-parsers into a test scope.

I guess we'd create some Lucene specific helper a bit later on that will deal 
with checking that a query matches a given document, which I can contribute 
easily.

Can you give me a favour and play a bit with adding a couple of more tests for 
some other file types, say some binary file which has no text content and only 
metadata ? Our code should work OK, from what I understand a tika parser will 
return an empty value if it has no content, but lets double check. The other 
possible area to explore: we have a single Document keeping the text and the 
metadata, can we hit a problem where a user looking for some metadata gets a 
wrong document due to the content getting a match, if yes then we'd need to 
have 2 documents created instead...

Thanks, Sergey

> Introduce Tika Search Visitor
> -----------------------------
>
>                 Key: CXF-5549
>                 URL: https://issues.apache.org/jira/browse/CXF-5549
>             Project: CXF
>          Issue Type: New Feature
>          Components: JAX-RS
>            Reporter: Sergey Beryozkin
>            Assignee: Andriy Redko
>            Priority: Minor
>
> Introduce TikaSearchVisitor which will convert FIQL/etc search expression 
> into Apache Tika component that can be used to search the binary data; for 
> example, the service can support something like "find all PDF files matching 
> a given expression"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to