[jira] [Commented] (CXF-5549) Introduce Tika Search Visitor

Andriy Redko (JIRA) Sat, 14 Jun 2014 11:14:24 -0700

    [ 
https://issues.apache.org/jira/browse/CXF-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031641#comment-14031641
 ]


Andriy Redko commented on CXF-5549:
-----------------------------------

As agreed, the first step is to create an extractor and test to do:
 - extract metadata and raw content from provided PDF docoument using Apache 
Tika
 - use Apache Lucene (RAM index / StandardAnalyzer) to index the metadata and 
context 
 - use LuceneQueryVisitor on top of filter expression and the index to match 
the documents 

Thanks.
Andriy

> Introduce Tika Search Visitor
> -----------------------------
>
>                 Key: CXF-5549
>                 URL: https://issues.apache.org/jira/browse/CXF-5549
>             Project: CXF
>          Issue Type: New Feature
>          Components: JAX-RS
>            Reporter: Sergey Beryozkin
>            Assignee: Andriy Redko
>            Priority: Minor
>
> Introduce TikaSearchVisitor which will convert FIQL/etc search expression 
> into Apache Tika component that can be used to search the binary data; for 
> example, the service can support something like "find all PDF files matching 
> a given expression"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CXF-5549) Introduce Tika Search Visitor

Reply via email to