Re: JackRabbit Search Engine Questions

Marcel Reutegger Thu, 03 May 2007 05:51:10 -0700

Hi Belinda,

Belinda Randolph wrote:

1.  Can I replace the JackRabbit search engine with my own?

Yes you can, there are several interfaces you have to implement. See interfaceQueryHandler for a starting point:

http://svn.apache.org/repos/asf/jackrabbit/tags/1.3/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/query/QueryHandler.java

2. Does your search engine look through actual document contents - as abackground process or at the time of the actual user search?

Whether text is extracted from documents and indexed when the document is savedor deferred to a a number of background threads is configurable.

3. What FORMATs of actual documents does your search engine look at?(Ascii, Microsoft, PDF, etc.)


The currently supported formats are:
- Microsoft Word, Excel, PowerPoint
- PDF
- Open Office Documents (text, spreadsheet, presentation, etc.)
- RTF
- HTML
- XML

Text extraction in Jackrabbit is extensible. See:
http://svn.apache.org/repos/asf/jackrabbit/tags/1.3/jackrabbit-text-extractors/src/main/java/org/apache/jackrabbit/extractor/TextExtractor.java

4. When searching the contents of a PDF file, does the backgroundprocess, using OCR, create an additional file in another format? Whatformat?

The text extractor in Jackrabbit does not use OCR technology, but if you have anexisting java solution you may easily integrate it into Jackrabbit.

5. Does your OCR routine search FORMATS other than PDF? If yes, whatformats can the OCR search?

n/a

6.  What are the resolution requirements for your OCR routines?

n/a

7. Can I change the GUI to a) add functionality or error checking andb) to look personalized with CSS?

Jackrabbit is a content repository infrastructure and does not come with a userinterface. You may use any existing JCR compliant application on top of Jackrabbit.

8. Can the search engine search using both requested metadata elementvalues and keywords from the document contents?


Yes, this is possible.

9. Can I start with keywords from the document contents and then laterfilter the results using user inputted metadata element values?


Yes, you would simply execute a second query that includes metadata values.

10. Can I start with user input metadata element values and then laterfilter down the results with document contents?

Yes, you would execute the first query with just the metadata values and then asecond one with additional keywords entered by the user.

11. After an initial search, can I refine my search by only looking atthe results of the previous search?


Yes, you would simply execute the initial query again with additional search 
terms.

regards
 marcel

Re: JackRabbit Search Engine Questions

Reply via email to