Problem getting full textual search to work with textextractors

Kurz Wolfgang Thu, 26 Mar 2009 09:31:41 -0700

Hello everyone,

i am trying to get the full textual search to work with text extractors.



I uploaded a pfd-file as resource into jackrabbit which works fine as I can 
download it just fine and I get the file back.

But now I wanted to implement textual search inside document I uploaded and 
somehow it doesn't find the documents even though the document contains the 
strings that I am searching for.

What I did I this:

I added these jar files to my tomcat server lib folder since I am using JNDI to 
connect

-jackrabbit-text-extractors-1.5.0.jar
-fontbox-0.1.0.jar
-junit-3.8.1.jar
-nekohtml-1.9.7.jar
-pdfbox-0.7.3.jar
-poi-3.0.2-FINAL.jar
-poi-scratchpad-3.0.2-FINAL.jar
-tm-extractors-0.4.jar

Then my x-path query looks like this:

//*[((jcr:contains(.,'consetetur')) or (jcr:contains(.,'sadipscing')))]

Both of those words are inside the pdf but the search result is empty.

Here is the code how I do the search:

javax.jcr.query.Query jcrQuery;
                try {
                        jcrQuery = 
session.getWorkspace().getQueryManager().createQuery(query, language);
                        QueryResult queryResult = jcrQuery.execute();
                        NodeIterator nodeIterator = queryResult.getNodes();
                        return nodeIterator;
                }
                catch (InvalidQueryException iqe) {
                        throw new 
org.apache.jackrabbit.ocm.exception.InvalidQueryException(iqe);
                }
                catch (RepositoryException re) {
                        throw new 
ObjectContentManagerException(re.getMessage(), re);
                }


Would be really awesome if anyone had an idea for me why this doesn't work

Thx a lot in advance
Wolfgang

Problem getting full textual search to work with textextractors

Reply via email to