Re: Proper Workspace/Indexing Configuration

Sean Callan Fri, 29 Feb 2008 05:55:00 -0800

Hi Marcel,

You've answered all my questions in one morning, what a great thing!  I did
realize my issue with the dependencies for text extraction the other day by
stumbling upon the pom.xml.  In addition I found there are some other
dependencies necessary for PDF extraction listed on the PDFBox website.


Thanks for much for all your help, I owe you a beer, let me know when you're
in Washington, DC!

Thanks,
Sean

On Fri, Feb 29, 2008 at 5:04 AM, Marcel Reutegger <[EMAIL PROTECTED]>
wrote:

> Hi Sean,
>
> did you include all required jar files into your classpath? e.g. text
> extraction
> from MS office documents requires apache poi. see dependencies here:
>
> http://svn.apache.org/repos/asf/jackrabbit/tags/1.4/jackrabbit-text-extractors/pom.xml
>
> regards
>   marcel
>
> Sean Callan wrote:
> > Hi guys,
> >
> > Would anyone be so kind as to send me a functional repository
> configuration
> > that indexes a variety of nt:files types?  I'm using the follow
> > configuration and I am unable to search for a term within any of my
> binary
> > content (nt:files > jcr:content).
> >
> > At this point I'm out of ideas, the correct jars are in place, searching
> > works on all my plain text nodes, I can even see that the index is
> updated
> > when I add in new nt:files nodes.  But a search returns nothing.  At
> this
> > point search is the only thing holding back my development and client's
> > acceptance of JackRabbit as our repository.
> >
> > Any help would be greatly appreciated!
> >
> > <?xml version="1.0" encoding="ISO-8859-1"?>
> > <Repository>
> >     <FileSystem
> >              class="org.apache.jackrabbit.core.fs.local.LocalFileSystem
> ">
> >         <param name="path" value="${rep.home}/repository"/>
> >     </FileSystem>
> >     <Security appName="Jackrabbit">
> >         <AccessManager
> >                  class="
> > org.apache.jackrabbit.core.security.SimpleAccessManager"/>
> >         <LoginModule class="
> > org.apache.jackrabbit.core.security.SimpleLoginModule">
> >             <param name="userid" value="" />
> >         </LoginModule>
> >     </Security>
> >     <Workspaces
> >              rootPath="${rep.home}/workspaces"
> >                       defaultWorkspace="default" />
> >     <Workspace name="${wsp.name}">
> >         <FileSystem
> >                  class="
> org.apache.jackrabbit.core.fs.local.LocalFileSystem
> > ">
> >             <param name="path" value="${wsp.home}"/>
> >         </FileSystem>
> >         <PersistenceManager
> >             class="
> > org.apache.jackrabbit.core.state.xml.XMLPersistenceManager" />
> >           <SearchIndex
> >             class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
> >
> >             <param name="path" value="${wsp.home}/index" />
> >             <param name="textFilterClasses" value="
> >                 org.apache.jackrabbit.extractor.PlainTextExtractor,
> >                 org.apache.jackrabbit.extractor.MsWordTextExtractor,
> >                 org.apache.jackrabbit.extractor.MsExcelTextExtractor,
> >
> org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
> >                 org.apache.jackrabbit.extractor.PdfTextExtractor,
> >                 org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
> >                 org.apache.jackrabbit.extractor.RTFTextExtractor,
> >                   org.apache.jackrabbit.extractor.HTMLTextExtractor,
> >                 org.apache.jackrabbit.extractor.XMLTextExtractor"/>
> >         </SearchIndex>
> >     </Workspace>
> >     <Versioning rootPath="${rep.home}/versions">
> >         <FileSystem
> >                  class="
> org.apache.jackrabbit.core.fs.local.LocalFileSystem
> > ">
> >             <param name="path" value="${rep.home}/versions"/>
> >         </FileSystem>
> >         <PersistenceManager
> >             class="
> > org.apache.jackrabbit.core.state.xml.XMLPersistenceManager" />
> >     </Versioning>
> > </Repository>
> >
> > Thanks,
> > Sean
> >
>
>

Re: Proper Workspace/Indexing Configuration

Reply via email to