Re: [Ecm] Howto configure JCR repository in nuxeo 5.2 to support blobs indexing

Stefan Dimov Sun, 26 Apr 2009 10:49:00 -0700

Thank you very much about this explanation and wiki post. You spent me a lot of debugging nights.
I just tested and everything is working fine.
I had to add some textExtractors to configuration and indexing/searching are perfect for attached documents.
My working configuration for workspace.xml is:
        <SearchIndex class="org.nuxeo.ecm.core.repository.jcr.jackrabbit.SearchIndex">
            <param name="path" value="${wsp.name}/index"/>
            <param name="indexingConfiguration" value="${wsp.home}/indexing_configuration.xml"/>
            <param name="textFilterClasses" value="org.apache.jackrabbit.extractor.MsWordTextExtractor,
                                org.apache.jackrabbit.extractor.MsExcelTextExtractor,
                                org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
                                org.apache.jackrabbit.extractor.PdfTextExtractor,
                                org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
                                org.apache.jackrabbit.extractor.RTFTextExtractor,
                                org.apache.jackrabbit.extractor.XMLTextExtractor"/>
        <param name="extractorPoolSize " value="2"/>
        </SearchIndex>

Where can I find documentation about schema mappings to JCR, i.e. when I contribute schema and new document to nuxeo, how can I understand their mappings to JCR, also about built-in core types?

Thank you very much again.
Regards,
Stefan

[email protected] wrote:

If you use the JCR backend, then fulltext configuration is left entirely to Jackrabbit.
You may want to read the documentation at http://wiki.apache.org/jackrabbit/IndexingConfiguration


Read especially the parts about index aggregates, which is what has to be used to make the fulltext stored in Nuxeo's complex properties (which are JCR subnodes) be indexed with the main document.

You should get something like:

  <aggregate primaryType="ecmdt:File">
    <include primaryType="ecmft:content">*</include>
    <include primaryType="ecmft:content">*/*/*</include>
  </aggregate>

Note that Jackrabbit doesn't use subtyping when matching the aggregate primaryType, so you'll have to repeat the rule for all the document types you have.
--
Posted by "fguillaume" at Nuxeo Discussions <http://nuxeo.org/discussions>
View the complete thread: <http://www.nuxeo.org/discussions/thread.jspa?threadID=2397#6510>
_______________________________________________
ECM mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm
To unsubscribe, go to http://lists.nuxeo.com/mailman/options/ecm

Stefan Dimov
tel/fax : +359 2 987 08 15
mobile: +359 888 52 54 88
web : http://www.setelis.com

_______________________________________________
ECM mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm
To unsubscribe, go to http://lists.nuxeo.com/mailman/options/ecm

Re: [Ecm] Howto configure JCR repository in nuxeo 5.2 to support blobs indexing

Reply via email to