On Wed, May 12, 2010 at 14:12, Jenni Pothu <[email protected]> wrote: > Hi Alex, > Thanks for the reply and information. It is very useful. Using > Jcr:contains I am able to search on the node content. But I need to search > the file content also. It's not working with Jcr:contains. Thanks again for > the needful.
Binary properties of nt:file nodes are full-text extracted with the help of Apache Tika (since 2.0 [1], before Jackrabbit also had its own text extractors [2] [3]). The support of files depends on the file format and whether there is an open source library available that can handle that format. Some formats such as PDF come in so many varieties that there are certain issues every now and then. Also note that large text extractions are queued and the result of it might not be immediately visible after the save. [1] http://lucene.apache.org/tika/ [2] http://jackrabbit.apache.org/jackrabbit-text-extractors.html [3] http://wiki.apache.org/jackrabbit/Search Regards, Alex -- Alexander Klimetschek [email protected]
