Alex, thanks for your quick reply. I will have all of this in mind!!!.
We fixed these problem by adding a new property to the node, that contains the text context, whose type is String. So we make the query against that property. We use a custom text extractor (tika text extractor), and we add the necessary code to populate the new property when a file is uploaded.
Thanks again
Greeting
Victor

On 7/19/2012 4:09 PM, Alexander Klimetschek wrote:
On 16.07.2012, at 20:51, Victor Giordano wrote:

Hi friends, i have a question about making a xpath expression for filtering 
resources by a property of type inputStream called data.
How i can do a text search... for example... this is working:

String xpath1 = "<my app path>//element(*, nt:resource) 
[jcr:contains(@jcr:mimeType,'*plain*')]";
String xpath2 = "<my app path>//element(*, nt:resource) 
[jcr:contains(@jcr:encoding,'*utf*')]";
FYI: jcr:contains() runs full text searches (with terms split up, word stemming 
etc.), so you don't need wildcards. Just use

jcr:contains(@jcr:mimeType, 'plain')

If you want real pattern-like matching (and highly-structured mime type or 
encoding values are probably better served by that), use jcr:like, which uses % 
as wildcard:

jcr:like(@jcr:mimeType, '%plain')

This should only match a value "text/plain" or "plain", but not "plain with a 
suffix".

But this is not working....
String xpath3 = "<my app path>//element(*, nt:resource) 
[jcr:contains(@jcr:data,'*plain*')]";
The full text index for binary content is by default aggregated on the node itself, which 
you address with ".":

//element(*, nt:resource) [jcr:contains(.,'plain')]

The index configuration is documented here: 
http://wiki.apache.org/jackrabbit/IndexingConfiguration

Cheers,
Alex


Reply via email to