I am adding BinaryValue properties to my nodes.  It appears that jackrabbit is 
not indexing the values of the BinaryValue even if the contents represent a 
string.  If I add the String value as a StringValue, the value is indexed and 
picked up in a contains search.

I have 2 issues with this:

1) String property values have a limit of around 16000 characters because the 
SimpleDBPersistence adapter will store the value in a BLOB field.  I get Mysql 
data truncation errors unless I chop the data down to 16000 characters.  In 
addition, I am doubling my space requirements.  No only do I have to store my 
binary content, by it's string representation in the node.

2) I use a byte[] array throughout my application has a means to store pdf 
files, image files, text files, etc...  It is a "common denominator for all 
content"  PDF files, image files, wiki entries, etc...  all can be stored, 
passed around, retrieved as a byte[] array.  I would like to figure out how to 
get jackrabbit to index the byte[] array properly.

3) Not an issue, but a question. How does jackrabbit know that a node is a pdf 
document?  It must figure it out somehow because I see that there is support in 
the SearchIndex to configure pdf extractions.  Do I add "jcr:mimeType" property 
of application/pdf to my pdf node and that will do it?  Will this solve the 
first 2 issues??

I appreciate your thoughts on this!


My Code:

String contentText= "this is a unique piece of text";
byte[] bytes = contentText.getBytes();
node.setProperty("content", new BinaryValue(bytes));
if (content.length() > 16000) {
        contentText= contentText.substring(0, 16000);                           
        
}
node.setProperty("worksproperty", new StringValue(contentText));



This is my xpath query:
//*[jcr:contains(.,'unique')]


Reply via email to