Hi,

did you make sure that the workspace configuration file (workspace.xml) is the
same as the one you used with jackrabbit 1.4.5? that's where the text extractor
classes are configured, which are responsible for extracting the text from your
pdf and odt files.

regards
 marcel

Jop Zinkweg - Initworks B.V. wrote:
> Hello everyone,
> 
> After upgrading from Jackrabbit 1.4(.5) to 1.5(.0), the behaviour of our
> search box changed.
> 
> Previously we were able to search the 'binary' contents and additional
> properties of a jcr:content node using the following query:
> 
> SELECT * FROM bos:correspondentie WHERE CONTAINS(.,'abc')
> 
> After upgrading this query only returns 'old' files (uploaded while
> running 1.4) which have 'abc' in them (pdf / odt files).
> 
> When searching for a value known to be in a 'meta' property both 'old'
> and 'new' (1.5) files are returned.
> 
> After removing the index both the 'old' and 'new' files can only be
> found using their properties.
> 
> This leads me to believe the indexing behaviour (and not the query
> behaviour) has changed between 1.4 and 1.5.
> 
> We're running a vanilla 1.4 configuration, and looking at the 1.5
> vanilla config nothing has changed in respect to the searching/indexing
> default setup.
> 
> Our node structure is as follows:
> 
> * 'folder' = nt:folder
>  * 'file.odt' = nt:file
>    * 'jcr:content' = nt:unstructured (+ bos:correspondentie mixin
> defining some properties)
> 
> 
> Has the indexing behaviour changed in 1.5, or am I looking at another
> problem entirely?
> 
> Thanks in advance,
> 
> Jop Zinkweg
> 
> 

Reply via email to