sorted it by removing the Apache Tika extraction from:

WEB-INF/classes/fgsconfigFinal/index/FgsIndex/foxmlToSolrGenerated.xslt

it seems it extracts the content and tries to index it. Not sure why it would 
want to extract the content of an image but when it does it causes Solr to fail 
to index the resource:

SEVERE: org.apache.solr.common.SolrException: Illegal character (NULL, unicode 
0) encountered: not valid in any content

Seems to only think some jpg files are not jpg files.

Alistair

--
mov eax,1
mov ebx,0
int 80h

From: Alistair Young <alistair.yo...@uhi.ac.uk<mailto:alistair.yo...@uhi.ac.uk>>
Reply-To: "Support and info exchange list for Fedora users." 
<fedora-commons-users@lists.sourceforge.net<mailto:fedora-commons-users@lists.sourceforge.net>>
Date: Wednesday, 24 July 2013 11:03
To: "Support and info exchange list for Fedora users." 
<fedora-commons-users@lists.sourceforge.net<mailto:fedora-commons-users@lists.sourceforge.net>>
Subject: Re: [fcrepo-user] Does gsearch index content with solr?

sorry should have mentioned, it's the content datastream, i.e. image/jpeg

Alistair

--
mov eax,1
mov ebx,0
int 80h

From: Alistair Young <alistair.yo...@uhi.ac.uk<mailto:alistair.yo...@uhi.ac.uk>>
Reply-To: "Support and info exchange list for Fedora users." 
<fedora-commons-users@lists.sourceforge.net<mailto:fedora-commons-users@lists.sourceforge.net>>
Date: Wednesday, 24 July 2013 10:59
To: "Support and info exchange list for Fedora users." 
<fedora-commons-users@lists.sourceforge.net<mailto:fedora-commons-users@lists.sourceforge.net>>
Subject: [fcrepo-user] Does gsearch index content with solr?

I have a weird problem. I dropped a foxml file into 
FgsConfig/indexingXsltGenerator/foxml and configured etc but certain files, 
when uploaded cause solr to crash:

SEVERE: org.apache.solr.common.SolrException: Illegal character (NULL, unicode 
0) encountered: not valid in any content

If I don't include datastream in the foxml it doesn't cause the crash, i.e. 
remove this:

<foxml:datastream ID="AUDIT" STATE="A" CONTROL_GROUP="X" VERSIONABLE="false">

Should the foxml used to configure gsearch only contain 'metadata', i.e. DC, 
RDF etc and not datastreams?

thanks,

Alistair

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to