I can see how it 's useful but with it in, I have a jpeg file that can't be indexed. What sort of technical assertions does it extract/infer? I could see if there's something strange in the image file.
Alternately, what's the source file and I'll have a look... Alistair -- mov eax,1 mov ebx,0 int 80h On 24/07/2013 13:42, "aj...@virginia.edu" <aj...@virginia.edu> wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >I was one of the people who instigated Gert to add that functionality.The >motivation is to be able to extract technical assertions about binary >datastreams and use them in indexing. It's not extracting content from >images, although it could extract content from PDF files or other >text-containing formats. > >On perhaps a more useful note, you should definitely expect to alter the >default indexing stylesheets, or even better, to create your own that are >to your particular purposes. > >- --- >A. Soroka >The University of Virginia Library > >On Jul 24, 2013, at 8:32 AM, Alistair Young wrote: > >> sorted it by removing the Apache Tika extraction from: >> >> WEB-INF/classes/fgsconfigFinal/index/FgsIndex/foxmlToSolrGenerated.xslt >> >> it seems it extracts the content and tries to index it. Not sure why it >>would want to extract the content of an image but when it does it causes >>Solr to fail to index the resource: >> >> SEVERE: org.apache.solr.common.SolrException: Illegal character (NULL, >>unicode 0) encountered: not valid in any content >> >> Seems to only think some jpg files are not jpg files. >> >> Alistair >> >> -- >> mov eax,1 >> mov ebx,0 >> int 80h >> >> From: Alistair Young <alistair.yo...@uhi.ac.uk> >> Reply-To: "Support and info exchange list for Fedora users." >><fedora-commons-users@lists.sourceforge.net> >> Date: Wednesday, 24 July 2013 11:03 >> To: "Support and info exchange list for Fedora users." >><fedora-commons-users@lists.sourceforge.net> >> Subject: Re: [fcrepo-user] Does gsearch index content with solr? >> >> sorry should have mentioned, it's the content datastream, i.e. >>image/jpeg >> >> Alistair >> >> -- >> mov eax,1 >> mov ebx,0 >> int 80h >> >> From: Alistair Young <alistair.yo...@uhi.ac.uk> >> Reply-To: "Support and info exchange list for Fedora users." >><fedora-commons-users@lists.sourceforge.net> >> Date: Wednesday, 24 July 2013 10:59 >> To: "Support and info exchange list for Fedora users." >><fedora-commons-users@lists.sourceforge.net> >> Subject: [fcrepo-user] Does gsearch index content with solr? >> >> I have a weird problem. I dropped a foxml file into >>FgsConfig/indexingXsltGenerator/foxml and configured etc but certain >>files, when uploaded cause solr to crash: >> >> SEVERE: org.apache.solr.common.SolrException: Illegal character (NULL, >>unicode 0) encountered: not valid in any content >> >> If I don't include datastream in the foxml it doesn't cause the crash, >>i.e. remove this: >> >> <foxml:datastream ID="AUDIT" STATE="A" CONTROL_GROUP="X" >>VERSIONABLE="false"> >> >> Should the foxml used to configure gsearch only contain 'metadata', >>i.e. DC, RDF etc and not datastreams? >> >> thanks, >> >> Alistair >> >> >>------------------------------------------------------------------------- >>----- >> See everything from the browser to the database with AppDynamics >> Get end-to-end visibility with application monitoring from AppDynamics >> Isolate bottlenecks and diagnose root cause in seconds. >> Start your free trial of AppDynamics Pro today! >> >>http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clkt >>rk_______________________________________________ >> Fedora-commons-users mailing list >> Fedora-commons-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > >-----BEGIN PGP SIGNATURE----- >Version: GnuPG/MacGPG2 v2.0.19 (Darwin) >Comment: GPGTools - http://gpgtools.org > >iQEcBAEBAgAGBQJR78vHAAoJEATpPYSyaoIk8dsIALihgJB0b4OABcOcOnk2qthk >79JqHouayvOFwTNMHsHZMIPXQ9KlD7h/zrHVYPPOqXV8fvNb3+EeQEal5WJxs4Z3 >mMevFpEpBlOWUOBAiEqayNNfnxNCGQ3ARCRXNzeiaheM43ouFCluOGkX9p3fjqSV >qq6QG862vDFvYF69rMH1NiFIUIA/QP8w/K/QzyI8qoblrzWCX2LmQ8NaH5b0oN1j >Nb0NXIQv+XOVJZeHFvbHNEzGMGMEWHKs2QsZ1auirOKaO3ccV74+gVTuvDkmmuXL >VjQQoxNBTqbkhSpoDsWPCkHE+fVGuWyFS/ffJQ/0heX1rWOkiOFgJhhGuwJOl2Y= >=s4aM >-----END PGP SIGNATURE----- > >-------------------------------------------------------------------------- >---- >See everything from the browser to the database with AppDynamics >Get end-to-end visibility with application monitoring from AppDynamics >Isolate bottlenecks and diagnose root cause in seconds. >Start your free trial of AppDynamics Pro today! >http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktr >k >_______________________________________________ >Fedora-commons-users mailing list >Fedora-commons-users@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users