Hi Dimitris, Did this error occur after making the encoding change? It may be a good idea to stop your servlet container, drop/truncate the tables from the ProAI database, delete the ProAI temporary files directory (by default /tmp/proai ), and then restart your servlet container. This will rebuild the ProAI database completely and ensure that you're not seeing cached errors.
Regards, Graeme On 25 Nov 2010, at 11:00, Dimitris Gavrilis wrote: Dear Steve, Thanks for you help. I did change the header (UTF-8) in the top of the file as you suggested but I still get the same error. The file seems ok when accessed through fedora (http://localhost:8080/fedora/objects/iid:1/datastreams/mods/content). I'm attaching below the error from the fedora's console: proai.error.ServerException: Error parsing record xml at proai.cache.ParsedRecord.<init>(ParsedRecord.java:70) at proai.cache.Worker.attempt(Worker.java:111) at proai.cache.Worker.run(Worker.java:51) Caused by: java.io.UTFDataFormatException: Invalid byte 2 of 2-byte UTF-8 sequen ce. at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source) at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unk nown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContent Dispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Un known Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(SAXParser.java:395) at javax.xml.parsers.SAXParser.parse(SAXParser.java:198) at proai.cache.ParsedRecord.<init>(ParsedRecord.java:62) ... 2 more On Thu, Nov 25, 2010 at 12:21 PM, Steve Bayliss <stephen.bayl...@acuityunlimited.net<mailto:stephen.bayl...@acuityunlimited.net>> wrote: Hi Dimitris It would certainly be worthwhile trying Graeme's suggestion, although I suspect that if Fedora didn't determine the correct encoding on ingest then this would cause problems elsewhere. (In any case you should correct this incorrect encoding declaration to UTF-8). I've taken a look at the proai oaiprovider source, and there is some "unsafe" code in there where the default platform encoding will be used. (eg FedoraOAIDriver.java line 275) 1) could you provide a full log of the exception (ie the full stack trace) 2) could you try setting the JVM default encoding by using -Dfile.encoding=utf-8 (eg add this to CATALINA_OPTS) Thanks Steve -----Original Message----- From: West, Graeme [mailto:graeme.w...@gcu.ac.uk<mailto:graeme.w...@gcu.ac.uk>] Sent: 25 November 2010 09:44 To: Support and info exchange list for Fedora users. Subject: Re: [fcrepo-user] Proia multilingual- java.io.UTFDataFormatException Hi Dimitris, I notice that on the first line, your XML declaration states: <?xml version="1.0" encoding="UTF8"?> This should be: <?xml version="1.0" encoding="UTF-8"?> ProAI is probably rejecting the documents because of this 'unknown' encoding. Hope this helps. Regards, Graeme West Digital Repository Developer Information Services Glasgow Caledonian University graeme.w...@gcu.ac.uk<mailto:graeme.w...@gcu.ac.uk><mailto:graeme.w...@gcu.ac.uk<mailto:graeme.w...@gcu.ac.uk>> On 24 Nov 2010, at 08:31, Dimitris Gavrilis wrote: Hi Steve, I'm attaching an xml sample of a record that produces this error. Thanks, Dimitris. On Wed, Nov 24, 2010 at 9:55 AM, Steve Bayliss <stephen.bayl...@acuityunlimited.net<mailto:stephen.bayl...@acuityunlimited.net><mailto:stephen.bayl...@acuityunlimited<mailto:stephen.bayl...@acuityunlimited>. net>> wrote: Hi Dimitris Do you have an example object FOXML file that could be used to reproduce this? Thanks Steve -----Original Message----- From: Dimitris Gavrilis [mailto:gavri...@gmail.com<mailto:gavri...@gmail.com><mailto:gavri...@gmail.com<mailto:gavri...@gmail.com>>] Sent: 23 November 2010 15:17 To: fedora-commons-users@lists.sourceforge.net<mailto:fedora-commons-users@lists.sourceforge.net><mailto:fedora-commons-us...@lists<mailto:fedora-commons-us...@lists> .sourceforge.net<http://sourceforge.net/>> Subject: [fcrepo-user] Proia multilingual - java.io.UTFDataFormatException Hi, I've setup fedora with Proai and whenever proai tries to parse non english records (Greek) I get a java.io.UTFDataFormatException. Although I've seen that this problem exists, I haven't managed to find a solution. When i exclude non-English text, proai works fine. Thanks in advance, Dimtris. ---------------------------------------------------------------------------- -- Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net<mailto:Fedora-commons-users@lists.sourceforge.net><mailto:fedora-commons-us...@lists<mailto:fedora-commons-us...@lists> .sourceforge.net<http://sourceforge.net/>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users Email has been scanned for viruses by Altman Technologies' email management service<http://www.altman.co.uk/emailsystems> <iid_1_mods.xml>------------------------------------------------------------ ------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems<http://www.altman.co.uk/emailsystems> _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net<mailto:Fedora-commons-users@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems<http://www.altman.co.uk/emailsystems> Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en .html<http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en%0A.html> ---------------------------------------------------------------------------- -- Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net<mailto:Fedora-commons-users@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users ------------------------------------------------------------------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net<mailto:Fedora-commons-users@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users Email has been scanned for viruses by Altman Technologies' email management service<http://www.altman.co.uk/emailsystems> ------------------------------------------------------------------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html ------------------------------------------------------------------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users