Dear Steve,
It finally worked. I think it was the -Dfile.encoding=utf-8 in the
JAVA_OPTS.
Thank you very much for your assistance,
Dimitris.
On Fri, Nov 26, 2010 at 12:02 PM, Steve Bayliss <
stephen.bayl...@acuityunlimited.net> wrote:
> Hi Dimitris
>
> Just to confirm,
>
> - were you specifying the encoding to the Tomcat JVM (ie using
> -Dfile.encoding=utf-8)?
> - which SQL database (and version) are you using?
>
> Thanks
> Steve
>
>
> -----Original Message-----
> *From:* Dimitris Gavrilis [mailto:gavri...@gmail.com]
> *Sent:* 26 November 2010 09:41
> *To:* Support and info exchange list for Fedora users.
> *Subject:* Re: [fcrepo-user] Proia multilingual-
> java.io.UTFDataFormatException
>
> Hi Steve,
>
> I did delete the tmp/proai folder and truncated the proai database but I
> still get the same error (see the log below).
>
>
>
> proai.error.ServerException: Error parsing record xml
> at proai.cache.ParsedRecord.<init>(ParsedRecord.java:70)
> at proai.cache.Worker.attempt(Worker.java:111)
> at proai.cache.Worker.run(Worker.java:51)
> Caused by: java.io.UTFDataFormatException: Invalid byte 2 of 2-byte UTF-8
> sequen
> ce.
> at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
> at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
> at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
> at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown
> Source)
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unk
> nown Source)
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContent
> Dispatcher.dispatch(Unknown Source)
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Un
> known Source)
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
> Source)
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
> Source)
> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
> at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown
> Source)
> at javax.xml.parsers.SAXParser.parse(SAXParser.java:395)
> at javax.xml.parsers.SAXParser.parse(SAXParser.java:198)
> at proai.cache.ParsedRecord.<init>(ParsedRecord.java:62)
> ... 2 more
>
>
>
> On Thu, Nov 25, 2010 at 1:11 PM, West, Graeme <graeme.w...@gcu.ac.uk>wrote:
>
>> Hi Dimitris,
>> Did this error occur after making the encoding change?
>>
>> It may be a good idea to stop your servlet container, drop/truncate the
>> tables from the ProAI database, delete the ProAI temporary files directory
>> (by default /tmp/proai ), and then restart your servlet container. This will
>> rebuild the ProAI database completely and ensure that you're not seeing
>> cached errors.
>>
>> Regards,
>>
>> Graeme
>>
>> On 25 Nov 2010, at 11:00, Dimitris Gavrilis wrote:
>>
>> Dear Steve,
>>
>> Thanks for you help. I did change the header (UTF-8) in the top of the
>> file as you suggested but I still get the same error. The file seems ok when
>> accessed through fedora (
>> http://localhost:8080/fedora/objects/iid:1/datastreams/mods/content).
>>
>> I'm attaching below the error from the fedora's console:
>>
>>
>> proai.error.ServerException: Error parsing record xml
>> at proai.cache.ParsedRecord.<init>(ParsedRecord.java:70)
>> at proai.cache.Worker.attempt(Worker.java:111)
>> at proai.cache.Worker.run(Worker.java:51)
>> Caused by: java.io.UTFDataFormatException: Invalid byte 2 of 2-byte UTF-8
>> sequen
>> ce.
>> at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
>> at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
>> at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
>> at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown
>> Source)
>> at
>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unk
>> nown Source)
>> at
>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContent
>> Dispatcher.dispatch(Unknown Source)
>> at
>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Un
>> known Source)
>> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
>> Source)
>> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
>> Source)
>> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>> at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown
>> Source)
>> at javax.xml.parsers.SAXParser.parse(SAXParser.java:395)
>> at javax.xml.parsers.SAXParser.parse(SAXParser.java:198)
>> at proai.cache.ParsedRecord.<init>(ParsedRecord.java:62)
>> ... 2 more
>>
>>
>>
>> On Thu, Nov 25, 2010 at 12:21 PM, Steve Bayliss <
>> stephen.bayl...@acuityunlimited.net<mailto:
>> stephen.bayl...@acuityunlimited.net>> wrote:
>> Hi Dimitris
>>
>> It would certainly be worthwhile trying Graeme's suggestion, although I
>> suspect that if Fedora didn't determine the correct encoding on ingest
>> then
>> this would cause problems elsewhere. (In any case you should correct this
>> incorrect encoding declaration to UTF-8).
>>
>> I've taken a look at the proai oaiprovider source, and there is some
>> "unsafe" code in there where the default platform encoding will be used.
>> (eg
>> FedoraOAIDriver.java line 275)
>>
>> 1) could you provide a full log of the exception (ie the full stack trace)
>> 2) could you try setting the JVM default encoding by using
>> -Dfile.encoding=utf-8 (eg add this to CATALINA_OPTS)
>>
>> Thanks
>> Steve
>>
>> -----Original Message-----
>> From: West, Graeme [mailto:graeme.w...@gcu.ac.uk<mailto:
>> graeme.w...@gcu.ac.uk>]
>> Sent: 25 November 2010 09:44
>> To: Support and info exchange list for Fedora users.
>> Subject: Re: [fcrepo-user] Proia multilingual-
>> java.io.UTFDataFormatException
>>
>>
>> Hi Dimitris,
>> I notice that on the first line, your XML declaration states:
>>
>> <?xml version="1.0" encoding="UTF8"?>
>>
>> This should be:
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> ProAI is probably rejecting the documents because of this 'unknown'
>> encoding.
>>
>> Hope this helps.
>>
>> Regards,
>>
>> Graeme West
>> Digital Repository Developer
>> Information Services
>> Glasgow Caledonian University
>> graeme.w...@gcu.ac.uk<mailto:graeme.w...@gcu.ac.uk><mailto:
>> graeme.w...@gcu.ac.uk<mailto:graeme.w...@gcu.ac.uk>>
>>
>>
>>
>> On 24 Nov 2010, at 08:31, Dimitris Gavrilis wrote:
>>
>> Hi Steve,
>>
>> I'm attaching an xml sample of a record that produces this error.
>>
>> Thanks,
>> Dimitris.
>>
>> On Wed, Nov 24, 2010 at 9:55 AM, Steve Bayliss
>> <stephen.bayl...@acuityunlimited.net<mailto:
>> stephen.bayl...@acuityunlimited.net><mailto:
>> stephen.bayl...@acuityunlimited<mailto:stephen.bayl...@acuityunlimited>.
>> net>> wrote:
>> Hi Dimitris
>>
>> Do you have an example object FOXML file that could be used to reproduce
>> this?
>>
>> Thanks
>> Steve
>>
>>
>> -----Original Message-----
>> From: Dimitris Gavrilis
>> [mailto:gavri...@gmail.com<mailto:gavri...@gmail.com><mailto:
>> gavri...@gmail.com<mailto:gavri...@gmail.com>>]
>> Sent: 23 November 2010 15:17
>> To:
>> fedora-commons-users@lists.sourceforge.net<mailto:
>> fedora-commons-users@lists.sourceforge.net><mailto:
>> fedora-commons-us...@lists<mailto:fedora-commons-us...@lists>
>> .sourceforge.net<http://sourceforge.net/>>
>> Subject: [fcrepo-user] Proia multilingual - java.io.UTFDataFormatException
>>
>> Hi,
>>
>> I've setup fedora with Proai and whenever proai tries to parse non english
>> records (Greek) I get a java.io.UTFDataFormatException. Although I've seen
>> that this problem exists, I haven't managed to find a solution. When i
>> exclude non-English text, proai works fine.
>>
>> Thanks in advance,
>> Dimtris.
>>
>>
>> ----------------------------------------------------------------------------
>> --
>> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
>> Tap into the largest installed PC base & get more eyes on your game by
>> optimizing for Intel(R) Graphics Technology. Get started today with the
>> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
>> http://p.sf.net/sfu/intelisp-dev2dev
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@lists.sourceforge.net<mailto:
>> Fedora-commons-users@lists.sourceforge.net><mailto:
>> fedora-commons-us...@lists<mailto:fedora-commons-us...@lists>
>> .sourceforge.net<http://sourceforge.net/>>
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>>
>>
>> Email has been scanned for viruses by Altman Technologies' email
>> management
>> service<http://www.altman.co.uk/emailsystems>
>>
>>
>> <iid_1_mods.xml>------------------------------------------------------------
>> ------------------
>> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
>> Tap into the largest installed PC base & get more eyes on your game by
>> optimizing for Intel(R) Graphics Technology. Get started today with the
>> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
>> http://p.sf.net/sfu/intelisp-dev2dev
>> Email has been scanned for viruses by Altman Technologies' email
>> management
>> service - www.altman.co.uk/emailsystems<
>> http://www.altman.co.uk/emailsystems>
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@lists.sourceforge.net<mailto:
>> Fedora-commons-users@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>> Email has been scanned for viruses by Altman Technologies' email
>> management
>> service - www.altman.co.uk/emailsystems<
>> http://www.altman.co.uk/emailsystems>
>>
>>
>> Glasgow Caledonian University is a registered Scottish charity, number
>> SC021474
>>
>> Winner: Times Higher Education's Widening Participation Initiative of the
>> Year 2009 and Herald Society's Education Initiative of the Year 2009
>>
>> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en
>> .html<
>> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en%0A.html
>> >
>>
>>
>> ----------------------------------------------------------------------------
>> --
>> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
>> Tap into the largest installed PC base & get more eyes on your game by
>> optimizing for Intel(R) Graphics Technology. Get started today with the
>> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
>> http://p.sf.net/sfu/intelisp-dev2dev
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@lists.sourceforge.net<mailto:
>> Fedora-commons-users@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
>> Tap into the largest installed PC base & get more eyes on your game by
>> optimizing for Intel(R) Graphics Technology. Get started today with the
>> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
>> http://p.sf.net/sfu/intelisp-dev2dev
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@lists.sourceforge.net<mailto:
>> Fedora-commons-users@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>>
>> Email has been scanned for viruses by Altman Technologies' email
>> management service<http://www.altman.co.uk/emailsystems>
>>
>>
>> ------------------------------------------------------------------------------
>> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
>> Tap into the largest installed PC base & get more eyes on your game by
>> optimizing for Intel(R) Graphics Technology. Get started today with the
>> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
>> http://p.sf.net/sfu/intelisp-dev2dev
>> Email has been scanned for viruses by Altman Technologies' email
>> management service - www.altman.co.uk/emailsystems
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>> Email has been scanned for viruses by Altman Technologies' email
>> management service - www.altman.co.uk/emailsystems
>>
>>
>> Glasgow Caledonian University is a registered Scottish charity, number
>> SC021474
>>
>> Winner: Times Higher Education's Widening Participation Initiative of the
>> Year 2009 and Herald Society's Education Initiative of the Year 2009
>>
>> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html
>>
>>
>> ------------------------------------------------------------------------------
>> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
>> Tap into the largest installed PC base & get more eyes on your game by
>> optimizing for Intel(R) Graphics Technology. Get started today with the
>> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
>> http://p.sf.net/sfu/intelisp-dev2dev
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>
>
>
> ------------------------------------------------------------------------------
> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
> Tap into the largest installed PC base & get more eyes on your game by
> optimizing for Intel(R) Graphics Technology. Get started today with the
> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
> http://p.sf.net/sfu/intelisp-dev2dev
> _______________________________________________
> Fedora-commons-users mailing list
> Fedora-commons-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>
>
------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users