Re: [Dspace-tech] utf8 and dspace
On Sat, Mar 08, 2008 at 11:11:02AM +0800, Jayan Chirayath Kurian wrote: In a DSpace batch import, the importer stops at special characters (e.g. ). This can be resolved by converting into its equivalent entity represented as amp;. Is there any other solution rather than changing this manually. Oh, that. That's not a charset encoding (UTF-8) issue; it's an XML encoding issue. Well-formed XML can't have naked ampersands or left angle brackets; they must be specified as coded character entities. You'd have the same problem no matter what charset encoding you used. There *are* charset encoding issues, often when building a batch by cut'n'pasting from Windows editors or office tools. I was advised to add an XML PI to the head of the dublin_core.xml to specify the likely encoding: ?xml version='1.0' encoding='windows-1252' ? and that took care of all the sections, em-dashes, accents, and silly smartquotes. -- Mark H. Wood, Lead System Programmer [EMAIL PROTECTED] Typically when a software vendor says that a product is intuitive he means the exact opposite. pgpxwhtqXuwhc.pgp Description: PGP signature - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] utf8 and dspace
Hello, Am 07.03.2008 um 16:49 schrieb LARC/J.L.Shipman/jshipman: We are running dspace 1.4.2, postgresql 8, solaris 10. Is UTF8 a necessity for dspace. My understanding is that Sun's en_US.ISO8859-1 includes most of UTF8 except for the far east languages. Any help is appreciated. @nasa.gov: Working at an international site? I dont know whether UTF-8 is strictly required everywhere but ISO 8859-1 is not even sufficient for any european language as it is missing the Euro sign e.g. (which requires ISO 8859-15 at least). What I do know is that everything is becoming better everyday since we aim at being UTF-8 strictly and only. We have lept into live service from a test installation and had old ISO 8859 stuff in our site for over a year. When I switched to a new server in January I made sure that everything is UTF-8 from now on. I mean everything including file names on the system level. This is default in a current Debian BTW. I still have some crap inside, in the search index I found a glitch last week and in the email templates too, but everywhere where I got rid of the old stuff I am really happy about flawless functionality and display. Come on, ISO 8859 is from the last century. The time is gone. UTF-8 is the way to go, no way to argue. You are running a system made for long term preservation. Bye, Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] utf8 and dspace
Let's turn the question around. Is UTF-8 a problem for you? why? What would you need to make it no longer a problem? -- Mark H. Wood, Lead System Programmer [EMAIL PROTECTED] Typically when a software vendor says that a product is intuitive he means the exact opposite. pgpwGHIvxmG5p.pgp Description: PGP signature - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech