Hi Rafa, We run into pretty much the same issue ourselves - we're hosting for institutes across Europe, and so are starting to see requirements for all sorts of characters to be stored. This is whilst using on Oracle 10 database that is set for ISO-8859-1. Currently, we can't do anything about the database codepage, as DSpace is sharing an instance with all of our other sites - BioMed Central, Faculty of 1000, etc. - and that's a lot of data and code that is impacted by a codepage change!!
Our current 'solution' is to convert some of the columns to the national character variants (NVARCHAR / NCLOB). That allows you to define a different codepage for the national character storage from the main codepage of the database. But this method has some serious drawbacks. You need to extend the DatabaseManager to cope with the additional types returned - which would be fine, if the driver didn't just classify it all under Types.OTHER. That means you have to make assumptions as to what you are dealing with. Really, it's a very nasty hack. I thought about creating a patch to accomodate national character set that could be submitted for the core - but at the moment I'm seeing it as just being unsupportable. It also only works to the extent of how many columns you change to the national variants. As I said, we have a lot of different sites using our Oracle database, and because the servers are quite old, we're very limited on resources. These servers are due to be upgraded in the next month, and with the additional resources that will provide, I'm currently pushing for us to have a separate database instance (running on the same hardware / license). That will allow us to have a UTF-8 codepage without affecting the instance that is used for our other sites. Even if you went the national character set route, you still end up with UTF-8 in the database. If that is unpalatable to you, then you are going to have to insert a character encoding / decoding where Strings get inserted to or retrieved from the database - and it looks like there are a lot of places in the code that you would have to change if you can't find (or create) a proxy JDBC driver to do it. If you can, I would strongly suggest that you run with a UTF-8 codepage - even if that means creating a separate instance (on the same hardware) so that everything non-DSpace can still operate with an ISO-8859-1 database. G ----- Original Message ----- From: "Rafa" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Monday, February 12, 2007 8:18 AM Subject: [Dspace-tech] Can´t see diacritics in WebUI Hi, with this configuration: DSpace 1.4.1 Debian 3.1 Tomcat 5.5 Oracle 10 Oracle JDBC Thin driver our DSpace doesn´t work well with diacritics. When a new submission is done, just when I submit a new item to the appropriate collection, all the diacritics appear corrupted in the WebUI. Looking at the tables in the database, they are correcty recorded. So, it seems to me that dspace is uploading well the data, but when it shows it in the user interface (download) it makes (or doesn´t make) some kind of transformation and then the diacritics are corrupted. Our database is working with ISO-8859-1, not UTF-8. In fact, if we put the db in UTF-8 it works quite well, but due to administrative considerations, we prefer make it work with the ISO-8859-1 codepage. We have already configured Tomcat so that it works with UTF-8, and it doesn´t make any difference at all. (The browser automatically works with UTF-8 as well) We have the database in a different computer than the dspace server. Analyzing the traffic between the database, the server and the browser, we have found that the data is codified as: Upload: 1- From the browser to the dspace server: UTF-8 2- From the dspace server to the database: ISO-8859 Download: 3-From the database to the dspace server: ISO-8859 4-From the dspace server to the browser: ???? Any idea? Thanks in advance Rafael Carreres Computing Service University of Alicante - Spain ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

