Hi Rafa,

We run into pretty much the same issue ourselves - we're hosting for 
institutes across Europe, and so are starting to see requirements for all 
sorts of characters to be stored. This is whilst using on Oracle 10 database 
that is set for ISO-8859-1. Currently, we can't do anything about the 
database codepage, as DSpace is sharing an instance with all of our other 
sites - BioMed Central, Faculty of 1000, etc. - and that's a lot of data and 
code that is impacted by a codepage change!!

Our current 'solution' is to convert some of the columns to the national 
character variants (NVARCHAR / NCLOB). That allows you to define a different 
codepage for the national character storage from the main codepage of the 
database.

But this method has some serious drawbacks. You need to extend the 
DatabaseManager to cope with the additional types returned - which would be 
fine, if the driver didn't just classify it all under Types.OTHER. That 
means you have to make assumptions as to what you are dealing with.

Really, it's a very nasty hack. I thought about creating a patch to 
accomodate national character set that could be submitted for the core - but 
at the moment I'm seeing it as just being unsupportable. It also only works 
to the extent of how many columns you change to the national variants.

As I said, we have a lot of different sites using our Oracle database, and 
because the servers are quite old, we're very limited on resources. These 
servers are due to be upgraded in the next month, and with the additional 
resources that will provide, I'm currently pushing for us to have a separate 
database instance (running on the same hardware / license). That will allow 
us to have a UTF-8 codepage without affecting the instance that is used for 
our other sites.

Even if you went the national character set route, you still end up with 
UTF-8 in the database. If that is unpalatable to you, then you are going to 
have to insert a character encoding / decoding where Strings get inserted to 
or retrieved from the database - and it looks like there are a lot of places 
in the code that you would have to change if you can't find (or create) a 
proxy JDBC driver to do it.

If you can, I would strongly suggest that you run with a UTF-8 codepage - 
even if that means creating a separate instance (on the same hardware) so 
that everything non-DSpace can still operate with an ISO-8859-1 database.

G

----- Original Message ----- 
From: "Rafa" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Monday, February 12, 2007 8:18 AM
Subject: [Dspace-tech] Can´t see diacritics in WebUI


Hi, with this configuration:

DSpace 1.4.1
Debian 3.1
Tomcat 5.5
Oracle 10
Oracle JDBC Thin driver

our DSpace doesn´t work well with diacritics.

When a new submission is done, just when I submit a new item to the
appropriate collection, all the diacritics appear corrupted in the
WebUI. Looking at the tables in the database, they are correcty
recorded. So, it seems to me that dspace is uploading well the data, but
when it shows it in the user interface (download) it makes (or doesn´t
make) some kind of transformation and then the diacritics are corrupted.

Our database is working with  ISO-8859-1, not UTF-8. In fact, if we put
the db in UTF-8 it works quite well, but due to administrative
considerations, we prefer make it work with the ISO-8859-1 codepage. We
have already configured Tomcat so that it works with UTF-8, and it
doesn´t make any difference at all. (The browser automatically works
with UTF-8 as well)

We have the database in a different computer than the dspace server.
Analyzing the traffic between the database, the server and the browser,
we have found that the data is codified as:

Upload:
1- From the browser to the dspace server: UTF-8
2- From the dspace server to the database: ISO-8859

Download:
3-From the database to the dspace server: ISO-8859
4-From the dspace server to the browser: ????

Any idea? Thanks in advance

Rafael Carreres
Computing Service
University of Alicante - Spain



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job 
easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to