On 09.03.2004 23:54, Rafael Alvarado wrote:

Here is my situation. I run an etext server with documents written in
several languages. In creating a search interface for a collection of
Hebrew documents, for example, I want to pull a distinct list of words from
a db and create a set of lists for users to search with. The values have to
be in unicode, since they will be sent back to the database as a query
string. I don't want to have to translate entities back and forth into UTF8
-- I would rather work in UTF8 and forget entities forever.

It seems we have to go in details for a possible solutions of the problem. I thought that it is only a question of convenience when viewing the HTML output source. Are the HTML pages containing the UTF-8 characters shown correctly? If so, it should be possible to get them back as UTF-8 in Cocoon and store them in the database.


By the way, I had a similar problem with the HTML generator that uses Jtidy
-- is this, too, the fault of Xalan?

Hmm, difficult to say from here. What exactly is the problem? As I said I thought your problem is only about serializing UTF-8 characters, so maybe � is escaped to ö, but at the end (i.e. in the browser) they are the same. But you seem to have something different in mind.


Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to