The patch did not make any difference - it still throws the same exception!

What I meant about converting from UTF-8 to Unicode is that the database
driver can handle Unicode. In the filestore UTF-8 is converted to local
character set in order to create the files and this is why the filestore (I
think) has a problem. If the database could store the data in Unicode then
there would be no problem. Since java is using Unicode in strings the task
would simply be to decode the strings before they are stored in the database
and then make sure that all text fields in the database are Unicode (or
widechar or nchar).

Please tell me if I am way off here!

/Jacob

-----Original Message-----
From: Oliver Zeigermann [mailto:[EMAIL PROTECTED] 
Sent: 29. januar 2004 10:02
To: Slide Users Mailing List
Subject: Re: TXFileStore and local filesystem

Jacob Lund wrote:
> No, the filestore works correctly.

OK, shall I check in the patch? Did it work for you?

>>From what I can see the filestore converts from UTF-8 to local before it
> stores data. This I why UTF-8 works fine for me when I upload files with
> Danish letters in the filename, and also why if fails when it stores files
> with characters not supported by the codepage.
> 
> Windows XP use Unicode, but in "dos mode" it will use the old codepage
> types. The only thing that I can imagine is that java will use this
codepage
> when it is doing IO operations towards the filesystem. This problem might
be
> a problem that only appears on windows systems.
> 
> I do not think that the problem is in the fill data into the database that
> has a problem. Some place in slide it will convert that data (in this case
> the uri) to UTF-8 before it is send to the client. The data stored in the
> database is UTF-8, and I believe that java is using Unicode. So the
solution
> might be to convert data fetched from the database back to Unicode as soon
> as it arrives to the store class.
> 
> The correct solution might be to convert from UTF-8 to Unicode before
> storing the data and then change the database scheme to Unicode char in
all
> fields containing strings.

Hmmmm. You might be confusing certain things here. On one side there is 
Unicode having a number for each character. On the other side there is 
the representation in bytes. Now, UTF-8 *is* Unicode, but on the other 
side, i.e. the representation in bytes. Thus it does not make too much 
sense to compare Unicode with UTF-8. Do you agree?

> I am guessing here since I do not have any idea of how the stores are
> structured in slide. I you want I would be happy to do some debugging, but
I
> will need a short introduction to how the datastores are designed in
slide.

I know, proper documentation is a major problem. I will try to prepare 
something like a short introduction and will post it to the list as soon 
as it is done. This may take a while though :(

Oliver

> /Jacob
> 
> -----Original Message-----
> From: Oliver Zeigermann [mailto:[EMAIL PROTECTED] 
> Sent: 28. januar 2004 16:40
> To: Slide Users Mailing List
> Subject: Re: TXFileStore and local filesystem
> 
> Jacob Lund wrote:
> 
> 
>>Sorry about that - yes I am talking about the URI!
>>
>>If I look in a record in the database, each Danish character is stored as
>>two "funny looking" characters corresponding to the unescaped UTF-8
> 
> encoded
> 
>>version - so this looks correct! However when I do a propfind on the
>>collection I which I place this file, then I get something like this
>>/files/%C3%83%C2%B8 - and this should have been representing one Danish
>>character. If I take the above and convert from UTF8 to my local, then I
> 
> get
> 
>>what is store in the database - If I then convert from UTF8 to local again
>>the I get the correct Danish letter.
> 
> 
> I could not find anything that might have converted the URI strings. 
> They are just plainly filled into the SQL like in
> 
> 
>>                        "select 1 from OBJECT o, URI u where
> 
> o.URI_ID=u.URI_ID and u.URI_STRING=?");
> 
>>                statement.setString(1, uri.toString());
> 
> 
> So, maybe this is a more general problem...
> 
> 
>>I seem that slide converts the URI's from the db to UTF8, but they are
>>already stored in unescaped UTF-8!
> 
> 
> Does this happen with the file store as well?
> 
> Oliver
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> .
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to