Hi all, Robert has asked me to start a discussion on this subject.

The problem - Windows does not support the UTF-8 codepage.
OpenSceneGraph uses narrow-character strings for filenames and therefore
it is impossible to store all filenames. This problem became apparent to
me when an artist used the filename "image-1.tga". Spot the problem? The
dash is a dash, not a minus, and therefore not in the default codepage.

The solution prescribed by Microsoft is to use wide-character strings.
The problem is with that in OSG is that the same structures are used to
store normal strings (e.g. GLSL shaders, which must be ASCII) as
filename strings.

My solution:
- Add a OSG_USE_UTF8_FILENAMES option to CMake
- let the user store all filenames in UTF-8 using
osg::ConvertUTF16toUTF8()
- where a file is loaded, if OSG_USE_UTF8_FILENAMES is true then convert
the filename to UTF-16 using osg::ConvertUTF8toUTF16() and use a wide
character function to open the file e.g. _wfopen instead of fopen.

I have implemented and submitted this solution. However, Robert is not
keen on using this solution as it is not the most readable/maintainable.
I agree that it would be very easy for someone to be unaware of it and
put in an fopen() to subtly break it.

So, suggestions and general discussion please...




-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Robert Osfield
Sent: 07 October 2008 13:24
To: OpenSceneGraph Submissions
Subject: Re: [osg-submissions] Unicode support for Windows

Hi Michael,

I have just done a review of all the changes, and rather than
overwhelmed by the extent of the changes.  While I'm impressed at how
thorough you've been - it's clearly taken quite a bit of work, I am
rather concerned about the nature of the changes and in particular the
ongoing readability and maintainability of them.  For this reason I
won't be merging this submission as is.

I do understand the need to be able to cope with this issue, so would
suggest an discussion about the different ways we could tackle it, since
the way we use ifstream, ostream and fopen usage will all have to
altered in some way I believe that this type of discussion should be
done on osg-users.  Hopefully we'll be able to come up with an approach
that is straight forward to implementent and maintain and we can then go
do a pure on the code base to move it across to using the new scheme.

Could you introduce the topic on osg-users?

Robert.

On Mon, Oct 6, 2008 at 2:28 PM, Michael Platings
<[EMAIL PROTECTED]> wrote:
> Hi Robert,
> Windows doesn't support the UTF-8 code-page. Therefore any filename 
> containing a character outside the current code-page cannot be loaded 
> by OpenSceneGraph.
> To fix this I have added a OSG_USE_UTF8_FILENAME advanced option for 
> CMake (off by default) When enabled, all file access functions use the

> wide-character alternative (e.g. _wfopen instead of fopen). The 
> filename argument is assumed to be a
> UTF-8 string which is converted to a UTF-16 string.
>
> Thanks,
> Michael Platings
>
> BTW, here's a good discussion of the problem previously posted by Ben 
> Discoe. His statement that C++ stream IO doesn't support wide 
> filenames seems to be true in the standard, however Visual C++ does 
> allow wide-string filenames.
>
> OSG on Windows passes your strings directly to fopen, or the C++ 
> stream equivalent. File paths are assumed to be in the local OS's 
> filesystem character set. This means that e.g. Chinese filenames can 
> be opened on computers with Chinese version of Windows, and Western 
> filenames can be opened on computer with Western version of Windows. 
> So, to do exactly what you describe below, OSG already does it.
>
> However, it is true that if a Chinese user sent you a file with a 
> Chinese filename, you could not open it with OSG on your non-Chinese 
> OS. To do that, OSG would indeed have to add Unicode filename support.

> Unicode means that any file can be opened on any machine.
>
> To support Unicode ,with the C standard lib on Windows, it is quite 
> easy to replace usage of fopen with _wfopen. However, there are many 
> places in OSG's code base where the C++ stream IO is used instead of 
> fopen. AFAIK, there is no _w version of those methods, so OSG is
stuck.
>
> Just as a note, this whole thing is delightfully a non-issue on Mac OS

> X, and (some?) flavors of Linux, in which UTF-8 is the filesystem 
> charset, so plain old fopen() handles everything.
>
> -Ben
>
>> ----------
>
>> From: Reed McKenna
>
>> Sent: Tuesday, April 22, 2008 4:00 PM
>
>>
>
>> We build an application for Windows XP using OpenSceneGraph. We have
>
>> more and more users from Asian countries who want to read in files
>
>> that have names with Asian characters. Windows' NTFS file system
>
>> stores file names in Unicode. How can I have osgDB::readNodeFile (and
>
>> writeNodeFile, etc.) read from and write to these files, using the 
>> full NTFS Unicode file name?
>
>> If it is not currently possible, are there any plans in the works to
>
>> make it possible?
>
>>
>
>> Reed
>
> _______________________________________________
>
> osg-users mailing list
>
> [email protected]
>
> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.
> org 
> ______________________________________________________________________
> This email and any files transmitted with it are confidential and 
> intended solely for the use of the individual or entity to whom they 
> are addressed. If you have received this email in error please notify 
> the system manager.
>
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email 
> ______________________________________________________________________
>
> _______________________________________________
> osg-submissions mailing list
> [EMAIL PROTECTED]
> http://lists.openscenegraph.org/listinfo.cgi/osg-submissions-openscene
> graph.org
>
>
_______________________________________________
osg-submissions mailing list
[EMAIL PROTECTED]
http://lists.openscenegraph.org/listinfo.cgi/osg-submissions-openscenegr
aph.org

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________

______________________________________________________________________
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________
_______________________________________________
osg-users mailing list
[email protected]
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Reply via email to