Even, I think I've figured this out, in my particular example I had a filename which contained the character 'é' (U+00E9), which in UTF-8 encoding is the two byte sequence 0xC3 0xA9. However, this character is also in the ANSI character set (233 decimal), which explains why passing a "normal" ANSI encoded C String to GDALOpen will open the file. If we instead try a filename with a character that is not in the ANSI character set, for example 'ə' (U+0259), then the function will not work (even) with a normal C String.
So the current Win32 C/C++ API does not support UTF-8 encoded strings. Are you aware of any workarounds that may be available? Best Regards, Louis. On Mon, Aug 31, 2009 at 7:19 PM, Even Rouault <[email protected]>wrote: > Louis, Chaintanya, > > I just wanted to mention that the topic of encoding for filenames dealt by > GDAL > or OGR is a known issue that has not been addressed yet. You can read > http://trac.osgeo.org/gdal/wiki/rfc5_unicode which was a proposal but has > not > been implemented. Some infrastructure for re-encoding has been introduced > during > the implementation of http://trac.osgeo.org/gdal/wiki/rfc23_ogr_unicode(but > RFC23 only addresses the issue of encoding in OGR field values, not for > filenames) > > My understanding is that : > * on Windows the current API used by GDAL/OGR does not expect UTF8 or > Unicode > but ANSI. > * on Linux systems, UTF-8 is now assumed > > Best regards, > > Even > > Selon Lodewijk Pool <[email protected]>: > > > Hi Chaitanya, > > > > I appreciate you taking the time to check. The TAB extension is MapInfo's > > vector file format. The odd thing is that I did exactly the same test as > you > > did, I renamed a GeoTiff file to the offending filename and tried the > normal > > Raster Driver and got the same problem. Still, as far as you aware these > > functions should support UTF-8 encoded strings? There could possibly be > a > > peculiarity in the way I pack UTF-8 strings, though I am reasonably > certain > > that they are encoded correctly. > > > > Could you perhaps send me the code snippet you used to test the > > functionality (the part where you pass the string to GDALOpen). Do you > think > > there is a chance that my compiled version may differ from your own, i.e. > is > > it possible that I compiled a version of GDAL without UTF support? > > > > Best Regards, > > Louis. > > > > On Mon, Aug 31, 2009 at 6:35 PM, Chaitanya kumar CH > > <[email protected]>wrote: > > > > > Louis, > > > > > > I couldn't reproduce the problem on my WinXP-32 system with vc8 with > locale > > > set to uk english. However, I used the filename on a GeoTiff file. I > > > couldn't identify the .TAB extension. I am not sure that is a problem. > > > > > > Some of the drivers may not handle non-ascii data but file names should > not > > > be a problem. > > > > > > If you don't find any problem at your application side, submit a bug > report > > > at http://trac.osgeo.org/gdal/ > > > > > > > > > On Mon, Aug 31, 2009 at 8:02 PM, Lodewijk Pool <[email protected] > >wrote: > > > > > >> Hi Chaitanya, > > >> > > >> Yes, this is using the C/C++ API, the functions I am using are > declared in > > >> *gdal.h* and *ogrsf_frmts.h* respectively. I am using WinXP 32bit (UK > > >> English locale) and a version of GDAL 1.6.2 that I compiled for Win32 > > using > > >> the supplied nmake script files for VC8. The specific filename that is > > >> causing me problems is this one; *"découpage_geographique.TAB"*. If I > > >> remove the 'é' character in that string and replace it with a normal > 'e' > > the > > >> file opens without any problems. > > >> > > >> Any help would be appreciated. > > >> > > >> Best Regards, > > >> Louis. > > >> > > >> > > >> > > >> > > >> > > >> On Mon, Aug 31, 2009 at 4:10 PM, Chaitanya kumar CH <chaitanya.ch@ > > >> gmail.com> wrote: > > >> > > >>> Louis, > > >>> > > >>> GDAL/OGR usually supports utf-8 encoding. I just don't know where it > > >>> doesn't support. > > >>> Can you provide the details of the OS you are working on? Also, some > > >>> sample file names that caused you problems will come handy. > > >>> I presume you are working in C/C++. > > >>> > > >>> On Mon, Aug 31, 2009 at 6:37 PM, Lodewijk Pool > > <[email protected]>wrote: > > >>> > > >>>> Hi All, > > >>>> > > >>>> I'm having problems opening Raster and Vector Datasources that have > > >>>> filenames and paths with special characters. I'm using GDALOpen for > > Raster > > >>>> sources and OGRSFDriverRegistrar::Open() for Vector sources, the > strings > > I > > >>>> pass for the filenames are UTF-8 encoded. Does anyone know whether > these > > >>>> functions support UTF-8 encoding, and if not, whether there are any > > other > > >>>> API entry points that do support UTF-8 and/or UTF-16? > > >>>> > > >>>> Thank you in advance, > > >>>> Louis. > > >>>> > > >>>> _______________________________________________ > > >>>> gdal-dev mailing list > > >>>> [email protected] > > >>>> http://lists.osgeo.org/mailman/listinfo/gdal-dev > > >>>> > > >>> > > >>> > > >>> > > >>> Best regards, > > >>> -- > > >>> Chaitanya kumar CH. > > >>> > > >> > > >> > > > > > > > > > Best regards, > > > -- > > > Chaitanya kumar CH. > > > > > > > >
_______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
