Le lundi 19 janvier 2015 12:46:20, MITANCHEY Richard - CEREMA/DTecTV/ESI/GNSI a écrit : > Hi, > I need to get WindowsLatin1 encoded Strings (mapinfo .tab files), and I > cannot really convert original data to UTF-8 before... > I'm using OGR (GDAL Java binding) with GetFieldAsString() but string > lengths (and chars within) are most of the times incorrect > Is there any way to specify read and write string encodings ? > Should it be a pb of GDAL Java binding ?
Richard, This is a problem of the TAB driver that should recode strings to UTF-8 internally as this is the conventionnal encoding decided in OGR. And also a problem of the Java bindings which should offer a binary interface in that case, since GetFieldAsString() can only be used to convert native UTF-8 strings into Java unicode strings. Both issues could potentially be fixed. A potential workaround is to convert the .tab into a .shp by using --config SHAPE_ENCODING "" in ogr2ogr, so that Latin1 strings are put directly unmodified. And then read the shapefile, in which case it will recode from Latin1 to UTF-8, and then you can use GetFieldAsString() Even > TIA for your answers > Richard > _______________________________________________ > gdal-dev mailing list > [email protected] > http://lists.osgeo.org/mailman/listinfo/gdal-dev -- Spatialys - Geospatial professional services http://www.spatialys.com _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
