Even,
Thank you for your excellent suggestion (as always)
I'll try to implement it into my Talend TOS project (quite a bit
complicated but it may works)
Richard
Le 19/01/2015 13:17, > Even Rouault (par Internet) a écrit :
Le lundi 19 janvier 2015 12:46:20, MITANCHEY Richard - CEREMA/DTecTV/ESI/GNSI
a écrit :
Hi,
I need to get WindowsLatin1 encoded Strings (mapinfo .tab files), and I
cannot really convert original data to UTF-8 before...
I'm using OGR (GDAL Java binding) with GetFieldAsString() but string
lengths (and chars within) are most of the times incorrect
Is there any way to specify read and write string encodings ?
Should it be a pb of GDAL Java binding ?
Richard,
This is a problem of the TAB driver that should recode strings to UTF-8
internally as this is the conventionnal encoding decided in OGR.
And also a problem of the Java bindings which should offer a binary interface
in that case, since GetFieldAsString() can only be used to convert native
UTF-8 strings into Java unicode strings.
Both issues could potentially be fixed.
A potential workaround is to convert the .tab into a .shp by using --config
SHAPE_ENCODING "" in ogr2ogr, so that Latin1 strings are put directly
unmodified. And then read the shapefile, in which case it will recode from
Latin1 to UTF-8, and then you can use GetFieldAsString()
Even
TIA for your answers
Richard
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev