Hello, list. Recently, I noticed that ArcGIS software (at least since version 10.3) can produce shapefiles where the DBF file is encoded with UTF-16. https://desktop.arcgis.com/en/arcmap/latest/extensions/production-mapping/converting-a-geodatabase-to-shapefiles.htm
But they have made it difficult to do so, since you need the "Production Mapping" license. Without that, produced shapefiles will by default be in UTF-8; one can use some other code page by modifying a system registry setting dbfDefault, but there doesn't seem to be any setting that will produce UTF-16. I have never encountered a shapefile in UTF-16, but I am beginning to wonder if we ought to support them. I guess they would be more space-efficient for languages like Chinese and Japanese, where most characters need three UTF-8 bytes but only two UTF-16 bytes. This could be important since DBF reserves only 10 bytes for field names. Some questions: Can the OGR Shape driver handle UTF-16? More generally, are there many GIS systems that can handle UTF-16 in shapefiles? Or perhaps I should just ask: has anyone ever seen a shapefile in UTF-16? If so, would the content of the CPG file be always UTF-16LE or always UTF-16BE, or is it just UTF-16? I suppose the only things encoded in UTF-16 would be the field values of type String, plus the field names? (I also wonder if shapefiles in UTF-16 is a good idea, or if the GIS community just ought to forget about them, but I guess there is no definite answer to that!) Kind regards, Mikael Rittri Carmenta Geospatial Technologies
_______________________________________________ gdal-dev mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/gdal-dev
