Jukka Rahkonen <jukka.rahkonen <at> mmmtike.fi> writes: > > Jukka Rahkonen <jukka.rahkonen <at> mmmtike.fi> writes: > > > I suspect that the reason for the trouble is that this field is a > > 17 character wide VARCHAR2 and I have in the data values like > > "ÖVRE SÖDERGÄRDAN ". Database is using UTF-8 and Ö,Ö and Ä are > > taking more than one byte each. Perhaps OCI driver develops come > > from some ASCII country and did not bother to think about Oracle's > > character and byte semantics throughly. It seems somehow fuzzy for > > me even after reading this article > > http://myorastuff.blogspot.fi/2009/02/character-and-byte- > > semantics-in-oracle.html > > I can repeat the error with a minimal one-row test table having a field > NAME VARCHAR2(6) > and value ÄäÖöÅå
I was following a wrong track and I think I need to add one more mail to t his thread if someone happens to read it later. I had discoved earlier with the trial and error method that I could write "äöåÄÖÅ" characters into this Oracle database right with ogr2ogr by setting the Windows environment variable as SET NLS_LANG=finnish_finland.utf-8 That made me think that it was the correct setting and good to be used also for reading data with ogr2ogr. That was not the case. The correct NLS_LANG for the database is really "finnish_finland.ISO8859-P15" By setting the environment to use that ogr2ogr is reading all the data from Oracle. However, I must do the character encoding conversion as another process because ogr2ogr cannot handle it. This means that I am totally happy because I have not found any way yet to make ogr2ogr to write the non-ASCII characters for example into Spatialite correctly. Direct writing leads to carbage in the Spatialite db. Somehow usable workaround is to write a temporary GML file first by using --config ORG_FORCE_ASCII=NO. By doing this the resulting GML file will be in ISO-8859-15. However, ogr2ogr writes always into the first line of GML files that the character encoding is UTF-8. That must be corrected by hand for making a valid GML file which can then be converted correctly into Spatialite. Not a big deal really but some of the resulting GML files are 2-3 GB in size and plain opening and saving the file takes some time. While wasting time for this I have been thinking about two options: - For writing out GML user could have an option to set the character encoding manually - For reading GML user could give an option to treat the encoding as something else than what the file itself is advertising. However, I think that the real solution would be to have some common OGR wide way to handle character encodings and conversions instead of different implementations for each driver. -Jukka Rahkonen- _______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev