New branch for charset encoding issues.

John Darrington Thu, 26 Mar 2009 22:36:59 -0700

I've started a new branch for fixing character set encoding issues.


So far, it reads record 7, subtype 20 to find out the ostensible
encoding of a dataset.  It stores this encoding name in the
dictionary.  The global "PSPP" encoding is no more.

Things to do before this branch is merged:

* Saving files should write record 7(20).
* More intelligent fallback if 7(20) isn't found.
* Update developers guide.
* Check what happens when mergeing (eg with MATCH, ADD, UPDATE)
  datafiles with different encodings.
* Should add some manual override.

Anyway it opens and correctly displays Korean, Japanese and Slovenian
files now. 

Comments welcome.

J'

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.

signature.asc
Description: Digital signature

_______________________________________________
pspp-dev mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/pspp-dev

New branch for charset encoding issues.

Reply via email to