You are right. Sadly Perl does not provide all encodings :-(

The README in Perls encoding directory states the following:

This directory contains binary encoding maps for some selected encodings.
If they are placed in a directoy listed in @XML::Parser::Expat::Encoding_Path,
then they are automaticly loaded by the XML::Parser::Expat::load_encoding
function as needed. Otherwise you may load what you need directly by
explicity calling this function.

These maps were generated by a perl script that comes with the module
XML::Encoding, compile_encoding, from XML formatted encoding maps that
are distributed with that module. These XML encoding maps were generated
in turn with a different script, domap, from mapping information contained
on the Unicode version 2.0 CD-ROM. This CD-ROM comes with the Unicode
Standard reference manual and can be ordered from the Unicode Consortium
at http://www.unicode.org. The identical information is available on the
internet at ftp://ftp.unicode.org/Public/MAPPINGS.


With this information at hand I searched Google and found a windows-1252.xml file that I copied and manually edited to describe the windows-1253 codepage (http://en.wikipedia.org/wiki/Windows-1253). Here are the two files. Try to use them. Perhaps you will have better luck. If it still fails generating the proper characters, please check the windows-1253.xml file. I could be possible that I made a mapping error.

Best regards
Dirk



Constantine Dokolas schrieb:
Dirk <vss2svn <at> nogga.de> writes:

Hi Constantine,

"ssphys" info -eiso-8859-7 "VssAbc/data/names.dat"
As Toby wrote, please try to run the above command standalone and redirect to a file.

The encoding patch is not 100% correct, and I still have no ideas, how to make it 100% correct. We have to deal with two problems:

1.) different encodings: This one should be solved with the encoding attribute, but while playing with this, I had still problems to output characters that are allowed in one codepage, but discouraged by the XML standard. See http://www.w3.org/TR/REC-xml/#charsets where some characters, that are still allowed in the the windows-1252 codepage, are discouraged in XML. esp. most of the characters in the band [x80-x9f].

[snip]

Please have a look at your generated xml file from the names.dat and check which character is the problematic one.

Done. The culprit is a 0x92 character (reverse single apostrophe). It seems like 8859-7 is not the right encoding. Instead, "windows-1253" supports that particular character. Unfortunately, using that I get this:

Couldn't open encmap windows-1253.enc:
No such file or directory
 at /PerlApp/XML/Parser.pm line 187

It seems the like the proper encoding mapping file is not included in the exe.

Second: your output shows, that you have a missing ParserDetails.Ini file. Please check a previous mail thread "Idiots' guide to setting up a perl environment for vss2svn?". There is a sample ParserDetails file:

Perhaps you didn't notice, but I'm using the exe. So, I guess it's also an
exe generation problem.

Thanks a bunch.

Doc

_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user



Attachment: windows-1253.enc
Description: Binary data

<encmap name='windows-1253' expat='yes'>
  <ch byte='x80' uni='x20ac'/>
  <ch byte='x82' uni='x201a'/>
  <ch byte='x83' uni='x0192'/>
  <ch byte='x84' uni='x201e'/>
  <ch byte='x85' uni='x2026'/>
  <range byte='x86' len='2' uni='x2020'/>
  <ch byte='x88' uni='x20c6'/>
  <ch byte='x89' uni='x2030'/>
  <ch byte='x8a' uni='x0160'/>
  <ch byte='x8b' uni='x2039'/>
  <ch byte='x8c' uni='x0152'/>
  <ch byte='x8e' uni='x017d'/>
  <range byte='x91' len='2' uni='x2018'/>
  <range byte='x93' len='2' uni='x201c'/>
  <ch byte='x95' uni='x2022'/>
  <range byte='x96' len='2' uni='x2013'/>
  <ch byte='x98' uni='x02dc'/>
  <ch byte='x99' uni='x2122'/>
  <ch byte='x9a' uni='x0161'/>
  <ch byte='x9b' uni='x203a'/>
  <ch byte='x9c' uni='x0153'/>
  <ch byte='x9e' uni='x017e'/>
  <ch byte='x9f' uni='x0178'/>
  <ch byte='xa0' uni='x00a0' />
  <range byte='xa1' len='2' uni='x385'/>
  <range byte='xa3' len='12' uni='xa3'/>
  <ch byte='xaf' uni='x2015' />
  <range byte='xb0' len='4' uni='xb0'/>
  <ch byte='xb4' uni='x384' />
  <range byte='xb5' len='3' uni='xb5'/>
  <range byte='xb8' len='3' uni='x388'/>
  <ch byte='xbb' uni='xbb' />
  <ch byte='xbc' uni='x38c' />
  <ch byte='xbd' uni='xbd' />
  <range byte='xbe' len='2' uni='x38e'/>
  <range byte='xc0' len='64' uni='x390'/>
</encmap>
_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user

Reply via email to