Smokey,

On 21/04/2011 08:20, Smokey Ardisson wrote:
> Just for my own information, your recent commits have only been for the
> double-byte languages; there's still no support for the single-byte ones
> in WP-Mac files, right?

I have problem with this for this moment. Because they are basically in 
the same range for each charset. I don't know how those are placed in 
the script function. It is possible that for each group there is one 
byte indicating which group it is and the code, but I am pretty not 
sure. If this were the case, it could happen that we could somehow 
convert them.

For the other "extended character function", we do following. We read 
first the one byte of mac character code and interprete it into the 
MacRoman table. If this character is not valid (special codes for that), 
we read the two bytes of WP5 charset/char pair and interprete it the 
same way as we use for WP5 documents.

I have implemented this primarily as only the WP5 part, but if I recall 
well, we had some problems with particular mac characters that were 
misinterpreted (probably bug in WP3). So we reverted to the priority for 
the Mac Character. It is nevertheless conceivable, although I don't have 
the empirical evidence for it that if you write a document using another 
system language, the mac character will be from different set. There, we 
cannot do much unless we somehow get the information about what charset 
those characters are from. That is also I was saying the we could use 
some additional information about how the WorldScript encoding is 
looking like.

So, unless we know which encoding the document uses for the mac 
character part of the 0xC0 functions, we are a bit grilled. I almost 
feel like actually converting the WP5 pair first and then the mac char 
only if the WP5 is not giving result, since it might cover correctly 
more cases even though it might mess up some 2-3 acutes vs. graves 
accents. But the proper way would be to get the information from the 
docs about what encoding one uses. Anybody volunteering?

F.

-- 
Please avoid sending me Word, Excel or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Libwpd-devel mailing list
Libwpd-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libwpd-devel

Reply via email to