On 1/5/2011 12:48 PM, Ian Walberg wrote:
Hello list and Happy New Year,

We are seeing incorrectly rendered Arabic text, typically a 'square'
displayed instead of one of the characters or sometimes one or more
characters missing.

The data is coming from shape files and most the names appear to be
displayed correctly.

This is being seen on both ms4w and the target Linux installation.

Any idea where to look would be greatly appreciated. Is the 'square'
character what Freetype displays if it cannot find a character in the
font.

Hi Ian,

The square that is displayed is the default character glyph used when a requested character can not be found in the character set being used.

I looked at the sample images the your emailed me directly and here are my thoughts on the problem:

1. encoding is ok, because 99.9% of the characters are displayed correctly

Possible causes for what you are seeing:

2. you have random characters that are not valid utf8 characters - but this is probably refuted by the fact that in other programs it is display correctly.

One way to easily verify this is to do a dbfdump of the dbf file in the shapefile like:

  dbfdump myshapfile.dbf > myshapfile.txt

Then view myshapfile.txt in firefox where you can change the character encoding from the menu:

View -> Character Encoding -> ...

You can change it until you find the one that looks correct. This is also a good way to determine the ENCODING value for the mapfile if you are unsure. Also if your data is mangled because it has mixed character data within an attribute column or the data was encoded badly it will most likely look like garbage.

3. the font you are using with mapserver is not the same font used by other programs. Hence the other programs have the glyph in their font and mapserver does not

4. typically in utf8 text, only the "content" characters are stored and it is up to the rendering program to determine if it left-right ot right-left rendering and whether or not some adjacent characters need to be joined in some manner. This joining process is typically different in different applications. So while mapserver might be failing in this regard (more on that below), may be your other application is doing a better job.

So my pick for the likely issue here is 3 or more likely 4. Mapserver uses fribidi library to convert the utf8 "content" string into a utf8 "display" string, then asks freetype to render the "display" string using you font. The difference between the "content" string and he "display" string is the fact the fribidi is dealing with positioning of the character for both RTL and LTR rendering and it is adding joining characters that must be present in the font that you use to display the string.

It is likely that whatever you comparison application is, it is not using fribidi.

So the fact that mapserver is rendering most of your data correctly means that you have things built correctly and you data and mapfile are setup correctly. I think you will need to chase down this problem on the fribidi list. If the issue a patch to the library then the MS4W team can pick that up if they know about it and anyone on *NIX can pull the new package and build it.

The other fix to this problem, again the fribidi list can probably help, is to identify which glyph is missing and try to find someone that can add a glyph for the missing character to your font.

As far as I can tell this is not a mapserver problem per say except that it is annoyingly only showing its ugly square box on the maps we render ;)

I hope this helps you solve this problem.

-Steve W
_______________________________________________
mapserver-users mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/mapserver-users

Reply via email to