Emad Nawfal (9E'/ FHAD) wrote:
Hi Tutors,
I have a bunch of text files that have many occurrences like the following
which I believe, given the context, are numbers:
١٨٧٢
٥٧
٢٠٠٨
etc.
So, can somebody please explain what kind of numbers these are, and how I
can get the original numbers back. The files are in Arabic and were
downloaded from an Arabic website.
I'm running python2.6 on Ubuntu 9.04
Those are standard html encodings for some Unicode characters. Skipper
has identified one of them as the digit '1' written in Arabic. I
presume the others will also be recognizable to you, since you
apparently know Arabic. The following text should be copied to a flie
with extension .html Then you run that in a browser, to see the
characters.
DaveA
<!DOCTYPE html PUBLIC "-//W3C//Dtd html 3.2//EN">
<HTML>
<HEAD>
<TITLE>Test Arabic Characters</TITLE>
</HEAD>
<BODY>
<p>
Table of characters <br>
1632 - ٠<br>
1633 - ١<br>
1634 - ٢<br>
1635 - ٣<br>
1636 - ٤<br>
1637 - ٥<br>
1638 - ٦<br>
1639 - ٧<br>
1640 - ٨<br>
1641 - ٩<br>
</p>
</BODY>
</HTML>
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor