Emad Nawfal (9E'/ FHAD) wrote:
Hi Tutors,
I have a bunch of text files that have many occurrences like the following
which I believe, given the context,  are numbers:

١٨٧٢

٥٧

 ٢٠٠٨

etc.

So, can somebody please explain what kind of numbers these are, and how I
can get the original numbers back. The files are in Arabic and were
downloaded from an Arabic website.
I'm running python2.6 on Ubuntu 9.04

Those are standard html encodings for some Unicode characters. Skipper has identified one of them as the digit '1' written in Arabic. I presume the others will also be recognizable to you, since you apparently know Arabic. The following text should be copied to a flie with extension .html Then you run that in a browser, to see the characters.

DaveA

<!DOCTYPE html PUBLIC "-//W3C//Dtd html 3.2//EN">

<HTML>
<HEAD>
<TITLE>Test Arabic Characters</TITLE>
</HEAD>
<BODY>
  <p>
Table of characters <br>
1632 - &#1632;<br>
1633 - &#1633;<br>
1634 - &#1634;<br>
1635 - &#1635;<br>
1636 - &#1636;<br>
1637 - &#1637;<br>
1638 - &#1638;<br>
1639 - &#1639;<br>
1640 - &#1640;<br>
1641 - &#1641;<br>
  </p>
</BODY>
</HTML>

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to