Salam,

Ahmed if you are trying to parse Quranic Corpus Xml file ,i think nltk
is not the library needed

you need an xml parser like elementtree http://effbot.org/downloads/
and a text parser like pyparsing
you need also to read about quranic corpus morphology syntax

i am working on a python api for Quranic Corpus , but not ready to publish



you need an

2010/5/24, Kais Dukes <k...@kaisdukes.com>:
> Salam Ahmed,
>
> I don't know much about python. But I have forwarded your e-mail to the
> comp-quran mailing list, Inshallah someone will be able to help you!
>
> w/salam,
>
> -- Kais
>
> -------------------------------------------
> From: Ahmed
> Salem[SMTP:ahmed.elsayed.sa...@gmail.com<smtp%3aahmed.elsayed.sa...@gmail.com>
> ]
> Sent: Sunday, May 23, 2010 11:51:43 PM
> To: Kais Dukes; k...@kaisdukes.com
> Subject: Python with Quranic Arabic Corpus Help
> Auto forwarded by a Rule
>
> Salmo alikom
> Hi kais,
>
>
> I'm student and now i trying to build Quranic Arabic  search program with
> python and Quran Corpus but i'm beginer at python and nltk
>
> I work on Quran corpus  at the link<http://corpus.quran.com/download/> which
> build on that format
> # Format: <chapter> | <verse> | <word> | <token> | <part-of-speech>
>
> Now  i need your help in finding all verse in selected chapter then all word
> in selected (chapter,word)
>
> i start to seprate them with code like
>
> path = nltk.data.find('D:\\quran\quranic-corpus-text-0.1.txt')
>
> ar = {}
> arabic =codecs.open(path, encoding='utf-8')
>
> line = arabic.readline()
> while line!='':
>    tmp = line.splitlines('|')
>    ch= tmp[0]
>    v = tmp[1]
>    txt=tmp[2]
>    kkk=ch.strib()+":"+v.strib()
>    ar[kkk]=txt.strib()
>
>    line = arabic.readline()
> arabic.close()
>
>
> but that way can't work yet also i think there are another easy way to do
> that so if you can please help or advice
>
> Thanks,
> --
> Ahmed Salem Resume<http://www.scribd.com/doc/14256056/Ahmed-LSayed-Salem-CV>
> MUFIX Community<http://www.mufix.org>
> Mobile : +2 (018) 23 79 073
>

Reply via email to