For basic language tools for Persian:
check http://stp.lingfil.uu.se/~mojgan/

Jörg


On Thu, Apr 18, 2013 at 12:38 PM, amin farajian <[email protected]>wrote:

>  Dear Wang,
>
> Here are the links to the publicly available Persian-English corpora:
>
>    - TEP: Tehran English-Persian parallel corpus, built on subtitles. It
>    is free and you can find it here: download 
> link<http://opus.lingfil.uu.se/download.php?f=OpenSubtitles2011/xml/en-fa.xml.gz>
>     - ELRA-W0051, generic domain. to obtain this corpus take a look at
>    this link <http://catalog.elra.info/product_info.php?products_id=1111>.
>    - PEN: Parallel English-Persian News corpus, which is a small corpus
>    built on news stories. It is not publicly available yet, but I am going to
>    release it soon. (link to the 
> paper<http://world-comp.org/p2011/ICA4953.pdf>
>    )
>
> For tokenization you can use every tokenizer available, such as the moses
> tokenizer.
>
>
>  If you have more questions, feel free to ask.
>
>
>  Regards,
> Amin
>
>
>
> On 04/18/2013 10:45 AM, Wang, JinPeng(AWF) wrote:
>
>  Hi, everyone****
>
> ** **
>
>          Have you got any Persian and English parallel text or related
> corpus links? And how to tokenize the Persian language?****
>
> ** **
>
> Thanks****
>
> Regards****
>
> ** **
>
> Wang, JinPeng(AWF)****
>
> eBay, Inc.****
>
> Stubhub****
>
>
> _______________________________________________
> Moses-support mailing 
> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
**********************************************************************************
Jörg Tiedemann
http://stp.lingfil.uu.se/~joerg/
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to