RE: Persian PC-Kimmo 0.8 released

Ehsan Akhgari Tue, 11 May 2004 20:34:30 -0700

> For anyone who's interested, Persian PC-Kimmo version
> 0.8 has just been released.  It's available here:
>
> http://home.byu.net/jmd56/download/persian-pckimmo-0.8.tar.gz


Thanks, Jon, for releasing this version.  It looks a lot better than the
previous one!

> The biggest thing holding them back from being a 1.0 is a relatively
> small lexicon (~1350 words).  The morphology engine achieves about
> two-thirds recognition on a corpus of about 3.5 million words.
> And of course, it's GPL'ed.

Hmmm, do you have a list of the words in the current lexicon?  (I'm not
familiar with PC-KIMMO specific commands, so I can't parse them on my own.)
What should I do to help adding more words?

> Any helpful feedback would be appreciated.

I find the new tree-style recognition a lot helpful:

n+mi+]+im     NEG+DUR+come.PRES+1P

1:
            Top
             |
           Verb
     ________|________
VNEGPREFIX        VNStem
    n+         ______|_______
   NEG+     VPREFIX       VStem
              mi+           |
             DUR+        V1Stem
                        ____|_____
                     V2Stem  VPSUFFIX
                        |       +im
                     V3Stem     +1P
                        |
                        V
                        ]
                    come.PRES

Top:
[ cat:   Top ]

1 parse found

n+mi+]+m     NEG+DUR+come.PRES+1S

1:
            Top
             |
           Verb
     ________|________
VNEGPREFIX        VNStem
    n+         ______|_______
   NEG+     VPREFIX       VStem
              mi+           |
             DUR+        V1Stem
                        ____|_____
                     V2Stem  VPSUFFIX
                        |       +m
                     V3Stem     +1S
                        |
                        V
                        ]
                    come.PRES

Top:
[ cat:   Top ]

1 parse found

I was wonderring if there's some way to retrieve the tree-structured data in
a format which is easy to parse (the ASCII style is too difficult for a
computer program to parse), something like an XML format maybe?

-------------
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]



_______________________________________________
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing

RE: Persian PC-Kimmo 0.8 released

Reply via email to