Dear Golam bhai,

Any updates on Anubadok? :)

Thanks,
`Jamil


----- Original Message ----- From: "Golam Mortuza Hossain" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, May 29, 2005 11:55 PM
Subject: [Ankur-core] Anubadok: status report


Friends,

This is a kind of a status report on the "Anubadok"
(also sort of beating my own drum :-)). I have now more or
less completed the coding for ONE of the core subroutines of
the translator. As a result, its performance has improved
significantly. The part of the translator that generates
Bengali sentence is now more or less complete (sans
"BnSondhi.pm". Someone has promised me, to get me a Bengali
grammar book. So once I get the book, I will try to complete
the coding for rules of "Sondhi". It needs only elementary
stuffs, not all of them!). The translator can now deal with
about 50 English prepositions (sans their dual meaning).
Apparently, English has more than a hundred of them but not
all of them are used commonly. I was feeding news.bbc pages.
So I guess, the included prepositions are among the common
prepositions.


Right now, I am mainly trying to deal with another
core module. It will take some time. This one takes a
English sentence and then splits into its sub-sentences
where each sub-sentence are structurally independent (they
themselves are basic sentences). It seems that in
Computational Linguistic there are no exact known algorithm
for doing this. This also means that one can only improve
the coding for this module but never complete :-(. Please
note that this doesn't mean that bigger sentences can't be
translated properly. Rather, it will perform poorly only if
the complexity of the syntactic structure of the sentence is
high. For example, it can do very good translation for
much bigger sentences but may fail for a relatively smaller
length sentence. Fortunately, occurrences of such complex
sentences are relatively lesser in normal circumstances.

On the Second issue; it is really an encouraging
development that the word-count of its English-to-Bengali
dictionary has gone UP by a quantum jump. A BIG THANKS to
Progga and his enthusiasms. Progga had sent me the ankur's
"commonwords.po" file containing a large number of "msgid"
and "msgstr". I have converted them into the dictionary
format using little bit of scripting.

So, its words count now stands above three thousand
and five hundred. At the time of my first posting, it was
only around 500. I have added a page for "list of
contributors" and it is also included in the anubadok
package.

I have uploaded the latest version (but still "0.0"
and it will remain so for some time). It includes 3500+
words in its E2B dictionary. You may try google("anubadok")
for downloading :-)

Suggestion, comments ... are welcome.

Cheers,
golam


-------------------------------------------------------
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=fad-ysdn-ostg-q22005
_______________________________________________
Bengalinux-core mailing list
Bengalinux-core@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bengalinux-core


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Bengalinux-core mailing list
Bengalinux-core@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bengalinux-core

Reply via email to