This is great!

However, is a bit counter-intuitive that lower weight wins. In other cases
(like lexical selection rules), higher weights win.

Would it be possible to do something so weights work consistently?

Missatge de Francis Tyers <fty...@prompsit.com> del dia ds., 4 d’ag. 2018 a
les 17:53:

> El 2018-08-03 15:42, Abinash Senapati escribió:
> > _
> > Hello developers,
> > I am a student currently working on the idea EXTEND LTTOOLBOX TO HAVE
> > THE POWER OF HFST for my GSoC project. So, I am here talk about the
> > new modifications that are now a part of the lttoolbox and want all of
> > you to try them out. As a part of my Coding Challenge I have developed
> > a module that converts the LEXC_ files to the _dix _file format. The
> > repo for the package is https://github.com/Techievena/lexc2dix. So
> > these are the set of changes we have in lttoolbox right now.
> >
> > Currently lttoolbox supports allows weights in the binary files. Here
> > is a snippet of that.
>
> Thanks Abinash! Excellent work!
>
> What this means is that you can now weight your morphological analysers,
> generators and bilingual dictionaries.
>
> Here are some problems that can solve:
>
> 1) Having zero-context rules in your .lrx files. Now you can just put
> the
>     weights directly in your bilingual dictionary
>
> $ echo "^estación<n><f><sg>$" | lt-proc -W -b testbidix.bin
> ^estación<n><f><sg>/season<n><W:1.000000><sg>/station<n><W:1.500000><sg>$
>
> $ echo "^estación<n><f><sg>$" | lt-proc -b testbidix.bin
> ^estación<n><f><sg>/season<n><sg>/station<n><sg>$
>
> Analyses will be output according to lowest weight first. So you can
> mark your
> default translation as "1.0" and then all others as >1.0 ... because of
> how
> transfer works, it will always take the first, which will be the one
> with
> the lowest weight.
>
> 2) Improving POS-tagging accuracy by ordering analyses by probability.
> This
>     way if your CG doesn't mop up all the ambiguity, you will get the
> best
>     remaining analysis. This works kind of like the unigram tagger, but
> because
>     it can be in the analyser itself, it can be easier to control.
>
> 3) Dealing with non-standard forms, instead of having to use LR/RL
> direction
>     restrictions, you can just make non-standard forms have a high weight
> and
>     ask for lt-proc to only generate the surface form with the lowest
> weight.
>
> There will no doubt be even more fun stuff that we can do with weights.
> I
> for one think it's very exciting and would encourage people to play
> around
> with it and see what they can come up with.
>
> Fran
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>


-- 
< Xavi Ivars >
< http://xavi.ivars.me >
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to