Hello developers,
I am a student currently working on the idea *Extend lttoolbox to have the
power of HFST *for my GSoC project. So, I am here talk about the new
modifications that are now a part of the lttoolbox and want all of you to
try them out. As a part of my Coding Challenge I have developed a module
that converts the *lexc* files to the *dix *file format. The repo for the
package is https://github.com/Techievena/lexc2dix. So these are the set of
changes we have in lttoolbox right now.
Currently lttoolbox supports allows weights in the binary files. Here is a
snippet of that.
$ cat test.att
>
> 0 1 c c 4.567895
>
> 1 2 a a 0.989532
>
> 2 3 t t 2.796193
>
> 3 4 @0@ + -3.824564
>
> 4 5 @0@ n 1.824564
>
> 5 0.525487
>
> 4 5 @0@ v 2.845989
>
>
$ lt-comp lr test.att test.bin
>
> main@standard 6 6
>
>
$ lt-print test.bin
>
> 0 1 c c 4.567895
>
> 1 2 a a 0.989532
>
> 2 3 t t 2.796193
>
> 3 4 ε + -3.824564
>
> 4 5 ε n 1.824564
>
> 4 5 ε v 2.845989
>
> 5 0.525487
>
>
> The second modification we have is now *lt-proc* now has optional
arguments to support printing of weights. Here is a snippet of that.
$ echo "cats" | lt-proc test.bin
^cat/cat+n/cat+v$s
$ echo "cats" | lt-proc -W test.bin
^cat/cat+n<W:6.353620>/cat+v<W:7.375045>$s
$ echo "cats" | lt-proc -N 1 test.bin
^cat/cat+n$s
$ echo "cats" | lt-proc -W -N 1 test.bin
^cat/cat+n<W:6.353620>$s
Now there are three more arguments in lt-proc: *-W, -N, -L*
$ lt-proc -h
>
> lt-proc: process a stream with a letter transducer
>
> USAGE: lt-proc [ -a | -b | -c | -d | -e | -g | -n | -p | -s | -t | -v | -h
>> -z -w ] [-W] [-N N] [-L N] [ -i icx_file ] [ -r rcx_file ] fst_file
>> [input_file [output_file]]
>
> Options:
>
> -a, --analysis: morphological analysis (default behavior)
>
> -b, --bilingual: lexical transfer
>
> -c, --case-sensitive: use the literal case of the incoming characters
>
> -d, --debugged-gen morph. generation with all the stuff
>
> -e, --decompose-nouns: Try to decompound unknown words
>
> -g, --generation: morphological generation
>
> -i, --ignored-chars: specify file with characters to ignore
>
> -r, --restore-chars: specify file with characters to diacritic
>> restoration
>
> -l, --tagged-gen: morphological generation keeping lexical forms
>
> -m, --tagged-nm-gen: same as -l but without unknown word marks
>
> -n, --non-marked-gen morph. generation without unknown word marks
>
> -o, --surf-bilingual: lexical transfer with surface forms
>
> -p, --post-generation: post-generation
>
> -s, --sao: SAO annotation system input processing
>
> -t, --transliteration: apply transliteration dictionary
>
> -v, --version: version
>
> -z, --null-flush: flush output on the null character
>
> -w, --dictionary-case: use dictionary case instead of surface case
>
> -C, --careful-case: use dictionary case if present, else surface
>
> -I, --no-default-ignore: skips loading the default ignore characters
>
> -W, --show-weights: Print final analysis weights (if any)
>
> -N, --analyses: Output no more than N analyses (if the
>> transducer is weighted, the N best analyses)
>
> -L, --weight-classes: Output no more than N best weight classes
>> (where analyses with equal weight constitute a class)
>
> -h, --help: show this help
>
>
>
Please give your reviews after trying it out yourself.
Regards,
Abinash Senapati
Final Year Undergraduate,
Department of Electronics and Electrical Communication Engineering,
Indian Institute of Technology, Kharagpur-721302, India
*Website: *https://techievena.github.io
[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality6&>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality6&>
08/03/18,
7:09:10 PM
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff