On 4 January 2011 16:16, Josep M. Fontana <[email protected]> wrote:
> Sorry if you receive this message more than once. I tried to send it
> from another address and it bounced saying I was not a member of the
> list.
> ------------
>
> Hi,
>

Hi.

> I was trying to convert an Apertium formatted dictionary into a format
> usable by Freeling. From what I saw in
> <http://wiki.apertium.org/wiki/Freeling>  the tool to do that is a script
> called dix-to-maco.py'. I've tried the conversion but I'm having no
> luck. I wonder whether someone in this list can help me.
>
> OK, here's what I did. I have a file called
> 'apertium-oldca-XX.oldca.dix'. Following instructions from Mikel
> Forcada, I used 'lt-expand' from 'lttoolbox' to expand the dictionary
> which I piped into a file called dict.txt. I put this file together with
> the file 'es-tags.parole.txt' in the same directory and I ran
> 'dix-to-maco.py' as follows:
>
> $./dix-to-maco.py -l dict.txt es-tags.parole.txt
>

'es-tags.parole.txt' will not work. 'oldca' is not part of Apertium,
per se, and uses its own tagset. You will need to supply a mapping of
that tagset to Parole to get usable output.

>
> The program ran apparently without any problems (no message errors) but
> nothing happened. I don't see any new file produced as output and
> nothing has changed in the dict-txt file. I have not been able to find
> any documentation for 'dix-to-maco.py'

It's right there, in the file:
#
# This is a conversion script for the format outputted by lt-expand to the
# format accepted by the freeling indexdict program.
#
# Input is an expanded Apertium dictionary:
#
#   tadoù:tad<n><m><sg>
#
# Output is a Freeling dictionary:
#
#   tadoù tag NCMPV0
#
# To convert the Apertium tagset into a PAROLE-compatible tagset, a file
# with the parole tag and Apertium tag list is used. The two are separated
# by a tab:
#
#   NCMPV0      <n><m><sg>
#


> and so I don't now whether I have
> to use any particular syntax to obtain an output file with the converted
> format for the dictionary. I tried changing -l for -m or -n with the
> same results. I don't really know what these parameters do but I just tried.
>
> Then I realized that if I did '$python dix-to-maco.py' I got the message:
>
> Usage: ./dix-to-maco.py [-l|-m|-n]<dix file>  <parole lookup>
>
>
> So I tried with the name of the original .dix file (not expanded with
> lt-expand) but I also got a bunch of errors:
>
>
>
> jfont...@ubuntu:~/Downloads/apertium-oldca-XX-0.7$ ./dix-to-maco.py -l
> apertium-oldca-XX.oldca.dix es-tags.parole.txt
> Traceback (most recent call last):
>  File "./dix-to-maco.py", line 249, in<module>
>    print key + ' ' +  analyses;
>  File "/usr/lib/python2.6/codecs.py", line 351, in write
>    data, consumed = self.encode(object, self.errors)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7:
> ordinal not in range(128)

That's the mating call of 'Badly Written Python'/'Python Sucks At
Unicode'. I'm having a look at it.
>
> -----------------------
>
> jfont...@ubuntu:~/Downloads/apertium-oldca-XX-0.7$ ./dix-to-maco.py -m
> apertium-oldca-XX.oldca.dix es-tags.parole.txt
> casem-s'ho casar+es+ho
> Traceback (most recent call last):
>  File "./dix-to-maco.py", line 249, in<module>
>    print key + ' ' +  analyses;
>  File "/usr/lib/python2.6/codecs.py", line 351, in write
>    data, consumed = self.encode(object, self.errors)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8:
> ordinal not in range(128)
>
> --------------------------
>
>
> jfont...@ubuntu:~/Downloads/apertium-oldca-XX-0.7$ ./dix-to-maco.py -n
> apertium-oldca-XX.oldca.dix es-tags.parole.txt
> Nuakchott
> Traceback (most recent call last):
>  File "./dix-to-maco.py", line 249, in<module>
>    print key + ' ' +  analyses;
>  File "/usr/lib/python2.6/codecs.py", line 351, in write
>    data, consumed = self.encode(object, self.errors)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7:
> ordinal not in range(128)
>
>
> ---
>
> Finally I tried by changing the extension of the file created with
> lt-expand to .dix, which also gave me an error.
>
> jfont...@ubuntu:~/Downloads/apertium-oldca-XX-0.7$ ./dix-to-maco.py -lm
> dict-freeling.txt es-tags.parole.txt
> Traceback (most recent call last):
>  File "./dix-to-maco.py", line 222, in<module>
>    analysis = row[1].strip();
> IndexError: list index out of range
>
> ----------
>
> Obviously I'm doing something wrong. Could anybody lend me a hand with
> this? Thanks in advance.
>
>
> Josep M.
>
> ------------------------------------------------------------------------------
> Learn how Oracle Real Application Clusters (RAC) One Node allows customers
> to consolidate database storage, standardize their database environment, and,
> should the need arise, upgrade to a full multi-node Oracle RAC database
> without downtime or disruption
> http://p.sf.net/sfu/oracle-sfdevnl
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>



-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.

------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to