Hi,
I was trying to convert an Apertium formatted dictionary into a format
usable by Freeling. From what I saw in
<http://wiki.apertium.org/wiki/Freeling> the tool to do that is a script
called dix-to-maco.py'. I've tried the conversion but I'm having no
luck. I wonder whether someone in this list can help me.
OK, here's what I did. I have a file called
'apertium-oldca-XX.oldca.dix'. Following instructions from Mikel
Forcada, I used 'lt-expand' from 'lttoolbox' to expand the dictionary
which I piped into a file called dict.txt. I put this file together with
the file 'es-tags.parole.txt' in the same directory and I ran
'dix-to-maco.py' as follows:
$./dix-to-maco.py -l dict.txt es-tags.parole.txt
The program ran apparently without any problems (no message errors) but
nothing happened. I don't see any new file produced as output and
nothing has changed in the dict-txt file. I have not been able to find
any documentation for 'dix-to-maco.py' and so I don't now whether I have
to use any particular syntax to obtain an output file with the converted
format for the dictionary. I tried changing -l for -m or -n with the
same results. I don't really know what these parameters do but I just tried.
Then I realized that if I did '$python dix-to-maco.py' I got the message:
Usage: ./dix-to-maco.py [-l|-m|-n] <dix file> <parole lookup>
So I tried with the name of the original .dix file (not expanded with
lt-expand) but I also got a bunch of errors:
jfont...@ubuntu:~/Downloads/apertium-oldca-XX-0.7$ ./dix-to-maco.py -l
apertium-oldca-XX.oldca.dix es-tags.parole.txt
Traceback (most recent call last):
File "./dix-to-maco.py", line 249, in <module>
print key + ' ' + analyses;
File "/usr/lib/python2.6/codecs.py", line 351, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7:
ordinal not in range(128)
-----------------------
jfont...@ubuntu:~/Downloads/apertium-oldca-XX-0.7$ ./dix-to-maco.py -m
apertium-oldca-XX.oldca.dix es-tags.parole.txt
casem-s'ho casar+es+ho
Traceback (most recent call last):
File "./dix-to-maco.py", line 249, in <module>
print key + ' ' + analyses;
File "/usr/lib/python2.6/codecs.py", line 351, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8:
ordinal not in range(128)
--------------------------
jfont...@ubuntu:~/Downloads/apertium-oldca-XX-0.7$ ./dix-to-maco.py -n
apertium-oldca-XX.oldca.dix es-tags.parole.txt
Nuakchott
Traceback (most recent call last):
File "./dix-to-maco.py", line 249, in <module>
print key + ' ' + analyses;
File "/usr/lib/python2.6/codecs.py", line 351, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7:
ordinal not in range(128)
---
Finally I tried by changing the extension of the file created with
lt-expand to .dix, which also gave me an error.
jfont...@ubuntu:~/Downloads/apertium-oldca-XX-0.7$ ./dix-to-maco.py -lm
dict-freeling.txt es-tags.parole.txt
Traceback (most recent call last):
File "./dix-to-maco.py", line 222, in <module>
analysis = row[1].strip();
IndexError: list index out of range
----------
Obviously I'm doing something wrong. Could anybody lend me a hand with
this? Thanks in advance.
Josep M.
------------------------------------------------------------------------------
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web. Learn how to
best implement a security strategy that keeps consumers' information secure
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff