Hello, I have few ideas about the "Extracting knowledge from Apertium's 
post-edition logs to improve translation" project.

- As my firs idea of the possible solution, it would not be a problem to search 
all the different logs containing a correctly corrected word, using the 
File::Find module in Perl, which does a depth-first search of the text nodes, 
but it won't be really a "on-the fly operation". (Yet it is helpful. It can 
also be done with a breadth first search algorithm implemented in custom made 
Perl script).

- For finding all the matches with a corrected word, the original form of the 
word would be used and every instance of the corrected word wold be further 
noted, lets say in another file. (Will this file be maintained on daily bases 
or words most frequently corrected is of further discussion). This would be 
done with regular expressions, using the properties of the word class.

- And last but not least, the word stored in the graphical translator would be 
chosen from the most frequently corrected words.

Should I put a specific example with created logs? Please let me know.

Thank you in advance.
Greetings,
 Anastasija Efremovska
------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to