Hi, I have tried translating some texts and got the translation in a large text file with all the error codes. I would like a frequency list for the words that get a certain error.
Example: Odenses *infrastrukur är präglad av @beliggenhed vid Odense Kanal, som *forbinder Odense Hamn med Odense Fjord. Den blev byggd i @åre omkring år 1800 och ger entré från vattnet till stadens centrum. *Herudover har den #ha betydelse for @infrastruktur vid placeringen av *kraftvarmeværket *Fynsværket och den tidigare *losseplads på Stege Ö. I looked at the page: http://wiki.apertium.org/wiki/One-liners and found the scripts: Get unknown words from chunked text and sort by frequency: sed 's/\$\W*\^/$\n^/g' | grep '@' | sed 's/><.*/>$/g' | sort -f | uniq -ci | sort -gr tr " " "\n" | grep "@" | tr -d "[:punct:]" | sort | uniq -c | sort -r But, unfortunately I cannot understand how to use them. How to enter the input and output file? BTW What's the scripting language? Yours, Per Tunedal ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_nov _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
