LT flags 'students', suggests 'pupils'
Malaysia was colonised by the British, so British English is sort of the standard. But now American English has gain acceptance here too. Even as early as 20 years ago, when I was in university, our English lecturers allowed both variety of spellings. Some American spellings have even become more widely used, e.g., 'students'. So, I was a little bit surprised to find it flagged. I learnt that this is not found in grammar.xml. So, I can't just untick it in Options. The file that provides for this is replace.txt under en-GB. For my own purpose, I can remove the file to solve the problem. Less tech-savvy ones may find it annoying. Request: Provide option to not flag this category in Options. Move the 8 American British phrases rules now in grammar.xml into replace.txt. Also, I noticed some spelling errors in replace.txt: * trash=rubbis (rubbish) * trashcan=dustbin (trash can) kb -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Intro
Hi, I am new to LanguageTool, have just learned about it as the HU translator of OmegaT. I wonder if it would be possible to start specifying rules for Hungarian. Best wishes to the List. Karoly Karoly Fabricz, Ph.D. Logon Translation Agency H-6725 Szeged, Szentháromság u. 57. Phone/fax: 62 423 585 Business hours: M-F, 8.00 a.m. -12.00 a.m. Net: www.logon.co.hu -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
RE: Regional variants of Catalan (ca-ES-valencia)
All, The structure that Marcin suggests seems good to me. Regards, Mike Unwalla Contact: www.techscribe.co.uk/techw/contact.htm -Original Message- From: Marcin Milkowski [mailto:list-addr...@wp.pl] snip Agreed, and we do have Esperanto, which is a language created in hope to erase boundaries between peoples and countries. For this reason, Simple Technical English or Easy German could be en-ANY-ASD_STD or de-ANY-easy. What do you think? Marcin snip -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Regional variants of Catalan (ca-ES-valencia)
On 2013-10-07 16:03, Marcin Miłkowski wrote: For this reason, Simple Technical English or Easy German could be en-ANY-ASD_STD or de-ANY-easy. de-DE-x-simple-language, which we currently use, should already be in accordance to BCP 47 (http://tools.ietf.org/html/bcp47). The x- marks a private tag. I'm not sure if we can just use easy without the x-. Regards Daniel -- http://www.danielnaber.de -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Intro
On 2013-10-09 10:40, Logon wrote: Hi Karoly, I am new to LanguageTool, have just learned about it as the HU translator of OmegaT. I wonder if it would be possible to start specifying rules for Hungarian. thanks for your interest in LanguageTool, we're very much interested in rules for Hungarian. As we don't support Hungarian yet, some code changes are needed. If you're a Java developer and want to give it a try, the process is described at http://wiki.languagetool.org/adding-a-new-language If you're not, we'll make the changes for you. But for now, I suggest you just take some other language and remove the existing rules in its grammar.xml file and add your new rules. Once you have some working rules we will add real support for Hungarian and it will get its own grammar.xml file. I suggest you use the English grammar.xml for this. Some things will not work yet, for example you cannot address a word's part-of-speech because a Hungarian dictionary is needed for that. I guess you have already found the documentation at http://languagetool.org/development/? Let us know if you need help. Regards Daniel -- http://www.danielnaber.de -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Intro
It might be possible to generate Hungarian postags from the info in the Hungarian Hunspell. Ruud On 2013-10-09 10:40, Logon wrote: Hi Karoly, I am new to LanguageTool, have just learned about it as the HU translator of OmegaT. I wonder if it would be possible to start specifying rules for Hungarian. thanks for your interest in LanguageTool, we're very much interested in rules for Hungarian. As we don't support Hungarian yet, some code changes are needed. If you're a Java developer and want to give it a try, the process is described at http://wiki.languagetool.org/adding-a-new-language If you're not, we'll make the changes for you. But for now, I suggest you just take some other language and remove the existing rules in its grammar.xml file and add your new rules. Once you have some working rules we will add real support for Hungarian and it will get its own grammar.xml file. I suggest you use the English grammar.xml for this. Some things will not work yet, for example you cannot address a word's part-of-speech because a Hungarian dictionary is needed for that. I guess you have already found the documentation at http://languagetool.org/development/? Let us know if you need help. Regards Daniel -- http://www.danielnaber.de -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
AW: Spell checkers
I am very interested if data for German is available! --Jan R.J. Baars r.j.ba...@xs4all.nl schrieb: Are there any people out there interested in new probable words for their spell checkers and/or postagging files? The data that I can create are: - language - word - # of times found within correct words - indication of likelyness for the language (based on character distribution) - when desired, word context. Anyone interested? It is not for all languages yet, but if you ar interested and not available yet, we can co-operate getting the data. Ruud -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Spell checkers
It would be great to get it for Slovenian. Unfortunately I am still lost at getting a POS tag dictionary created for use with LT. Lp, m. 2013/10/9 Jan Schreiber j...@duplexnegatioaffirmat.com I am very interested if data for German is available! --Jan R.J. Baars r.j.ba...@xs4all.nl schrieb: Are there any people out there interested in new probable words for their spell checkers and/or postagging files? The data that I can create are: - language - word - # of times found within correct words - indication of likelyness for the language (based on character distribution) - when desired, word context. Anyone interested? It is not for all languages yet, but if you ar interested and not available yet, we can co-operate getting the data. Ruud -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Intro
W dniu 2013-10-09 14:04, Daniel Naber pisze: On 2013-10-09 10:40, Logon wrote: Hi Karoly, I am new to LanguageTool, have just learned about it as the HU translator of OmegaT. I wonder if it would be possible to start specifying rules for Hungarian. thanks for your interest in LanguageTool, we're very much interested in rules for Hungarian. As we don't support Hungarian yet, some code changes are needed. If you're a Java developer and want to give it a try, the process is described at http://wiki.languagetool.org/adding-a-new-language If you're not, we'll make the changes for you. But for now, I suggest you just take some other language and remove the existing rules in its grammar.xml file and add your new rules. Once you have some working rules we will add real support for Hungarian and it will get its own grammar.xml file. I suggest you use the English grammar.xml for this. Some things will not work yet, for example you cannot address a word's part-of-speech because a Hungarian dictionary is needed for that. I guess you have already found the documentation at http://languagetool.org/development/? Let us know if you need help. Also, there is an old version of LanguageTool in Python that had rules for Hungarian: http://sourceforge.net/p/languagetool/code/HEAD/tree/trunk/archive/languagetool/rules/hugrammar.xml You could probably adapt some of these rules. There are also Hungarian rules in LightProof that you might want to adapt: http://numbertext.org/lightproof/ Regards, Marcin Regards Daniel -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: LT flags 'students', suggests 'pupils'
On 2013-10-09 10:27, Kumara Bhikkhu wrote: lecturers allowed both variety of spellings. Some American spellings have even become more widely used, e.g., 'students'. So, I was a little bit surprised to find it flagged. I learnt that this is not found in grammar.xml. So, I can't just untick it in Options. It's under Misc: American words easily confused in British English Move the 8 American British phrases rules now in grammar.xml into replace.txt. Also, I noticed some spelling errors in replace.txt: * trash=rubbis (rubbish) Thanks, I have fixed that. * trashcan=dustbin (trash can) I have fixed this, too, but it seems multi term words (i.e. phrases) are not supported by the rule so it doesn't get triggered. That's probably why all multi term words are commented out. Also, because of that we cannot move all rules into the replace.txt. Regards Daniel -- http://www.danielnaber.de -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: LT flags 'students', suggests 'pupils'
W dniu 2013-10-09 10:27, Kumara Bhikkhu pisze: Malaysia was colonised by the British, so British English is sort of the standard. But now American English has gain acceptance here too. Even as early as 20 years ago, when I was in university, our English lecturers allowed both variety of spellings. Some American spellings have even become more widely used, e.g., 'students'. So, I was a little bit surprised to find it flagged. I learnt that this is not found in grammar.xml. So, I can't just untick it in Options. The file that provides for this is replace.txt under en-GB. For my own purpose, I can remove the file to solve the problem. Less tech-savvy ones may find it annoying. Overall, the idea to flag student at all times was bad. I commented it out. Request: Provide option to not flag this category in Options. This is very hard, and would defeat the purpose of the SimpleReplaceRule. Move the 8 American British phrases rules now in grammar.xml into replace.txt. This is logically impossible. SimpleReplaceRule does not have all features of our XML rules. Also, I noticed some spelling errors in replace.txt: * trash=rubbis (rubbish) * trashcan=dustbin (trash can) Thanks, I fixed those. Best, Marcin -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel