LT flags 'students', suggests 'pupils'

2013-10-09 Thread Kumara Bhikkhu
Malaysia was colonised by the British, so British English is sort of 
the standard. But now American English has gain acceptance here too. 
Even as early as 20 years ago, when I was in university, our English 
lecturers allowed both variety of spellings. Some American spellings 
have even become more widely used, e.g., 'students'. So, I was a 
little bit surprised to find it flagged.


I learnt that this is not found in grammar.xml. So, I can't just 
untick it in Options.


The file that provides for this is replace.txt under en-GB. For my 
own purpose, I can remove the file to solve the problem. Less 
tech-savvy ones may find it annoying.


Request:
Provide option to not flag this category in Options.
Move the 8 American British phrases rules now in grammar.xml into 
replace.txt.


Also, I noticed some spelling errors in replace.txt:
   * trash=rubbis (rubbish)
   * trashcan=dustbin (trash can)

kb --
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Intro

2013-10-09 Thread Logon
Hi,

I am new to LanguageTool, have just learned about it as the HU translator of 
OmegaT.

I wonder if it would be possible to start specifying rules for Hungarian.

Best wishes to the List.

Karoly

Karoly Fabricz, Ph.D.

Logon Translation Agency
H-6725 Szeged, Szentháromság u. 57.
Phone/fax: 62 423 585
Business hours: M-F, 8.00 a.m. -12.00 a.m.
Net: www.logon.co.hu 


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


RE: Regional variants of Catalan (ca-ES-valencia)

2013-10-09 Thread Mike Unwalla
All,

The structure that Marcin suggests seems good to me.

Regards,

Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm 



-Original Message-
From: Marcin Milkowski [mailto:list-addr...@wp.pl] 
snip

Agreed, and we do have Esperanto, which is a language created in hope to 
erase boundaries between peoples and countries.

For this reason, Simple Technical English or Easy German could be 
en-ANY-ASD_STD or de-ANY-easy.

What do you think?

Marcin
snip


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Regional variants of Catalan (ca-ES-valencia)

2013-10-09 Thread Daniel Naber
On 2013-10-07 16:03, Marcin Miłkowski wrote:

 For this reason, Simple Technical English or Easy German could be
 en-ANY-ASD_STD or de-ANY-easy.

de-DE-x-simple-language, which we currently use, should already be in 
accordance to BCP 47 (http://tools.ietf.org/html/bcp47). The x- marks 
a private tag. I'm not sure if we can just use easy without the x-.

Regards
  Daniel

-- 
http://www.danielnaber.de


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Intro

2013-10-09 Thread Daniel Naber
On 2013-10-09 10:40, Logon wrote:

Hi Karoly,

 I am new to LanguageTool, have just learned about it as the HU 
 translator of
 OmegaT.
 
 I wonder if it would be possible to start specifying rules for 
 Hungarian.

thanks for your interest in LanguageTool, we're very much interested in 
rules for Hungarian. As we don't support Hungarian yet, some code 
changes are needed. If you're a Java developer and want to give it a 
try, the process is described at 
http://wiki.languagetool.org/adding-a-new-language

If you're not, we'll make the changes for you. But for now, I suggest 
you just take some other language and remove the existing rules in its 
grammar.xml file and add your new rules. Once you have some working 
rules we will add real support for Hungarian and it will get its own 
grammar.xml file.

I suggest you use the English grammar.xml for this. Some things will not 
work yet, for example you cannot address a word's part-of-speech because 
a Hungarian dictionary is needed for that.

I guess you have already found the documentation at 
http://languagetool.org/development/? Let us know if you need help.

Regards
  Daniel

-- 
http://www.danielnaber.de


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Intro

2013-10-09 Thread R.J. Baars

It might be possible to generate Hungarian postags from the info in the
Hungarian Hunspell.

Ruud

 On 2013-10-09 10:40, Logon wrote:

 Hi Karoly,

 I am new to LanguageTool, have just learned about it as the HU
 translator of
 OmegaT.

 I wonder if it would be possible to start specifying rules for
 Hungarian.

 thanks for your interest in LanguageTool, we're very much interested in
 rules for Hungarian. As we don't support Hungarian yet, some code
 changes are needed. If you're a Java developer and want to give it a
 try, the process is described at
 http://wiki.languagetool.org/adding-a-new-language

 If you're not, we'll make the changes for you. But for now, I suggest
 you just take some other language and remove the existing rules in its
 grammar.xml file and add your new rules. Once you have some working
 rules we will add real support for Hungarian and it will get its own
 grammar.xml file.

 I suggest you use the English grammar.xml for this. Some things will not
 work yet, for example you cannot address a word's part-of-speech because
 a Hungarian dictionary is needed for that.

 I guess you have already found the documentation at
 http://languagetool.org/development/? Let us know if you need help.

 Regards
   Daniel

 --
 http://www.danielnaber.de


 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
 ___
 Languagetool-devel mailing list
 Languagetool-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/languagetool-devel




--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


AW: Spell checkers

2013-10-09 Thread Jan Schreiber
I am very interested if data for German is available! --Jan

R.J. Baars r.j.ba...@xs4all.nl schrieb:

Are there any people out there interested in new probable words for their
spell checkers and/or postagging files?

The data that I can create are:
- language
- word
- # of times found within correct words
- indication of likelyness for the language (based on character distribution)
- when desired, word context.

Anyone interested?

It is not for all languages yet, but if you ar interested and not
available yet, we can co-operate getting the data.

Ruud




--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Spell checkers

2013-10-09 Thread Martin Srebotnjak
It would be great to get it for Slovenian.

Unfortunately I am still lost at getting a POS tag dictionary created for
use with LT.

Lp, m.


2013/10/9 Jan Schreiber j...@duplexnegatioaffirmat.com

 I am very interested if data for German is available! --Jan

 R.J. Baars r.j.ba...@xs4all.nl schrieb:

 Are there any people out there interested in new probable words for their
 spell checkers and/or postagging files?
 
 The data that I can create are:
 - language
 - word
 - # of times found within correct words
 - indication of likelyness for the language (based on character
 distribution)
 - when desired, word context.
 
 Anyone interested?
 
 It is not for all languages yet, but if you ar interested and not
 available yet, we can co-operate getting the data.
 
 Ruud
 
 
 
 

 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 
 http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
 ___
 Languagetool-devel mailing list
 Languagetool-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/languagetool-devel

 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
 ___
 Languagetool-devel mailing list
 Languagetool-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/languagetool-devel

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Intro

2013-10-09 Thread Marcin Miłkowski
W dniu 2013-10-09 14:04, Daniel Naber pisze:
 On 2013-10-09 10:40, Logon wrote:

 Hi Karoly,

 I am new to LanguageTool, have just learned about it as the HU
 translator of
 OmegaT.

 I wonder if it would be possible to start specifying rules for
 Hungarian.

 thanks for your interest in LanguageTool, we're very much interested in
 rules for Hungarian. As we don't support Hungarian yet, some code
 changes are needed. If you're a Java developer and want to give it a
 try, the process is described at
 http://wiki.languagetool.org/adding-a-new-language

 If you're not, we'll make the changes for you. But for now, I suggest
 you just take some other language and remove the existing rules in its
 grammar.xml file and add your new rules. Once you have some working
 rules we will add real support for Hungarian and it will get its own
 grammar.xml file.

 I suggest you use the English grammar.xml for this. Some things will not
 work yet, for example you cannot address a word's part-of-speech because
 a Hungarian dictionary is needed for that.

 I guess you have already found the documentation at
 http://languagetool.org/development/? Let us know if you need help.

Also, there is an old version of LanguageTool in Python that had rules 
for Hungarian:

http://sourceforge.net/p/languagetool/code/HEAD/tree/trunk/archive/languagetool/rules/hugrammar.xml

You could probably adapt some of these rules.

There are also Hungarian rules in LightProof that you might want to adapt:

http://numbertext.org/lightproof/

Regards,
Marcin


 Regards
Daniel



--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: LT flags 'students', suggests 'pupils'

2013-10-09 Thread Daniel Naber
On 2013-10-09 10:27, Kumara Bhikkhu wrote:

 lecturers allowed both variety of spellings. Some American spellings
 have even become more widely used, e.g., 'students'. So, I was a
 little bit surprised to find it flagged.
 
  I learnt that this is not found in grammar.xml. So, I can't just
 untick it in Options.

It's under Misc: American words easily confused in British English

  Move the 8 American British phrases rules now in grammar.xml into
 replace.txt.
 
  Also, I noticed some spelling errors in replace.txt:
 
   * trash=rubbis (rubbish)

Thanks, I have fixed that.

   * trashcan=dustbin (trash can)

I have fixed this, too, but it seems multi term words (i.e. phrases) are 
not supported by the rule so it doesn't get triggered. That's probably 
why all multi term words are commented out. Also, because of that we 
cannot move all rules into the replace.txt.

Regards
  Daniel

-- 
http://www.danielnaber.de


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: LT flags 'students', suggests 'pupils'

2013-10-09 Thread Marcin Miłkowski
W dniu 2013-10-09 10:27, Kumara Bhikkhu pisze:
 Malaysia was colonised by the British, so British English is sort of the
 standard. But now American English has gain acceptance here too. Even as
 early as 20 years ago, when I was in university, our English lecturers
 allowed both variety of spellings. Some American spellings have even
 become more widely used, e.g., 'students'. So, I was a little bit
 surprised to find it flagged.

 I learnt that this is not found in grammar.xml. So, I can't just untick
 it in Options.

 The file that provides for this is replace.txt under en-GB. For my own
 purpose, I can remove the file to solve the problem. Less tech-savvy
 ones may find it annoying.

Overall, the idea to flag student at all times was bad. I commented it 
out.

 Request:
 Provide option to not flag this category in Options.

This is very hard, and would defeat the purpose of the SimpleReplaceRule.

 Move the 8 American British phrases rules now in grammar.xml into
 replace.txt.

This is logically impossible. SimpleReplaceRule does not have all 
features of our XML rules.


 Also, I noticed some spelling errors in replace.txt:

   * trash=rubbis (rubbish)
   * trashcan=dustbin (trash can)

Thanks, I fixed those.

Best,
Marcin

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel