Re: move mailing list to forum?

2016-09-27 Thread Andriy Rysin
I felt like mailing list is used more by developers and users writing to the forum. I personally prefer mailing list, but I understand if maintaining both is too much work. Andriy 2016-09-26 12:14 GMT-04:00 Andre Couture : > I personally like the idea of moving to a

Re: Inflecting second token with postag from the first

2016-09-13 Thread Andriy Rysin
> It looks very strange to me to include ".*" in a replacement expression. > > > > But now I stated my observation so it is up to you if you want to go into > > it. > > > > Best, > > Jesper > > > > > > 2016-09-13 23:21

Re: Inflecting second token with postag from the first

2016-09-13 Thread Andriy Rysin
des of experience in regular expressions, and the .* looks > strange to me in a replacement expression: > > postag_replace="$1lname$2.*" > > Are you sure it shouldn't simply be > > postag_replace="$1lname$2" > > ? > > Best, > Jesper > > > > 2

Inflecting second token with postag from the first

2016-09-13 Thread Andriy Rysin
Sorry if this is already written somewhere - I looked at wiki pages but could not find anything relevant. I have two tokens (first name and last name) and in the suggestion I want to inflect second token the same as the first. I tried to do this: \2 but it sends the tests into 100% CPU loop and

Re: AbstractSimpleReplaceRule: look through multiple lemmas

2016-09-13 Thread Andriy Rysin
Thanks, pushed. Andriy 2016-09-13 3:03 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-09-13 00:51, Andriy Rysin wrote: > > > All language tests passed but as it's a core rule used by many I'd > > like to do a review before I push. > >

AbstractSimpleReplaceRule: look through multiple lemmas

2016-09-12 Thread Andriy Rysin
I just got a report that for Ukrainian the simple replace rule does not pick the right lemma for the replacements. E.g. if the token is not found in replace list the lemmas are searched and first found is used for replacement list. If token has multiple lemmas this may not be the right one to

Paste into web-page check field on mobile devices

2016-07-02 Thread Andriy Rysin
I just noticed it's really hard to paste a text to check e.g. on https://languagetool.org from mobile devices when the text field is empty. On regular text areas long tap works for mobile devices by bringing Paste menu whether there's a text in the field or not. For our text area if the text

Non-overlapping anti-pattern

2016-06-22 Thread Andriy Rysin
Currently for antipattern to work it needs to overlap with the pattern. Would it make sense to add an attribute (e.g. "overlap") that can be set to false so the antipattern can be appied anywhere in the sentence? E.g. often in rules when I need to guess the context by some words in the sentence I

Re: Announcement: old HTTP API will be replaced

2016-06-06 Thread Andriy Rysin
Hi Daniel I've tested switching my web-page to /v2 according to http://wiki.languagetool.org/integration-on-websites And I found one little problem: in the instructions it says to use "https://languagetool.org/api/v2/check; so I've replaced it with "http://r2u.org.ua/api/v2/check; but it returned

Re: Chrome extension update

2016-06-03 Thread Andriy Rysin
I don't remember if this was mentioned: if I try to check Ukrainian text from languagetool.org with LT addon in Chrome I get some extra errors with some internal ids in the error text (e.g. UK_SIMPLE_REPLACE). Unfortunately I don't have a machine with Chrome right now but I can try to reproduce

Re: immunized words do not stop repeat word rule

2016-06-02 Thread Andriy Rysin
Thanks, I've pushed a fix: 653d1b979d8a170ecf143df983474551018c5430 Andriy 2016-06-01 17:17 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-06-01 19:07, Andriy Rysin wrote: > >> I found this for Ukrainian but it seems to work the same for English >>

Java rule disabled by default

2016-06-01 Thread Andriy Rysin
I have a new Java rule (inflection agreement between adjective and the noun) that allows to catch very interesting errors that often escape the eye. But this rule also produces a lot of false positives (I spent some time improving it but it's gets very tricky quickly). So I am thinking to leave

immunized words do not stop repeat word rule

2016-06-01 Thread Andriy Rysin
I found this for Ukrainian but it seems to work the same for English as well, in a sentence: And side to side and up-down. "side to side" is immunized but it triggers repeated word (for "and") and LT highlights "side and". Shall immunized words stop the rule instead of "stetching it out"? Not

Re: Chrome extension update

2016-05-27 Thread Andriy Rysin
BTW as google apps script is pretty much JavaScript could be reuse good chunk of chrome extension code for google docs LT plugin? 2016-05-19 3:27 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-05-19 05:33, Andriy Rysin wrote: > >> So it looks like for Chrom

Re: Chrome extension update

2016-05-18 Thread Andriy Rysin
hat sounds correct I can prepare the patch. Regards, Andriy 2016-05-16 20:02 GMT-04:00 Andriy Rysin <ary...@gmail.com>: > Ah, have to dig deeper then :) > > So in the API result the spelling error comes first but in the > editor_plugin.js we iterate backwards and ignore position

Re: Chrome extension update

2016-05-16 Thread Andriy Rysin
viousSpanStart) { // overlapping errors - these are not supported by our underline approach, // as we would need overlapping s for that, so skip the error: continue; } 2016-05-16 16:12 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2

Re: Chrome extension update

2016-05-16 Thread Andriy Rysin
important but still) consistent order with rule id I prefer approach 3) which besides consistent results also gives ability to sort by rule categories. Regards, Andriy 2016-05-16 12:17 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-05-16 17:37, Andriy Rysin wrote: > >

Re: Chrome extension update

2016-05-16 Thread Andriy Rysin
place rule on the webpage, but in Chrome extension I see spelling error. Would you know what's the reason for this difference? Thanks Andriy 2016-05-16 11:25 GMT-04:00 Andriy Rysin <ary...@gmail.com>: > Looks great, thanks a lot! > > Andriy > > 2016-05-16 11:22 GMT-04:00 Daniel Naber

Re: Chrome extension update

2016-05-16 Thread Andriy Rysin
Looks great, thanks a lot! Andriy 2016-05-16 11:22 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-05-16 16:37, Andriy Rysin wrote: > > Hi Andriy, > >> I noticed though that the extension description (the text that starts >> with "Wit

Re: Chrome extension update

2016-05-16 Thread Andriy Rysin
Hi Daniel it looks great, thanks! I noticed though that the extension description (the text that starts with "With this extension you can check text...") is not translated into Ukrainian althought the title is translated correctly. Is it just taking some time or we need to adjust something?

Re: Please help test Chrome extension beta

2016-05-06 Thread Andriy Rysin
I just tried this extension in Firefox 48.0a2: I had to turn of signature checking to install and there's no Ukrainian translation but otherwise it works well. One thing I noted (and looks like it's the same behavior as in Firefox extension) - if there's a spelling error and grammar error on the

Re: Multithreaded LT optimization (take 2)

2016-04-27 Thread Andriy Rysin
PerformanceTest, but I am travelling for several days starting tomorrow so probably won't happen until next week. Regards, Andriy 2016-04-26 5:08 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-04-25 20:18, Andriy Rysin wrote: > >> How many cores do you have? >

Re: Multithreaded LT optimization (take 2)

2016-04-25 Thread Andriy Rysin
small but still noticeble (~9.0 vs 7.2 s). How many cores do you have? Regards, Andriy 2016-04-25 12:14 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-04-25 16:54, Andriy Rysin wrote: > >> I've just pushed a forkjoinpool branch that uses ForkJoinPool for >&

Re: Multithreaded LT optimization (take 2)

2016-04-25 Thread Andriy Rysin
for others to try it > out. > > Regards, > Andriy > > 2016-03-07 8:21 GMT-05:00 Daniel Naber <daniel.na...@languagetool.org>: >> On 2016-01-28 14:18, Andriy Rysin wrote: >> >>> As you may know I am running regression tests for any LT changes I >>&

Re: Multithreaded LT optimization (take 2)

2016-04-21 Thread Andriy Rysin
-03-07 8:21 GMT-05:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-01-28 14:18, Andriy Rysin wrote: > >> As you may know I am running regression tests for any LT changes I >> make by checking huge Ukrainian media/book archives I collected over >> time. The

Re: Website integration for LT

2016-04-20 Thread Andriy Rysin
.@languagetool.org>: > On 2016-04-20 17:48, Andriy Rysin wrote: > >> Can I also suggest we make 500 message more user-friendly? E.g. with >> something like this > > The common causes for 500 errors should all already have a friendly > error message. Something like Cla

Re: Website integration for LT

2016-04-20 Thread Andriy Rysin
= HttpURLConnection.HTTP_INTERNAL_ERROR; } logError(text, remoteAddress, e, errorCode); 2016-04-20 11:03 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-04-20 16:44, Andriy Rysin wrote: > >> 4) if I submit a check via the web-page (directly to API, no proxy) &

Re: Website integration for LT

2016-04-20 Thread Andriy Rysin
2016-04-20 9:29 GMT-04:00 Daniel Naber <daniel.na...@languagetool.org>: > On 2016-04-20 15:02, Andriy Rysin wrote: > >> this works if I point to https://languagetool.org/api/v1/, but I was >> not able to run with direct API pointing to the server running locally >> on m

Re: Website integration for LT

2016-04-20 Thread Andriy Rysin
Thanks Daniel this works if I point to https://languagetool.org/api/v1/, but I was not able to run with direct API pointing to the server running locally on my machine. I've added --public and it listens on *:8081 but it never returns a request. Is there some other config options I need to turn

Re: mvn clean package fails

2016-03-26 Thread Andriy Rysin
There are suggestions on stack overflow for this error to delete everything under ~/.m2/repository/ and try again Andriy On Mar 26, 2016 10:38 AM, "Dominique Pellé" wrote: > Hi > > Running "mvn clean package" fails on my xubuntu-14.04.4 > Linux machine. > > Does

Preventing inflections in suggestions

2016-03-11 Thread Andriy Rysin
Hi all I have some word forms (colloquial forms of the verbs) I would like to tag and recognize but I don't want them to show up in suggestions. I found that I can remove the tags I don't want from the list when building synthesizer but I am wondering if that's the right way to do it (I did

Re: Android Spellchecker using LanguageTool

2016-03-01 Thread Andriy Rysin
Probably missed text in here: The application sends what the introduces in regular text boxes 2016-02-28 16:22 GMT-05:00 Andriy Rysin <ary...@gmail.com>: > English repeated twice in this message: It is available for Catalan, > English, French, German, Polish and English languages. &g

Re: Android Spellchecker using LanguageTool

2016-02-28 Thread Andriy Rysin
English repeated twice in this message: It is available for Catalan, English, French, German, Polish and English languages. 2016-02-28 11:52 GMT-05:00 Jordi Mas <j...@softcatala.org>: > El 28/02/2016 a les 08:43, Dominique Pellé ha escrit: >> Andriy Rysin <ary...@gmail.com>

Re: Android Spellchecker using LanguageTool

2016-02-27 Thread Andriy Rysin
Thanks guys, just a little note that I would be nice to have context for strings like %d (last %s) as translating that without context is hard. Thanks Andriy 2016-02-27 17:22 GMT-05:00 Daniel Naber : > On 2016-02-27 22:09, Jordi Mas wrote: > >> Let me know how we

Re: Android Spellchecker using LanguageTool

2016-02-27 Thread Andriy Rysin
Hi Jordi, this is very nice, could you please add Ukrainian and let me know how I can help localizing the app? Thanks Andriy On Feb 27, 2016 1:54 PM, "Jordi Mas" wrote: > El 28/12/2015 a les 08:13, Jordi Mas ha escrit: > > El 26/12/2015 a les 11:32, Daniel Naber ha escrit:

Re: Valency dictionary and attribute [long mail]

2016-02-07 Thread Andriy Rysin
Hi Marcin I was actually thinking for something even more abstract. To adjust your example:

Re: MS Word add-in for LT

2016-02-03 Thread Andriy Rysin
Hi Jaume it seems that Ukrainian (uk-UA) is not in the list, can you please take a look at that? Thanks Andriy 2016-02-02 15:54 GMT-05:00 Jaume Ortolà i Font : > Thanks, Daniel. That was the bug. I have fixed it and published a new > release. > > I have also completed the

Re: Valency dictionary and attribute [long mail]

2016-02-02 Thread Andriy Rysin
Hey Marcin this is great addition, though I have one remark. Besides valency information some other type of information could be useful too (if we starting to head this direction). E.g. I have rules in Ukrainian that suggests superlative form for adjective when "самий" (very) + base form is used.

Re: Multithreaded LT optimization (take 2)

2016-01-29 Thread Andriy Rysin
p! It is a very good knowledge base for someone > who wants to work on improving LT's multithreaded performance. > > > On Thu, Jan 28, 2016 at 01:29:13PM -0500, Andriy Rysin wrote: >> so currently what we (approximately) do is >> 1) read file (line by line in general c

Multithreaded LT optimization (take 2)

2016-01-28 Thread Andriy Rysin
Hi all As you may know I am running regression tests for any LT changes I make by checking huge Ukrainian media/book archives I collected over time. The full check for 6 text files I check was taking more than an hour on my i3 so I upgraded to i7. The time got under 60 min which was much better

Re: Multithreaded LT optimization (take 2)

2016-01-28 Thread Andriy Rysin
checks for some languages - that could be a good benchmarking test. Regards, Andriy 2016-01-28 8:47 GMT-05:00 Dominique Pellé <dominique.pe...@gmail.com>: > Andriy Rysin wrote: > > >> Then I realized that in the check method we split rules into callables >> and their co

Re: MS Word add-in for LT

2016-01-26 Thread Andriy Rysin
This is great news Jaume, (unfortunately) many people still use MS Office so this will give them a chance to use LT in their work. Regards, Andriy 2016-01-26 17:42 GMT-05:00 Marcin Miłkowski : > Hi Jaume, > > this is very good news! > > W dniu 26.01.2016 o 10:47, Jaume

Re: Splitting segment.srx?

2016-01-25 Thread Andriy Rysin
the process for non-developers. Regards, Andriy 2016-01-25 3:50 GMT-05:00 Marcin Miłkowski <list-addr...@wp.pl>: > W dniu 25.01.2016 o 03:29, Andriy Rysin pisze: >> Currently 95% of the language handling is done in language module so >> when I edit segment.srx I need to remember

Re: Splitting segment.srx?

2016-01-25 Thread Andriy Rysin
:)). Regards, Andriy 2016-01-25 12:32 GMT-05:00 Marcin Miłkowski <list-addr...@wp.pl>: > W dniu 25.01.2016 o 17:08, Andriy Rysin pisze: >> Well I am currently trying to involve several linguist who are not >> proficient with development tools in developing Ukrainian module for &

Re: Splitting segment.srx?

2016-01-24 Thread Andriy Rysin
-01-24 17:13 GMT-05:00 Marcin Miłkowski <list-addr...@wp.pl>: > W dniu 24.01.2016 o 17:15, Andriy Rysin pisze: >> Would it make sense to split segment.srx into language modules (and >> assemble dynamically from available languages)? For now it seems to be >> the o

Splitting segment.srx?

2016-01-24 Thread Andriy Rysin
Would it make sense to split segment.srx into language modules (and assemble dynamically from available languages)? For now it seems to be the only language-specific piece that belongs to core module. Was there any attempts at this and if yes what was the obstacle? Thanks Andriy

Re: misnamed community.languagetool.org

2016-01-17 Thread Andriy Rysin
I'd say path approach is good Andriy On Jan 17, 2016 8:59 AM, "Daniel Naber" wrote: > Hi, > > http://community.languagetool.org lists LT rules and offers the > Wikipedia check. Thus, it has been misnamed almost from the beginning. > Can anybody come up with a

Re: introduce new color for style errors

2016-01-05 Thread Andriy Rysin
Hi David this looks pretty nice! Shall we introduce an attribute on category/rulegroup/rule that will trigger different coloring? Thanks Andriy 2016-01-05 9:59 GMT-05:00 Daniel Naber : > On 2016-01-04 13:36, Daniel Naber wrote: > >> are difficult to find. I

Re: introduce new color for style errors

2016-01-04 Thread Andriy Rysin
I vote for it, users of Ukrainian have been asking for different color for style errors for a while. Thanks, Andriy On Jan 4, 2016 7:36 AM, "Daniel Naber" wrote: > Hi, > > LanguageTool currently uses two colors to mark errors: red for spelling > errors,

Re: limit the numer of suggestions in LibreOffice

2015-12-22 Thread Andriy Rysin
15 max sounds reasonable Andriy On Dec 21, 2015 4:45 PM, "Jaume Ortolà i Font" wrote: > Hi, > > I would like to limit the maximum number of suggestions that are shown in > LibreOffice. In Catalan the Morfologik speller is used for spelling > suggestions, and this number

Re: Behavior of non-breaking space U+00A0 in LanguageTool

2015-10-11 Thread Andriy Rysin
I woud agree with that, I had to add 00A0 in a lot of places including sentence tokenizer, word tokenizer and some rules for Ukrainian. But from text analysis it's pretty much the same as normal space so it would make sense to handle this at common level (early in the process). Thanks Andriy

Re: towards grammar checking in Firefox

2015-09-29 Thread Andriy Rysin
That's good news. I've added my vote to the bug. Andriy 2015-09-28 16:36 GMT-04:00 Yakov Reztsov : > Excellent! > > > Воскресенье, 27 сентября 2015, 11:39 +02:00 от Daniel Naber > : > > Hi, > > there's now a patch available that allows the

LanguageTool demo

2015-08-06 Thread Andriy Rysin
Hey guys I am going to do a presentation on LanguageTool and its Ukrainian module here in Kyiv next week and wondering if you can suggest some materials/slides for LT architecture so I don't create them from scratch. I have only 1 hour so high level overview is enough. English would be perfect,

Re: non-breaking space / spacebefore=no

2015-05-24 Thread Andriy Rysin
Good question, I would vote to include non-breaking space in spacebefore. On a related not I recently found out that \s does not include non-breaking space so I had to change \s to [\s\u00A0] in segment.srx and this improved sentence tokenizing for Ukrainian quite a bit. Andriy 2015-05-18 18:26

de tests fail

2015-05-07 Thread Andriy Rysin
I just checked out fresh copy of LT from git and Germal tests fail with following exception, is this just me or there's a problem? ests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.146 sec FAILURE! - in org.languagetool.tagging.disambiguation.rules.de.GermanDisambiguationRuleTest

Re: Multiple zero of min occurances

2015-05-07 Thread Andriy Rysin
I've pushed this fix today. Please let me know if you see any issues. Thanks Andriy 2015-04-30 16:27 GMT-04:00 Andriy Rysin ary...@gmail.com: Daniel, thanks will do. Jaume could you please update the rule in Catalan? I'll push my change once Catalan rules run green. Thanks Andriy

Re: regular expression detection inside token

2015-05-03 Thread Andriy Rysin
forceNoRegexp flag and then we skip the warning. Andriy 2015-05-03 10:27 GMT-04:00 Dominique Pellé dominique.pe...@gmail.com: Andriy Rysin ary...@gmail.com wrote: I started working on some abbreviations with dots in Ukrainian and added some of them to the dictionary. But now when I specify tokenрр

Bug in Matcher postag replacement

2015-05-01 Thread Andriy Rysin
I've found a bug in the matcher postag replacement: if I have 2 out match suggestions and first one uses postag_replace in match element, but second one just uses simple match no=N/ the matcher in the second suggestion also gets first postag replacement applied. If I put the suggestion with

Re: Multiple zero of min occurances

2015-04-30 Thread Andriy Rysin
Daniel, thanks will do. Jaume could you please update the rule in Catalan? I'll push my change once Catalan rules run green. Thanks Andriy 2015-04-30 16:25 GMT-04:00 Daniel Naber daniel.na...@languagetool.org: On 2015-04-29 19:38, Andriy Rysin wrote: Hi Andriy, I wrote little patch

Multiple zero of min occurances

2015-04-29 Thread Andriy Rysin
I just found out that if I have multiple tokens with min=0 my patterns don't match. Looking at the code it seems like if min=0 we only check for next pattern to match but that next may also have 0 mins. I wrote little patch with tests that make my rules work but unfortunately it breaks 1 rule in

regexp case sensitivity

2015-04-26 Thread Andriy Rysin
Looks like in token regexp is case sensitive by default, but in match it's not. Is this only for me? If not was this by design? I would never notice as most of the time I just add tests for one case (usually lower) and don't try the uppercase. The user pointed this problem to me. Thanks Andriy

Integrating LT into web-page

2015-03-12 Thread Andriy Rysin
I am running LT on my web-page for Ukrainian and a bit ago I had a nice border around text area and a status bar at the bottom of mce with resize button. That code was using my local version of tiny_mce (3.5.8). With newer code that I copied from

Re: [PATCH] Ignoring characters

2015-03-10 Thread Andriy Rysin
in JLanguageTool.getRawAnalyzedSentence() we only update tokens starting with i=1. I was not sure what's the right fix so I left it as it is. Please let me know if you see any problems, Thanks, Andriy 2015-03-08 13:21 GMT-04:00 Andriy Rysin ary...@gmail.com: I've found one problem with ignored characters: Morfologik speller

Re: [PATCH] Ignoring characters

2015-03-08 Thread Andriy Rysin
= rule.match(langTool.getAnalyzedSentence(атакуючий)); 2015-01-21 22:33 GMT-05:00 Andriy Rysin ary...@gmail.com: Ok, I've pushed a change to allow per-language set of characters to be ignored in tokens (e.g. Ukrainian adds an accent U+0301 to the soft hypen). Adding a reading with null tag seems to have

Re: German tests

2015-03-03 Thread Andriy Rysin
I installed jdk1.7.0_75 and German tests pass with it so it's java 8 which makes it fail. andriy 2015-03-03 8:26 GMT-05:00 Andriy Rysin ary...@gmail.com: I manually copied openregex.jar into libs/ and my output is similar to Marcin's, using lowercase die does not help: Fedora 21 (Twenty One

Re: German tests

2015-03-03 Thread Andriy Rysin
I manually copied openregex.jar into libs/ and my output is similar to Marcin's, using lowercase die does not help: Fedora 21 (Twenty One) I tried both openjdk and Oracle jdk: java -version openjdk version 1.8.0_31 OpenJDK Runtime Environment (build 1.8.0_31-b13) OpenJDK 64-Bit Server VM (build

Re: German tests

2015-03-02 Thread Andriy Rysin
(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 22 more 2015-03-02 16:12 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-03-02 21:41, Andriy Rysin wrote: Exception

Re: German tests

2015-03-02 Thread Andriy Rysin
Yes, I've built de, core, and commanline with maven (compile/install) but there's no openregex jar in target/LanguageTool-2.9-SNAPSHOT/LanguageTool-2.9-SNAPSHOT/libs 2015-03-02 17:18 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-03-02 22:34, Andriy Rysin wrote: Sorry, my bad, I

Re: German tests

2015-03-02 Thread Andriy Rysin
) ... 9 more 2015-03-02 15:20 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-03-02 18:03, Andriy Rysin wrote: I ran mvn clean test still has same issue, I tried en, ca, and pl and they all pass (and no extra output). I did mvn clean install in languagetool-core to make sure I get

Re: German tests

2015-03-02 Thread Andriy Rysin
) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) 2015-03-02 11:11 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-03-02 17:06, Andriy Rysin wrote: Hmm... $ git branch -a * master 2015-03-02 10:59 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-03-02 15:27, Andriy

Re: PATCH: Remove extra comma in AnalyzedSentence.toString()

2015-03-02 Thread Andriy Rysin
Pushed. Please let me know if you see any problem, Andriy 2015-03-02 6:45 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-03-01 23:16, Andriy Rysin wrote: Currently if there's no chunk tags we get extra commas in AnalyzedSentence.toString() Would it make sense to not add

Re: German tests

2015-03-02 Thread Andriy Rysin
Hmm... $ git branch -a * master 2015-03-02 10:59 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-03-02 15:27, Andriy Rysin wrote: Tests in de generate a lot of sysout and also fail for me. Is it just me or that's how it is right now? master should be fine, but I'm doing some

German tests

2015-03-02 Thread Andriy Rysin
Tests in de generate a lot of sysout and also fail for me. Is it just me or that's how it is right now? Thanks Andriy -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and

Re: German tests

2015-03-02 Thread Andriy Rysin
, Andriy Rysin wrote: java.lang.AssertionError: Got unexpected match(es) for 'Die diplomatischen Beziehungen zwischen Kanada und dem Iran sind seitdem abgebrochen.' Have you tried mvn clean? That sentence doesn't give an error on my computer, nor on languagetool.org. Regards Daniel

Disambiguator tests run twice

2015-03-02 Thread Andriy Rysin
It looks like it's because we have public void testRules() throws Exception { testDisambiguationRulesFromXML(); } in test subclass and also testDisambiguationRulesFromXML() from parent class is run as well. We probably should either not create test method in subclass or rename base class

PATCH: Remove extra comma in AnalyzedSentence.toString()

2015-03-01 Thread Andriy Rysin
Currently if there's no chunk tags we get extra commas in AnalyzedSentence.toString() Would it make sense to not add this extra comma if we don't need to add chunk tags? Thanks, Andriy diff --git a/languagetool-core/src/main/java/org/languagetool/AnalyzedSentence.java

chunker in disambiguator tests

2015-03-01 Thread Andriy Rysin
It looks like when we run checks we do run chunker before we run disambiguator, but when we run disambiguator tests we don't run chunker so the rules/examples in the disambiguator don't see multiword chunks. Is this correct or am I missing something, and if yes was it done on purpose? Thanks

Re: bounty for Firefox grammar checker interface

2015-02-28 Thread Andriy Rysin
Will do once I get to computer. BTW did we sign up for GSoC 2015? Andriy On Feb 28, 2015 11:08 AM, Daniel Naber daniel.na...@languagetool.org wrote: Hi, I have posted a $150 bounty to bountysource.com for integrating a grammar checker interface to Firefox. Why? Because the way

Disabling disambiguator rules

2015-02-26 Thread Andriy Rysin
Would it make sense to allow to disable disambiguator rules the same way we disable checking rules? I.e. I have a disambiguator rule that wil remove tokens with :rare tag if they overlap with ones without :rare. This produces good results for modern texts but does not work as well for books which

Re: @Nullable annotation to warn about NullPointerException

2015-02-23 Thread Andriy Rysin
That's great, latest Eclipse has support for null annotations as well, you might have to turn them on in the compiler settings though. Andriy 2015-02-23 6:00 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: Hi, I've added quite some '@Nullable' annotations to methods. They indicate that

Re: MultiThreadedJLanguageTool

2015-02-22 Thread Andriy Rysin
On 02/22/2015 04:45 AM, Marcin Miłkowski wrote: Hi, W dniu 2015-02-21 o 19:22, Andriy Rysin pisze: So the main problem with this performance improvement is that we read across paragraphs. There are two problems with this: 1) error context shows sentences from another paragraph: I almost

Re: MultiThreadedJLanguageTool

2015-02-22 Thread Andriy Rysin
actually increase number of matches). Could you please check if those are indeed extra consequitive overlapping matches? Thanks Andriy 2015-02-22 4:47 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-02-21 19:22, Andriy Rysin wrote: So the main problem with this performance improvement

Re: MultiThreadedJLanguageTool

2015-02-21 Thread Andriy Rysin
Thanks, I've pushed suggested cleanups. Andriy 2015-02-20 8:10 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-02-19 22:16, Andriy Rysin wrote: I've merged multithreading branch into master. Please try it out when you have a chance and let me know if you see any issues

Re: MultiThreadedJLanguageTool

2015-02-21 Thread Andriy Rysin
clean up the code logic... Thanks Andriy 2015-02-20 9:00 GMT-05:00 Andriy Rysin ary...@gmail.com: So before wrapping these optimizations up I decided to take a last look at the thread graph in jvisualvm and it showed that the worker threads spend more time in park state then in running

Re: MultiThreadedJLanguageTool

2015-02-20 Thread Andriy Rysin
for SameRuleGroupFilter and then will create another branch for everybody to test it out. Andriy 2015-02-20 8:10 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-02-19 22:16, Andriy Rysin wrote: I've merged multithreading branch into master. Please try it out when you have a chance

Re: workflow optimization

2015-02-19 Thread Andriy Rysin
appreciate some help as I am not very proficient with maven Thanks Andriy 2015-02-19 3:26 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-02-18 23:56, Andriy Rysin wrote: When I run ./build.sh uk test besides my language module maven also seems to build languagetool-core and runs

Re: workflow optimization

2015-02-19 Thread Andriy Rysin
Thanks, that worked. Though if run build.sh uk test it does not compile languagetool-core any more but still runs its tests. Andriy 2015-02-19 15:55 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-02-19 21:43, Andriy Rysin wrote: I ran mvn install in languagetool-core

Re: MultiThreadedJLanguageTool

2015-02-19 Thread Andriy Rysin
I've merged multithreading branch into master. Please try it out when you have a chance and let me know if you see any issues. Thanks Andriy 2015-02-18 14:10 GMT-05:00 Andriy Rysin ary...@gmail.com: That makes sense, change pushed. Andriy 2015-02-18 11:48 GMT-05:00 Daniel Naber daniel.na

Re: MultiThreadedJLanguageTool

2015-02-19 Thread Andriy Rysin
Sorry, forgot the test change, here's the full patch. 2015-02-19 18:58 GMT-05:00 Andriy Rysin ary...@gmail.com: Daniel I took a look at the problem of SameRuleGroupFilter missing rules on multithreaded execution due to rules with same id being split across threads. So I've added

Re: MultiThreadedJLanguageTool

2015-02-19 Thread Andriy Rysin
a look? Also with this we run SameRuleGroupFilter twice for both modes - one time (per thread) inside performCheck() and once at the end of check() after sorting. I feel like it's redundant and we can remove the first one. Thanks Andriy 2015-02-19 16:16 GMT-05:00 Andriy Rysin ary...@gmail.com

Re: MultiThreadedJLanguageTool

2015-02-18 Thread Andriy Rysin
3:50 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-02-18 00:15, Andriy Rysin wrote: I don't have much explanation for this so I introduced a system property (org.languagetool.thread_count) if you want to force different # of threads. We don't use system properties anywhere

Re: workflow optimization

2015-02-18 Thread Andriy Rysin
When I run ./build.sh uk test besides my language module maven also seems to build languagetool-core and runs tests on it. It's pretty quick but if I run ./build.sh uk test 20-30 times a day it gets noticeable. Is there an easy way to skip building/testing languagetool-core? Thanks Andriy

Re: MultiThreadedJLanguageTool

2015-02-17 Thread Andriy Rysin
regression. Please take a look and let me know if this works for you and if there's anything else we need to do to merge this into master. Andriy 2015-02-12 22:35 GMT-05:00 Andriy Rysin ary...@gmail.com: So I've played with this a bit today and here's what I found: with 3 relatively small

Re: MultiThreadedJLanguageTool

2015-02-14 Thread Andriy Rysin
for you and if there's anything else we need to do to merge this into master. Andriy 2015-02-12 22:35 GMT-05:00 Andriy Rysin ary...@gmail.com: So I've played with this a bit today and here's what I found: with 3 relatively small changes: 1) reuse thread pool rather that recreate it every time

Re: MultiThreadedJLanguageTool

2015-02-12 Thread Andriy Rysin
-11 05:07, Andriy Rysin wrote: 1) it seems like we're currently creating and destorying thread pool every time we check sentences, would it not make more sense to create pool once and keep threads in the pool and reuse them? I think so. The number of threads should then probably be specified

MultiThreadedJLanguageTool

2015-02-10 Thread Andriy Rysin
I have 2 questions about MultiThreadedJLanguageTool: 1) it seems like we're currently creating and destorying thread pool every time we check sentences, would it not make more sense to create pool once and keep threads in the pool and reuse them? It probably would not improve performance much but

Re: Grammar rules in more than 1 external file: bug or expected behaviour?

2015-02-06 Thread Andriy Rysin
I spent some time trying to include multiple xml files via ENTITY and gave up. It's not too hard to make it work on local filesystem but making it work iniside jar is pretty tricky. I went and just defined xml names inside the language class (see Ukrainian.java) but that's only for built-in xml

Development tools

2015-02-01 Thread Andriy Rysin
Sorry if this is obvious, but my friends asked me and I'm away from my computer. Is there a way to call parts of sentence analyzer of LT from command line? I.e. sentence tokenizer, tokenizer, tagger, disambiguator? Or currently using Java API is the only way to go? Thanks Andriy

Re: Failing tests in Ukrainian

2015-01-25 Thread Andriy Rysin
Ahh, that's for debugging unknown compounds, I need to turn it off my default, my bad will fix it today. Thanks for letting me know Andriy On Jan 25, 2015 6:30 AM, Marcin Miłkowski list-addr...@wp.pl wrote: Andriy, you seem to have failed to include one file in your commits. At least the

Re: [PATCH] Ignoring characters

2015-01-21 Thread Andriy Rysin
the cleaned token is but I am afraid the dirty token reading will affect suggestions etc in the way we don't want. Andriy 2015-01-20 9:58 GMT-05:00 Daniel Naber daniel.na...@languagetool.org: On 2015-01-20 14:29, Andriy Rysin wrote: So in JLanguageToolTest.testAnalyzedSentence() (line 133

  1   2   >