Re: PT (PRE and POS)
On 2014-04-02 23:21, Marco A.G.Pinto wrote: > Where can I find the "github tracker"? https://github.com/languagetool-org/languagetool/issues?state=open -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: XML element and attribute statistics
A common xinclude processor is xmllint, part of Daniel Vaillards software. http://xmlsoft.org/xmllint.html $xmllint -o outputFile --xinclude inputFile HTH On 2 April 2014 18:29, Andriy Rysin wrote: > When I was splitting grammar.xml file I actually spent almost a day > trying to use xml include features to include component grammar files, > I must say I was not able to make it work properly in all scenarios: > filesystem/jar, for tests/released version. It could be I just didn't > use the right approach so if somebody can please point me on how it > can be done to keep LT working seemlessly in all scenarios I would > really appreciate it. > > If we can't do that can we consider loading all files together > similarly to how it's done in production code? > > Thanks > Andriy > > 2014-04-02 11:13 GMT-04:00 Daniel Naber : >> On 2014-04-02 16:42, Andriy Rysin wrote: >> >>> Provided those rules work in 2.5, do you think we just didn't include >>> grammar.xml before testing grammar-style.xml in our tests? >> >> The test checks one file after the other, so any definition in >> grammar.xml won't we visible in the other grammar*.xml files. Maybe >> those definitions could be moved to their own file and then be included >> via some XML feature? >> >> Regards >> Daniel >> >> >> -- >> ___ >> Languagetool-devel mailing list >> Languagetool-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/languagetool-devel > > -- > ___ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- Dave Pawson XSLT XSL-FO FAQ. Docbook FAQ. http://www.dpawson.co.uk -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: XML element and attribute statistics
On 04/02/2014 04:44 PM, Daniel Naber wrote: > On 2014-04-02 19:29, Andriy Rysin wrote: > >> When I was splitting grammar.xml file I actually spent almost a day >> trying to use xml include features to include component grammar files, >> I must say I was not able to make it work properly in all scenarios: > I guess you tried this one? > http://wiki.languagetool.org/tips-and-tricks#toc2 > If that doesn't work, there's no other approach I know of. yes, that's what i tried, I could not make the url work for both filesystem and jar, I even seen some differences on how LT code and xmllint include files (the simple include that worked for xmllint didn't work in LT) so I abandoned that path > >> If we can't do that can we consider loading all files together >> similarly to how it's done in production code? > Mhh, I can't see us doing anything special in production code. All files > are handled separately. Are you really 100% sure that these rules > actually worked? Or did they maybe work by chance, e.g. because the > wasn't actually needed for the examples you tried? yes I can confirm one of the rules (rulegroup id "SAMYI") works correctly in 2.5 and takes to account unification. It looks that PatterRuleTest.validatePatternFile() checks the xml files one at a time: loading one, validating it, going for next, while JLanguageTool.activateDefaultPatternRules() loads them all in memory, which (if I understand correctly) will keep first grammar.xml (which contains common parts) already loaded and parsed when loading/parsing rest of them. I guess we have two ways to go from here: adjust the tests to load files and keep them (I am not sure how easy it is - depends on how flexible our XMLValidator is) or change our getRuleFileNames() API to require those files to be independent (which may not be very efficient if all rule files will have to load and parse the same common parts, like unification etc) Regards, Andriy -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: PT (PRE and POS)
Hi Daniel, Where can I find the "github tracker"? Thanks! Kind regards, >Marco A.G.Pinto --- On 02/04/2014 22:10, Daniel Naber wrote: On 2014-04-02 16:11, Marco A.G.Pinto wrote: Hi Marco, I guess I can start working on the Portuguese, pre-agreement and post-agreement. could you please create an issue for this in the github tracker, describing the requirements? This way everything is kept in one place and we don't have to search through the mailing list archives. I am sharing the dictionary files taken from Minho University, on my Dropbox: We will also need the original source URL so we can document where we got it from. Regards Daniel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: PT (PRE and POS)
On 2014-04-02 16:11, Marco A.G.Pinto wrote: Hi Marco, > I guess I can start working on the Portuguese, pre-agreement and > post-agreement. could you please create an issue for this in the github tracker, describing the requirements? This way everything is kept in one place and we don't have to search through the mailing list archives. > I am sharing the dictionary files taken from Minho University, on my > Dropbox: We will also need the original source URL so we can document where we got it from. Regards Daniel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Maven vs. Gradle
Hi, Gradle is a build system, similar to Maven. I noticed that there's a "gradle init" command which automatically turns your Maven project into a Gradle project. As our Maven tests are quite slow, I've given it a try to see if Gradle is faster. The conversion didn't work 100% but issues could be worked around (see below). Here are the numbers for running the tests with Gradle: gradle clean -> 0:08 (i.e. 8 seconds) gradle test -> 5:47 gradle test -> 0:14 (well, nothing has changed) now change the German grammar.xml gradle test -> 2:30 now change a Java file in languagetool-wikipedia gradle test -> 0:45 You can see here that gradle actually considers the dependencies, i.e. a change in a module will run all the module's tests and all the tests of the modules that depend on it. As a comparison, "mvn clean test" takes about 5 minutes on my computer. Conclusion? It's probably not worth switching to Gradle, as the full test build is even a bit slower than with Maven and one rarely needs to run a full test. If anybody here has an issue with Maven and the tests being slow, please see http://wiki.languagetool.org/maven-tips to make sure you use all the tricks that keep test times down. If someone actually wants to try, here are the things you need to do after "gradle init"'s incomplete conversion: http://stackoverflow.com/questions/5144325/gradle-test-dependency http://stackoverflow.com/questions/7459755/how-can-i-make-gradle-include-ftl-files-in-war-file Regards Daniel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: XML element and attribute statistics
On 2014-04-02 19:29, Andriy Rysin wrote: > When I was splitting grammar.xml file I actually spent almost a day > trying to use xml include features to include component grammar files, > I must say I was not able to make it work properly in all scenarios: I guess you tried this one? http://wiki.languagetool.org/tips-and-tricks#toc2 If that doesn't work, there's no other approach I know of. > If we can't do that can we consider loading all files together > similarly to how it's done in production code? Mhh, I can't see us doing anything special in production code. All files are handled separately. Are you really 100% sure that these rules actually worked? Or did they maybe work by chance, e.g. because the wasn't actually needed for the examples you tried? Regards Daniel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: XML element and attribute statistics
When I was splitting grammar.xml file I actually spent almost a day trying to use xml include features to include component grammar files, I must say I was not able to make it work properly in all scenarios: filesystem/jar, for tests/released version. It could be I just didn't use the right approach so if somebody can please point me on how it can be done to keep LT working seemlessly in all scenarios I would really appreciate it. If we can't do that can we consider loading all files together similarly to how it's done in production code? Thanks Andriy 2014-04-02 11:13 GMT-04:00 Daniel Naber : > On 2014-04-02 16:42, Andriy Rysin wrote: > >> Provided those rules work in 2.5, do you think we just didn't include >> grammar.xml before testing grammar-style.xml in our tests? > > The test checks one file after the other, so any definition in > grammar.xml won't we visible in the other grammar*.xml files. Maybe > those definitions could be moved to their own file and then be included > via some XML feature? > > Regards > Daniel > > > -- > ___ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: XML element and attribute statistics
On 2 April 2014 16:13, Daniel Naber wrote: > On 2014-04-02 16:42, Andriy Rysin wrote: > >> Provided those rules work in 2.5, do you think we just didn't include >> grammar.xml before testing grammar-style.xml in our tests? > > The test checks one file after the other, so any definition in > grammar.xml won't we visible in the other grammar*.xml files. Maybe > those definitions could be moved to their own file and then be included > via some XML feature? Preferable xInclude over entities, if your preferred parser supports it? regards DaveP > > Regards > Daniel > > > -- > ___ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- Dave Pawson XSLT XSL-FO FAQ. Docbook FAQ. http://www.dpawson.co.uk -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: XML element and attribute statistics
On 2014-04-02 16:42, Andriy Rysin wrote: > Provided those rules work in 2.5, do you think we just didn't include > grammar.xml before testing grammar-style.xml in our tests? The test checks one file after the other, so any definition in grammar.xml won't we visible in the other grammar*.xml files. Maybe those definitions could be moved to their own file and then be included via some XML feature? Regards Daniel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: XML element and attribute statistics
Thanks Daniel! I can't figure out what's wrong with those tests you commented out though, the error is this: cvc-id.1: There is no ID/IDREF binding for IDREF 'gender'. Problem found at line 484, column 9. but the gender is properly defined in grammar.xml: Provided those rules work in 2.5, do you think we just didn't include grammar.xml before testing grammar-style.xml in our tests? grammar.xml should be returned first in Ukrainian.getRuleFileNames() list of filenames. Thanks Andriy 2014-04-01 16:11 GMT-04:00 Daniel Naber : > On 2014-04-01 05:00, Andriy Rysin wrote: > >> Oops, my bad, but the interesting this is that the tests pass on this >> file and the rule actually works in the final release... > > This is fixed now I think. I commented out the Ukrainian rules that > would have made the tests fail now. > > Regards > Daniel > > > -- > ___ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: prototype of new rule editor
On 2014-03-17 17:51, Daniel Naber wrote: > there's now a prototype of a new rule editor available at > http://community.languagetool.org/ruleEditor2/. Main features are: I have released another update. Major new features: -"Parse existing XML" link to get an existing XML rule into the editor. This doesn't support everything, but at least it should tell you which element is not supported in those cases. -Attributes of tokens and exceptions can now be set, even if the editor doesn't know about them ('skip' is an example) -Small user interface improvements Regards Daniel -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
PT (PRE and POS)
Hello! I guess I can start working on the Portuguese, pre-agreement and post-agreement. It should appear in the combo box as: Portuguese -> PT-PRE Portuguese -> PT-POS I am sharing the dictionary files taken from Minho University, on my Dropbox: PT-PRE: https://dl.dropboxusercontent.com/u/30674540/oo4x-pt-PT-preao-14.4.1.1.oxt.zip PT-POS: https://dl.dropboxusercontent.com/u/30674540/oo4x-pt-PT-posao-14.1.1.1.oxt.zip They are both dated from yesterday. Could someone also create the .txt for the compound words post-agreement? I looked in the supermarket and they do have a post-agreement dictionary, but it is "2013" and I can wait a couple of months or so for the "2014" to be released... meanwhile I can use the Priberam site to get some compound words post-agreement. Microsoft Office 2010 uses Priberam. The grammar.xml works for both, so no need to create another file. Thanks! Kind regards, >Marco A.G.Pinto --- -- -- ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel