W dniu 2015-06-29 o 23:18, KARIN PIŠKUR pisze: > Hello! > > I'm working on my diploma thesis and I'd like to improve LanguageTool > for Slovenian language and prepare some new rules in Java. For that I > need to include POS dictionary into LanguageTool. > > For the beginning (and my needs for my diploma) it would be good enough > to include open source POS tagger available on this web site: > http://eng.slovenscina.eu/tehnologije/oznacevalnik > > but I don't know how to include it. > > Can you please help me, how can I do it?
I'm afraid you would have to port it from C# to Java. Alternatively, you could follow a minimal pathway, by simply using the morphosyntactic lexicon and lemmatiser, which seem to be available as well: http://www.slovenscina.eu/sloleks/opis I cannot find a download link but if there is one, it should be fairly easy to create an XSLT template to process the dictionary to the format that we use. See here for more documentation: http://wiki.languagetool.org/developing-a-tagger-dictionary Of course, you'd miss disambiguation then. Given that the POS tagger is really very accurate -- I cannot really believe the numbers, they're too good to be true ;) -- probably porting the C# code is the best solution. C# is close to Java, so it should be doable. Regards, Marcin ------------------------------------------------------------------------------ Don't Limit Your Business. Reach for the Cloud. GigeNET's Cloud Solutions provide you with the tools and support that you need to offload your IT needs and focus on growing your business. Configured For All Businesses. Start Your Cloud Today. https://www.gigenetcloud.com/ _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel