Re: readable POS tags

2014-05-07 Thread Daniel Naber
On 2014-04-08 14:44, Daniel Naber wrote: I have now added a branch (readable-pos-tags) for this, simply because the changes are getting so complex. It's still incomplete and buggy. As you may have noticed, I did some work in this branch. You can see it at https://github.com/languagetool-org

Re: readable POS tags

2014-05-07 Thread Daniel Naber
On 2014-05-07 19:07, Marcin Miłkowski wrote: unification. I still don't get why German doesn't use it for disambiguation, for example. Maybe because nobody has seen an urgent need for that yet. I don't work that much on the German rules, but I'm generally okay with the way they work.

XSD and namespaces (was: readable POS tags)

2014-04-10 Thread Marcin Miłkowski
: https://github.com/languagetool-org/languagetool/blob/readable-pos-tags/languagetool-language-modules/en/src/main/java/org/languagetool/tagging/en/EnglishTagger.java Also, have you used namespaces on attributes in XSD? If so, could you provide a small example on how to write the XSD so that token

Re: readable POS tags

2014-04-08 Thread Marcin Miłkowski
W dniu 2014-04-07 23:01, Daniel Naber pisze: On 2014-03-25 09:35, Daniel Naber wrote: I've written an overview of how we could use readable POS tags in LT: http://wiki.languagetool.org/readable-part-of-speech-tags I'm writing a prototypical implementation on this for English now

Re: readable POS tags

2014-04-08 Thread Daniel Naber
think the only thing needed is to parse the tags again, if they are different. Although I'm not sure if I understood what you meant, I have now added a branch (readable-pos-tags) for this, simply because the changes are getting so complex. It's still incomplete and buggy. Here's the basic idea

Re: readable POS tags

2014-04-07 Thread Daniel Naber
On 2014-03-25 09:35, Daniel Naber wrote: I've written an overview of how we could use readable POS tags in LT: http://wiki.languagetool.org/readable-part-of-speech-tags I'm writing a prototypical implementation on this for English now. But there's one point where I'm stuck. Maybe I'm

Re: readable POS tags

2014-03-26 Thread Daniel Naber
On 2014-03-25 21:59, Dominique Pellé wrote: power compared to using regexp. Power users know regexp well as they are used in many programs so they don't have to learn something new. Power users also like the conciseness of regexp. As you said, the old way of matching will still be there for

Re: readable POS tags

2014-03-26 Thread Dave Pawson
On 26 March 2014 10:49, Daniel Naber daniel.na...@languagetool.org wrote: On 2014-03-25 21:59, Dominique Pellé wrote: power compared to using regexp. Power users know regexp well as they are used in many programs so they don't have to learn something new. Power users also like the

Re: readable POS tags

2014-03-26 Thread Daniel Naber
On 2014-03-25 14:24, Marcin Miłkowski wrote: So instead of just adding the POS tag we get from Morfologik to our AnalyzedToken object as a string, we interpret it and store something like pos = preposition, case = accusative. Is it that what you mean? Exactly. Any ideas on how the VBP tag

RE: readable POS tags

2014-03-26 Thread Mike Unwalla
I agree that backward compatibility is important. Without backward compatibility, the proposed change means that the content of disambiguation files and grammar files must be changed. That is a huge task. Even if you develop a utility that lets people convert files to the new format, there

Re: readable POS tags

2014-03-26 Thread Marcin Miłkowski
is not that tags are cryptic and short; it is that they do not make features easily available separately. My use case for readable pos tags is also speed and simplicity for unification (rules that use agreement between words). It is simply faster to specify features by citing appropriate

Re: readable POS tags

2014-03-26 Thread Marcin Miłkowski
W dniu 2014-03-26 15:20, Mike Unwalla pisze: I agree that backward compatibility is important. Without backward compatibility, the proposed change means that the content of disambiguation files and grammar files must be changed. That is a huge task. Even if you develop a utility that lets

Re: readable POS tags

2014-03-26 Thread Daniel Naber
On 2014-03-26 17:49, Marcin Miłkowski wrote: No, that would be horrible, as this is not an improvement. The problem is not that tags are cryptic and short; That's also a problem, but not so much for power users and for everybody else we will be able to solve that in the user interface (i.e.

readable POS tags

2014-03-25 Thread Daniel Naber
Hi, I've written an overview of how we could use readable POS tags in LT: http://wiki.languagetool.org/readable-part-of-speech-tags The core part however - how do these new POS tags actually look like - is still missing. Any input on the overview and ideas about that core part is welcome

Re: readable POS tags

2014-03-25 Thread Dave Pawson
On 25 March 2014 08:35, Daniel Naber daniel.na...@languagetool.org wrote: Hi, I've written an overview of how we could use readable POS tags in LT: http://wiki.languagetool.org/readable-part-of-speech-tags The core part however - how do these new POS tags actually look like - is still

Re: readable POS tags

2014-03-25 Thread Daniel Naber
On 2014-03-25 11:07, Marcin Miłkowski wrote: For all I can see, no HashMaps are required at all, just a consistent way of understanding the values in class members. So instead of just adding the POS tag we get from Morfologik to our AnalyzedToken object as a string, we interpret it and store

Re: readable POS tags

2014-03-25 Thread Marcin Miłkowski
W dniu 2014-03-25 13:29, Daniel Naber pisze: On 2014-03-25 11:07, Marcin Miłkowski wrote: For all I can see, no HashMaps are required at all, just a consistent way of understanding the values in class members. So instead of just adding the POS tag we get from Morfologik to our AnalyzedToken

Re: readable POS tags

2014-03-25 Thread Dominique Pellé
Daniel Naber wrote: Hi, I've written an overview of how we could use readable POS tags in LT: http://wiki.languagetool.org/readable-part-of-speech-tags The core part however - how do these new POS tags actually look like - is still missing. Any input on the overview and ideas about