Missatge de Daniel Swanson <awesomeevildu...@gmail.com> del dia dt., 21 de des. 2021 a les 7:57:
> Hi Greg, > > The file where you want to write rules for this is > https://github.com/apertium/apertium-pol/blob/master/apertium-pol.pol.rlx > > If you want something like "tacy is <det> before <n>", you could get that > with > > SELECT DET IF (0 DET) (0 NOUN) (1 NOUN) ; > The problem with this rule is that (1 NOUN) is not necessarily a noun, but something that can be analysed as a noun at the moment this rule is executed. Similarly, the 0 word may be correctly analysed as something else, like an adjective. So, a more cautious rule can be, for instance: REMOVE NOUN IF (0 DET) (0 NOUN) (1C NOUN) ; The problem with this alternative variant of the rule is that it matches less often than the first one. It may not solve cases Daniel's version solve, although it probably makes less wrong decisions. Your knowledge of the language, and testing on corpus, should help you decide what is better, or maybe you will choose something else in the middle. Tuning can be done adding a few rules, previous to the general one, for often words/cases. Hèctor > > Daniel > > On Mon, Dec 20, 2021 at 1:40 PM Grzegorz Kulik <gregorykku...@gmail.com> > wrote: > > > > Hello all, > > > > I haven't contacted you for some time, I hope you are all well. I > developed the pol-szl pair and although the translation is quite > reasonable, I decided to make it better by improving the lexical selection. > I've been reading the documentation and managed to write several rules for > forms that need disambiguation and are the same parts of speech. However, I > cannot find any information anywhere about what to do if there is a form > that can mean two completely different things. Example in Polish: > > > > tacy (such) = taki<det><dem><mp><pl><nom> > > tacy (of a tablet) = > taca<n><f><sg><gen>/taca<n><f><sg><dat>/taca<n><f><sg><loc> > > > > The first meaning is obviously much more frequent but the translator > chooses the second one, which is less than desirable. > > > > What can I do to remedy this? Can I write rules for that manually? > Should I train the tagger? If so, what method would be the best? There's > multiple training methods and I don't know which one to choose for my pair. > Could you recommend me the best approach? > > > > Thank you in advance > > Greg > > _______________________________________________ > > Apertium-stuff mailing list > > Apertium-stuff@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff