Gazs,

wordBreaker is what you're looking for. Unfortunately, there are a number of 
issues involved with working with strongly case-marked languages, like 
Hungarian, only the first of which is writing a custom wordBreaker to 
"de-affix" the arguments. Here's an explanation here:

http://mitcho.com/blog/projects/in-case-of-case/

mitcho

> Hello,
> 
> I'm toying around with creating a Hungarian language parser for
> Ubiquity, but I have a big problem: how can I tell Ubiquity that not
> only is Hungarian left-branching, the suffixes (which show the roles)
> are glued to the end of the words (...which sometimes assimilate as
> well, but that's a later problem).
> 
> There are two ways I thought I could make it work with Ubiquity. The
> wordBreaker function from the Japanese parser seems unfortunately too
> rigid (it mercilessly chops off everything that looks like a suffix).
> The other function that seemed like it could work was the
> normalizeArgument found in romance language parsers, but I couldn't
> make it work. Would this be what I'm looking for?
> 
> Thanks for any help,
> Gazs
> 
> --
> 
> You received this message because you are subscribed to the Google Groups 
> "ubiquity-firefox" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/ubiquity-firefox?hl=en.
> 
> 

--
mitcho (Michael 芳貴 Erlewine)
[email protected]
http://mitcho.com/
linguist, coder, teacher

--

You received this message because you are subscribed to the Google Groups 
"ubiquity-firefox" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ubiquity-firefox?hl=en.


Reply via email to