Hi Fran,

I guess you mean with disambiguated corpus only a corpus with
morphological readings disambiguated and where there is no syntactic
structure marked (like with CG style sugar).  But generating barrier
sets from syntax trees (and in general just from syntactilly marked
corpus) is a good idea! The research I have seen so far on this topic
has only used the morphological tags and n-gram methods for induction
but no syntactic information.

And now as you mentioned this topic I remembered that we should decide
that do we want to use the rules only for morphological disambiguation
or also for marking syntactic structure. Morphological disambiguation
as the first objective seems reasonable and then again later rules for
syntactic marking could be added.

Joonas

On 10/20/16, Francis Tyers <[email protected]> wrote:
> A 2016-10-20 14:29, Joonas Kylmälä escrigué:
>> Hi peeps!
>>
>> Have you read "Inducing constraint grammars" [1]? That's a really
>> good¹ way to get into the world of Constraint Grammar rule induction!
>> Well, that wasn't my original intent with this email. Instead, I'm
>> looking for people who would like to transfer these theories and
>> experiments that have been made in the area of CG rule induction to
>> something concrete and usable. Anyone want to join in? I'm not still
>> an expert in this field but slowly starting to grasp the ideas behind
>> CG rule induction. So some help would be appreciated.
>>
>> I would like the system to be made in mind that the induced rules
>> could be edited later manually, e.g., the generated rules should be
>> generalized (=reduce the number of rules) as much as possible and have
>> some sort of semantic meaning labeled to them. An AI (or some other
>> method) generating comments/labels for rules might be needed for this.
>> Having the system generate all the other different bits (delimiters,
>> ...) to the constraint files would be also cool, but maybe not the
>> starting point for creating this system.
>>
>> I would like to have the project under a free software license,
>> preferably GPLv3+.
>>
>
> Hey Joonas! Yes, this certainly looks interesting. In particular I would
> be interested in learning the rules not just from a disambiguated corpus
> but also from a treebank where you could learn long distance rules
> by looking at the tree structure and use the intervening matter to
> define
> the barrier sets.
>
> Fran
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to