Re: [Apertium-stuff] Lint Checker Ideas for GSOC

Jimmy O'Regan Mon, 26 Mar 2012 14:57:42 -0700

On 26 March 2012 22:17, Aaron Rubin <[email protected]> wrote:
> I've adjusted the plan quite a bit - it now gives more time to transfer
> rules and checks for a few other problems that I thought might come up. How
> does it look?
>
> Weeks 1-5, .dix files:
>
> Week 1: Redundant Entry Finder
> Week 2: Testing Full Entries in Lemmas where Part of the Lemma is Specified
> by the Pardef; Testing Misspelled Tags and Pardefs
> Week 3: Testing Incompatible Tags; Testing Tag Missing on One Side of
> Translation Equivalents (in bilingual dictionaries)
> Week 4: Testing Missing Gender on Gendered Languages (in bilingual
> dictionaries)


Running such a tool on the es-ca dictionary would report ~15000 false
positives; on en-es, ~1200. It would be nice to be able to add
something like <!-- no_gender_check --> to a dictionary, and have it
exit.

> Week 5: Bundling features together in one program; re-organizing code, and
> writing documentation, to make sure that everything is as neat and
> maintainable as possible. Combining tests from previous weeks into a single
> testing program so that all features can be tested at once when the code is
> modified in the future.
> Weeks 6-12, transfer rules:
> Week 6:  Checking inappropriate uses of <equal>, <begins-with>, <ends-with>,
> and <let> in transfer rules (equating a tag with a non-empty string literal,
> etc.)

Using let to assign an empty string to a tag is an appropriate use, be
sure you take that into account.

> Week 7: Checking for cases where the user asks for nonexistent tags with
> lit-tag v="some_tag" (always an error) or for a string literal with lit
> v="some_string" that is identical to a tag (suspicious and very likely an
> error).

The latter seems dubious. It seems a reasonable thing to do in
<concat> at least.

> Week 8: Checking for undefined tags after attr-item in attribute
> definitions, probably due to spelling errors. Checking for calls to anything
> other than a defined attribute, lem, lemh, lemq, whole, or tags after part=
> in a clip.
> Week 9: Checking for patterns that refer to non-existent categories,
> probably due to spelling errors.

Already caught by the validator, though a more descriptive error
message might be helpful.

> Checking for misspelled variables.
> Week 10: Checking for an untagged chunk (ex., in the rule "HACE NUM NOM" in
> apertium-en-es.en-es.t1x, forgetting to give the resulting chunk the tag
> "adverb," which seems like a conceivable mistake to me). Checking for
> incorrect number of arguments in calls to macro.

Also already caught by the validator.

> Week 11: Checking for missing <test> after <when> and for non-boolean
> arguments to <test>, <and>, <not>, and <or> (unless the compiler already
> checks for that sort of thing?).

Again, also caught by the validator.

-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Lint Checker Ideas for GSOC

Reply via email to