[Juglist] Text Parsing

pathogen Mon, 08 Dec 2003 11:53:20 -0800

Hi,

I am having a bit of a parsing problem in the chemistry domain. I have
the following Rule:


A + B -->

A: [C:1][N:2][C:3]
B: [Cl:4][C:5]
C: [C:1][N:2]([C:5])[C:3]

The integers are the "mappings" of the various atoms. This rule,
including the product, is entered by the user.

The user also enters this as a test case:

A: C[C:1][N:2][C:3]C
B: [Cl:4][C:5]COC 

This should produce:
C: C[C:1][N:2]([C:5]COC)[C:3]C

The user will input the test case in a different notation, however,
called SMILES. Thus, a,b,c map to this respectively in SMILES

A: CCNCC
B: ClCCOC 
C: CCN(CCOC)CC

The goal is to apply the given rule to the test case to produce C for
the test case (i.e. product). It is intended to be flexible enough such
that the user can go back and change any parts of the rule or a,b of the
test case and a new product will be calculated. Human-wise, it is not
difficult. Looking at the product of the rule, we see that the product
of the test case must have at least [C:1][N:2] (in that order), followed
by a branch to b (denoted by parenthesis), in turn followed by the
remainder of a. Notice [Cl:4] was removed in the process while unmapped
elements (not in brackets) are included. Try as I might, I just cannot
figure out a good way to parse this thing and make it generic enough to
handle multiple rules and/or test cases. Any help would be greatly
appreciated.

Thanks,
Dev Brown

_______________________________________________
Juglist mailing list
[EMAIL PROTECTED]
http://trijug.org/mailman/listinfo/juglist_trijug.org

[Juglist] Text Parsing

Reply via email to