Re: [Rdkit-discuss] regexp

Greg Landrum Thu, 19 Nov 2009 12:28:52 -0800

Hi,

On Thu, Nov 19, 2009 at 3:41 PM, bouille <[email protected]> wrote:
>
> The error comes from |c:9|, |c:1,t:5|, |c:5,19|
> I have to cut off probably with regexp
>
> bad molecule Fc1ccccc1N1N=C(CC1c1ccccc1Cl)c1ccc(Br)cc1 |c:9|
> bad molecule NC1=C(C#N)C2=CCCCC2C2(C(=O)Nc3ccccc23)C1(C#N)C#N |c:1,t:5|
> bad molecule
> CCOC(=O)C1=C(C)N(c2ccc(OC)cc2)C(C)=C(C1c1ccc(Cl)cc1)C(=O)OCC |c:5,19|


yes, that would be a problem; the smiles for the molecule is:
"Fc1ccccc1N1N=C(CC1c1ccccc1Cl)c1ccc(Br)cc1". The parts between "|"
characters, like: "|c:9|" are ChemAxon extensions to SMILES. The RDKit
doesn't recognize these.

So your approach of using a regexp to remove the extra characters is
exactly correct.

Best Regards,
-greg

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] regexp

Reply via email to