On Friday 10 November 2006 09:25, Takayuki KOTANI wrote:
> I am using cdk-20060714.jar I have some requests for cdk improvement.

Hi Takayuki,

sorry for not having replied earlier, but I am at a conference right now 
(google:goslar chemoinformatics GCC).

> 1) RuleOfFiveDescriptor uses WeightDescriptor for calculation of molecular
> weight. However it does not give IUPAC official masses published in Pure
> Appl. For example, WeightDescriptor uses 1.00782504 for H mass, but IUPAC
> official mass is 1.00794.
> I think that using getCanonicalMass() at Class MFAnalyser instead of
> WeightDescriptor is a good solution.

In the end all these algorithms should give the same output for:
- natural mass
- exact mass of one isotope

using the Blue Obelisk Data Repository (BODR).

I haven't had time to look into it yet, so don't know exactly which method is 
using what data. Additionally, I do not know if the given IUPAC mass is 
correct either.

Takayuki, if you like, you could download BODR 4 from the CDK download 
webpage, and compare the isotopic masses and natural abundances that IUPAC 
gives and which BODR gives. BODR uses (mostly) primary literature, and should 
have, that's why that project is set up, high quality data.

> 2) Both WeightDescriptor and getCanonicalMass gave sometimes one or more H
> atom less molecular weight.
> For example, getCanonicalMass of "CN(CC2=CC=CO2)C1=CC=CC=C1" gives 186.23,
> It should be 187.23 (C12H13NO).
> Since analyseAtomContainer at the same class of getCanonicalMass gives the
> correct formula,

I will need to look at this too; do I understand correctly that when parsing 
that SMILES, the molecular formula is on hydrogen short?

> 397:                  mass += ac.getAtom(f).getHydrogenCount() *
> getCanonicalMass(h);
>
> could not count exact Hydrogen number.

That looks fine, though I am not sure, at this time, what getCanonicalMass() 
precisely does... explicit hydrogens should not be calculated here, as the 
are taken care of my the loop over the atoms itself.

> How about calculation of getCanonicalMass using the same algorithms with
> analyseAtomContainer?

Yes, in the end they should use the same algorithm and the BODR data. I will 
look into this.

> getMass() and getNaturalMass() are also including the same problems.

The problem of having one hydrogen short, or ...?

Egon

-- 
http://chem-bla-ics.blogspot.com/

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to