Dear cdk users,

It seems that it's impossible to get other results than NaN values for  
the following descriptors:

Wgamma1.unity = NaN
Wgamma2.unity = NaN
Wgamma3.unity = NaN
WG.unity = NaN

I've tested with the last CDK version. The CDKDesc GUI of Rajarshi  
gives similar results with a previous CDK version.

Bellow is attached a small snippet that perform calculation for WHIM  
descriptors & PSA to perform a quicktest. Running on a simple SDF file  
containing cyclohexane, pyrrole & 3H-indole, I get the previously  
described problem, plus all values at NaN for the pyrrole molecule.

Don't know if it's a known issue, a new bug, or my own ignorance on  
how to use correctely this descriptors.

Also, I used the CDK to load several millions of molecules from  
existing chemical providers; During the process, several atom types  
were not recognized. If Egon or anyone else is interested in viewing  
these compounds, just send me an email.

Cheers :)
Vincent.

=========== SNIPPET ==========/**

import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.openscience.cdk.DefaultChemObjectBuilder;
import org.openscience.cdk.aromaticity.CDKHueckelAromaticityDetector;
import org.openscience.cdk.atomtype.CDKAtomTypeMatcher;
import org.openscience.cdk.graph.ConnectivityChecker;
import org.openscience.cdk.interfaces.IAtom;
import org.openscience.cdk.interfaces.IAtomType;
import org.openscience.cdk.interfaces.IMolecule;
import org.openscience.cdk.interfaces.IMoleculeSet;
import org.openscience.cdk.interfaces.IPseudoAtom;
import org.openscience.cdk.io.iterator.IteratingMDLReader;
import org.openscience.cdk.qsar.DescriptorValue;
import org.openscience.cdk.qsar.descriptors.molecular.TPSADescriptor;
import org.openscience.cdk.qsar.descriptors.molecular.WHIMDescriptor;
import org.openscience.cdk.tools.CDKHydrogenAdder;
import org.openscience.cdk.tools.manipulator.AtomContainerManipulator;
import org.openscience.cdk.tools.manipulator.AtomTypeManipulator;

/**
  *
  * @author vince
  */
public class CdkTest {

     public static void test(String fileName) throws Exception {
         // Open SD file
         FileInputStream ins = new FileInputStream(fileName) ;
         IteratingMDLReader reader = new IteratingMDLReader(ins,
                                          
DefaultChemObjectBuilder.getInstance());

         // Load all molecules in memory & clean them
         List<IMolecule> mols = new ArrayList() ;
         IMolecule mol ;

         while (reader.hasNext()) {
             mol  = (IMolecule) reader.next();
             try {
                 mol = cleanMolecule(mol, true, true, false) ;
                 mols.add(mol) ;
             }
             catch(Exception e) {
                 e.printStackTrace();
             }
         }

         reader.close();

         // Calculate descriptor & print results
         for(int i = 0 ; i < mols.size() ; i++) {
             System.out.println("\n=== Molecule "+i+": \n ");
             DescriptorValue dv  = new  
WHIMDescriptor().calculate(mols.get(i)) ;
             DescriptorValue psa = new  
TPSADescriptor().calculate(mols.get(i)) ;

             String [] vals = dv.getValue().toString().split(",") ;
             String [] names = dv.getNames() ;

             for(int j = 0 ; j < vals.length ; j++) {
                 System.out.println(names[j]+" = "+vals[j]);
             }

             System.out.println(psa.getNames()[0]+" =  
"+psa.getValue().toString());

         }
     }

    /**
      *
      * Clean an input molecule, including configuration of atom types and
      * aromaticity detection. Additional actions are available; see  
parameters.
      *
      * @param m
      * @param keepLargestFrag   If true, remove any disconected  
fragment, and keep the largest one
      * @param explicitH         If true, add explicit hydrogens. If  
false, only implicit H are added
      * @param forceExotic       If true, do not throw exception for  
unrecognized atom types
      * @return  The cleaned molecule
      * @throws Exception
      */
     public static IMolecule cleanMolecule(IMolecule m,
                                           boolean keepLargestFrag,
                                           boolean explicitH,
                                           boolean forceExotic)
             throws Exception {

         // Check for salts and such, if asked -> simply keep the  
largest fragment
         if (keepLargestFrag) {
             if (!ConnectivityChecker.isConnected(m)) {
                 IMoleculeSet fragments =  
ConnectivityChecker.partitionIntoMolecules(
                         m);

                 int maxID = 0;
                 int maxVal = Integer.MIN_VALUE;
                 int atomCount = -1;

                 for (int i = 0; i < fragments.getMoleculeCount(); i++) {
                     atomCount = fragments.getMolecule(i).getAtomCount();
                     if (atomCount > maxVal) {
                         maxID = i;
                         maxVal = atomCount;
                     }
                 }

                 m = fragments.getMolecule(maxID);
             }
         }

         // Configure the molecule atom types & add implicit hydrogens

         // 1. The fastest way (fastest = fiewer code), but we don't control
         // everything, namely exotic atom types

//        AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(m);
//
//        CDKHydrogenAdder hAdder =  
CDKHydrogenAdder.getInstance(m.getBuilder());
//        hAdder.addImplicitHydrogens(m);

         // 2. The custom way: more code, but more control on atom typing
         CDKAtomTypeMatcher matcher = CDKAtomTypeMatcher.getInstance(
                 m.getBuilder());
         CDKHydrogenAdder hAdder =  
CDKHydrogenAdder.getInstance(m.getBuilder());

         // Assign atom types for all atoms
         for (IAtom atom : m.atoms()) {
             if (!(atom instanceof IPseudoAtom)) {
                 IAtomType matched = matcher.findMatchingAtomType(m, atom);
                 if (matched != null) {
                     AtomTypeManipulator.configure(atom, matched);
                     hAdder.addImplicitHydrogens(m, atom);
                 }
                 else {
                     // Here the CDK doesn't know the atom type...
                     if (!forceExotic) {
                         throw new Exception("Unknown atom type " +  
atom.getSymbol());
                     }
                 }
             }
         }

         // Detect aromaticity
         CDKHueckelAromaticityDetector.detectAromaticity(m);

         // Add explicit hydrogens, if asked
         if (explicitH) {
             AtomContainerManipulator.convertImplicitToExplicitHydrogens(m);

             // Percieve atom types again to assign hydrogens atom types
             AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(m);
         }

         return m;
     }

     public static void main(String [] args) {
         String file = (args.length > 0 && args[0] != null) ?  
args[0]:"dummy.sdf" ;
         try {
             test(file);
         }
         catch (Exception ex) {
             ex.printStackTrace();
         }
     }
}



------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to