Hi all,

the Isomorphism class has an init method:

public void init(IMolecule reactant, IMolecule product, boolean removeHydrogen, 
boolean cleanAndConfigureMolecule) throws CDKException {
        this.removeHydrogen = removeHydrogen;
        init(new MolHandler(reactant, removeHydrogen, 
cleanAndConfigureMolecule),
                new MolHandler(product, removeHydrogen, 
cleanAndConfigureMolecule));
    }

The molecules I pass into this method have no explicit hydrogens and are 
configured so both flags removeHydrogen and cleanAndConfigureMolecule I should 
be able to set them to false.
The issue is the removeHydrogen flag. If I set it to "false" it cripples 
performance compared to "true". However even with flag set to true UIT is 
faster!

MolHandler Constructor:

    public MolHandler(IAtomContainer container, boolean removeHydrogen, boolean 
cleanMolecule) {
        String molID = container.getID();
        this.removeHydrogen = removeHydrogen;
        this.atomContainer = container;
        if (removeHydrogen) {
            try {
                this.atomContainer = 
ExtAtomContainerManipulator.removeHydrogensExceptSingleAndPreserveAtomID(atomContainer);
 <- remove Hydrogen set to true
            } catch (Exception ex) {
                logger.error(ex);
            }
        } else {
            this.atomContainer = 
container.getBuilder().newInstance(IAtomContainer.class, atomContainer);  <- 
remove Hydrogen set to false. this is pointless IMHO. do nothing.
        }

        if (cleanMolecule) {
            try {
                if (!isPseudoAtoms()) {
                    atomContainer = 
canonLabeler.getCanonicalMolecule(atomContainer);
                }
                // percieve atoms, set valency etc
                
ExtAtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(atomContainer);
                //Add implicit Hydrogens
                CDKHydrogenAdder adder = 
CDKHydrogenAdder.getInstance(atomContainer.getBuilder());
                adder.addImplicitHydrogens(atomContainer);
                // figure out which atoms are in aromatic rings:
                CDKHueckelAromaticityDetector.detectAromaticity(atomContainer);
            } catch (CDKException ex) {
                logger.error(ex);
            }
        }
        atomContainer.setID(molID);
    }

I tried to determine what actually is done in both code-path. 

Setting removeHydrogen to "true":
Bascially clones all atoms (except H) and all bonds (except does that were 
connected to H) into a new Molecule and sets implicit hydrogen. In my case this 
is just a waste of CPU time.

Setting removeHydrogen to "false":
Executes following line of code:
this.atomContainer = container.getBuilder().newInstance(IAtomContainer.class, 
atomContainer);

What is the point of this? IMHO it's 100% pointless and a waste of CPU time. 
I'm not sure why this cripples performance because what it ends up doing is 
this:

public AtomContainer(IAtomContainer container)
    {
        this.atomCount = container.getAtomCount();
        this.bondCount = container.getBondCount();
        this.lonePairCount = container.getLonePairCount();
        this.singleElectronCount = container.getSingleElectronCount();
        this.atoms = new IAtom[this.atomCount];
        this.bonds = new IBond[this.bondCount];
        this.lonePairs = new ILonePair[this.lonePairCount];
        this.singleElectrons = new ISingleElectron[this.singleElectronCount];
        
        stereoElements = new ArrayList<IStereoElement>(atomCount/2);

        for (int f = 0; f < container.getAtomCount(); f++) {
            atoms[f] = container.getAtom(f);
            container.getAtom(f).addListener(this);
        }
        for (int f = 0; f < this.bondCount; f++) {
            bonds[f] = container.getBond(f);
            container.getBond(f).addListener(this);
        }
        for (int f = 0; f < this.lonePairCount; f++) {
            lonePairs[f] = container.getLonePair(f);
            container.getLonePair(f).addListener(this);
        }
        for (int f = 0; f < this.singleElectronCount; f++) {
            singleElectrons[f] = container.getSingleElectron(f);
            container.getSingleElectron(f).addListener(this);
        }
    }

So it also copies the whole Molecule into a new AtomContainer. Not sure why 
this is so much slower but it is besides being pointless. The number of hits 
found is identical to setting removeHydrogens to true or using UIT.
I'm not sure why everyone says UIT is much slower. It is theoretically but in 
my case it is not probably because of the useless work done as indicated above.

Any comments? Am I missing something?

Regards,

Thomas


                                          
------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to