On 09/11/2010 13:53, Andrew Dalke wrote:
> I am also willing to write tests which cover these.

On Nov 10, 2010, at 11:22 AM, Chris Morley wrote:
> Even better!

I spent a lot of time on trains yesterday, and test cases were
the right mental level for me.

Here's what I developed:

   http://dalkescientific.com/test_python.py

They are meant so that people have a way to figure out how
to call different functions, and they are meant to be fast,
so they can be used as real unit tests. (The 66 test cases /
~400 tests) run in about 1.5 seconds on my laptop.)


I don't know how you all want that hooked into the existing test
framework, but I figured it was best to start with test cases.

It also looks like I should get commit privs to OpenBabel.


I noticed a number of minor problems when developing the tests.
These are all marked with an "XXX" in the above code, although
there are a few places marked with XXX which are either TODOs
or places where I didn't understand the result when I wrote it
but I believe I do now.

1) If I do something like

            mol = parse_smiles("CCO")
            mol.SetTitle("#1")
            conv.WriteFile(mol, tempdir("blah.sdf"))

I get the message

    Warning in WriteMolecule No 2D or 3D coordinates exist.
    Any stereochemical information will be lost. To generate
    2D or 3D coordinates use --gen2D or --gen3d.

Since I'm not (directly) calling WriteMolecule and since my
test cases don't support the --gen2D or --gen3d flags, this
warning isn't that helpful in a library function.

2) ob.OBFingerprint.List("fingerprints")

This writes to an output stream, and I can't figure out
how to generate a string-based output stream in Python.
However, this is not that important because there are
other ways to access the same data. It does suggest that
List() is not a useful library function.

3) Why does the fingerprint GetBit require a fingerprinter instance?
That is, I have to go through a bound method

        x = ob.OBFingerprint.FindFingerprint("FP2")
        self.assertEquals(x.GetBit(v, 0), True)

instead of using a static/global function

        self.assertEquals(ob.OBFingerprint.GetBit(v, 0), True)

4) Here's a test case I didn't expect. It seems that GetUMapList
interacts with the results of GetMapList.

    def test_basic_match_behavior_which_I_did_not_expect(self):
        mol = parse_smiles("c1ccccc1O")
        pat = parse_smarts("c1ccccc1")
        pat.Match(mol)
        self.assertEquals(pat.NumMatches(), 12)
        results = pat.GetUMapList()
        self.assertEquals(pat.NumMatches(), 1)
        results = pat.GetMapList()
        # I really expected these to be 12.
        # It appears the UMapList does an in-place trim.
        # XXX Is that the right/expected behavior?
        self.assertEquals(pat.NumMatches(), 1)
        self.assertEquals(len(results), 1)

        pat.Match(mol)
        # Here they are 12
        results = pat.GetMapList()
        self.assertEquals(pat.NumMatches(), 12)
        results = pat.GetUMapList()
        self.assertEquals(pat.NumMatches(), 1)
        self.assertEquals(len(results), 1)


5) The BeginMList/EndMList seems broken in Python. That is, 
I couldn't figure how how to use them.

6) I included tests for iterating over atoms and bonds in
a molecule, and for atoms and bonds connected to an atom. I
expected I could also iterate over the atoms in a bond, but
there is no OBBondAtomIter . While not important, it appeals
to my sense of symmetry and it's very simple to write.

7) Why can't I do mol.AddBond(C, N, 3) where C and N are atoms?
Instead, I need to do mol.AddBond(C.GetIdx(), N.GetIdx(), 3)

8) Why are there two ways to add an atom? These are:
  atom = mol.NewAtom()
    -and-
  atom = OBAtom()
  mol.AddAtom(atom)

Is one of these forms preferred?

9) How does OpenBabel define "implicit hydrogen"? It's different
than my definition, which in

  [NH3].C

is the explicitly stated implicit h-count in [NH3] and the
inferred H-count in C (which is 4). OpenBabel treats the H3
as explicit.

It appears that OpenBabel doesn't have an implicit h-count
field and instead tracks the valence, then calculates the
h-count based on the valence.

10) Shouldn't the molecular formula for "[NH4+]" include
the total formal charge?

11) The formal results for c1ccccc1O.[NH4+] are very strange

Take a look at this test

        self.assertEquals(mol.GetSpacedFormula(),  "C 6 H 10 N  1 O  1 ")
        self.assertEquals(mol.GetSpacedFormula(1), "C 6 H 10 N O ")

Why is there a double space in "N  1" and "O  1"? Why is there a
space after the last "1"? And just how useful is the "include
implicit hydrogens" flag to generate

        self.assertEquals(mol.GetSpacedFormula(0, ' ', 0), "C 6 H 4 N  1 O  1 ")

?

12)  I don't think atom.GetCoordinate() is accessible from Python
except through ctypes (Python's foreign function interface)

13) Why does GetAtom() start from 1 while GetBond() start from 0?
(Perhaps this is in the documentation somewhere, but I was on a
train without access to the internet and Google)

        mol = parse_smiles("C#N")
        C = mol.GetAtom(1)
        N = mol.GetAtom(2)
        # XXX Why do bonds starts from 0 and not 1
        self.assertEquals(mol.GetBond(1), None)

14) The documentation for OBBond.GetNbrAtom needs to warn that if
the atom isn't one of the bond atoms then the GetBeginAtom() will
be returned. (This documentation for GetNbrAtomIdx already
has a warning to that effect.)

15) In an OBRing, if there are multiple non-C atoms, which one
is the root? And since ring.GetRootAtom() returns an integer,
shouldn't it be named GetRootAtomIdx()?

16) Why does GetAngle() return degrees instead of radians? I see
the warning to that effect, but it was rather unexpected to see.

17) If you have something like "CNOS" then C.IsConnected(C) is
true. I did not expect that. This is because IsConnected checks
to see if the target is any of the atoms of either side of the
bonds that are connected to C. And of course, C is always one
of the atoms connected to a bond of C.

This has a downstream consequence in that

        C.IsOneFour(C) == True

If I read the code right, the C in "CN" will also be in a
1-4 relationship with itself.

18) There are some minor problems with the spectrophoretest.cpp
test01. (I didn't review all of the tests.)

In

   OB_REQUIRE( (r[13] >  3.470) && (r[13] <  3.472) );
   OB_REQUIRE( (r[14] >  6.698) && (r[14] <  6.800) );
   OB_REQUIRE( (r[15] >  9.485) && (r[15] <  9.487) );

the line with r[14] should have 6.700 as the upper bound.

and
   OB_REQUIRE( (r[18] >  4.899) && (r[18] <  4.902) );
should be slightly smaller in the upper bound
   OB_REQUIRE( (r[18] >  4.899) && (r[18] <  4.901) );


19) I see that MMFF94's ValidateGradients() dumps output
to stdout. The other ValidateGradients I tested (gaff and
ghemical) do not do that, and the output probably isn't
appropriate for a library.

Then again, the API docs say that it's meant for debugging.
Should the debugging API be published for general use?

20) OBAtom.HtoMethtyl() can also dump code to stderr

  v3: < -0.28, 0, 0 > v4: < -0.28, 0, 0 >

This is quite clearly a bit of debugging code which
wasn't removed:

    cerr << "v3: " << v3 << " v4: " << v4 << endl;


Cheers!


                                Andrew
                                da...@dalkescientific.com



------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to