Re: [BlueObelisk-discuss] [BlueObelisk-SMILES] help needed -- SMILES and MMFF94 atom types

2012-04-28 Thread Robert Hanson
OK, SMARTS-based MMFF94 charges are now checked in for Jmol.

Obviously the settings are not perfect and will take some tweaking. I'm
sure it's more complicate than I make it out to be -- probably some very
odd definitions in there; certainly some I could not fathom on this first
pass. I haven't taken a full look at OpenBabel's definitions, but I did use
the algorithm in OBForceFieldMMFF94::SetPartialCharges() as my basis (even
though some of that makes almost no sense to me -- for example:

  switch (type1) {
  case 32:
// 32  OXYGEN IN CARBOXYLATE ANION
// 32  NITRATE ANION OXYGEN
// 32  SINGLE TERMINAL OXYGEN ON TETRACOORD SULFUR
// 32  TERMINAL O-S IN SULFONES AND SULFONAMIDES
// 32  TERMINAL O IN SULFONATES
  case 35:
// 35  OXIDE OXYGEN ON SP2 CARBON, NEGATIVELY CHARGED
  case 72:
// 72  TERMINAL SULFUR BONDED TO PHOSPHORUS
factor = 0.5f;
break;
  case 62:
// 62  DEPROTONATED SULFONAMIDE N-; FORMAL CHARGE=-1
  case 76:
// 76  NEGATIVELY CHARGED N IN, E.G, TRI- OR TETRAZOLE ANION
factor = 0.25f;
break;
  }

What the heck is that? Convince me that is reasonable! (Or are these not
the MMFF94type numbers in MMFF-I_AppendixB.ascii?)

Jmol's aromatic definition is structural, not electronic, so that certainly
would be one point of difference.

Ideally it would be verified, but for a lot of purposes at least for now
I'll be satisfied with an approximation.

I can get partial charges from MMFF94 far easier from PubChem than I can
from OpenBabel, I think. All I have to do is

load :x

then  save the partial charges in an array and compare them with calculated
values:

load :methanol
A = {*}.partialcharge.all
calculate partialCharge
{*}.partialcharge = A.sub({*}.partialcharge.all)
label %[partialcharge]
print {*}.partialcharge.all.stddev

This will display the differences and print the standard deviation. I must
be doing something right, because I do get this:

$ load :methanol
...
0.0

$ load :acetone
...
8.1649824E-4

$ load :ethyl acetate
...
3.9223014E-4

$ load :benzene
...
0.0

$ load :pyridine
...
0.0

But I'm sure there are plenty that are problematic.

Caffeine must assign different atom types --

load :caffeine
A = {*}.partialcharge.all
calculate partialCharge
{*}.partialcharge = A.sub({*}.partialcharge.all)
label %[partialcharge]
print {*}.partialcharge.all.stddev
0.10437195

But that's what I expect at this stage. Can't expect everything to be
perfect in a day!

Can I get OpenBabel to report what MMFF94 atom types it is assigning?

Bob


On Fri, Apr 27, 2012 at 8:00 PM, Geoffrey Hutchison geo...@pitt.edu wrote:



  Geoff, I will look at that. What does fully validated mean exactly?

 There's an MMFF94 validation set:
 http://ccl.net/cca/data/MMFF94/

 This includes 761 structures, plus energies, etc. But one reason I bring
 this up, is that you can use Babel to generate partial charges (e.g., for
 mol2 files):
 echo c(s1)ccc1Cl | babel -ismi --gen3d --partialcharge mmff94 -omol2

 In short, feel free to use Babel to generate a pile of MMFF94 charges for
 testing. IIRC, the structures in the test set also have MMFF94 charges.

  I have the SMARTS business for MMFF94 charges working in Jmol now for
 getting the atom types -- obviously not validated! -- and I suspect it will
 require a bit of hand-crafting.

 We'd be interested in testing SMARTS versus the hand-crafted rules in Open
 Babel.

 Hope that helps,
 -Geoff




-- 
Robert M. Hanson
Professor of Chemistry
St. Olaf College
1520 St. Olaf Ave.
Northfield, MN 55057
http://www.stolaf.edu/people/hansonr
phone: 507-786-3107


If nature does not answer first what we want,
it is better to take what answer we get.

-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss


Re: [BlueObelisk-discuss] [BlueObelisk-SMILES] help needed -- SMILES and MMFF94 atom types

2012-04-28 Thread Robert Hanson
I'm not understanding  the MMFF94 assumptions about formal charges. For
example, I have acetic acid

CC(=O)[O-1]

I believe both of these oxygens are MMFF94 type 32 (O, CARBOXYLATE ANION),
Is that correct? If so, how does MMFF94 handle the two different formal
charges? I'm not seeing that...



On Sat, Apr 28, 2012 at 1:03 AM, Robert Hanson hans...@stolaf.edu wrote:

 OK, SMARTS-based MMFF94 charges are now checked in for Jmol.

 Obviously the settings are not perfect and will take some tweaking. I'm
 sure it's more complicate than I make it out to be -- probably some very
 odd definitions in there; certainly some I could not fathom on this first
 pass. I haven't taken a full look at OpenBabel's definitions, but I did use
 the algorithm in OBForceFieldMMFF94::SetPartialCharges() as my basis (even
 though some of that makes almost no sense to me -- for example:

   switch (type1) {
   case 32:
 // 32  OXYGEN IN CARBOXYLATE ANION
 // 32  NITRATE ANION OXYGEN
 // 32  SINGLE TERMINAL OXYGEN ON TETRACOORD SULFUR
 // 32  TERMINAL O-S IN SULFONES AND SULFONAMIDES
 // 32  TERMINAL O IN SULFONATES
   case 35:
 // 35  OXIDE OXYGEN ON SP2 CARBON, NEGATIVELY CHARGED
   case 72:
 // 72  TERMINAL SULFUR BONDED TO PHOSPHORUS
 factor = 0.5f;
 break;
   case 62:
 // 62  DEPROTONATED SULFONAMIDE N-; FORMAL CHARGE=-1
   case 76:
 // 76  NEGATIVELY CHARGED N IN, E.G, TRI- OR TETRAZOLE ANION
 factor = 0.25f;
 break;
   }

 What the heck is that? Convince me that is reasonable! (Or are these not
 the MMFF94type numbers in MMFF-I_AppendixB.ascii?)

 Jmol's aromatic definition is structural, not electronic, so that
 certainly would be one point of difference.

 Ideally it would be verified, but for a lot of purposes at least for now
 I'll be satisfied with an approximation.

 I can get partial charges from MMFF94 far easier from PubChem than I can
 from OpenBabel, I think. All I have to do is

 load :x

 then  save the partial charges in an array and compare them with
 calculated values:

 load :methanol
 A = {*}.partialcharge.all
 calculate partialCharge
 {*}.partialcharge = A.sub({*}.partialcharge.all)
 label %[partialcharge]
 print {*}.partialcharge.all.stddev

 This will display the differences and print the standard deviation. I must
 be doing something right, because I do get this:

 $ load :methanol
 ...
 0.0

 $ load :acetone
 ...
 8.1649824E-4

 $ load :ethyl acetate
 ...
 3.9223014E-4

 $ load :benzene
 ...
 0.0

 $ load :pyridine
 ...
 0.0

 But I'm sure there are plenty that are problematic.

 Caffeine must assign different atom types --

 load :caffeine
 A = {*}.partialcharge.all
 calculate partialCharge
 {*}.partialcharge = A.sub({*}.partialcharge.all)
 label %[partialcharge]
 print {*}.partialcharge.all.stddev
 0.10437195

 But that's what I expect at this stage. Can't expect everything to be
 perfect in a day!

 Can I get OpenBabel to report what MMFF94 atom types it is assigning?

 Bob



 On Fri, Apr 27, 2012 at 8:00 PM, Geoffrey Hutchison geo...@pitt.eduwrote:



  Geoff, I will look at that. What does fully validated mean exactly?

 There's an MMFF94 validation set:
 http://ccl.net/cca/data/MMFF94/

 This includes 761 structures, plus energies, etc. But one reason I bring
 this up, is that you can use Babel to generate partial charges (e.g., for
 mol2 files):
 echo c(s1)ccc1Cl | babel -ismi --gen3d --partialcharge mmff94 -omol2

 In short, feel free to use Babel to generate a pile of MMFF94 charges for
 testing. IIRC, the structures in the test set also have MMFF94 charges.

  I have the SMARTS business for MMFF94 charges working in Jmol now for
 getting the atom types -- obviously not validated! -- and I suspect it will
 require a bit of hand-crafting.

 We'd be interested in testing SMARTS versus the hand-crafted rules in
 Open Babel.

 Hope that helps,
 -Geoff




 --
 Robert M. Hanson
 Professor of Chemistry
 St. Olaf College
 1520 St. Olaf Ave.
 Northfield, MN 55057
 http://www.stolaf.edu/people/hansonr
 phone: 507-786-3107


 If nature does not answer first what we want,
 it is better to take what answer we get.

 -- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900




-- 
Robert M. Hanson
Professor of Chemistry
St. Olaf College
1520 St. Olaf Ave.
Northfield, MN 55057
http://www.stolaf.edu/people/hansonr
phone: 507-786-3107


If nature does not answer first what we want,
it is better to take what answer we get.

-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. 

Re: [BlueObelisk-discuss] [BlueObelisk-SMILES] help needed -- SMILES and MMFF94 atom types

2012-04-28 Thread Kyle Lutz
On Sat, Apr 28, 2012 at 10:38 PM, Robert Hanson hans...@stolaf.edu wrote:
 Thanks, Kyle. My idea is to replace that very explicit atom typer with a
 SMARTS analysis, and I will check carefully to see that it agrees.

 One thing I don't understand is this: Where is the original MMFF94 code? Is
 that available? My question is whether this code is an interpretation of
 that or an exact duplication of that.

I have never seen the original code and I'm pretty sure that it's not
publicly available. I based my implementation on the (seven?) MMFF
papers and the MMFF94 validation suite which Geoff previously linked.
The MMFF94_opti.log gives a lot of useful output showing atom types,
charges and energies for the calculations. Based on that I created an
XML file [1] with the expected values which I use to test my
implementation.

So, for example, I think this is the original description of atom type 82:

   N5AX   82  N-OXIDE NITROGEN IN 5-RING ALPHA POSITION
   N5BX   82  N-OXIDE NITROGEN IN 5-RING BETA POSITION
   N5OX   82  N-OXIDE NITROGEN IN GENERAL 5-RING POSITION

 For which I am clueless as to what N5AX and N5BX could possibly mean.

 I think  your code describes N5OX as $([nD3r5] -- an SP2 N-oxide with
 aromatic N in a 5-membered ring. But that doesn't fit the description. First
 of all, I don't think you can have an N-oxide in an aromatic 5-membered
 ring. you could in a 6-membered ring, but not in a five. What am I missing
 there.

 Not that I trust the description. Surely the REAL description is the
 original MMFF94 code. So rather than look at an interpretation of that, I'd
 like to see the original.

As would have I when I was writing chemkit's implementation :-).

 Maybe you can just point me to a molecule that does have an atom type of 82.

There are seven molecules containing an atom with type #82 in the
validation suite file. Their names are: DICPUA, DICRAI, FENCOQ,
FUPJUV, FUPKOQ, GAVKOD and KEPKIZ.

 Charges: I think what you are saying is that I should totally ignore what a
 MOL file has for formal charges and first create a set of starting point
 charges.

Yes. However, some of the types for metal atoms (e.g Copper and Iron)
require their partial charges which are read from the file.

 I think I should be able to do that with SMARTS instead -- applying
 the charge model as necessary on the fly. Certainly looks like everything is
 there that I'll need. So thanks for that. Can you give me a hint where the
 value of a formal charge is coming from in this?


 chemkit::Real q0 = typer-formalCharge(atom);

It is coming from the formal charge assigned during atom typing. In
the MmffAtomTyper class the setType() method has an optional third
parameter for formal charge which gets stored when the atom type is
assigned. The formalCharge() method just returns the formal charge
that was assigned.

Hope this helps.

Cheers,
Kyle

[1] 
https://github.com/OpenChemistry/chemkit/blob/master/tests/auto/plugins/mmff/mmff94.expected

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss


Re: [BlueObelisk-discuss] [BlueObelisk-SMILES] help needed -- SMILES and MMFF94 atom types

2012-04-27 Thread Robert Hanson
[switching to BlueObelisk-Discuss from BlueOblelisk-Smiles here]

Geoff, I will look at that. What does fully validated mean exactly?

I have the SMARTS business for MMFF94 charges working in Jmol now for
getting the atom types -- obviously not validated! -- and I suspect it will
require a bit of hand-crafting.

Egon, I'm using your CDK interpretation of how to handle MMFF94 bci and
pbci. That looks relatively simple.

Still, I would like to know more details. I'll look at that code, Geoff,
and see what I can see about some of those groups. Thanks.

Others with insight into MMFF94 charge calculation?

Bob


On Thu, Apr 26, 2012 at 8:11 PM, Geoff Hutchison
ge...@geoffhutchison.netwrote:

  Today I decided I wanted Jmol to be able to generate MEP mapped van der
 Waals surfaces, and the way I see to do that is to use the MMFF94
 algorithms. Especially because I should be able to easily validate with
 PubChem, because their structures are delivered with MMFF94 charges.

 Open Babel has a fully validated implementation of MMFF94 and charges,
 thanks to Tim Vandermeersch. I think ChemKit also has a full MMFF94
 implementation, although you'd have to ask Kyle if he validated it.

 The MMFF94 atom types are quite opaque, and I'm not sure they can all be
 done via SMARTS. Tim hand-coded the aromaticity model and typing in
 src/forcefields/forcefieldmmff94.cpp.

 -Geoff




-- 
Robert M. Hanson
Professor of Chemistry
St. Olaf College
1520 St. Olaf Ave.
Northfield, MN 55057
http://www.stolaf.edu/people/hansonr
phone: 507-786-3107


If nature does not answer first what we want,
it is better to take what answer we get.

-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss