Re: [Rdkit-discuss] rdDeprotect & DeprotectData

2023-08-21 Thread James Davidson
Hi Katrina,

I must confess I haven't actually used rdDeprotect before (I have always 
created reactions and called the RunReactants() method)...
I just tried your use case, and I think it is working as you would like (I 
can't immediately see what is wrong in the original code you posted).
Here is a gist showing it (I hope): 
https://gist.github.com/jepdavidson/ec1664a8bfa8b921262fc844c0e523e4

Kind regards

James

From: Katrina Lexa 
Sent: 21 August 2023 14:58
To: James Davidson 
Cc: RDKit Discuss 
Subject: Re: [Rdkit-discuss] rdDeprotect & DeprotectData

Hi James,

Thanks for the quick reply!

You're quite right, I'm simply interested in the virtual reaction to remove the 
boronates. Thank you for fixing my incorrect mapping. At some point, I had had 
the aryl carbon properly specified, but I clearly lost my way with it along my 
quest.

Sadly, the reaction_smarts = "[c:1][B;R0](O)O>>[*:1]" still does not remove any 
of the boronates from my input smiles, but it sounds like everything else about 
the specification of the reaction is correct, so I'll get there at some point 
with the right reaction_smarts.

Thanks again,

Katrina

On Mon, Aug 21, 2023 at 3:26 AM James Davidson 
mailto:j.david...@vernalis.com>> wrote:
Hi Katrina,

I'm slightly unsure what "deprotection" you are trying to represent, but I 
think there are a couple of problems with the rsmarts...

  reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"

This is looking for an aromatic carbon with one hydrogen AND connected to a 
non-ring boron.  This pattern will never be found!
Also, you have a mapped atom on the reactant side, but no mapped atoms on the 
product side.

If your reaction is aiming to hydrolyse non-cyclic boronic esters (and return 
the alcohols), then you should map the oxygen atom on the product side as well 
- something like:

  reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]"

If, instead, you are interested in the virtual reaction that removes boronates 
from aryl R-groups (perhaps to calculate R-group fingerprints, etc) - then you 
should map the aryl carbon on both sides instead:

  reaction_smarts = "[c:1][B;R0](O)O>>[*:1]"

In either case you probably want to deduplicate products (the boronic acids and 
esters will match the pattern twice).

Kind regards

James

From: Katrina Lexa mailto:kl...@umich.edu>>
Sent: 21 August 2023 06:03
To: RDKit Discuss 
mailto:rdkit-discuss@lists.sourceforge.net>>
Subject: [Rdkit-discuss] rdDeprotect & DeprotectData

Hi All,

I don't know why I'm struggling so much with this, as it seems like it should 
be pretty straight forward. I'm trying to add some additional deprotection 
smirks to a data-cleaning python script and I'm not having success with the new 
reactions actually transforming my reactants to deprotected smiles. I have 
about 10 I'd like to add, so I know I could do it with simple reactions, but 
I'd rather figure out where I'm going wrong here.

My definition of deprotect data:
#deborylation
deprotection_class = "boron"
reaction_smarts =  "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"
abbreviation = "BOO"
full_name = "deboron"
bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts, 
abbreviation, full_name)
assert bdata.isValid()

I tried adding this line:
newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata)

but it seems to make no difference:
try:
#result = rdDeprotect.Deprotect(dep_m,deprotections=[bdata])
result = rdDeprotect.Deprotect(dep_m,newDeprotect)


As an example, this is one of the smiles strings in the smiles file I'm reading 
in I would expect to deprotect"
Cc1cc(B(O)O)ccc1OC(C)C

Maybe I'm just awful at writing SMIRKS?


Thanks for the help here,

Katrina



PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or 
postmas...@vernalis.com<mailto:postmas...@vernalis.com>. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, 

Re: [Rdkit-discuss] rdDeprotect & DeprotectData

2023-08-21 Thread James Davidson
Hi Katrina,

I'm slightly unsure what "deprotection" you are trying to represent, but I 
think there are a couple of problems with the rsmarts...

  reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"

This is looking for an aromatic carbon with one hydrogen AND connected to a 
non-ring boron.  This pattern will never be found!
Also, you have a mapped atom on the reactant side, but no mapped atoms on the 
product side.

If your reaction is aiming to hydrolyse non-cyclic boronic esters (and return 
the alcohols), then you should map the oxygen atom on the product side as well 
- something like:

  reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]"

If, instead, you are interested in the virtual reaction that removes boronates 
from aryl R-groups (perhaps to calculate R-group fingerprints, etc) - then you 
should map the aryl carbon on both sides instead:

  reaction_smarts = "[c:1][B;R0](O)O>>[*:1]"

In either case you probably want to deduplicate products (the boronic acids and 
esters will match the pattern twice).

Kind regards

James

From: Katrina Lexa 
Sent: 21 August 2023 06:03
To: RDKit Discuss 
Subject: [Rdkit-discuss] rdDeprotect & DeprotectData

Hi All,

I don't know why I'm struggling so much with this, as it seems like it should 
be pretty straight forward. I'm trying to add some additional deprotection 
smirks to a data-cleaning python script and I'm not having success with the new 
reactions actually transforming my reactants to deprotected smiles. I have 
about 10 I'd like to add, so I know I could do it with simple reactions, but 
I'd rather figure out where I'm going wrong here.

My definition of deprotect data:
#deborylation
deprotection_class = "boron"
reaction_smarts =  "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"
abbreviation = "BOO"
full_name = "deboron"
bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts, 
abbreviation, full_name)
assert bdata.isValid()

I tried adding this line:
newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata)

but it seems to make no difference:
try:
#result = rdDeprotect.Deprotect(dep_m,deprotections=[bdata])
result = rdDeprotect.Deprotect(dep_m,newDeprotect)


As an example, this is one of the smiles strings in the smiles file I'm reading 
in I would expect to deprotect"
Cc1cc(B(O)O)ccc1OC(C)C

Maybe I'm just awful at writing SMIRKS?


Thanks for the help here,

Katrina



PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem finding potential stereo-centres in bridged bicyclics involving 4-membered rings?

2021-05-19 Thread James Davidson
Hi Greg,

Thanks for the response (and sorry to be the bearer of bad news!).
Issue added:  https://github.com/rdkit/rdkit/issues/4155

Kind regards

James

From: Greg Landrum 
Sent: 19 May 2021 14:59
To: James Davidson 
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Problem finding potential stereo-centres in 
bridged bicyclics involving 4-membered rings?

Hi James,

I don't think that's the same bug as #3490. I think it's something different; 
"yay".
;-)

It would be great if you could file a github issue for this.

Thanks,
-greg


On Wed, May 19, 2021 at 3:20 PM James Davidson 
mailto:j.david...@vernalis.com>> wrote:
Dear All,

I’ve got a strong suspicion that what I am seeing is related to the open issue 
3490 (https://github.com/rdkit/rdkit/issues/3490), but as I can’t seem to find 
a mention of a non-spiro problem then I thought I would share.
Tested in 2020.09.4 and 2021.03.2 with the same result.

smi_list = ['CC1CCC(CC1)C(N)=O', 'CC12CCC(CC1)(C2)C(N)=O', 'CC1CC(C1)C(N)=O', 
'CC12CC(C1)(CC2)C(N)=O']
for smi in smi_list:
mol = Chem.MolFromSmiles(smi)
display(show_mol(mol, size=(450, 200)))  # wrapper function for new drawing 
code in jupyter
print(list(Chem.FindPotentialStereo(mol)))
print(Chem.FindMolChiralCenters(mol, includeUnassigned=True, 
useLegacyImplementation=False))

The 4 cases are:

  *   Symmetrically-disubstituted 6-membered ring
  *   A bridged version (using a 1-atom bridge to avoid a completely 
symmetrical product)
  *   Symmetrically-disubstituted 4-membered ring
  *   A bridged version (this time using a 2-atom bridge to avoid symmetry)

And this is what I see:

[cid:image001.png@01D74CCD.B6F2A180]

In the case of the bridged 4-membered ring (or bridged 5-membererd ring, 
depending on your viewpoint!), FindPotentialStereo() fails to identify the two 
potential stereo atoms.
If anyone can spot if this is the same issue as 3490, or something different, 
then that would be appreciated!

Kind regards

James


PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or 
postmas...@vernalis.com<mailto:postmas...@vernalis.com>. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problem finding potential stereo-centres in bridged bicyclics involving 4-membered rings?

2021-05-19 Thread James Davidson
Dear All,

I've got a strong suspicion that what I am seeing is related to the open issue 
3490 (https://github.com/rdkit/rdkit/issues/3490), but as I can't seem to find 
a mention of a non-spiro problem then I thought I would share.
Tested in 2020.09.4 and 2021.03.2 with the same result.

smi_list = ['CC1CCC(CC1)C(N)=O', 'CC12CCC(CC1)(C2)C(N)=O', 'CC1CC(C1)C(N)=O', 
'CC12CC(C1)(CC2)C(N)=O']
for smi in smi_list:
mol = Chem.MolFromSmiles(smi)
display(show_mol(mol, size=(450, 200)))  # wrapper function for new drawing 
code in jupyter
print(list(Chem.FindPotentialStereo(mol)))
print(Chem.FindMolChiralCenters(mol, includeUnassigned=True, 
useLegacyImplementation=False))

The 4 cases are:

  *   Symmetrically-disubstituted 6-membered ring
  *   A bridged version (using a 1-atom bridge to avoid a completely 
symmetrical product)
  *   Symmetrically-disubstituted 4-membered ring
  *   A bridged version (this time using a 2-atom bridge to avoid symmetry)

And this is what I see:

[cid:image001.png@01D74CB9.B0931950]

In the case of the bridged 4-membered ring (or bridged 5-membererd ring, 
depending on your viewpoint!), FindPotentialStereo() fails to identify the two 
potential stereo atoms.
If anyone can spot if this is the same issue as 3490, or something different, 
then that would be appreciated!

Kind regards

James


PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Stereochemistry problem with spiro centre

2021-05-09 Thread James Davidson
Thanks Paolo - I should have found this before posting!

Cheers

James

On 9 May 2021, at 15:05, Paolo Tosco  wrote:


Hi James,

IIRC that's a known open issue with the way spirocyclic pseudochiral centers 
are handled:

https://github.com/rdkit/rdkit/issues/3490

Cheers,
p.

On Sun, May 9, 2021 at 10:15 AM James Davidson 
mailto:j.david...@vernalis.com>> wrote:
Dear All,

I am having some issues with tetrahedral stereochemistry perception in RDKit 
(2020.09.4) for a certain class of molecule.
Here’s an example (rendered using cdk-depict):


https://www.simolecule.com/cdkdepict/depict/bot/svg?smi=F%5BC%40H%5D1C%5BC%40%40%5D2(C1)C%5BC%40H%5D(Cl)C2=-1=-1=off=bridgehead=false=3.65=cip=0

If I try to work with this class of molecule in RDKit, it seems impossible(?) 
to assign stereo information to the central, spirocyclic stereo centre.
Exporting back out as SMILES shows that the information is not present:

m = Chem.MolFromSmiles('F[C@H]1C[C@@]2(C1)C[C@H](Cl)C2')
print(Chem.MolToSmiles(m))

>>>   F[C@H]1CC2(C1)C[C@H](Cl)C2

The spiro-atom is clearly being identified as a potential stereo-centre 
(strangely, the CIP labels aren’t being generated for the other two centres – 
just the parity info is returned):

centers = Chem.FindMolChiralCenters(m, includeUnassigned=True, 
useLegacyImplementation=False)
print(centers)

>>>   [(1, 'Tet_CW'), (3, '?'), (6, 'Tet_CCW')]

If I look at the atom properties for the central atom, I can see 
_ChiralityPossible == 1, but I also see _ringStereochemCand == False.
Is this the problem?

Any help/advice greatly appreciated!

Kind regards

James


PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or 
postmas...@vernalis.com<mailto:postmas...@vernalis.com>. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Stereochemistry problem with spiro centre

2021-05-09 Thread James Davidson
Dear All,

I am having some issues with tetrahedral stereochemistry perception in RDKit 
(2020.09.4) for a certain class of molecule.
Here's an example (rendered using cdk-depict):

[cid:image002.png@01D744AE.E7C7ACA0]
https://www.simolecule.com/cdkdepict/depict/bot/svg?smi=F%5BC%40H%5D1C%5BC%40%40%5D2(C1)C%5BC%40H%5D(Cl)C2=-1=-1=off=bridgehead=false=3.65=cip=0

If I try to work with this class of molecule in RDKit, it seems impossible(?) 
to assign stereo information to the central, spirocyclic stereo centre.
Exporting back out as SMILES shows that the information is not present:

m = Chem.MolFromSmiles('F[C@H]1C[C@@]2(C1)C[C@H](Cl)C2')
print(Chem.MolToSmiles(m))

>>>   F[C@H]1CC2(C1)C[C@H](Cl)C2

The spiro-atom is clearly being identified as a potential stereo-centre 
(strangely, the CIP labels aren't being generated for the other two centres - 
just the parity info is returned):

centers = Chem.FindMolChiralCenters(m, includeUnassigned=True, 
useLegacyImplementation=False)
print(centers)

>>>   [(1, 'Tet_CW'), (3, '?'), (6, 'Tet_CCW')]

If I look at the atom properties for the central atom, I can see 
_ChiralityPossible == 1, but I also see _ringStereochemCand == False.
Is this the problem?

Any help/advice greatly appreciated!

Kind regards

James


PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] A question regarding double bonds and reading molblocks

2020-12-22 Thread James Davidson
Dear All,

I think this question is in some way related to the following closed issue:
https://github.com/rdkit/rdkit/pull/3015

I am working with 2020.09.1, but see the following error when calling 
EnumerateStereoisomers():


RuntimeError: Pre-condition Violation

Stereo atoms should be specified before specifying 
CIS/TRANS bond stereochemistry

Violation occurred on line 288 in file Code/GraphMol/Bond.h

Failed Expression: what <= STEREOE || 
getStereoAtoms().size() == 2

RDKIT: 2020.09.1

BOOST: 1_72

I may be wrong, but I think my issue has something to do with incoming 
STEREOANY bonds *terminating* at double bonds.
Here is an example (not very pretty, I know!):

[cid:image001.png@01D6D861.708D15D0]

The intention for the wavy bonds is (probably) to say that nothing is known 
about the configuration at the 3 stereocentres.
But the intention for the double bonds is that they are as drawn (the hydrazone 
is trans, and the alkene is cis).
Here is the corresponding molblock:


  Mrv1921 1012592D

10 11  0  0  0  0999 V2000
   15.65131.47110. N   0  0  0  0  0  0  0  0  0  0  0  0
   16.06350.75580. N   0  0  0  0  0  0  0  0  0  0  0  0
   16.88900.75460. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.30100.03940. C   0  0  3  0  0  0  0  0  0  0  0  0
   17.3788   -0.67540. C   0  0  0  0  0  0  0  0  0  0  0  0
   18.0262   -1.15190. C   0  0  3  0  0  0  0  0  0  0  0  0
   18.7209   -1.23210. C   0  0  0  0  0  0  0  0  0  0  0  0
   18.6301   -0.47940. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.3831   -1.57410. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.8753   -0.38740. C   0  0  3  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  2  3  2  0  0  0  0
  4  3  1  4  0  0  0
  4  5  1  0  0  0  0
  6  5  1  4  0  0  0
  6  7  1  0  0  0  0
  7  8  2  0  0  0  0
  6  9  1  0  0  0  0
10  9  1  4  0  0  0
10  4  1  0  0  0  0
10  8  1  0  0  0  0
M  END


And we can see 3 atoms are set with atom parity = 3 (either or unmarked; 
ignored when read).
And 3 single bonds are set with bond stereo = 4 (either).
Both double bonds are set with bond stereo = 0 (use coords to determine cis or 
trans).

If I read this into RDKit, however, I see one of the double bonds (the 
hydrazone one) is interpreted as STEREOANY and not STEREONONE as the molblock 
intended:

mol = Chem.MolFromMolBlock(test_mb_20)
for bond in mol.GetBonds():
if bond.GetBondType() == Chem.BondType.DOUBLE:
print(bond.GetStereo())


STEREOANY

STEREONONE

And if I make a call to EnumerateStereoisomers() I see the above error.
Is there a step (or understanding) I am missing?

Kind regards

James


PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Simple question about double bond stereo in molblock output

2020-12-22 Thread James Davidson
Dear All,

I wonder if I can quickly sanity-check something(?).

I have noticed that symmetrical double bonds output with a bond stereo setting 
of "3" (cis or trans (either) double bond) in the standard molblock output.
Is this expected/intentional?  I would have expected a setting of "0" (use 
coords to determine cis or trans) for a non-stereo double bond.
(I am using 2020.09.1)

Here's a simple example:

m = Chem.MolFromSmiles('FC(F)=CC1=CC=CC=C1')
print(Chem.MolToMolBlock(m))


 RDKit  2D

10 10  0  0  0  0  0  0  0  0999 V2000
5.2500   -1.29900. F   0  0  0  0  0  0  0  0  0  0  0  0
3.7500   -1.29900. C   0  0  0  0  0  0  0  0  0  0  0  0
3.   -2.59810. F   0  0  0  0  0  0  0  0  0  0  0  0
3.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
1.50000.0. C   0  0  0  0  0  0  0  0  0  0  0  0
0.7500   -1.29900. C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.7500   -1.29900. C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.50000.0. C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.75001.29900. C   0  0  0  0  0  0  0  0  0  0  0  0
0.75001.29900. C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0
  2  3  1  0
  2  4  2  3
  4  5  1  0
  5  6  2  0
  6  7  1  0
  7  8  2  0
  8  9  1  0
  9 10  2  0
10  5  1  0
M  END


This behaviour is maybe what I would expect if the bond was explicitly set 
using bond.SetStereo(Chem.BondStereo.STEREOANY), but in the absence of this I 
would expect the bond to default to STEREONONE, and I guess I would expect this 
to be bond stereo "0" in the output molblock.  What am I missing?

Kind regards

James


PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Open3DAlign scoring of existing alignment?

2019-11-26 Thread James Davidson
Dear All (especially Paolo!),

I have a strong suspicion I have already asked this at some point in the past - 
so apologies in advance (but I can't seem to find the answer)...
I am interested in taking an existing overlay of two RDKit molecules in 3D and 
scoring the overlay using Open3DAlign scoring scheme (eg with MMFF atom-types), 
but *without* trying to optimise the alignment or score.

I thought setting maxIters=0 in the call to AllChem.GetO3A() would do the trick 
(I even tried setting options=3 to "trigger local optimization").  Eg

o3a = AllChem.GetO3A(prb_mol, ref_mol, maxIters=0, options=3)
o3a.Matches()  # Show the matches

But while the options setting certainly changes the matching atoms (and the 
score), the matches don't seem to correspond well to my starting alignment...
Any advice is greatly appreciated (including, of course, simply pointing me to 
the old answer that I am likely missing!)

Kind regards

James

__
PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis Limited (Company no. 2304992), Vernalis (R) Limited (no. 1985479) 
and Vernalis Development Limited (no. 2600483)
Granta Park
Great Abington
Cambridge
CB21 6GB, UK
Tel: +44 (0)1223 895 555


_
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Is it possible to get a breakdown of conformational energy terms?

2018-03-22 Thread James Davidson
Dear All,

Recently I have been assessing some ligand conformations from crystal 
structures to identify any non-ideal bond lengths, angles, torsions, or 
non-bonded contacts.
What I am doing at the moment is adding some positional constraints to the 
crystallographic heavy atom positions, and calculating the energy before and 
after minimisation:

>>>   m = Chem.MolFromMolFile('input.mol')
>>>   mh = AllChem.AddHs(m, addCoords=True)
>>>   mp = AllChem.MMFFGetMoleculeProperties(mh, mmffVariant='MMFF94s')
>>>   ff = AllChem.MMFFGetMoleculeForceField(mh, mp)
>>>   ff.CalcEnergy()

This gives the 'raw' energy.

>>>   for atom in mh.GetAtoms():
>>>   if not atom.GetAtomicNum() == 1:
>>>   idx = atom.GetIdx()
>>>   ff.MMFFAddPositionConstraint(idx, maxDispl=0.5, forceConstant=100)
>>>   ff.Minimize(maxIts=1)
>>>   ff.CalcEnergy()

And this gives the energy after applying a moderate restraint (100 kcal/mol, 
with a maximum displacement of 0.5 A).

So I think this is ok(?), and I can compare the two energies and inspect the 
conformations visually.
What I was wondering was whether there is a way of obtaining the individual 
energy terms (ie each bonded and non-bonded term, angle, and torsion)?
Because what I'd really like to do is identify the areas of the molecule that 
contribute the most to the pre- and post- minimisation energy difference.

Any suggestions would be greatly appreciated!

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral centres(?)

2017-02-08 Thread James Davidson
Hi Greg (et al.),

Thanks for looking into it.
And thanks to Paolo, who gave me a good workaround suggestion – which was to 
desymmetrise the spirocyclic centre by modifying the isotope on one of the 
neighbours.
This is good for attended processing of single molecules, but not so good for 
unattended processing of unknown molecules…

Reading in molecules with sanitize=False is a good start, but my first thought 
was then to do some sort of rSMARTS transform to automate the isomer assignment.
It soon became apparent that this wasn’t the way to go – as abilities are 
limited with an unsanitised molecule(!).

So I ended-up with the following:

m3 = Chem.MolFromSmiles('O[C@H]1CC[C@]11CC[C@@](Cl)(Br)CC1', sanitize=False)
for atom in m3.GetAtoms():
print "Stereo:", atom.GetChiralTag(), "Neighbours:", [n.GetSymbol() for n 
in atom.GetNeighbors()]  # chiral centres currently intact

# Find possible spirocentres
for atom in m3.GetAtoms():
if len(atom.GetNeighbors()) == 4 and atom.IsInRing() and 
atom.GetChiralTag() != 'CHI_UNSPECIFIED':
# We have found a candidate spirocentre modify a neighbour at random
first_neighbour = atom.GetNeighbors()[0]
first_neighbour.SetIsotope(100)
Chem.SanitizeMol(m3)  # Now we can sanitise
test3_mols = summarise_conformers(m3)  # and generate the conformers (as before)
sdf = Chem.SDWriter('test3.sdf')  # and write them out (but resetting the 
isotopes first)
for mol in test3_mols:
for atom in mol.GetAtoms():
if atom.GetIsotope() == 100:
atom.SetIsotope(0)
sdf.write(mol)


GIST is updated to include this:  
https://gist.github.com/jepdavidson/fdfbf6366a17f4829de3d4de22f3b442

Kind regards

James


From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 08 February 2017 03:45
To: James Davidson
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral 
centres(?)

Hi James,

This is definitely a bug. The problem seems to be connected to the way what the 
RDKit calls "ring stereochemistry" is handled when there are spiro linkages.

Here's the github issue: https://github.com/rdkit/rdkit/issues/1294

I'll take a look.

Best,
-greg



On Tue, Feb 7, 2017 at 8:32 PM, James Davidson 
<j.david...@vernalis.com<mailto:j.david...@vernalis.com>> wrote:
Dear All,

I have hit what I think is a problem with stereochemistry perception/handling 
for certain types of pseudochiral and/or spirocyclic systems.
Basically I am observing that some types of input tetrahedral stereochemical 
information gets lost when an RDKit molecule is generated.
But I only realised this because I was wanting to generate conformers and was 
seeing stereochemical scrambling…

Anyway, an example with pictures will probably explain things better:
https://gist.github.com/jepdavidson/fdfbf6366a17f4829de3d4de22f3b442

Any help/advice appreciated.

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or 
postmas...@vernalis.com<mailto:postmas...@vernalis.com>. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 <tel:+44%20118%20938%20>

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com<http://www.vernalis.com> and click on the 
"Company address and registration details" link at the bottom of the page..
__

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


__
PLEASE READ: This email is confide

[Rdkit-discuss] Stereochemistry issue for spirocycles/pseudochiral centres(?)

2017-02-07 Thread James Davidson
Dear All,

I have hit what I think is a problem with stereochemistry perception/handling 
for certain types of pseudochiral and/or spirocyclic systems.
Basically I am observing that some types of input tetrahedral stereochemical 
information gets lost when an RDKit molecule is generated.
But I only realised this because I was wanting to generate conformers and was 
seeing stereochemical scrambling...

Anyway, an example with pictures will probably explain things better:
https://gist.github.com/jepdavidson/fdfbf6366a17f4829de3d4de22f3b442

Any help/advice appreciated.

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit patch releases in conda?

2016-11-02 Thread James Davidson
Hi Riccardo,

> are you working on Windows? Pre-built conda packages targeting the 2016.03 
> patch releases are at this time only available for linux and osx.
Yes, I'm afraid so...

> an additional patch release was tagged before the UGM, and I think it wasn't 
> yet pushed to the anaconda channel. if there's interest for making this 
> revision available,
>  I can try to include some windows packages (for the amd64 platform at 
> least), otherwise it might make sense to wait for the upcoming release?
I would definitely be interested in this (py2 and py3) for win64, but if it is 
anything more than a small amount of work, then please don't do it just for me!

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit patch releases in conda?

2016-11-02 Thread James Davidson
Hi Greg (and Riccardo),

> Riccardo had pushed binaries for Linux and I have done most of the mac 
> versions, but doing windows builds, which I suspect is what you want, is 
> enough of a barrier that we haven't done those.
> There is an ongoing discussion about resolving this problem, but it is 
> non-trivial.
It sounds like there is a chance that Riccardo will look at a win64 2016_03 
patch build for conda (which would be geat!)

> P.S. Obligatory plug: this is a matter of focusing resources on an 
> less-than-pleasant task; exactly the kind of thing that RDKit 
> maintenance/support customers can reasonably request.
That sounds fair...

Kind regards

James


_____
From: James Davidson <j.david...@vernalis.com>
Sent: Wednesday, November 2, 2016 2:32 PM
Subject: [Rdkit-discuss] RDKit patch releases in conda?
To: <rdkit-discuss@lists.sourceforge.net>

Dear All,
 
I think I probably know the answer to this already, but wanted to double check 
– did any of the four 2016_03 patch releases ever get pushed to conda?
I only seem to get 2016_03_1 with “conda update –c 
https://conda.anaconda.org/rdkit rdkit”
 
(if not available then I guess this is academic with 2016_09_1 around the 
corner?)
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__


__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit patch releases in conda?

2016-11-02 Thread James Davidson
Dear All,

I think I probably know the answer to this already, but wanted to double check 
- did any of the four 2016_03 patch releases ever get pushed to conda?
I only seem to get 2016_03_1 with "conda update -c 
https://conda.anaconda.org/rdkit rdkit"

(if not available then I guess this is academic with 2016_09_1 around the 
corner?)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problem adding hydrogens to peptides

2016-11-01 Thread James Davidson
Dear All,

Enthused by all the great talks at the UGM, for the last couple of days I have 
been getting more hands-on with RDKit than I have in quite a while!
I was keen to work with some peptides/proteins in 3D, but am having some 
problems when adding hydrogens...

I have uploaded a GIST to demonstrate the issue (apologies - the py3Dmol js 
doesn't render in the nbviewer, but this doesn't affect understanding):
https://gist.github.com/jepdavidson/f5220187c18be0fc9e119f9da2e7d955

The main problem is that added hydrogens don't automatically get assigned 
monomer info from the monomer they are being added to, but there are other 
issues as well (the hydrogens are marked 'HETATM', the occupancy for the ATOM 
blocks are set to "-nan", and the CONECT block doesn't list the added Hs).

Propagating the monomer info from the amino acids to the added Hs isn't too 
difficult (can call atom.GetNeighbors() and take the info from the neighbouring 
atom) - but there are also some preferred (or required?) naming and numbering 
conventions to adhere to ("H" for the backbone NH, "HA" for the hydrogen on the 
alpha carbon, etc).

Perhaps I am missing something here (a secret 'flavour' option? :)) - but if 
not, it would be interesting to hear what behaviour others would expect when 
adding explicit hydrogens (I think the same issues will relate to any sequence 
where monomer information is present).

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] New problem compiling RDKit on Windows

2015-11-27 Thread James Davidson
Hi again, Greg

>  If you still have problems with this (or hc.c), please let me know,

hc.c fails to compile.
The errors are shown below, and then I get related linking errors.  I'm hoping 
all the errors are related(?)
The first line affected is line 42:

static doublereal inf = 1e20;

Kind regards

James


Error   2426error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   42  1   hc
Error   2427error C2275: 'integer' : illegal use of this type as an 
expression  C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   45  1   hc
Error   2428error C2146: syntax error : missing ';' before identifier 
'i__1'C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   45  1   hc
Error   2429error C2065: 'i__1' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   45  1   hc
Error   2430error C2065: 'i__2' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   45  1   hc
Error   2431error C2275: 'doublereal' : illegal use of this type as an 
expression   C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   46  1   hc
Error   2432error C2146: syntax error : missing ';' before identifier 
'd__1'C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   46  1   hc
Error   2433error C2065: 'd__1' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   46  1   hc
Error   2434error C2065: 'd__2' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   46  1   hc
Error   2435error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   49  1   hc
Error   2436error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   50  1   hc
Error   2437error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   51  1   hc
Error   2438error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   52  1   hc
Error   2439error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   53  1   hc
Error   2440error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   54  1   hc
Error   2441error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   55  1   hc
Error   2442error C2143: syntax error : missing ';' before 'type'   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   56  1   hc
Error   2443error C2065: 'i__1' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   72  1   hc
Error   2444error C2065: 'i__' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   73  1   hc
Error   2445error C2065: 'i__1' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   73  1   hc
Error   2446error C2065: 'i__' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   74  1   hc
Error   2447error C2065: 'i__' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   75  1   hc
Error   2448error C2065: 'ncl' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   77  1   hc
Error   2449error C2065: 'i__1' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   79  1   hc
Error   2450error C2065: 'ind' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   80  1   hc
Error   2451error C2065: 'i__1' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   80  1   hc
Error   2452error C2065: 'ind' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   81  1   hc
Error   2453error C2065: 'i__1' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   90  1   hc
Error   2454error C2065: 'i__' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   91  1   hc
Error   2455error C2065: 'i__1' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   91  1   hc
Error   2456error C2065: 'dmin__' : undeclared identifier   
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   92  1   hc
Error   2457error C2065: 'inf' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   92  1   hc
Error   2458error C2065: 'i__2' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   93  1   hc
Error   2459error C2065: 'j' : undeclared identifier
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   94  1   hc
Error   2460error C2065: 'i__' : undeclared identifier  
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   94  1   hc
Error   2461error C2065: 'i__2' : undeclared identifier 
C:\RDKit\Code\ML\Cluster\Murtagh\hc.c   94  1   hc
Error   2462error C2065: 'ind' : undeclared 

Re: [Rdkit-discuss] New problem compiling RDKit on Windows

2015-11-26 Thread James Davidson

That looks like a leftover from a source-control conflict. I can't find it in 
github:
https://github.com/rdkit/rdkit/blob/master/Code/RDBoost/Wrap.h#L133

Could it be that you are pulling from github and that you had local 
modifications to the file that lead to a conflict?



__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] New problem compiling RDKit on Windows

2015-11-22 Thread James Davidson
Dear All,

For quite some time I have been successfully compiling RDKit on Windows using 
Visual Studio 2012.
However, recently (and perhaps triggered by a recent VS update that I accepted) 
I am getting errors.

The problem seems to be in Wrap.h (line 133):

<<< .mine


VS is complaining "error C2059: syntax error:'<<'"  and the corresponding error 
when inspecting the code is "Error: expected a declaration".
Does anyone have any suggestions for working through this?

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__--
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741551=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Rev. 5775 (Windows) - pyGraphMolWrap test fails

2015-07-13 Thread James Davidson
Dear All,

I have just built revision 5775 on Windows, and the pyGraphMolWrap test fails.  
The relevant bit of the verbose output is below:

78: ERROR: testGithub498 (__main__.TestCase)
78: --
78: Traceback (most recent call last):
78:   File C:/RDKit/Code/GraphMol/Wrap/rough_test.py, line 3033, in 
testGithub498
78: outf = gzip.open(tempfile.mktemp(),'wt+')
78:   File C:\Anaconda\lib\gzip.py, line 34, in open
78: return GzipFile(filename, mode, compresslevel)
78:   File C:\Anaconda\lib\gzip.py, line 94, in __init__
78: fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
78: ValueError: Invalid mode ('wt+b')


This seems to be due to a difference in python/gzip behaviour on Windows vs. eg 
Linux:

On Ubuntu (Anaconda python):

In [1]: import tempfile, gzip
In [2]: outf = gzip.open(tempfile.mktemp(), 'wt+')
In [3]:


On Windows (again Anaconda python):

In [1]: import tempfile, gzip
In [2]: outf = gzip.open(tempfile.mktemp(), 'wt+')
---
ValueErrorTraceback (most recent call last)
ipython-input-2-6bee12287576 in module()
 1 outf = gzip.open(tempfile.mktemp(), 'wt+')

C:\Anaconda\lib\gzip.pyc in open(filename, mode, compresslevel)
 32
 33 
--- 34 return GzipFile(filename, mode, compresslevel)
 35
 36 class GzipFile(io.BufferedIOBase):

C:\Anaconda\lib\gzip.pyc in __init__(self, filename, mode, compresslevel, 
fileobj, mtime)
 92 mode += 'b'
 93 if fileobj is None:
--- 94 fileobj = self.myfileobj = __builtin__.open(filename, mode 
or 'rb')
 95 if filename is None:
 96 # Issue #13781: os.fdopen() creates a fileobj with a bogus 
name

ValueError: Invalid mode ('wt+b')

In [3]:



Is this an easy one to fix?

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] rev5771 and boost_chrono?

2015-07-02 Thread James Davidson
Dear All,

I recently rebuilt RDKit under 64bit Windows and things worked great for me.  
However, I found that when I shared the build with another user, things weren't 
so good - from rdkit.Chem import AllChem gave a DLL error that pointed to 
rdForceFieldHelpers.pyd.

So I then ran Dependecy Walker and, as well as pointing at the usual boost 
libraries (python, system, thread), it also pointed at chrono.  This is the 
first time I had seen this.  Adding boost_chrono-vc110-mt-1_56.dll to the other 
user's rdkit/lib folder sorted the issue - which is great.

So this is a heads-up, in case it helps others; but also a question:  is there 
a good way to figure out all the boost dependencies ahead of deploying?

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python GetShortestPath()?

2015-04-22 Thread James Davidson
Hi Greg,

I just built the latest revision - and the functionality is exposed - thanks 
(and, of course, thanks Paolo!).

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Python GetShortestPath()?

2015-04-21 Thread James Davidson
Dear All,

I might be having a 'moment' here, but for the life of me I can't seem to find 
the equivalent of RDKit::MolOps::getShortestPath exposed in python(?).
I want to pass in two atom ids, and get back a list of atom ids in the shortest 
path.  I could possibly try to roll my own by using GetDistanceMatrix() and 
GetAdjacencyMatrix(), but I think I may struggle(!).

So, any pointer to GetShortestPath() greatly appreciated!

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python GetShortestPath()?

2015-04-21 Thread James Davidson
Hi Nick,

Well on the plus side I don't get a segfault(!)
On the minus side - unfortunately I think this approach only gives the length 
of the shortest path, rather than a list of the atom ids in the shortest path.

Kind regards

James

 -Original Message-
 From: Nicholas Firth [mailto:nicholas.fi...@icr.ac.uk]
 Sent: 21 April 2015 17:44
 To: James Davidson; rdkit-discuss@lists.sourceforge.net
 Subject: RE: Python GetShortestPath()?
 
 Dear James,
 
 I tried to be helpful and show you how I do it with GetAdjacencyMatrix,
 however I ran into my old friend the segmentation fault 11 as there is still
 some weird error with this function.
 
 Here's what I have though, should work for you.
 
 
  from rdkit import Chem
  m =
 
 Chem.MolFromSmiles('CC[C@H](CO)NC1=NC(=C2C(=N1)N(C=N2)C(C)C)NCC
 3=CC=
  CC=C3')
  atomIdx1 = 0
  atomIdx2 = 10
  print(Chem.GetAdjacencyMatrix(m)[atomIdx1][atomIdx2])
 Segmentation fault: 11
 
 
 
 Best,
 Nick
 
 Nicholas C. Firth | PhD Student | Cancer Therapeutics The Institute of Cancer
 Research | 15 Cotswold Road | Belmont | Sutton | Surrey | SM2 5NG T 020
 8722 4033 | E nicholas.fi...@icr.ac.uk | W www.icr.ac.uk | Twitter @ICRnews
 
 
 From: James Davidson [j.david...@vernalis.com]
 Sent: 21 April 2015 17:06
 To: rdkit-discuss@lists.sourceforge.net
 Subject: [Rdkit-discuss] Python GetShortestPath()?
 
 Dear All,
 
 I might be having a 'moment' here, but for the life of me I can't seem to find
 the equivalent of RDKit::MolOps::getShortestPath exposed in python(?).
 I want to pass in two atom ids, and get back a list of atom ids in the 
 shortest
 path.  I could possibly try to roll my own by using GetDistanceMatrix() and
 GetAdjacencyMatrix(), but I think I may struggle(!).
 
 So, any pointer to GetShortestPath() greatly appreciated!
 
 Kind regards
 
 James
 
 __
 
 PLEASE READ: This email is confidential and may be privileged. It is intended
 for the named addressee(s) only and access to it by anyone else is
 unauthorised. If you are not an addressee, any disclosure or copying of the
 contents of this email or any action taken (or not taken) in reliance on it is
 unauthorised and may be unlawful. If you have received this email in error,
 please notify the sender or postmas...@vernalis.com. Email is not a secure
 method of communication and the Company cannot accept responsibility for
 the accuracy or completeness of this message or any attachment(s). Please
 check this email for virus infection for which the Company accepts no
 responsibility. If verification of this email is sought then please request a 
 hard
 copy. Unless otherwise stated, any views or opinions presented are solely
 those of the author and do not represent those of the Company.
 
 The Vernalis Group of Companies
 100 Berkshire Place
 Wharfedale Road
 Winnersh, Berkshire
 RG41 5RD, England
 Tel: +44 (0)118 938 
 
 To access trading company registration and address details, please go to the
 Vernalis website at www.vernalis.com and click on the Company address
 and registration details link at the bottom of the page..
 __
 
 
 The Institute of Cancer Research: Royal Cancer Hospital, a charitable
 Company Limited by Guarantee, Registered in England under Company No.
 534147 with its Registered Office at 123 Old Brompton Road, London SW7
 3RP.
 
 This e-mail message is confidential and for use by the addressee only.  If the
 message is received by anyone other than the addressee, please return the
 message to the sender by replying to it and then delete the message from
 your computer and network.

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address

Re: [Rdkit-discuss] Problem building recent revisions on Windows

2015-04-14 Thread James Davidson
Thanks Greg

I have now tried using boost 1.56 (the cmake configuration once BOOST_ROOT is 
set is a little different vs. 1.55…).  Either way I can’t seem to build with 
threadsafe/multithreaded support – but perhaps we should draw a line under it 
for now / follow-up ‘off-line’(?).

The good news is that (with the thread settings OFF) the changes you checked-in 
(rev5616) have indeed sorted the MolHash piece, and all tests pass – thanks!

Kind regards

James

From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 14 April 2015 05:35
To: James Davidson
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Problem building recent revisions on Windows

Hi James

Thanks for your patience here. I will invest the time in trying to get 
automated builds happening on this Windows side so that this stops happening.

I just pushed some changes that should clear up the MolHash-related build 
problems. I had forgotten that that bit of code still needed to be tested on 
Windows. I haven't tested building the cartridge on Windows (I'm not set up to 
do that), but the python wrappers definitely do build for me now with both 
RDK_BUILD_THREADSAFE_SSS and RDK_TEST_MULTITHREADED set to ON. I don't think it 
should make a difference, but just in case: I am doing this using boost 1.56.

Best,
-greg




On Mon, Apr 13, 2015 at 11:30 AM, James Davidson 
j.david...@vernalis.commailto:j.david...@vernalis.com wrote:
Here’s an update:

Tried building rev5211, but saw similar linking errors (to do with Boost 
threading libraries).  The most recent build that I have successfully managed 
without the errors is rev5016.

It occurred to me that a couple of my cmake options relate to threading 
(RDK_BUILD_THREADSAFE=ON; RDK_TEST_MULTITHREADED=ON) – and perhaps other people 
(Greg, Paolo) with successful Windows builds had these set OFF (default)(?).  
Indeed, if I set both of these to OFF, I can successfully build rev5211, and 
all of the tests pass.
So this is good – but I still don’t understand what has changed (in relation to 
Boost threading) that means that later versions don’t build…

Threading aside, I was now feeling pretty confident that I would be ok to build 
the latest revision (rev5611).  Unfortunately this was not the case – I 
initially hit a problem with MolHash.cpp, which I thought was down to MSVC 
being stupid about snprintf() (line 265).  After a little stack-overflowing, I 
thought changing this to _snprintf() would solve the problem – which it appears 
to (at least MolHash.cpp now compiles), but then I get a couple of further 
errors down-stream (probably related to the change I made?):


Error  1  error LNK2019: unresolved external symbol class 
std::basic_stringchar,struct std::char_traitschar,class std::allocatorchar 
 __cdecl RDKit::Descriptors::calcMolFormula(class RDKit::ROMol const 
,bool,bool) 
(?calcMolFormula@Descriptors@RDKit@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVROMol@2@_N1@Z)
 referenced in function void __cdecl 
RDKit::MolHash::generateMoleculeHashSet(class RDKit::ROMol const ,struct 
RDKit::MolHash::HashSet ,class std::vectorunsigned int,class 
std::allocatorunsigned int  const *,class std::vectorunsigned int,class 
std::allocatorunsigned int  const *) 
(?generateMoleculeHashSet@MolHash@RDKit@@YAXAEBVROMol@2@AEAUHashSet@12@PEBV?$vector@IV?$allocator@I@std@@@std@@2@Z)
 C:\RDKit\build\Code\GraphMol\MolHash\Wrap\MolHash.lib(MolHash.obj) 
   rdMolHash

Error  2  error LNK1120: 1 unresolved externals   
C:\RDKit\build\rdkit\Chem\Release\rdMolHash.pyd  rdMolHash


So I guess it would still be good to understand the threading issue (or at 
least for someone else to be able to reproduce it); and perhaps the observed 
MolHash issue is an easier one to sort(?)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or 
postmas...@vernalis.commailto:postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 tel:%2B44%20%280

Re: [Rdkit-discuss] Problem building recent revisions on Windows

2015-04-13 Thread James Davidson
Here's an update:

Tried building rev5211, but saw similar linking errors (to do with Boost 
threading libraries).  The most recent build that I have successfully managed 
without the errors is rev5016.

It occurred to me that a couple of my cmake options relate to threading 
(RDK_BUILD_THREADSAFE=ON; RDK_TEST_MULTITHREADED=ON) - and perhaps other people 
(Greg, Paolo) with successful Windows builds had these set OFF (default)(?).  
Indeed, if I set both of these to OFF, I can successfully build rev5211, and 
all of the tests pass.
So this is good - but I still don't understand what has changed (in relation to 
Boost threading) that means that later versions don't build...

Threading aside, I was now feeling pretty confident that I would be ok to build 
the latest revision (rev5611).  Unfortunately this was not the case - I 
initially hit a problem with MolHash.cpp, which I thought was down to MSVC 
being stupid about snprintf() (line 265).  After a little stack-overflowing, I 
thought changing this to _snprintf() would solve the problem - which it appears 
to (at least MolHash.cpp now compiles), but then I get a couple of further 
errors down-stream (probably related to the change I made?):


Error  1  error LNK2019: unresolved external symbol class 
std::basic_stringchar,struct std::char_traitschar,class std::allocatorchar 
 __cdecl RDKit::Descriptors::calcMolFormula(class RDKit::ROMol const 
,bool,bool) 
(?calcMolFormula@Descriptors@RDKit@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVROMol@2@_N1@Z)
 referenced in function void __cdecl 
RDKit::MolHash::generateMoleculeHashSet(class RDKit::ROMol const ,struct 
RDKit::MolHash::HashSet ,class std::vectorunsigned int,class 
std::allocatorunsigned int  const *,class std::vectorunsigned int,class 
std::allocatorunsigned int  const *) 
(?generateMoleculeHashSet@MolHash@RDKit@@YAXAEBVROMol@2@AEAUHashSet@12@PEBV?$vector@IV?$allocator@I@std@@@std@@2@Z)
 C:\RDKit\build\Code\GraphMol\MolHash\Wrap\MolHash.lib(MolHash.obj) 
   rdMolHash

Error  2  error LNK1120: 1 unresolved externals   
C:\RDKit\build\rdkit\Chem\Release\rdMolHash.pyd  rdMolHash


So I guess it would still be good to understand the threading issue (or at 
least for someone else to be able to reproduce it); and perhaps the observed 
MolHash issue is an easier one to sort(?)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem building recent revisions on Windows

2015-04-09 Thread James Davidson
Hi Greg,

 James: one odd thing I notice about the error messages you posted is that 
 they are all referencing a boost library that seems to be present in your 
 build directory:
 
 Error  2651   error LNK2005: public: virtual __cdecl 
 boost::detail::thread_data_base::~thread_data_base(void) 
 (??1thread_data_base@detail@boost@@UEAA
 @XZ) already defined in 
 boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)  
 C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost
 _thread-vc110-mt-1_55.lib(thread.obj)   rdDistGeom
 
 Is this just MSVC being odd about how it reports errors or is there really a 
 version of the boost threading library in your 
 build\Code\GraphMol\DistGeomHelpers\Wrap directory?

No boost threading library there...  And I clear-out the build folder between 
builds anyway.  So I guess it is a quirk of the reporting.
I will have another go at building the latest version and report back on 
success or otherwise(!)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem building recent revisions on Windows

2015-04-08 Thread James Davidson
Hi Paolo,

 Unfortunately I have the impression that James' problem is related to
 neither of those. Might it be a boost/libboost naming issue?

Perhaps, but cmake seems happy (see below)...

 James, could it be that you have multiple version of boost on your Windows
 machine and CMake is not picking the correct one? You might try to explicitly
 define on the CMake command line both BOOST_ROOT and
 BOOST_LIBRARYDIR location as I do on my system:
 
 C:\Program Files (x86)\CMake\bin\cmake
 -DBOOST_LIBRARYDIR=c:\32\boost_1_55_0_py34\lib32-msvc-12.0
 -DBOOST_ROOT=c:\32\boost_1_55_0_py34 ..

I do have multiple versions around, but I have the following set (from the 
cmake GUI):

BOOST_LIBRARYDIRC:/local/boost_1_55_0-msvc-11.0-64/lib64-msvc-11.0
BOOST_ROOT  C:/local/boost_1_55_0-msvc-11.0-64

and cmake reports that this version of boost is found:

Boost version: 1.55.0
Found the following Boost libraries:
  regex

So I am not sure what change is giving me the issue...  Let's wait and see what 
Greg finds as well(!)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problem building recent revisions on Windows

2015-04-08 Thread James Davidson
Dear All,

I just tried building the latest RDKit build (rev. 5204) from the github 
repository, and hit a lot of link errors...  So (somewhat at random) I tried an 
older build (5042), and saw very similar things (errors for this attempt are 
below).
I am running on 64-bit Windows, and use cmake and Visual Studio 2012 - my build 
process hasn't changed since the last time I successfully built (rev. 4274 - 
and I can confirm that if I roll-back to this revision, the build is once again 
successful), so I wondered if anyone more skilled in the art than me could 
suggest what the problem might be from the errors below(?)


These are the errors when building the 'ALL_BUILD' project:

Error  2651   error LNK2005: public: virtual __cdecl 
boost::detail::thread_data_base::~thread_data_base(void) 
(??1thread_data_base@detail@boost@@UEAA@XZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)  
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdDistGeom
Error  2652   error LNK2005: public: void __cdecl 
boost::thread::detach(void) (?detach@thread@boost@@QEAAXXZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdDistGeom
Error  2653   error LNK2005: class boost::thread::id __cdecl 
boost::this_thread::get_id(void) 
(?get_id@this_thread@boost@@YA?AVid@thread@2@XZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdDistGeom
Error  2654   error LNK2005: public: class boost::thread::id __cdecl 
boost::thread::get_id(void)const  (?get_id@thread@boost@@QEBA?AVid@12@XZ) 
already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdDistGeom
Error  2656   error LNK2005: private: bool __cdecl 
boost::thread::join_noexcept(void) (?join_noexcept@thread@boost@@AEAA_NXZ) 
already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdDistGeom
Error  2658   error LNK2005: public: bool __cdecl 
boost::thread::joinable(void)const  (?joinable@thread@boost@@QEBA_NXZ) already 
defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   

C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdDistGeom
Error  2659   error LNK2005: private: bool __cdecl 
boost::thread::start_thread_noexcept(void) 
(?start_thread_noexcept@thread@boost@@AEAA_NXZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdDistGeom
Error  2660   error LNK1169: one or more multiply defined symbols found 
   C:\RDKit\build\rdkit\Chem\Release\rdDistGeom.pydrdDistGeom
Error  2675   error LNK2005: public: virtual __cdecl 
boost::detail::thread_data_base::~thread_data_base(void) 
(??1thread_data_base@detail@boost@@UEAA@XZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)  
C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdForceFieldHelpers
Error  2676   error LNK2005: public: void __cdecl 
boost::thread::detach(void) (?detach@thread@boost@@QEAAXXZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)
C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdForceFieldHelpers
Error  2677   error LNK2005: class boost::thread::id __cdecl 
boost::this_thread::get_id(void) 
(?get_id@this_thread@boost@@YA?AVid@thread@2@XZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdForceFieldHelpers
Error  2678   error LNK2005: public: class boost::thread::id __cdecl 
boost::thread::get_id(void)const  (?get_id@thread@boost@@QEBA?AVid@12@XZ) 
already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdForceFieldHelpers
Error  2679   error LNK2005: private: bool __cdecl 
boost::thread::join_noexcept(void) (?join_noexcept@thread@boost@@AEAA_NXZ) 
already defined in 

Re: [Rdkit-discuss] Problem building recent revisions on Windows

2015-04-08 Thread James Davidson
Hi Greg – thanks!
One extra piece:  as of a few minutes ago, I can confirm that revision 4947 
(last revision in Feb) builds, and passes all of the tests.

Kind regards

James

From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 08 April 2015 15:14
To: James Davidson
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Problem building recent revisions on Windows

I will fire up windows tomorrow morning and ensure that things can build. It's 
been a couple weeks since I last did that.

-greg


On Wed, Apr 8, 2015 at 3:43 PM, James Davidson 
j.david...@vernalis.commailto:j.david...@vernalis.com wrote:
Dear All,

I just tried building the latest RDKit build (rev. 5204) from the github 
repository, and hit a lot of link errors…  So (somewhat at random) I tried an 
older build (5042), and saw very similar things (errors for this attempt are 
below).
I am running on 64-bit Windows, and use cmake and Visual Studio 2012 – my build 
process hasn’t changed since the last time I successfully built (rev. 4274 – 
and I can confirm that if I roll-back to this revision, the build is once again 
successful), so I wondered if anyone more skilled in the art than me could 
suggest what the problem might be from the errors below(?)


These are the errors when building the ‘ALL_BUILD’ project:

Error  2651   error LNK2005: public: virtual __cdecl 
boost::detail::thread_data_base::~thread_data_base(void) 
(??1thread_data_base@detail@boost@@UEAA@XZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)  
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdDistGeom
Error  2652   error LNK2005: public: void __cdecl 
boost::thread::detach(void) (?detach@thread@boost@@QEAAXXZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdDistGeom
Error  2653   error LNK2005: class boost::thread::id __cdecl 
boost::this_thread::get_id(void) 
(?get_id@this_thread@boost@@YA?AVid@thread@2@XZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdDistGeom
Error  2654   error LNK2005: public: class boost::thread::id __cdecl 
boost::thread::get_id(void)const  (?get_id@thread@boost@@QEBA?AVid@12@XZ) 
already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdDistGeom
Error  2656   error LNK2005: private: bool __cdecl 
boost::thread::join_noexcept(void) (?join_noexcept@thread@boost@@AEAA_NXZ) 
already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdDistGeom
Error  2658   error LNK2005: public: bool __cdecl 
boost::thread::joinable(void)const  (?joinable@thread@boost@@QEBA_NXZ) already 
defined in boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   

C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdDistGeom
Error  2659   error LNK2005: private: bool __cdecl 
boost::thread::start_thread_noexcept(void) 
(?start_thread_noexcept@thread@boost@@AEAA_NXZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\DistGeomHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdDistGeom
Error  2660   error LNK1169: one or more multiply defined symbols found 
   C:\RDKit\build\rdkit\Chem\Release\rdDistGeom.pydrdDistGeom
Error  2675   error LNK2005: public: virtual __cdecl 
boost::detail::thread_data_base::~thread_data_base(void) 
(??1thread_data_base@detail@boost@@UEAA@XZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)  
C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
   rdForceFieldHelpers
Error  2676   error LNK2005: public: void __cdecl 
boost::thread::detach(void) (?detach@thread@boost@@QEAAXXZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)
C:\RDKit\build\Code\GraphMol\ForceFieldHelpers\Wrap\libboost_thread-vc110-mt-1_55.lib(thread.obj)
rdForceFieldHelpers
Error  2677   error LNK2005: class boost::thread::id __cdecl 
boost::this_thread::get_id(void) 
(?get_id@this_thread@boost@@YA?AVid@thread@2@XZ) already defined in 
boost_thread-vc110-mt-1_55.lib(boost_thread-vc110-mt-1_55.dll)   
C:\RDKit\build\Code\GraphMol\ForceFieldHelpers

Re: [Rdkit-discuss] Tests failing on Windows: more info

2015-02-10 Thread James Davidson
Hi Paolo, Greg, et al.

I have also been having some problems recently building (64-bit Windows) from 
recent github versions, but I don't know if this is related to what you see, 
Paolo...
My environment is Win 7 64-bit, CMake 3.0.0, boost_1_55_0-msvc-11.0-64, MS 
Visual Studio Express 2012.

I have done a bit of version rolling-back and forwards to see if I can pinpoint 
the last version that builds with no errors, and this is what I have found so 
far (sorted by revision, not by sequence of attempts!):

4577   - compiles fine, - passes all tests
4618   - as above
4649   - some errors during compile, -passes all tests except the molDraw2D 
bits (which are also involved in the errors)
4651   - as above
4743   - as above
4765   - as above
4780   - pyGraphMolWrap now fails test
4826   - this is where significant problems start (for me at least).  
pyGraphMolWrap still fails, but now with a segfault
4859   - same segfault as above.  Also pymolDraw2D test fails...


The errors I start to see for molDraw2D are this sort of thing (is this 
expected?):

Error  49   error C2668: 'boost::tuples::tie' : ambiguous call to 
overloaded function
C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   341 1  
MolDraw2D
Error  50   error C2668: 'boost::tuples::tie' : ambiguous call to 
overloaded function
C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   353 1  
MolDraw2D
Error  51   error C2668: 'boost::tuples::tie' : ambiguous call to 
overloaded function
C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   357 1  
MolDraw2D
Error  61   error C2668: 'boost::tuples::tie' : ambiguous call to 
overloaded function
C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   544 1  
MolDraw2D
Error  63   error C2668: 'boost::tuples::tie' : ambiguous call to 
overloaded function
C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   591 1  
MolDraw2D
Error  131 error LNK1181: cannot open input file 
'..\..\..\lib\Release\MolDraw2D.lib'
C:\RDKit\build\Code\GraphMol\MolDraw2D\LINK  moldraw2DTest1
Error  149 error LNK1181: cannot open input file 
'..\..\..\lib\Release\MolDraw2D.lib'
C:\RDKit\build\Code\GraphMol\Wrap\LINK   rdmolops


If I see the above errors when building 'ALL_BUILD', I also see the following 
error when building the 'INSTALL' section:

Error  41   error MSB3073: The command setlocal
C:\Program Files (x86)\CMake\bin\cmake.exe -DBUILD_TYPE=Release -P 
cmake_install.cmake
if %errorlevel% neq 0 goto :cmEnd
:cmEnd
endlocal  call :cmErrorLevel %errorlevel%  goto :cmDone
:cmErrorLevel
exit /b %1
:cmDone
if %errorlevel% neq 0 goto :VCEnd
:VCEnd exited with code 1.C:\Program Files 
(x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets   
 134 5  INSTALL



Anyway, 4618 is the latest revision that I have tested where I see no build 
errors, and 4765 is the latest revision I've found before I start to see 
pyGraphMolWrap tests failing (or segfaults).  For now, I have rolled-back my 
installation to 4618 (but would be very happy if anyone can figure-out what 
causes the problems with later revisions).

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Dive into the World of Parallel 

[Rdkit-discuss] Avalon test failing(?)

2014-12-10 Thread James Davidson
Hi Greg,

I wondered if you (or anyone else) have been seeing any issues with win64 build 
of the RDKit - with Avalon toolkit support - recently?
Yesterday I updated my local SVN copy of RDKit (to rev4274) and rebuilt.  
Everything seemed to go ok, but the testAvalonLib1 test is now failing (the 
pyAvalonTools test passes) - see below.
I can see that test1.cpp has changed recently, but my AvalonTools source 
hasn't...  Has a problem been introduced into the test?

Kind regards

James


C:\RDKit\buildctest -R testAvalon -V
UpdateCTestConfiguration  from :C:/RDKit/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :C:/RDKit/build/DartConfiguration.tcl
Test project C:/RDKit/build
Constructing a list of tests
Done constructing a list of tests
Checking test dependency graph...
Checking test dependency graph end
test 2
Start 2: testAvalonLib1

2: Test command: C:\RDKit\build\External\AvalonTools\Release\testAvalonLib1.exe
2: Test timeout computed to be: 9.99988e+006
2: [12:31:18] testing canonical smiles generation
2: [12:31:18] done
2: [12:31:18] testing coordinate generation
2: [12:31:18] done
2: [12:31:18] testing fingerprint generation
2: [12:31:18] c1n1 18
2:   returning
2: [12:31:18] c1n1 6
2: [12:31:18] c1nnccc1 28
2: [12:31:18] c1ncncc1 25
2: [12:31:18] c1cccnc1 18
2: [12:31:18] c1c1 6
2: [12:31:18] c1cccnc1 19
2: [12:31:18] c1cocc1 48
2: [12:31:18]
2:
2: 
2: Test Assert
2: Expression Failed:
2: Violation occurred on line 146 in file 
..\..\..\External\AvalonTools\test1.cpp
2: Failed Expression: bv.getNumOnBits()==53
2: 
2:
1/1 Test #2: testAvalonLib1 ...***Failed2.87 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   2.98 sec

The following tests FAILED:
  2 - testAvalonLib1 (Failed)
Errors while running CTest


__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Avalon test failing(?)

2014-12-10 Thread James Davidson
Hi Greg,


 The new version of the test code is targeting the 1.2 avalon toolkit

 version.

 Here's the commit that did that.

 https://github.com/rdkit/rdkit/commit/42dab414ee6fbe5489078e5e52046608bbf785cb



 As an FYI, to make these tests pass on windows, you need to edit the code

 to fix a bug:



 you need to comment out line 1446 of reaccsio.c:

//MyFree((char *)tempdir);

Following your advice, I downloaded the 1.2 source from Sourceforge 
(http://sourceforge.net/projects/avalontoolkit/files/AvalonToolkit_1.2/); 
commented-out the line in reaccsio.c; and then reconfigured in cmake and 
rebuilt in VS.  The tests pass now - thanks!

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] MMFF constraints question

2014-06-03 Thread James Davidson
Dear All (but mainly Paulo!),

I have really been appreciating the MMFF implementation in RDKit - particularly 
now with the ability to add position / distance / angle / torsional constraints!
I have a couple of naïve questions; and apologise in advance if I have missed 
answers to these in the documentation / method doc-strings...

1.  This is a simple one - but just to categorically confirm - 
ff.CalcEnergy() gives results in kcal/mol units, right?
2.  Now onto force constants...  I see from the unittest (aka documentation 
if you can't find anything else) 'testConstraints.py' that the value of 1.0e5 
is used in the tests.  1.0e5 what?  And should this be viewed as a strong, 
modest, or weak restraint?  Presuming this constant is somewhat like saying how 
springy a spring is(?), then what is a sensible value to give me a super-strong 
steel joist that would essentially resist everything (or is 1.0e5 it)?
3.  Final thing:  the problem with making something really good / useful is 
that people get used to it and then start wanting more!  How's the GBSA 
implicit solvation coming on?   : )

Kind regards

James


__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MMFF constraints question

2014-06-03 Thread James Davidson
Hi Paolo,

First of all - please see this time my brain has engaged quicker than my 
English-biased touch-typing - and I have spelt your name correctly(!).
Thanks for the very clear explanation on force constants - this is really 
helpful!

And, regarding your new non-academic position vs continued 'forcefield tools' 
development in RDKit - I kind of suspected the answer before I asked!  Oh well, 
my non-academic position of course gives me access to commercial 
implementations of MMFF + implicit solvation...  It's just not nearly as fun!  
: (

Kind regards

James


__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit cartridge similarity search speeds(?)

2014-05-09 Thread James Davidson
Hi Greg,

 What these are telling you is that the second query is not using the index:
 it's a sequential scan, so it has to test all rows of the database. This
 happens because the index is defined for the operator %, but not for the
 function tanimoto_sml(). There may be an approach to get the index set up
 using that function, but there we reach the limits of my expertise.

Well, I will stick to the recommended operator use then!


 One final advanced topic: if you are planning on making regular use of the
 similarity features in the cartridge and are running on a linux system or
 Mac I would recommend recompiling the cartridge with some optimizations for
 tanimoto similarity. To do this, you need to edit the cartridge Makefile
 from:
 PG_CPPFLAGS = -I${BOOSTHOME} -I${RDKIT}/Code -DRDKITVER='007200'
 ${INCHIFLAGS} #-DUSE_BUILTIN_POPCOUNT -msse4.2
 
 to:
 PG_CPPFLAGS = -I${BOOSTHOME} -I${RDKIT}/Code -DRDKITVER='007200'
 ${INCHIFLAGS} -DUSE_BUILTIN_POPCOUNT -msse4.2
 
 (I just removed a comment character here). This speeds the Tanimoto
 calculation up a fair bit (it's still not nearly as fast as Andrew's
 chemfp, but it's better than the default behavior).

I'm on linux (Ubuntu), and have just re-built with the above recommendation.
I'll see what the speeds look like afterwards (out of interest, I presume the 
timings in your examples were with this optimisation in place?).

Does this also affect dice?

And final question - after rebuilding the cartridge, does the extension need to 
be dropped and then re-created in all databases; does postgreSQL server need 
restarting; or neither?


 Hope this helps,
 -greg

It does - thanks!

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__
--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
#149; 3 signs your SCM is hindering your productivity
#149; Requirements for releasing software faster
#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit cartridge similarity search speeds(?)

2014-05-08 Thread James Davidson
Dear All,

I have recently been spending a bit more time with the RDKit cartridge, and 
have what is probably a very naïve question...
Having built some RDKit fingerprints for ChEMBL_18, I see the following 
behaviour (for clarification - 'ecfp4_bv' is the column in my rdk.fps table 
that has been generated using morganbv_fp(mol, 2)):


chembl_18=# \timing on
Timing is on.

chembl_18=# set rdkit.tanimoto_threshold=0.5;
SET
Time: 0.167 ms

chembl_18=# select chembl_id from rdk.fps where ecfp4_bv % 
morganbv_fp('c1nnccc1'::mol,2);
  chembl_id
-
CHEMBL15719
(1 row)

Time: 2033.348 ms

chembl_18=# select chembl_id from rdk.fps where tanimoto_sml(ecfp4_bv, 
morganbv_fp('c1nnccc1'::mol, 2))  0.5;
  chembl_id
-
CHEMBL15719
(1 row)

Time: 6843.605 ms


I can see that the query plans are different in the two cases, but I don't 
fully understand why - see below:

QUERY 1 (with explain analyze)
chembl_18=# explain analyze select chembl_id from rdk.fps where ecfp4_bv % 
morganbv_fp('c1nnccc1'::mol,2);

 QUERY PLAN

Bitmap Heap Scan on fps  (cost=106.91..5298.31 rows=1352 width=13) (actual 
time=1774.986..1774.987 rows=1 loops=1)
   Recheck Cond: (ecfp4_bv % 
'\x0100084200048204'::bfp)
   -  Bitmap Index Scan on fps_ecfp4bv_idx  (cost=0.00..106.57 rows=1352 
width=0) (actual time=1774.969..1774.969 rows=1 loops=1)
 Index Cond: (ecfp4_bv % 
'\x0100084200048204'::bfp)
Total runtime: 1775.035 ms
(5 rows)

Time: 1776.133 ms


QUERY 2 (with explain analyze)
chembl_18=# explain analyze select chembl_id from rdk.fps where 
tanimoto_sml(ecfp4_bv, morganbv_fp('c1nnccc1'::mol, 2))  0.5;

  QUERY PLAN
---
Seq Scan on fps  (cost=0.00..388808.17 rows=450793 width=13) (actual 
time=1278.115..6953.977 rows=1 loops=1)
   Filter: (tanimoto_sml(ecfp4_bv, 
'\x0100084200048204'::bfp)
  0.5::double precision)
   Rows Removed by Filter: 1352377
Total runtime: 6954.010 ms
(4 rows)

Time: 6955.103 ms


It seems conceptually 'easier' to add the similarity value as part of the 
query, rather than setting it as a variable ahead of the query; but clearly I 
should be doing it the latter way for performance reasons.  So even if I don't 
fully understand why at the moment, am I correct in thinking that queries of 
this sort should always be run with the similarity operators (%, #)?  And if 
so, is the rdkit.tanimoto_threshold variable set at the level of the session, 
the user, or the database?

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
#149; 3 signs your SCM is hindering your productivity
#149; Requirements for releasing software faster
#149; Expert tips and advice for 

Re: [Rdkit-discuss] Building RDKit on Windows

2014-03-05 Thread James Davidson
Thanks Greg - that did the trick!
(I still see pythonTestDbCLI - as previously posted)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on Windows

2014-03-03 Thread James Davidson
Hi All,

I have just rebuilt RDKit on Windows using the latest source, and am seeing a 
problem with smaTest1 failing (as well as still seeing the same DbCLI failure 
posted previously...)
The smaTest1 failure seems a little strange because it actually throws a 
Windows executable error  (smaTest1.exe has stopped working, etc).
If I run ctest -V -R smaTest1 I see the output below.  Any thoughts?

Kind regards

James



C:\RDKit\buildctest -V -R smaTest1
UpdateCTestConfiguration  from :C:/RDKit/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :C:/RDKit/build/DartConfiguration.tcl
Test project C:/RDKit/build
Constructing a list of tests
Done constructing a list of tests
Checking test dependency graph...
Checking test dependency graph end
test 32
Start 32: smaTest1

32: Test command: C:\RDKit\build\Code\GraphMol\SmilesParse\Release\smaTest1.exe
32: Test timeout computed to be: 9.99988e+006
32: [17:42:57] -
32: [17:42:57] Testing patterns which should parse.
32: [17:42:57] SMARTS Parse Error: syntax error for input: c1b1
32: [17:42:57]
32:
32: 
32: Invariant Violation
32: c1b1
32: Violation occurred on line 90 in file 
..\..\..\..\Code\GraphMol\SmilesParse\smatest.cpp
32: Failed Expression: mol
32: 
32:
1/1 Test #32: smaTest1 .***Failed4.03 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   4.29 sec

The following tests FAILED:
 32 - smaTest1 (Failed)
Errors while running CTest

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on Windows

2014-01-29 Thread James Davidson
Hi Greg,

 Try: ctest -V -R DbCLI
 that should run the test in Verbose mode so that you can see the failures.

Thanks - I have pasted the output below - looks like a file access issue (but I 
don't know why...).

Kind regards
James



C:\RDKit\buildctest -V -R DbCLI
UpdateCTestConfiguration  from :C:/RDKit/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :C:/RDKit/build/DartConfiguration.tcl
Test project C:/RDKit/build
Constructing a list of tests
Done constructing a list of tests
Checking test dependency graph...
Checking test dependency graph end
test 76
Start 76: pythonTestDbCLI

76: Test command: C:\Python27\python.exe C:/RDKit/Projects/test_list.py 
--testDir C:/RDKit/Projects
76: Test timeout computed to be: 9.99988e+006
76: [10:13:51] INFO: Reading molecules and constructing molecular database.
76: [10:13:51] INFO: Generating molecular database in file 
testData/bzr\Compounds.sqlt
76: [10:13:51] INFO:   Processing 10 molecules
76: [10:13:51] INFO: Generating fingerprints and descriptors:
76: [10:13:51] INFO: Finished.
76: [10:13:52] INFO: Reading molecules and constructing molecular database.
76: [10:13:52] INFO: Generating molecular database in file 
testData/bzr\Compounds.sqlt
76: [10:13:52] INFO:   Processing 163 molecules
76: [10:13:53] INFO:   done 100
76: [10:13:53] INFO: Generating fingerprints and descriptors:
76: [10:14:02] INFO: Finished.
76: [10:14:02] INFO: Reading query molecules and generating fingerprints
76: [10:14:03] INFO: Finding Neighbors
76: [10:14:04] INFO: The search took 0.6 seconds
76: [10:14:04] INFO: Creating output
76: [10:14:04] INFO: Done!
76: [10:14:05] INFO: Reading query molecules and generating fingerprints
76: [10:14:05] INFO: Finding Neighbors
76: [10:14:05] INFO: The search took 0.3 seconds
76: [10:14:05] INFO: Creating output
76: [10:14:05] INFO: Done!
76: [10:14:06] INFO: Reading query molecules and generating fingerprints
76: [10:14:06] INFO: Finding Neighbors
76: [10:14:06] INFO: The search took 0.1 seconds
76: [10:14:06] INFO: Creating output
76: [10:14:06] INFO: Done!
76: [10:14:07] INFO: Doing property query
76: [10:14:07] INFO: Found 30 molecules matching the query
76: [10:14:07] INFO: Creating output
76: [10:14:07] INFO: Done!
76: [10:14:08] INFO: Doing property query
76: [10:14:08] INFO: Found 30 molecules matching the query
76: [10:14:08] INFO: Creating output
76: [10:14:08] INFO: Done!
76: [10:14:08] INFO: Doing substructure query
76: [10:14:09] INFO:Fingerprint screenout rate: 112 of 163 (%68.71)
76: [10:14:09] INFO: Found 49 molecules matching the query
76: [10:14:09] INFO: Creating output
76: [10:14:09] INFO: Done!
76: [10:14:09] INFO: Doing substructure query
76: [10:14:09] INFO:Fingerprint screenout rate: 112 of 163 (%68.71)
76: [10:14:09] INFO: Found 49 molecules matching the query
76: [10:14:09] INFO: Creating output
76: [10:14:09] INFO: Done!
76: [10:14:10] INFO: Doing substructure query
76: [10:14:10] INFO: Found 114 molecules matching the query
76: [10:14:10] INFO: Creating output
76: [10:14:10] INFO: Done!
76: [10:14:11] INFO: Doing substructure query
76: [10:14:11] INFO:Fingerprint screenout rate: 23 of 30 (%76.67)
76: [10:14:11] INFO: Found 5 molecules matching the query
76: [10:14:11] INFO: Creating output
76: [10:14:11] INFO: Done!
76: [10:14:12] INFO: Doing substructure query
76: [10:14:12] INFO: Found 25 molecules matching the query
76: [10:14:12] INFO: Creating output
76: [10:14:12] INFO: Done!
76: [10:14:13] INFO: Reading query molecules and generating fingerprints
76: [10:14:18] INFO: Finding Neighbors
76: [10:14:19] INFO: The search took 0.9 seconds
76: [10:14:19] INFO: Creating output
76: [10:14:19] INFO: Done!
76: [10:14:20] INFO: Reading query molecules and generating fingerprints
76: [10:14:20] INFO: Finding Neighbors
76: [10:14:20] INFO: The search took 0.0 seconds
76: [10:14:20] INFO: Creating output
76: [10:14:20] INFO: Done!
76: [10:14:21] INFO: Reading molecules and constructing molecular database.
76: [10:14:21] INFO: Generating molecular database in file 
testData/bzr\Compounds.sqlt
76: [10:14:21] INFO:   Processing 10 molecules
76: [10:14:21] INFO: Generating fingerprints and descriptors:
76: [10:14:21] INFO: Finished.
76: [10:14:23] INFO: Reading molecules and constructing molecular database.
76: [10:14:23] INFO: Generating molecular database in file 
testData/bzr\Compounds.sqlt
76: [10:14:23] INFO:   Processing 10 molecules
76: [10:14:23] INFO: Generating fingerprints and descriptors:
76: [10:14:23] INFO: Finished.
76: .Traceback (most recent call last):
76:   File CreateDb.py, line 460, in module
76: CreateDb(options,dataFilename)
76:   File CreateDb.py, line 214, in CreateDb
76: startAnew=not options.updateDb
76:   File C:\RDKit\rdkit\Chem\MolDb\Loader_sa.py, line 111, in LoadDb
76: os.unlink(dbName)
76: WindowsError: [Error 32] The process cannot access the file because it is 
being used by another process: 'testData/bzr\\Compounds.sqlt'
76: [10:14:24] INFO: Reading 

[Rdkit-discuss] Building RDKit on Windows

2014-01-28 Thread James Davidson
Dear All,

As part of a New Year's resolution, I decided I should try to enjoy the 
benefits of a cutting-edge version of RDKit built from source(!)  So far this 
has proven to be much more realistic than eg 'not drinking for January' - as I 
now have a working build to show for my efforts.

However, I wonder if I could quickly list the steps I took; and also ask a 
couple of questions (relating to InChi and Avalon)?  For reference I am running 
on Windows7 64-bit, but use python 2.7.6 32bit, so am building 32-bit RDKit.  I 
essentially followed the guide on the wiki 
(https://code.google.com/p/rdkit/wiki/BuildingOnWindows) but thought the 
version info of boost, etc may be of use to others, and the steps may help put 
my questions into context:


1.   Downloaded Visual Studio Express 2012 for Desktop, installed, and 
accepted the updates

2.   Downloaded matching version of Windows boost binaries 
(boost_1_55_0-msvc-11.0-32.exe) from 
http://sourceforge.net/projects/boost/files/boost-binaries/ and extracted to 
the default path

3.   Used TortoiseSVN to add a repository link to 
https://github.com/rdkit/rdkit.git/trunk (and not the SF path as currently 
shown in the wiki guide) in C:/RDKit

4.   Set the environment variables as described on the wiki.

5.   Downloaded the INCHI src as described in the wiki and set the 
RDK_BUILD_INCHI_SUPPORT option later in cmake.  Incidentally, the location for 
the downloads from IUPAC have changed (ie the info in the README is out of 
date): http://www.iupac.org/home/publications/e-resources/inchi/download.html

6.   Ran CMake configure (GUI) following the wiki, and based on the output, 
made some boost-related additions to environment variables

a.   Added C:\local\boost_1_55_0\lib32-msvc-11.0 to PATH

b.  Created BOOST_ROOT=C:\local\boost_1_55_0

c.   Created BOOST_LIBRARYDIR=C:\local\boost_1_55_0\lib32-msvc-11.0

7.   Re-ran configure, then generate, then followed the rest of the wiki 
instructions to build and test - all tests passed except the dbCli one.


So now for the questions:
I thought I did everything right for adding INCHI support.  However, I see the 
following:



In [1]: from rdkit import Chem

In [2]: Chem.inchi.INCHI_AVAILABLE

Out[2]: False



CMake shows:


Could NOT find InChI in system locations (missing:  INCHI_LIBRARY 
INCHI_INCLUDE_DIR)

Found InChI software locally



Do I also need to download the InChi binary and set these two variables 
appropriately in CMake?



Also, I am struggling to build with Avalon support...  Choosing the 
RDK_BUILD_AVALON_SUPPORT appears to configure fine, but when I try to 
'Generate' I see the following error:



CMake Error at Code/cmake/Modules/RDKitUtils.cmake:26 (add_library):
Cannot find source file:

/common/layout.c

Tried extensions .c .C .c++ .cc .cpp .cxx .m .M .mm .h .hh .h++ .hm .hpp
.hxx .in .txx
Call Stack (most recent call first):
External/AvalonTools/CMakeLists.txt:43 (rdkit_library)

Any thought on how to get around this?  Do I need to download the Avalon 
project src and put it somewhere?

And final question - can I happily ignore the CMake messages about not finding 
FLEX and BISON, or are these needed when incorporating any of the non-default 
entries (SWIG wrappers, etc)?


Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance 

[Rdkit-discuss] Minimising bits of molecules?

2013-11-26 Thread James Davidson
Dear All,

I think this is probably one for Paolo - I was looking at fixing certain atoms 
during MMFF minimisation, but couldn't find the option...  Then I re-read the 
UGM slides, and found the one titled Force-field wish list, and fixed atoms 
were one of the listed items!

My intended use-case is the following:


1.   Load protein-ligand complex into PyMOL

2.   Make some changes to the bound ligand (using the Builder functionality)

3.   Select atoms that are allowed to move (manual selection, then use of 
PyMOL's 'flag' command)

4.   Pass the molecule over to RDKit (already incorporated in a plugin we 
use), to minimise and then pass back (either as a new object, or apply the new 
coordinates to the existing object in situ)

Actually, this process is already well-used by some of our chemists here - as a 
way of doing some simple modelling / idea exploration - but is currently using 
a much 'flakier' MMFF implementation.  So I would definitely like to move to 
RDKit for the minimisation - any idea when a 'fixed atoms' option is likely to 
be added?

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chemistry 101 question...

2013-10-28 Thread James Davidson
Greg wrote:
  This is what it looks like the state of play at the moment is:
  
  - Adding nitro groups tends to make molecules more lipophilic, at least as 
 measured by retention time in chromatography.
  - Nitro groups are H-bond acceptors, at least according to the papers I 
 found above and the evidence one finds in the CSD.
  
  This seems like an argument for having nitro groups in the default fdef file 
 as both lumped hydrophobes (the whole group) and acceptors (the Os).
  
  Make sense?


Makes sense to me!

Cheers

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Beta of Q3 2013 release available

2013-10-25 Thread James Davidson
Hi Sereina,

Sereina wrote:
 Regarding the AssignBondOrdersFromTemplate() method:
 As far as I understood, the PDB reader assigns bond orders to the amino acids 
 in a protein, but if a ligand is present it puts all bonds of it to SINGLE 
 bonds as auto bond-type perception is not trivial (see Roger's comments).
 However, usually one knows which ligand was crystallized (i.e. the SMILES is 
 available), so the AssignBondOrdersFromTemplate() method can be used to set 
 the bond orders based on the known ligand structure.
 This is the idea of the method. Now, to your real-world application. I'm 
 sorry but I don't think I understand it completely. Do you want to set only 
 the bond orders of a specific substructure?
 Or would you like to give the function a set of ligands and a set of 
 templates and it figures out which template belongs to which ligand and sets 
 the bonds orders accordingly? 

This is very likely to be me being stupid - so please bear with me!
If I read in a complex (pdb), and already have my reference ligand (lig), then 
AllChem.AssignBondOrdersFromTemplate(lig, pdb) fails because the reference 
ligand has not been matched to the ligand in the pdb 'complex' (dot-separated 
list of molecules).
The doc-string states that the method works on two molecules - but I want to 
work on a reference molecule (lig) and a *substructure* of the macromolecule 
(pdb).  How should I be getting the bound ligand out as a molecule object to 
then use the AssignBondOrdersFromTemplate() method?  Am I missing some new 
PDB-related methods, or have I forgotten some fundamental RDKit methods for 
dealing with multi-component molecules?

I guess a sensible process would be:
1. Identify any HETATM residues
2. For each residue (or at least those that have bonds!) extract or copy the 
mol (unless it can be addressed 'in place'?)
3. Use AssignBondOrdersFromTemplate() - relying on lookup be eg residue name, 
etc
4. Insert the molecule back into the complex (or update the info if it has been 
modified 'in place')

Is this how the method is intended to be used with complexes (and if so, do you 
have an example for steps 2 and 4?

Thanks

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Beta of Q3 2013 release available

2013-10-24 Thread James Davidson
Hi Greg (et al.),

Thanks for the beta!  I have been going through some of the recently-added 
functionality, and had a couple of questions regarding the PDB reading / 
writing.


1.   Do I remember correctly that there was a proposal (from Roger) to add 
some auto bond-type perception to the PDB parser for ligands (or is that just 
wishful thinking!)?

2.   If not, I notice that there is an AssignBondOrdersFromTemplate() 
method - but the example in the doc-string only shows (I think) the case where 
the input PDB is just a single small molecule - so the matching is pretty easy! 
 I think a more real-World case is when one wants to set the bond orders for 
multiple ligands (HETATM residues) based on substructure matches - which will 
then return an atom index selection that can be used as a start point.  Is 
there any way to have the AssignBondOrdersFromTemplate() convenience function 
optionally accept a list of atom indexes to specify a substructure?

3.   Is there some explanation for what the 'flavor' option does for 
reading/writing PDB?

4.   Having read in a PDB file I see the correct atoms flagged as HETATM 
(from GetIsHeteroAtom()).  But when call Chem.MolToPDBBlock() these atoms get 
written as ATOM records...  Also, a Chem.MolToPDBFile() method would be nice 
for completeness / symmetry : )

5.   It seems to me that GetResidueNumber() and GetSerialNumber() may have 
got mixed-up at some point(?).  At least, when I call GetSerialNumber() I see 
what appears to be the residue number; and when I call GetResidueNumber() I get 
0!

6.   I also seem to be seeing all of the bonds (for all residues) being 
written out in CONECT records - such that they all appear as single bonds in eg 
PyMOL - is this expected behaviour at the moment?

Cheers

James



__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chemistry 101 question...

2013-10-22 Thread James Davidson
Hi JP, Nik, Greg, RDKitters

The question about the lipophilicity (or otherwise) of nitro groups was 
interesting to me...  I came from a CNS background, where there was, of course, 
a stricter requirement for molecules to be suitably lipophilic to cross the 
blood-brain barrier.  My recollection was that the observed lipophilicity of 
nitro groups was dependent on their local environments (ie electron rich / +m 
gave more polar character, and electron poor / -m gave more polar character)... 
 But rather than rely on my hazy recollections, I decided to have a quick look 
back at some historical reverse-phase analytical LC data.

What I did was took all retention times (in mins) under one well-used gradient 
method, and generated the matched-molecular pairs using George's KNIME node.  I 
was then only interested in *[H]  [*][N+](=O)[O-] transformations, so 
filtered-down to just those changes involving 5 atoms in the transformation 
(because this was quicker than chemically searching!).  I then grouped across 
the examples of transformations to give some average changes in retention time, 
plus n, range, sd:

Transformation

Mean RT change (min)

RT range (min)

SD

n

*[H][*]CCC

2.5

3.3

0.999

28

*[H][*]C(C)C

2.19

5.47

1.09

37

*[H][*]CCCl

1.91

1.5

1.06

2

*[H][*]C(F)F

1.22

1.36

0.748

3

*[H][*]C1CC1

1.18

1.04

0.436

4

*[H][*]N(C)C

1.08

1.21

0.472

6

*[H][*]CSC

0.67

0

0

1

*[H][*]OCC

0.479

4.67

1.17

15

*[H][*][N+](=O)[O-]

0.169

2.82

0.645

35

*[H][*]NCC

0.0625

0.045

0.0318

2

*[H][*]CCO

0.06

0.04

0.0283

2

*[H][*]COC

-0.001

2.46

0.62

14

*[H][*]CC=C

-0.357

0

0

1

*[H][*]C(C)=O

-0.397

1.21

0.696

3

*[H][*]C(=O)O

-0.848

4.3

2.17

3

*[H][*]CC#N

-1.3

2.35

1.66

2

*[H][*]C(N)=O

-2.72

0

0

1

*[H][*]CCN

-2.77

0

0

1



So on average over the 35 examples of H -- NO2 the change made the molecules 
slightly more lipophilic (or, at least, they were retained slightly longer on a 
C18 column).
I expect there is much more data-digging that could be done - particularly with 
larger data sets, and (maybe) with proper logP / logD measurements; but for now 
I am going to stick to thinking NO2 groups can be lipophilic additions(!)

Cheers

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chemistry 101 question...

2013-10-22 Thread James Davidson
Hi Nik,

Nik wrote:
 Interesting. I wonder if this is also dependent on the transport phase that 
 was used. Do you have any info on that? Was it a typical 10% MeOH or more 
 something with dichlormethane?

I dug-out the conditions:
LC retention time Method A refers to elution of a sample through an XTERRA RP18 
(50 mm x 4.6 mm) 5 µm column under gradient conditions.  The initial eluent 
comprises 50% Methanol (pump-A) and 50% of a 10 mM aqueous ammonium acetate 
solution containing 5% IPA (pump-B) at a flow rate of 2 mL/min.  After 1 min, a 
gradient is run over 5 min to an end point of 80% pump-A and 20% pump-B, which 
is isocratically maintained for a further 3 min.  UV peak detection is 
generally carried out at a wavelength of 220 nm.


I should also say that, in my experience, even under normal-phase conditions 
(ie silica column and organic eluent) nitro-aromatics tend to behave 
'greasily'.  Who in big pharma wants to mine some nitration reaction data to 
pull out TLC plate Rf data (normal phase) + LC retention (reverse phase)?  I 
think your DB may be bigger than ours!  : )

Cheers

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Handling reaction stereochemistry

2013-04-07 Thread James Davidson
Hi Greg,

 Correct, relative (or other forms of enhanced) stereochemistry is not
 possible. It's worth talking about how to deal with this, but it's
going to be
 more than a little bit of work, I suspect.

I suspect so, too!


 The conversation about representation of and handling of enhanced
 stereochemistry, and what the actual use cases are, would be a good
one to
 have. I think it's probably going to be difficult via email though.
Maybe a topic
 for the UGM...

I agree re: email.  A topic for discussion at the UGM sounds like a very
good idea - that gives everybody 6 months to mull it over!


Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Handling reaction stereochemistry

2013-04-03 Thread James Davidson
Hi Greg

 
 I should have provided a bit more context around what the current
behavior
 is, or at least what it's supposed to be. Sorry I forgot that.

My fault - I should have (re)read the manual (I thought it seemed a bit
familiar..!)


 Currently, when creating a reaction from rxnSMARTS,
inversion/retention is
 handled by looking at the relative stereochemistry of atoms in the
reactants
 and products.
 
 If they're different you get inversion (apologies for the extremely
bogus
 example reaction):
 
 In [13]: rxn = AllChem.ReactionFromSmarts([C@:1][C@@:1])
 
 In [14]: ps =
rxn.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),))
 
 In [15]: Chem.MolToSmiles(ps[0][0],True)
 Out[15]: 'F[C@@](Cl)(Br)I'
 
 In [16]: ps =
rxn.RunReactants((Chem.MolFromSmiles('F[C@@](Cl)(Br)I'),))
 
 In [17]: Chem.MolToSmiles(ps[0][0],True)
 Out[17]: 'F[C@](Cl)(Br)I'
 
 and if they're the same you get retention:
 
 In [7]: rxn2 = AllChem.ReactionFromSmarts([C@:1][C@:1])
 
 In [8]: ps =
rxn2.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),))
 
 In [9]: Chem.MolToSmiles(ps[0][0],True)
 Out[9]: 'F[C@](Cl)(Br)I'
 
 In [10]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1])
 
 In [11]: ps =
rxn3.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),))
 
 In [12]: Chem.MolToSmiles(ps[0][0],True)
 Out[12]: 'F[C@](Cl)(Br)I'
 
 
 This much feels logical to me, though of course it can be changed if
there's
 disagreement.

It sort of does to me too, but I can't shift the sensation that there
might be a can of worms here - more on that in a moment...


 If you call the reaction with non-chiral starting material, you get
non-chiral
 ouput:
 
 In [20]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1])
 
 In [21]: ps = rxn3.RunReactants((Chem.MolFromSmiles('FC(Cl)(Br)I'),))
 
 In [22]: Chem.MolToSmiles(ps[0][0],True)
 Out[22]: 'FC(Cl)(Br)I'
 
 This is probably also ok; it certainly reflects what would happen in
the lab (er,
 at least I think it does).

Just to be a pedant for a moment (but actually, this could be important
later) - this is actually calling the reaction with *chiral* (albeit
presumably racemic) starting material


 So far so good. We've got inversion of stereochemistry and retention
of
 stereochemistry. There are two cases left: resolution/creation and
 scrambling.
 
 One obvious thing to do here would be:
 
   [C@:1][C:1]   scrambling
   [C:1][C@:1]   resolution/induction
 
 This is where my extremely bogus example starts to make things more
 difficult to understand, so here's a more real example of the
induction case:
[#6:1]/[C:2]=[C:3](/[#6:4])[#6:1][C@H:2](Br)[C@H:3](Br)[#6:4]
 
 Seem right?

Can of worms alert 1!!  At first sight this seems perfectly ok(?) -
as long as we accept that we know what we mean by the (R) flags on the
carbons (by my reckoning we probably mean syn addition of Br2 across a
double-bond?).  But - problems of symmetry and atom priorities aside(!)
- what do I do if I want to employ the same transformation but with no
absolute stereo-control (ie if I don't have the same wonder-catalyst)?
At the moment I guess there is no way to represent relative
stereochemistry in the absence of an enhanced stereochemistry model?

This brings me on to the main can of worms sensation - and I think it
may revolve trying to service both real and 'virtual/fake' reactions in
the same system, as well as some obvious concerns about enhanced
stereochemistry.  So some examples / questions:

1.  I have a super-useful enzyme that will only hydrolyse (R)-esters (or
more precisely I should say it won't hydrolyse (S)-esters).  So:

CC[C@H](C)C(=O)OCCC[C@H](C)C(=O)O ## R gets hydrolysed
CC[C@@H](C)C(=O)OCCC[C@@H](C)C(=O)OC  ## S doesn't
CCC(C)C(=O)OC ## Oh dear, what do we want to happen here?  I know what
my enzyme will do - but we do have to assume that we are implying a
racemic mix (it gets more worrying if we might mean a single, but
unknown, enantiomer, or we might know nothing at all - we're back to
enhanced stereochemistry again!)
CCC(C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)OC  ## So this is
what the enzyme would do - because we have treated the chiral centre as
a racemic mix - essentially expanding out to:
CC[C@H](C)C(=O)OC.CC[C@@H](C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)O
C

The problem with this is that it doesn't fit with the existing rSMARTS
nomenclature for retention and inversion, because the absolute
stereochemistry of the starting material affects the outcome of the
reaction!  But I guess my enzyme reaction above would be represented as
something like

[C@:1][C:2](=[O:3])[O:4]C[C@:1][C:2](=[O:3])[O:4]H

But we would have to (a) assume now that '@' in the starting material
only matched (R), and (b) treat incoming racemates intrinsically as
two-component mixtures of (R) and (S) to then apply the transformation
to just the (R) and add the (S) starting material to the products...


2.  I am a database admin, and I want to transform some mis-assigned
racemates to the (S) enantiomers

Eg 

Re: [Rdkit-discuss] Handling reaction stereochemistry

2013-04-02 Thread James Davidson
Hi Greg,

 

 I've got a question for the community about how chirality should be
handled in reactions.

 This morning I managed to fix one of the outstanding reaction
stereochemistry problems in the RDKit: the loss of chirality when one
bond to a stereocenter is to an unmapped atom. Here's a quick demo of
the new behavior (not yet checked in; there are still a couple things to
be cleaned up):

 

 In [7]: rxn = AllChem.ReactionFromSmarts('[C:1]-O[C:1]-S')

 In [8]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@H](O)Cl'),))

 In [9]: Chem.MolToSmiles(ps[0][0],True)

 Out[9]: 'F[C@H](S)Cl'

 

 It seems nice to be able to preserve chirality in these cases.

 The question that comes up is: *Should* we be preserving chirality in
these cases?. The change makes it impossible to indicate a reaction
that scrambles stereochemistry. That doesn't seem right.

 So... the question to you guys: How should stereochemistry
inversion/retention/loss be indicated in Reaction SMARTS?

 

 

Good question - and let me be the first to jump in, feet first, without
thinking enough!  : )

Instinctively, I would say it would be good to (a) scramble
stereochemistry if not otherwise specified - at least this way we
default to losing information rather than risking keeping incorrect
information; (b) use a flag at each centre if we want to retain
stereochemistry (what about '@'?); (c) use another flag if we want to
invert (and, inventive I know, what about '@@'?).

 

So in the above example, let's say I want to always invert (eg to
represent an SN2 reaction) - the rSMARTS could then be something like
[C:1]-O[C:1@@]-S, and the example input above would give F[C@@H](S)Cl
out.

The same output with no specification could give FC(H)(S)Cl and, of
course, achiral input would always give achiral output - regardless of
the flag in the rSMARTS.

 

 

 Bonus points to anyone who can explain to me how the
inversion/retention flags in RXN files should be handled. At the moment
the RDKit uses what's in the products and ignores them in the reactants.

 

Something like the above? (I told you I hadn't thought about it enough!)

 

Kind regards

 

James


__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Polymers, S-Groups, and molblock-parsing (oh my!)

2011-10-19 Thread James Davidson
Dear All,
 
I just wanted to raise an observation about the behaviour of the
molblock parser.  I was running some SMARTS-based substructure queries
in KNIME, and happened to be looking for aromatic N-oxides - the query
was just nO - which should maybe be the answer as well!  : )
 
Anyway, I was actually searching DrugBank (via the SDF -
http://www.drugbank.ca/system/downloads/current/structures/small_molecul
e.sdf.zip) and found Heparin was a hit for my query - which I thought
was a bit funny as there are no aromatic nitrogens.  It seems, however,
that the match is due to the * atoms in the molblock (see below) that
are representing the polymer repeat points (leading to *-O, which is
matching n-O).  As I understand it, the rest of the info about the
polymer is stored as S-Group data - and I am presuming that RDKit is not
currently interpreting this(?)
 
So I guess the simple question is - should polymers, etc be handled by
the parser (maybe if not fully, just partially - eg by deleting the *
atoms if the S-Group data are found)?
 
Kind regards
 
James
 
 
 
  Mrv0541 09201117322D  
 
14  0  0  1  0999 V2000
   12.8725  -11.15210. C   0  0  1  0  0  0  0  0  0  0  0  0
   13.5903  -11.56670. C   0  0  1  0  0  0  0  0  0  0  0  0
   12.8725  -10.32720. C   0  0  2  0  0  0  0  0  0  0  0  0
   11.8517  -11.74930. O   0  0  0  0  0  0  0  0  0  0  0  0
   14.2992  -11.15210. C   0  0  2  0  0  0  0  0  0  0  0  0
   13.5903  -12.39140. O   0  0  0  0  0  0  0  0  0  0  0  0
   13.5903   -9.91720. O   0  0  0  0  0  0  0  0  0  0  0  0
   12.1547   -9.91720. C   0  0  0  0  0  0  0  0  0  0  0  0
   10.8307  -12.33350. C   0  0  1  0  0  0  0  0  0  0  0  0
   14.2992  -10.32720. C   0  0  2  0  0  0  0  0  0  0  0  0
   14.9729  -11.91850. N   0  0  0  0  0  0  0  0  0  0  0  0
   11.4415  -10.32720. O   0  0  0  0  0  0  0  0  0  0  0  0
   10.8307  -13.15820. C   0  0  1  0  0  0  0  0  0  0  0  0
   10.1175  -11.92320. O   0  0  0  0  0  0  0  0  0  0  0  0
   15.3200   -9.74330. O   0  0  0  0  0  0  0  0  0  0  0  0
   16.1934  -11.91390. S   0  0  0  0  0  0  0  0  0  0  0  0
   10.1175  -13.57280. C   0  0  2  0  0  0  0  0  0  0  0  0
   11.7684  -14.14450. O   0  0  0  0  0  0  0  0  0  0  0  0
9.3996  -12.33350. C   0  0  2  0  0  0  0  0  0  0  0  0
   16.3409   -9.15040. C   0  0  2  0  0  0  0  0  0  0  0  0
   16.1889  -12.73870. O   0  0  0  0  0  0  0  0  0  0  0  0
   16.1889  -11.08920. O   0  0  0  0  0  0  0  0  0  0  0  0
   17.0225  -11.91390. O   0  0  0  0  0  0  0  0  0  0  0  0
9.3996  -13.15820. C   0  0  1  0  0  0  0  0  0  0  0  0
   10.1175  -14.39750. O   0  0  0  0  0  0  0  0  0  0  0  0
8.6864  -11.92320. C   0  0  0  0  0  0  0  0  0  0  0  0
   16.3409   -8.32570. C   0  0  2  0  0  0  0  0  0  0  0  0
   17.0586   -9.56500. C   0  0  1  0  0  0  0  0  0  0  0  0
8.6819  -13.56830. O   0  0  0  0  0  0  0  0  0  0  0  0
7.9730  -12.33350. O   0  0  0  0  0  0  0  0  0  0  0  0
8.6864  -11.09850. O   0  0  0  0  0  0  0  0  0  0  0  0
   17.0586   -7.91540. O   0  0  0  0  0  0  0  0  0  0  0  0
   15.6276   -7.91540. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.7720   -9.15040. C   0  0  2  0  0  0  0  0  0  0  0  0
   17.0586  -10.39420. O   0  0  0  0  0  0  0  0  0  0  0  0
6.9121  -13.55940. *   0  0  0  0  0  0  0  0  0  0  0  0
   17.7720   -8.32570. C   0  0  2  0  0  0  0  0  0  0  0  0
   14.9099   -8.32570. O   0  0  0  0  0  0  0  0  0  0  0  0
   15.6276   -7.09070. O   0  0  0  0  0  0  0  0  0  0  0  0
   19.3208  -10.04870. O   0  0  0  0  0  0  0  0  0  0  0  0
   18.7974   -7.73260. O   0  0  0  0  0  0  0  0  0  0  0  0
   19.8138   -7.14420. C   0  0  2  0  0  0  0  0  0  0  0  0
   19.8138   -6.31940. C   0  0  2  0  0  0  0  0  0  0  0  0
   20.5314   -7.55890. C   0  0  1  0  0  0  0  0  0  0  0  0
   20.5314   -5.90930. O   0  0  0  0  0  0  0  0  0  0  0  0
   19.1005   -5.90930. C   0  0  0  0  0  0  0  0  0  0  0  0
   21.2449   -7.14420. C   0  0  2  0  0  0  0  0  0  0  0  0
   20.5314   -8.38800. O   0  0  0  0  0  0  0  0  0  0  0  0
   21.2449   -6.31940. C   0  0  0  0  0  0  0  0  0  0  0  0
   18.2713   -5.90040. O   0  0  0  0  0  0  0  0  0  0  0  0
   22.5298   -7.73480. N   0  0  0  0  0  0  0  0  0  0  0  0
   22.7828   -5.43680. *   0  0  0  0  0  0  0  0  0  0  0  0
   17.4465   -5.89590. S   0  0  0  0  0  0  0  0  0  0  0  0
   22.5342   -8.56390. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.4421   -6.72070. O   0  0  0  0  0  0  0  0  0  0  0  0

Re: [Rdkit-discuss] 2011.09 (Q3 2011) RDKit release

2011-10-17 Thread James Davidson
Thanks Greg, and George.
 
I have not tested the new win-py27 binary fully - but it does at least
behave itself when importing AllChem!
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] 2011.09 (Q3 2011) RDKit release

2011-10-16 Thread James Davidson
Hi Greg,
 
I probably should have picked this up in the beta (but didn't...)  When
I try to import AllChem, I see the following:
 
 from rdkit import Chem
 from rdkit.Chem import AllChem
 
Traceback (most recent call last):
  File pyshell#6, line 1, in module
from rdkit.Chem import AllChem
  File C:\Python27\RDKit_2011_09_1\rdkit\Chem\AllChem.py, line 28, in
module
from rdkit.Chem.rdSLNParse import *
ImportError: DLL load failed: The specified module could not be found.
 
Any advice?
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Beta of Q3 2011 Release Available

2011-10-02 Thread James Davidson
Hi Greg,

 If there's demand for it, I will also put up a windows binary.

As usual, I'd appreciate a Windows build against python 2.7  : )

Thanks

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Lipinski HBD count

2011-09-30 Thread James Davidson
Hi Greg,

Greg wrote: 
 You actually don't need to add the Hs:
  p1 = Chem.MolFromSmarts('[#7,#8;H1]')
  p2 = Chem.MolFromSmarts('[#7,#8;H2]')
  p3 = Chem.MolFromSmarts('[#7,#8;H3]') m = 
  Chem.MolFromSmiles('CC(=O)N')
  m2 = Chem.MolFromSmiles('OCC(=O)N')
  def NHOHCount(mol): return 
  
 len(mol.GetSubstructMatches(p1))+2*len(mol.GetSubstructMatches(p2))+
  3*len(mol.GetSubstructMatches(p3))
 ...
  NHOHCount(m)
 2
  NHOHCount(m2)
 3

I think this system works well in almost all cases : )  However, I had a
nagging concern over a couple of 'edge' cases - namely water, and
ammonia (and for that matter, the oxonium and ammonium ions).

I guess the simple inclusion of P4 = Chem.MolFromSmarts('[#8;H4]') would
make sure all cases were covered(?).

Out of interest, I decided to compile a small list of 'normal' and
'edge' case SMILES, and ran it through the MOE descriptor node in KNIME.
For all these cases, lip_don behaves as I would expect (tab-separated
output included below)

Kind regards

James

SMILESa_acc a_don lip_acc   lip_don
CO1.0 1.0 1.0 1.0
C(=O)N1.0 1.0 2.0 2.0
O 1.0 1.0 1.0 2.0
CN1.0 1.0 1.0 2.0
[O+]  1.0 0.0 1.0 3.0
C[O+] 1.0 0.0 1.0 2.0
[N+]  0.0 0.0 1.0 4.0
C[N+] 0.0 0.0 1.0 3.0
[N-]  0.0 1.0 1.0 2.0
[O-]  0.0 1.0 1.0 1.0
C(=O)[N-] 0.0 1.0 2.0 1.0

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Lipinski HBD count

2011-09-30 Thread James Davidson
Hi Greg,

Greg wrote: 
 For what it's worth: the results here are definitely not 
 correct for the SMILES as provided. Atoms in SMILES that are 
 in square brackets have no implicit Hs, so [N+] actually has 
 zero hydrogens. I guess you actually provided the molecules 
 to MOE in some other form.

Oops - you're quite right - I converted them to MOL format with ChemAxon
MolConverter.  However, the point about implicit hydrogens for atoms in
square brackets had completely passed me by - thanks!

 Output with the SVN version of the RDKit:
 
 #--
 Smiles NOCount NHOHCount
 CO 1 1
 C(=O)N 2 2
 O 1 2
 CN 1 2
 [OH3+] 1 3
 C[OH2+] 1 2
 [NH4+] 1 4
 C[NH3+] 1 3
 [NH2-] 1 2
 [OH-] 1 1
 C(=O)[NH-] 2 1
 #-


Looks great!

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Beta of Q2 2011 Release Available

2011-07-04 Thread James Davidson
Hi Greg, 

  windows binary (py27, please  : )  )
 
 It's up on the google download page; hopefully I remembered 
 all the DLLs this time. :-S
 
 -greg


The binary works a treat - no sign of missing DLLs - thanks!

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit.Chem.Draw.spingCanvas.py (and py27 aggdraw / cairo help?)

2011-06-11 Thread James Davidson
Hi Greg,

 Greg wrote: 
 The attached .pyd is 32-bit aggdraw build for python2.7 on 
 windows. I tested it very briefly and it seems to work; let 
 me know if you have problems with it.

It works a treat - very much appreciated!  My molecules have never
looked better  : )

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit.Chem.Draw.spingCanvas.py (and py27 aggdraw / cairo help?)

2011-06-10 Thread James Davidson
Dear Greg, Riccardo, et al.


 Riccardo wrote:
 I don't know exactly about the other problems, but this one 
 should be related to the version of the installed PIL. If I 
 remember correctly, BGRA raw mode requires PIL 1.1.7.

@Riccardo - 
Thanks for the advice, Riccardo.  I think I was already on 1.1.7 - but
maybe an alpha release(?)  Anyway, I have now standardised across Python
2.6 and 2.7 with the latest PIL installers from
http://www.pythonware.com/products/pil/.


 Greg wrote: 
 I will try to do an aggdraw build for 2.7. If I succeed, I'll 
 post something.

@Greg -
Thanks for the kind offer.  I for one would be very pleased to be using
aggdraw again (as I think the image quality seems the best).  I am
pleased to say, however, that it is less critical now that I have worked
through my cairo issues!  : )


Previously I was getting:

Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit
(Intel)] on win32
Type copyright, credits or license() for more information.
 from rdkit import Chem
 from rdkit.Chem import AllChem, Draw
 mol = Chem.MolFromSmiles(c1c1)
 AllChem.Compute2DCoords(mol)
 im = Draw.MolToImage(mol)
!!!PYTHONW.EXE CRASH!!!


Finally, after a morning of going round in circles (and following
red-herring Dependency Walker trails(!)), things are running well; with
RDKit now happily calling cairo!  I thought it might be useful for
others to list the versions of software / DLLs that I finally found to
work:

Windows XP Pro. SP3 (32-bit)
Python 2.7.1 (http://www.python.org/ftp/python/2.7.1/python-2.7.1.msi)
PIL 1.1.7 (installer -
http://effbot.org/downloads/PIL-1.1.7.win32-py2.7.exe)
Pycairo-1.8.10.win32-py2.7 (installer -
http://ftp.gnome.org/pub/GNOME/binaries/win32/pycairo/1.8/pycairo-1.8.10
.win32-py2.7.exe)
Libcairo-2.dll (get from the following archive:
http://wxpython.org/cairo/cairo_1.8.6-1_win32.zip)
libpng12-0.dll (get from the following archive:
http://wxpython.org/cairo/libpng_1.2.34-1_win32.zip)
Zlib1.dll (get from the following archive:
http://wxpython.org/cairo/zlib123-dll.zip)

I then put the 3 DLLs into the C:\Python27\Lib\site-packages\cairo\
folder, and made sure that this is on the system path.  The use of the
wxPython DLLs seemed to be the key to sorting things out (I certainly
tried a few other versions!) - thanks to Alex Matan's blog-post for the
instructions
(http://electromagnetictelegraph.com/install-cairo-wxpyton-pycairo-pytho
n-windows)

The setup under 2.6 was exactly the same - except I used the
corresponding 2.6 installer for PIL, and used the
Pycairo-1.8.4.win32-py26 from the wxPython site
(http://wxpython.org/cairo/pycairo-1.8.4.win32-py2.6.exe) as detailed in
Alex's blog.


I can add these instructions to to the wiki if you like(?)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] rdkit.Chem.Draw.spingCanvas.py (and py27 aggdraw / cairo help?)

2011-06-09 Thread James Davidson
Dear All,
 
I am in the process of upgrading to python 2.7 under Windows, and part
of this has included moving to the RDKit_2011_03_2 (py27) build.  I had
previously done most work with earlier versions of RDKit under python
2.6, but have found a problem with calling Draw.MolToImage() with the
latest RDKit binary for both py26 and py27:
 
Traceback (innermost last):
  File C:\Python26\lib\site-packages\Pmw\Pmw_1_3\lib\PmwBase.py, line
1747, in __call__
return apply(self.func, args)
  File C:\Python26\lib\site-packages\pmg_tk\startup\VerMOL.py, line
1188, in lambda
command=lambda
s=self:s.draw_ligand(self.modelling_chainlist.listbox, self.ligcanvas,
self.smiles, '3D',200,200, 'modelling_lig_image'))
  File C:\Python26\lib\site-packages\pmg_tk\startup\VerMOL.py, line
2871, in draw_ligand
im = Draw.MolToImage(mol, size=(x,y))
  File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\__init__.py, line
71, in MolToImage
drawer.AddMol(mol,**kwargs)
  File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\MolDrawing.py, line
361, in AddMol
color=color,width=width,color2=color2)
  File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\MolDrawing.py, line
190, in _drawBond
dash=self.dash)
  File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\MolDrawing.py, line
169, in _drawWedgedBond
 
self.canvas.addCanvasDashedWedge(poly[0],poly[1],poly[2],color=color)
  File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\spingCanvas.py,
line 104, in addCanvasDashedWedge
pts1 = _getLinePoints(p1,p2,dash)
type 'exceptions.NameError': global name '_getLinePoints' is not
defined
 
 
Not a big problem to sort - I think spingCanvas.addCanvasDashedWedge()
should read:
 
pts1 = self._getLinePoints(p1,p2,dash)
pts2 = self._getLinePoints(p1,p3,dash)
 
on lines 104, 105 instead of:
 
pts1 = _getLinePoints(p1,p2,dash)
pts2 = _getLinePoints(p1,p3,dash)
 
 
Anyway, this is only a problem if spingCanvas is being called - which I
think only happens as a last resort if aggdraw or cairo aren't found.
So on that note, the reason I was calling spingCanvas was that I don't
have a build of aggdraw for python 2.7, and I have found that when
cairo/pycairo are available to python 2.7 I get a pythonw.exe
Application Error at the point of calling Draw.MolToImage().  Under
python 2.6 I thought I would see what would happen if I removed aggdraw
to force cairo into play (different version of PIL, different version of
cairo - not an ideal comparison!):
 
File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\__init__.py, line 54,
in MolToImage
canvas = Canvas(img)
  File C:\Python26\RDKit_2011_03_2\rdkit\Chem\Draw\cairoCanvas.py,
line 38, in __init__
imgd = image.tostring(raw,BGRA)
  File C:\Python26\lib\site-packages\PIL\Image.py, line 516, in
tostring
e = _getencoder(self.mode, encoder_name, args)
  File C:\Python26\lib\site-packages\PIL\Image.py, line 389, in
_getencoder
return apply(encoder, (mode,) + args + extra)
type 'exceptions.SystemError': unknown raw mode
 
 
If it helps, I can follow-up with more details on exact versions of
DLLs, etc; but for now wondered if:
 
(a) anybody had a version of aggdraw for windows, built with python 2.7?
(b) or any recommendations for reliable PIL / cairo / pycairo
combinations for python 2.7 / windows?
 
Kind regards
 
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.

Re: [Rdkit-discuss] Sample RD Files?

2011-06-08 Thread James Davidson
Hi Greg,

Thanks for the python-full reply!

 # let's test the reaction to make sure it works.
 # due to a (already reported) bug in the way atom properties 
 are handled, nrxn cannot be directly used, # so we use a hack 
 and reparse it:
  nrxn = AllChem.ReactionFromSmarts(AllChem.ReactionToSmarts(nrxn))
  nrxn.Validate()
 # now we can run a molecule through to make sure it works:
  nmol = Chem.MolFromSmiles('c1c1C') nps = 
  nrxn.RunReactants((nmol,)) print Chem.MolToSmiles(nps[0][0])
 # output is: BrCc1c1
 
 Is that what you're looking for?

It certainly allows me to do what I want - which is get a mapped RXN
out.  And this can even be done with coordinates - which I have added
below as a reminder to anyone (which included me until about 10 mins
ago!) who had forgotten:

AllChem.Compute2DCoordsForReaction(nrxn)
rxnBlock = AllChem.ReactionToRxnBlock(nrxn)


So Thanks very much!  : )

 - and thanks for the reminder about sanitizing products from
reactions...

 The molecules that come back from reactions have not been 
 sanitized, so all you need to do is add a call to 
 Chem.SanitizeMol first:
 
  Chem.SanitizeMol(prod)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problem with ConstrainedEmbed()

2011-05-20 Thread James Davidson
Dear All,

I am currently having some problems using the AllChem.ConstrainedEmbed() -
which I have previously used successfully in version 2010_09_1 (Windows py26
binary).  The following example demonstrates the issue:

 from rdkit import Chem
 from rdkit.Chem import AllChem
 template = Chem.MolFromSmiles(c1cnn(Cc2c2)c1)
 mol = Chem.MolFromSmiles(c1ccc(Cn2ncc(-c3c3)c2)cc1)

Now I give the template some 3D coordinates:

 AllChem.EmbedMolecule(template)
 AllChem.UFFOptimizeMolecule(template)

and finally, try to force an overlay of 'mol' onto 'template'

 AllChem.ConstrainedEmbed(mol, template, True)

Traceback (most recent call last):
  File pyshell#7, line 1, in module
AllChem.ConstrainedEmbed(mol, template, True)
  File C:\RDKit_2010_12_1\rdkit\Chem\AllChem.py, line 294, in
ConstrainedEmbed
rms = AlignMol(mol,core,atomMap=algMap)
RuntimeError: Range Error


I am not getting a ValueError - so I think this shows the substructure match
is ok, but wasn't sure where to go to dig into the AlignMol() function...

As the error message above shows, this is running 2010_12_1.  I have just
tried with 2011_03_1beta1 and 2011_03_2 and get the same thing.  In
2010_09_1 the ConstrainedEmbed() passes back an RDKit molecule fine.

Kind regards

James
--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Beta of Q1 2011 Release Available

2011-04-04 Thread James Davidson
Hi Greg - great news about the beta / new functionality!

 Greg wrote:
 This morning I tagged the beta for the Q1 2011 (2011.03 in the new
 numbering) release in svn:
 http://rdkit.svn.sourceforge.net/viewvc/rdkit/tags/Release_201
 1_03_1beta1/
 
 and uploaded a source distribution to the google code site:
 http://code.google.com/p/rdkit/downloads/detail?name=RDKit_201
 1_03_1beta1.tgz
 If there's demand for it, I will also put up a windows binary.

As usual, yes, please for a python 2.6 windows binary if possible  : )

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself; 
WebMatrix provides all the features you need to develop and 
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Beta of Q4 2010 release up

2011-01-05 Thread James Davidson
Hi Greg,

 Greg wrote:
 If there's demand for it, I will also put up a windows binary.
 
 As usual: if no show-stopper bugs appear, I will do the release itself
 in about a week.

I would appreciate a Windows binary to check out the beta release - but
if it is just me, I can obviously wait for the full release (presuming a
windows binary would be available at that point?)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Canonical smiles for medium and large rings?

2011-01-04 Thread James Davidson
Hi Greg,

 On Sat, Dec 18, 2010 at 6:27 AM, Greg Landrum 
 greg.land...@gmail.com wrote:
 
 I just checked in a set of changes that should get this 
 (mostly) working correctly. Here's a demonstration with Geldanamycin:
 
 In [7]: 
 smi=r'NC(=O)o...@h]1c(/C)=C/[...@h](C)[C@@H](O)[C@@H](OC)c...@h](C
 )C\C2=C(/OC)C(=O)\C=C(\NC(=O)C(\C)=C\C=C/[C@@H]1OC)C2=O'
 
 In [8]: print Chem.CanonSmiles(smi)
 COC1=C2C[C@@H](C)c...@h](OC)[...@h](O)[C@@H](C)/C=C(\C)[...@h](OC(N
 )=O)[C@@H](OC)/C=C\C=C(/C)C(=O)NC(=CC1=O)C2=O

Thanks for looking into this so quickly!

 It would be *really* useful to have some more real-world 
 cases like this one to use as tests. So if you happen to have 
 others you can send I would be quite happy to have them.

On that note, I have added a comment to the bug tracker
(https://sourceforge.net/tracker/?func=detailaid=3139534group_id=16013
9atid=814650) - but was not sure how to attach a file (eg sdf) there,
so apologies for it ending up on more lines than I intended...  Also, I
logged in with my google account, but it looks like it may not be clear
who it is!

The first two examples are two marine natural products that only differ
in the geometry of the double bond in the medium ring.  The final
example is a cis- analogue that I synthesised during my PhD for which a
crystal structure was also obtained.  The stereochemistry in these
systems is 'challenging' to say the least, so I thought they would make
reasonable test cases.  I should say that even for the cis- double bond
cases, RDKit does a rather ugly job of the 2D depiction - but I am not
sure if other depictors will perform much better...

On a related note, I was keen to manually double-check the
stereochemistry that had been assigned to each of the chiral centres
(particularly the ones involving the 9-5 ring connections - as these are
potentially troublesome), and found myself wishing there was a way to
easily label a 2D depiction of the molecules with the atom ID.  What I
ended-up doing was the following:

1.  Getting the R/S info + atomIdx back from RDKit (example output):
 Chem.FindMolChiralCenters(mol)
[(3, 'R'), (7, 'R'), (8, 'S'), (9, 'R'), (11, 'R'), (18, 'R'), (24,
'R')]
2.  Opening the molfile in a program where I know how to label with atom
IDs (pymol)
3.  Check which atom is which manually (had to add 1 to the RDKit
atomIdx values as they start at 0) then double-check with reference
values.

RDKit performed admirably - but I presume this is dependant on the
quality of the wedge info coming in from the SDF(?)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Canonical smiles for medium and large rings?

2010-12-17 Thread James Davidson
Dear All,
 
I have been investigating an issue that a colleague of mine identified.
He was working with the RDKit Canon Smiles node in Knime, and found that
for the natural product, Geldanamycin, the double-bond geometry
information was being lost during canonicalisation.  I repeated this
result outside of knime:
 
from rdkit import Chem
from rdkit.Chem import AllChem

 smi =
r'NC(=O)o...@h]1c(/C)=C/[...@h](C)[C@@H](O)[C@@H](OC)c...@h](C)C\C2=C(/OC)C(
=O)\C=C(\NC(=O)C(\C)=C\C=C/[C@@H]1OC)C2=O'
 AllChem.CanonSmiles(smi)

'COC1=C2C[C@@H](C)c...@h](OC)[...@h](O)[C@@H](C)C=C(C)[...@h](OC(N)=O)[C@@H](
OC)C=CC=C(C)C(=O)NC(=CC1=O)C2=O'


The simpler example below may be better:

 smi1 = r'O1CC/C=C\1' # cyclic ether
 smi2 = r'OCC/C=C\' # corresponding acyclic alcohol

 AllChem.CanonSmiles(smi1)
'C1C=CCCOCCC1' - stereochemistry lost
 AllChem.CanonSmiles(smi2)
'/C=C\\CCO' - stereochemistry retained


So, I am guessing that double-bonds in rings are being 'ignored'(?) by
the canonicaliser?  For 'classic' aliphatic systems, double-bonds in
3-7-membered rings can only sensibly exist in the cis orientation, so
'ignoring' them would be ok.  However, for 8-membered and above, cis or
trans are certainly both possible, so it becomes more important to keep
track - particularly if canonical smiles are being used to check for
unique structures, as my colleague was doing with the geldanamycin
example above.
 
Any thoughts / suggestions are much appreciated as always!

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Handling certain sterochemistry in reactions

2010-11-26 Thread James Davidson
Dear All,
 
I wonder if anybody can help with the following?  I am trying to
figure-out how to handle double-bond stereochemistry in reactions when
the stereochemistry is involved with the making / breaking bond.
Hopefully this example will explain better than that sentence(!):
 
rxn = AllChem.ReactionFromSmarts('[c:1][Cl,Br,I].[#6:2][B][*:1][*:2]')
mol1 = Chem.MolFromSmiles('c1c1Br')
mol2 = Chem.MolFromSmiles('C\C=C\B(O)O')
ps = rxn.RunReactants((mol1, mol2))
Chem.MolToSmiles(ps[0][0], True)
 
---  'CC=Cc1c1' (stereochemical information lost)
 
whereas using mol2 = Chem.MolFromSmiles('C\C=C\c1c1B(O)O') gives
 
---  'C/C=C/c1c(-c2c2)1' (stereochemical information retained)
 
Not quite the same, but I have read through some related SMIRKS info
here: http://www.daylight.com/dayhtml/doc/theory/theory.smirks.html
http://www.daylight.com/dayhtml/doc/theory/theory.smirks.html .
However, this explains how to handle stereo centres and stereo bonds in
reactions when they are explicitly defined on both sides of the
reaction.  I guess what I am looking for is a shortcut for saying
'retain' or 'invert' stereochemistry at reacting centre (sp3) or bond
attached to reacting centre (sp2)...
 
Having got to the end of explaining that, I am thinking that the way I
should handle this is to check for 'problem' reactants and pass to a
more specific rSMARTS when required!
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Increase Visibility of Your 3D Game App  Earn a Chance To Win $500!
Tap into the largest installed PC base  get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Beta of RDKit knime nodes available

2010-11-26 Thread James Davidson

Hi Greg and Thorsten,


 Greg:

 Thorsten:
 On the other hand, 4000 rows should not take that long in KNIME. How
 much times does it currently take?

 I just did 1000 rows on my macbook. Assuming I'm reading the knime log
 correctly, that took about a minute.


Thanks for testing this out, Greg.  I must confess, I didn't wait for
the hierarchical clustering to finish for the 4000!  Going back and
selecting a random 1000 molecule subset, I reproduce your result of ~ 1
min (I get 67 secs).  If I then go to 2000, it takes 520 secs - so to me
this looks like cubic complexity - which is what the documentation for
the node states (this would mean  1 hr for my original 4000...)

For completeness - this result was with the Hierarchical
Clustering(DistMatrix) node set with 'Tanimoto' similarity and 'Complete
Linkage' for cluster comparison.  Changing the comparison to 'Single
Linkage' did not reduce the time.

Interestingly, the documentation for the 'standard' Hierarchical
Clustering' (ie non-distance matrix) node states that it operates with
n-squared complexity.  I guess other clustering algorithms available
in knime must scale better than cubicly as well (k-means, fuzzy
c-means?) - but as far as I can see they don't currently operate on
distance matrices (or directly on bit vectors).  If they could, then
this may be a solution; or implementing the Murtagh algorithm (I am
guessing the scaling is below cubic from my recollection of the speeds
observed in rdkit).

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Increase Visibility of Your 3D Game App  Earn a Chance To Win $500!
Tap into the largest installed PC base  get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Beta of RDKit knime nodes available

2010-11-24 Thread James Davidson
Dear Greg (and, of course, Thorsten and Bernd!)
 
Great job on the Knime nodes!  I have been giving these a go and am
impressed (and excited about the future development!).  A couple of
observations / comments / questions:
 
1.  I have observed that sometimes the FP node seems to generate blank
fingerprints (doesn't appear to just be the rendering - eg blank if I
swap to 'Bit Scratch' render as well.  I have mainly been trying the
default Morgan FPs, and find that if I reset the node and re-run, the FP
is still blank.  If, however, I swap the node to eg atompair, run, then
swap back to Morgan - it seems to work...  I am running on knime 2.2.2
on Windows 32-bit.
 
2.  The next point is probably down to cheminformatics / knime naivety,
but I must confess I am struggling a little to cluster compounds based
on the FP...   I have used the 'Distance Matrix Calculate' node (with
Tanimoto similarity) to get a matrix that can be used by the
'Heirarchical Clustering (DistMatrix)' or 'k-Medoids' nodes.  However,
both of these appear to perform VERY slowly for a set of ~ 4000
compounds.  I also attempted to cluster on the fingerprints directly,
using the Neighborgrams nodes - but must confess I am some way off
understanding what I am doing!  My limited experience of using the RDKit
functionality to cluster compounds and eg select a representative set
(based on the FP Tanimoto distances and the Murtagh clustering) was that
it performed rather rapidly.  Is there the intention to expose this
functionality in knime (or is the functionality already there and I just
don't know how?)
 
3.  Any plans for Windows 64-bit support?
 
4.  I would be interested to know what the team views as the next
priorities - property calcs, 3D conformations, pharmacophores,
rendering?  So much great stuff to choose from!  :-)
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Increase Visibility of Your 3D Game App  Earn a Chance To Win $500!
Tap into the largest installed PC base  get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How can I escape to this error

2010-09-29 Thread James Davidson
Hi Greg,

Apologies for resurrecting a rather old thread, but I have been
investigating the Q32010_1beta1 release on a set of commercial amines
(from ACD) and came across the 'hypervalent P' issue as well.

Greg wrote:
 To continue and try to answer Christian's question: it is currently
 impossible to really work with this hypervalent molecule in the RDKit.
 The only real solution is to tell the RDKit that P is allowed to have
 7 substituents. If you really want to do this you can edit the file
 $RDBASE/Code/GraphMol/atomic_data.cpp and change the allowed valence
 list for P from 3 5 to 3 5 7. After you do this, you will need to
 rebuild the code.

With the new release currently in beta, I wondered whether this would be
a good time to consider if the change you suggest above for P should
make it into the release code(?)  Having said that, I am expecting that
your comment it is currently impossible to really work with this
hypervalent molecule in the RDKit suggests that a robust solution is
not as simple as just changing the allowed valence list...

Anyway, what I was finding from my list of amines was that the
hexafluorophosphate counterion  [PF6]-  was triggering the error.  Not a
particularly common counterion - so I can certainly live without(!) -
but not particularly esoteric either :-)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Cleaning SD files

2010-09-16 Thread James Davidson
Dear All,
 
Today I have spent some time processing a freely-available SDF that
contains many compounds and melting-points / ranges (
http://www.mdpi.org/molmall/mdpi1-51sd.zip).  The reason for doing this
is that I wanted to implement a melting-point predictor following the
work of Andreas Bender (J. Chem. Inf. Model. 2005, 45, 581-590) and more
recently Reifeng Liu at AZ (J. Chem. Inf. Model. 2008, 48, 981-987).
 
I have attached the python-script that I have at the moment (a) in case
it is of some use to anybody else, (b) in the hope that I can improve my
python and rdkit abilities through any suggested alterations (I'm sure
there are many!), and (c) to form the basis of a couple of questions.
At the moment, the script is just running through each compound;
checking if the molecule is valid; and if so, noting how many
components, and whether any of the atoms are outside of the desired
list.  These two results are then written out to a new SDF.  I am then
using this to make sure my data-set contains only compounds that I would
say are 'reasonable' to build a melting-point model with.  Now for the
questions:
 
1.  In RDKit, has the 'cleaning / washing / salt-stripping' of molecules
already been formalised based on a set of rules, etc?
2.  When identifying compounds that contain a non-allowed atom-type, why
do I find the SMARTS def [!H;!C;!N;!O;!F;!S;!Cl;!Br;!I] gives unexpected
results, but [!#1;!#6;!#7;!#8;!#9;!#16;!#17;!#35;!#53] works as I would
expect?
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

inorg_or_mix.py
Description: inorg_or_mix.py
--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Align SDF to user-supplied template coordinates (2D)

2010-08-16 Thread James Davidson
Dear All,
 
I am currently struggling with something that I expect is very easy to
solve (I have just got back from holiday, so I think my brain isn't
quite in the zone!)
 
I am trying to read in an SDF and align each molecule to a template
scaffold provided in molfile format.  I want to supply a tool that
allows a user to sketch in a template and view their SDF entries in 2D
all aligned (where there is a match) to the supplied template.
 
I have essentially followed this entry in the Chemistry Toolkit Rosetta
-
http://ctr.wikia.com/wiki/Align_the_depiction_using_a_fixed_substructure
http://ctr.wikia.com/wiki/Align_the_depiction_using_a_fixed_substructur
e , which in essence is pretty-much the same as the info in the RDKit
documentation.
 
However, when I am using pre-supplied 2D coordinates for the template,
rather than generating them from the first substructure match (as in the
CTR example), I find that the alignment proceeds as required, but there
is a mis-match between the scaling of the bond-lengths in the aligned
substructure compared with the rest of the molecule...
 
Is there a way to 'scale' the molecules according to the template mol
(or alternatively scale the template according to the RDKit default)?
Or is it that I am tackling this in the wrong way?
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev ___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Align SDF to user-supplied template coordinates (2D)

2010-08-16 Thread James Davidson
Thanks Greg,

Greg wrote:
 Ah yes, the depictions that you get look rather silly, no?

Yes they do!

 You're doing it correctly; no worries there. The problem is 
 that most pieces of chemical drawing software generate 2D 
 coordinates for molecules such that a C-C single bond is 1.0A 
 long. The RDKit, on the other hand, sets the C-C single bond 
 to be 1.5A long. The consequence is a depiction with a core 
 that's smaller than it should be.

I was using ISISDraw for sketching the core motif.  It seems that the
single-bond length (from the Origdraw settings) is ~ 0.825 A (!)

So I modified the scalar to 1.818 (1.5/0.825) in your code and it works
beautifully!
 
 It should be possible to specify the scale used in the RDKit 
 depictions so that these contortions are not necessary. I 
 will put a feature request in for this and get it in the next version.

Thanks for this - I certainly think this would be a worthwhile feature.
What I may implement in the meantime is running through all the bonds of
type SINGLE in the core molecule, taking the average, then using
1.5/(average) as the scalar to protect against differing user settings!

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Reading Molfiles with \'ambiguous\' 5-membered aromatics

2010-07-20 Thread James Davidson
Dear All,
 
It's been a couple of weeks since Greg first helped me with this, and
after some further help I agreed that I would do my best to summarise
things for the benefit of the Group.
 
The attached file 'sanifix3.py' was provided to me by Greg, and
essentially does exactly what I (thought I) wanted - ie if required,
'cleans' up an input molecule by modifying aromatic nitrogen-containing
ring systems until a 'sanitizable' form is generated.
 
However, having tested this a bit further, I found that N-containing
heteroaromatics (which I originally posted the question about) are only
one of many possible issues when dealing with automated atom- and
bond-typing from PDB files!  So taking this approach would require a
significantly larger set of 'rules' to cover all possible problems (I'm
sure many people more experienced than me will have been aware of this
for a long time!).  As Greg said:
 
 Figuring out the correct chemistry for a pdb ligand is one of 
 those challenges at I wouldn't dream of attempting. Between 
 the various sources of ligand structures out there you can 
 probably find omsething at least halfway acceptable. For in 
 house stuff, I would assume that you can use the registry 
 number to get a smiles or mol block, right?
 You could use that with the rdkit substructure matching code 
 to test the pymol-assigned structures.

And indeed, this is the way that I ended-up going for in-house
structures - a script that extracts our corporate ID from the PDB file
and searches our database to return the SMILES.  Then (again, thanks to
Greg for more help here, and steering me away from some clumsy usage of
ConstrainedEmbed!) a substructure match is conducted between an RDKit
mol from the SMILES (refered to as 'db_mol' in the function below), and
the original ligand.

The main point here is to convert the original ligand structure to a set
of non-aromatic atoms joined by 'unspecified' bond-types.  Below is the
excerpt from what I am using with PyMOL: 'molfile3D' is a temporary
molfile that has been created using the PyMOL 'save' command, that gets
converted to the required 'connectivity substructure' that carries the
3D coordinates we will need later:


def make3DTemplate(molfile3D):

mol = Chem.MolFromMolFile(molfile3D, False)
for atom in mol.GetAtoms():
atom.SetIsAromatic(False)
for bond in mol.GetBonds():
bond.SetBondType(rdkit.Chem.rdchem.BondType.UNSPECIFIED)

return mol


Then once we have this '3D template', the substructure match can be
conducted for the molecule built from the database SMILES string
(db_mol).  If the match is successful, the original 3D coordinates for
the atoms in the 'template' are then applied back to a conformer of our
new molecule.  Finally, this new molecule + conformation is returned as
the molblock, which I then read back in PyMOL to give a 'sanitized'
version of the bound ligand for any in-house crystal structure:


def outputMolBlock(db_mol, template_mol):

matches = db_mol.GetSubstructMatches(template_mol)
if not matches:
raise ValueError,no substruct match
if len(matches)1:
print warning! more than one isomorphism found!

db_conf = db_mol.GetConformer()
template_conf = template_mol.GetConformer()

match = matches[0]

# This sets the 3D coordinates for 
for i,mIdx in enumerate(match):
db_conf.SetAtomPosition(mIdx,
template_conf.GetAtomPosition(i))

db_conf.Set3D(True)

return Chem.MolToMolBlock(db_mol)


It wouldn't now be too much of a leap(?) to extend the same methodology
to public PDB structures - using the LigandExpo SDF.  See this post from
Noel on Blue Obelisk for background:

http://blueobelisk.shapado.com/questions/how-to-get-an-experimental-liga
nd-structure-from-the-pdb


Also, just for interest - I am using cx_Oracle to connect to our
corporate database from Python, which is now allowing me to add a few
extra bits - like flagging up to people if the in-house structure they
have just opened has been previously crystallised in any other targets,
etc, etc.  If anybody is trying to do similar, but has not used
cx_Oracle, then give me a shout and I will see if I can help (although
SQL is definitely also on the list of things I know only barely enough
about!).

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for 

[Rdkit-discuss] Interacting with molecules in PyMOL

2010-07-02 Thread James Davidson
Dear All,
 
I am trying to work out the best way to accomplish some tasks involving
RDKit, using PyMOL as an interface, and would appreciate some help.  I
would like to be able to start from a PDB file of a ligand-bound crystal
structure loaded in PyMOL and be able to 'virtually' build some
analogues - initially just simple substitutions - and visually inspect
the results.
 
(1)  So my first question is - having started PyMOL with the -R option,
is there an easy or recommended way of transferring molecules from PyMOL
to RDKit?  I can accomplish this by writing molfiles to a temporary
file, but wondered if I am creating work, if eg RDKit could
automatically create Mol objects from non-biopolymer atoms in PyMol(?).
ie it would be nice if:
 
from rdkit import Chem
from rdkit.Chem import PyMol
v = PyMol.MolViewer()
 
# Invented function to create an RDKit mol object
mol = v.GetAtomsAsMol(selection)
 
(2)  Once the ligand is converted to an RDKit mol object (by whatever
means) I want to enumrate some libraries of virtual products - eg
choosing an atom in PyMol as the attachment point, then running a set of
reactions to get products with a set of substituents added.  In
principle I think this is quite straightforward; however, what I am
struggling with is a mechanism to 'freeze' the 3D coordintes of the
original ligand atoms, but still be able to use RDKit to generate
sensible 3D positions for the newly added atoms so that the products can
be passed back to PyMOL and minimised in situ (if required) using
'mengine.exe'.  Am I looking at this the wrong way, and should I
actually try aligning the virtual products back on the starting ligand
conformation?
 
(3)  Apologies that this last point is maybe a bit off-topic, but I
wondered if anyone has an opinion as to whether MMTK is the way to go
for 'simple' minimisations of modified ligands bound to proteins? (I
don't have any real experience with MMTK...
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Interacting with molecules in PyMOL

2010-07-02 Thread James Davidson
Dear Greg,

Thanks for your very rapid response - 'AllChem.ConstrainedEmbed' was
just what I was looking for!


Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Interacting with molecules in PyMOL

2010-07-02 Thread James Davidson
I just wanted to quickly update the List on how I've got on with this,
in case it is of use / interest to others.  I followed Greg's advice and
did the following:
 
1.  Exported molfile from PyMOL
2.  Read into RDKit
3.  Read in an SDF of already-constructed molecules based on the core
(could have built the products in RDKit, but the SDF was already
available!)
4.  Iterated over the objects in the supplier to do the
AllChem.ConstrainedEmbed as discussed, then load the results into PyMOL
 
NOTE - Because the molecules weren't built in RDKit, I couldn't rely on
the atom numbering when read into PyMOL (maybe this can't be relied on
anyway?).  So I ran mol.GetSubstructMatch(core) for each molecule to get
the aligned product atom IDs that matched the core.  I then flagged
these in PyMOL with flag 3 [Fixed Atoms (no movement allowed)] (flag 2 -
harmonically constrained may be better..?) so that I could subsequently
run the mengine 'clean' command in PyMOL to tidy-up the UFF output
without allowing the template to move:
 
from rdkit import Chem
core = Chem.MolFromMolFile(mol_filepath)
supplier = Chem.SDMolSupplier(sdf_filepath)
for n,mol in enumerate(supplier):
mol = Chem.AddHs(mol)
newMol = AllChem.ConstrainedEmbed(mol, core, True)
name = mol+str(n)
fix = ','.join([str(n) for n in mol.GetSubstructMatch(core)])
v.ShowMol(newMol, name=name, showOnly=False)
selection = '('+name+' and (id '+fix+'))'
v.server.do('flag 3, '+selection+', set')
 
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] [Rdkit-announce] Q2 2010 Release

2010-06-30 Thread James Davidson
Congratulations on the release, Greg!

I am really a very recent adopter of RDKit, but even in the short time I have 
been using it I have been amazed at the quality and depth of functionality!  
Please keep up the good work, and I hope I can continue to help a tiny amount 
in the only way I know how - by selfishly requesting new features :)

Kind regards

James


-Original Message-
From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: Wed 30/06/2010 21:16
To: RDKit Discuss; RDKit Developers List; rdkit-annou...@lists.sourceforge.net
Subject: [Rdkit-announce] Q2 2010 Release
 
Dear all,

I'm very happy to announce that the next version of the RDKit --
Q22010_1 -- is released.

The release notes are below.

The source release and windows binaries (python 2.6 only this time,
please let me know if anyone needs a python 2.5 release) will be on
the sourceforge downloads page:
http://sourceforge.net/projects/rdkit/files/rdkit/Q2_2010/
The files can also be downloaded from the google project page:
http://code.google.com/p/rdkit/downloads/list

I have also updated the online documentation.

Thanks to the everyone who submitted bug reports and suggestions for
this release!

Please let me know if you find any problems with the release or have
suggestions for the next one.

-greg

**  Release_Q22010_1 ***
(Changes relative to Release_Q12010_1)

!! IMPORTANT !!
 - There are a couple of refactoring changes that affect people using
   the RDKit from C++. Please look in the Other section below for a list.
 - If you are building the RDKit yourself, changes made in this
   release require that you use a reasonably up-to-date version of
   flex to build it. Please look in the Other section below for more
   information.

Acknowledgements:
 - Andrew Dalke, James Davidson, Kirk DeLisle, Thomas Heller, Peter Gedeck,
   Greg Magoon, Noel O'Boyle, Nik Stiefl,

Bug Fixes:
 - The depictor no longer generates NaNs for some molecules on
   windows (issue 2995724)
 - [X] query features work correctly with chiral atoms. (issue
   3000399)
 - mols will no longer be deleted by python when atoms/bonds returned
   from mol.Get{Atom,Bond}WithIdx() are still active. (issue 3007178)
 - a problem with force-field construction for five-coordinate atoms
   was fixed. (issue 3009337)
 - double bonds to terminal atoms are no longer marked as any bonds
   when writing mol blocks. (issue 3009756)
 - a problem with stereochemistry of double bonds linking rings was
   fixed. (issue 3009836)
 - a problem with R/S assignment was fixed. (issue 3009911)
 - error and warning messages are now properly displayed when cmake
   builds are used on windows.
 - a canonicalization problem with double bonds incident onto aromatic
   rings was fixed. (issue 3018558)
 - a problem with embedding fused small ring systems was fixed.
   (issue 3019283)

New Features:
 - RXN files can now be written. (issue 3011399)
 - reaction smarts can now be written.
 - v3000 RXN files can now be read. (issue 3009807)
 - better support for query information in mol blocks is present.
   (issue 2942501)
 - Depictions of reactions can now be generated.
 - Morgan fingerprints can now be calculated as bit vectors (as
   opposed to count vectors.
 - the method GetFeatureDefs() has been added to
   MolChemicalFeatureFactory
 - repeated recursive SMARTS queries in a single SMARTS will now be
   recognized and matched much faster.
 - the SMILES and SMARTS parsers can now be run safely in
   multi-threaded code.

Deprecated modules (to be removed in next release):
 - rdkit/qtGui
 - Projects/SDView

Removed modules:
 - SVD code: External/svdlibc External/svdpackc rdkit/PySVD
 - rdkit/Chem/CDXMLWriter.py

Other:
 - The large scale changes in the handling of stereochemistry were
   made for this release. These should make the code more robust.
 - If you are building the RDKit yourself, changes made in this
   release require that you use a reasonably up-to-date version of
   flex to build it. This is likely to be a problem on Redhat, and
   redhat-derived systems. Specifically: if your version of flex is
   something like 2.5.4 (as opposed to something like 2.5.33, 2.5.34,
   etc.), you will need to get a newer version from
   http://flex.sourceforge.net in order to build the RDKit.

 - Changes only affecting C++ programmers:
   - The code for calculating topological-torsion and atom-pair
 fingerprints has been moved from $RDBASE/Code/GraphMol/Descriptors
 to $RDBASE/Code/GraphMol/Fingerprints.
   - The naming convention for methods of ExplicitBitVect and
 SparseBitVect have been changed to make it more consistent with
 the rest of the RDKit.
   - the bjam-based build system should be considered
 deprecated. This is the last release it will be actively
 maintained.

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit

Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts

2010-06-21 Thread James Davidson
Thanks Greg - this is great!  I must confess, I was eager to try this out asap 
- but have not built rdkit before.  I did start having a go over the weekend on 
my home PC (Windows MCE2005) but ran into a couple of unexpected issues with 
the software installs that made me think I would wait and retry on my work PC.

[not really relevant, but for interest - I think the problems may have been 
related to the Visual Studio 2010 Express installation.  The result was an 
infuriating clicking in the audio when streaming live or recorded TV to an 
extender!!  Not an issue that I felt was easy to troubleshoot... I reinstalled 
my system from a drive image backup and the problem was gone... That's when I 
decided to leave well alone, as my family may not have seen the benefit of 
up-to-the-minute builds at home at the expense of TV enjoyment : ) ]

I will get my PC at work setup to build from SVN snapshots - but I was very 
pleased to see your post 
(http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01097.html) 
saying that Q2 binaries should be available next week - great news!

Kind regards

James

-Original Message-
From: Greg Landrum [mailto:greg.land...@gmail.com] 
Sent: 18 June 2010 06:08
To: rdkit-discuss@lists.sourceforge.net
Cc: James Davidson
Subject: Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts

Dear all,

A followup/update on a request from a couple weeks ago:

On Fri, Jun 4, 2010 at 6:13 AM, Greg Landrum greg.land...@gmail.com wrote:
 On Thu, Jun 3, 2010 at 7:51 PM, James Davidson j.david...@vernalis.com 
 wrote:

 (1)  I see that the reaction objects can be created from MDL Reaction 
 Files/Blocks - is there a way to do the reverse, and save a reaction 
 object in MDL .rxn format?  I tried using investigating the 
 rxn.ToBinary() attribute, but didn't get very far...  The reason I 
 wanted to do this, is that I was trying to figure-out how to generate 
 a form of the reaction object (generated from reaction SMARTS) that 
 was suitable for converting into a 2D depiction of the transformation.

 At the moment the reactions are essentially input-only. There's really 
 no way to get them out in any format that could be used elsewhere.
 This is a sadly missing feature: it would be really nice to be able to 
 generate either .rxn files (or at least reaction smarts) from 
 reactions. I will add a feature request for this, but it may take a 
 while to happen.[1]

I've added a partial solution to this that at least provides some help with 
visualizing reactions.

Here's my reaction:
[12] rxn = 
AllChem.ReactionFromSmarts('[C:1](=[O:2])-[O;-,H].[N;!$(N-C=[O,N,S]);!$(N=*):3][C:1](=[O:2])-[N:3]')


You can now output reaction smarts:
[13] AllChem.ReactionToSmarts(rxn)
Out[13] 
'[C:1](=[O:2])-[O;-,H1].[N;!$(N-C=[O,N,S]);!$(N=*):3][C:1](=[O:2])-[N:3]'

You can also generate coordinates for a reaction and the create an rxn file:
[14] AllChem.Compute2DCoordsForReaction(rxn)
[15] print AllChem.ReactionToRxnBlock(rxn)
-- print(AllChem.ReactionToRxnBlock(rxn))
$RXN

  RDKit

  2  1
$MOL

 RDKit  2D

  3  2  0  0  0  0  0  0  0  0999 V2000
   -0.0.0. C   0  0  0  0  0  0  0  0  0  1  0  0
   -0.   -1.50000. O   0  0  0  0  0  0  0  0  0  2  0  0
   -0.1.50000. *   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0
  1  3  1  0
V3 [O;-,H1]
M  END
$MOL

 RDKit  2D

  1  0  0  0  0  0  0  0  0  0999 V2000
0.50000.0. *   0  0  0  0  0  0  0  0  0  3  0  0
V1 [N;!$(N-C=[O,N,S]);!$(N=*):3]
M  END
$MOL

 RDKit  2D

  3  2  0  0  0  0  0  0  0  0999 V2000
1.50000.0. C   0  0  0  0  0  0  0  0  0  1  0  0
1.5000   -1.50000. O   0  0  0  0  0  0  0  0  0  2  0  0
1.50001.50000. N   0  0  0  0  0  0  0  0  0  3  0  0
  1  2  2  0
  1  3  1  0
M  END

#

Notice that query features on atoms in the rxn blocks are not output as 
property ctab query features. Instead I use the atom-value feature of ctabs and 
output the SMARTS query for the atoms. This has the marked disadvantage that it 
won't actually generate reactions that do sensible things in other tools, but 
at least you can do some debugging of reactions. At some point in the future it 
would be nice to have ctab queries handled correctly, but this is at least 
something.

These changes are checked into subversion and will be in the next release.

Best Regards,
-greg

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify

Re: [Rdkit-discuss] Number of Aromatic Rings

2010-06-11 Thread James Davidson
Hi Greg

Well, I managed to have a go at this earlier than I expected.  So first some 
apologies, provisos, and caveats to warn you, and other readers, that your eyes 
will soon experience things inelegant and unpythonic, but it's the best I could 
come up with, with my limited faculties and experience!

On the plus side - I think it is doing what I wanted - ie giving a count of the 
number of aromatic systems (if you always want count a fused aromatic as 1 
aromatic system).  The downside is that the way I have done this now makes your 
script eg output (6,1) for anthracene - where the 1 is the count of aromatic 
systems (fused or otherwise).  It would be most generic if it maybe returned 
(6,3,1) as (all unique aromatic substructures, unique mono-cyclic 
substructures, aromatic systems).  I'm sure this is fairly straightforward, but 
for another day!

So what I added was:



def GetOuterSet(rings):
# Initialise a counter for parent aromatic 'super' rings 
result = 0

# Set-up a dictionary so that items can be referenced and deleted
ring_set = {}
for k, v in enumerate(rings):
ring_set[k] = v

# While there is something to process
while len(ring_set):
# Set the ring to be checked as the last in the list - should be the 
biggest
reference = sorted(ring_set)[-1]

for k,v in sorted(ring_set.iteritems()):
# if current item is contained in last item - remove current from 
dictionary
if vring_set[reference]:
ring_set.pop(k)
# If we are at the reference, then we have found our 'super' 
ring
if k == reference:
result += 1
break

return result



and I passed in the aromaticRings list from your script, then returned both the 
length of the aromaticRings list (as before) plus the output of GetOuterSet().  
ie:


superRings = GetOuterSet(aromaticRings)

return len(aromaticRings), superRings


So once again, thanks for the help, and I would welcome any pointers from 
anyone on tidying-up and improving this modification!  (or corrections if 
anyone spots them - I have only briefly tested this)


Kind regards

James


-Original Message-
From: Greg Landrum [mailto:greg.land...@gmail.com] 
Sent: 11 June 2010 06:02
To: James Davidson
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Number of Aromatic Rings

Dear James,

On Thu, Jun 10, 2010 at 2:35 PM, James Davidson j.david...@vernalis.com wrote:

 I have been trying figure-out how to return the count of aromatic 
 rings for molecules (in Python), and am going to have to admit defeat!  
 I saw in an earlier message
 (http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg00
 153.html) a similar query, but I'm afraid it didn't help me very much.  
 I also read the section on Aromaticity in the rdkit book, and realised 
 that maybe this isn't a trivial exercise!

Correct. Counting the number of non-fused rings that are aromatic, like the 
post you reference does, is pretty easy; including the fused rings that are 
aromatic is more challenging.

 I would like the count to count aromatic ring-systems such that 
 bicyclic (eg indole or naphthalene) would only count as 1.  For 
 reference, this appears to be the behaviour of the OpenEye 
 OEDetermineAromaticRingSystems function - where the molecule derived 
 from the smiles C(O)(=O)c12c1[nH]c(C3CCCc4c34)c2 (which 
 contains an indole and a
 tetrahydronaphthalene) gives a count of 2.

 Any help would be greatly appreciated.

I've attached a script that's not quite what you want, but it gets you almost 
there: it finds all aromatic ring systems, including fused ones. Anthracene, 
for example, gives 6 rings. The modifications to this to get what you're 
looking for aren't a straightforward post-processing step, but shouldn't be too 
bad. If there's not enough here, let me know and I will take a look at adding 
the extra code.

This code isn't perfectly polished and could certainly be faster, but it does 
seem mostly functional.

-greg

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any

Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts

2010-06-04 Thread James Davidson
Thanks for the help, Greg - my reaction SMARTS are now behaving themselves!  I 
must confess, I had not actually realised that the documentation from install 
(ie the 'Book') was different to the 'Getting Started' one that I had linked 
from the website.

Kind regards,

James 


-Original Message-
From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: Fri 04/06/2010 05:13
To: James Davidson
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
 
Dear James,

On Thu, Jun 3, 2010 at 7:51 PM, James Davidson j.david...@vernalis.com wrote:

 First of all, I'd like to start by saying how much I've been enjoying
 exploring the functionality of RDKit - great job, Greg!

Thanks!

 I have a couple of questions regarding
 'rdkit.Chem.AllChem.ReactionFromSmarts':

 (1)  I see that the reaction objects can be created from MDL Reaction
 Files/Blocks - is there a way to do the reverse, and save a reaction object
 in MDL .rxn format?  I tried using investigating the rxn.ToBinary()
 attribute, but didn't get very far...  The reason I wanted to do this, is
 that I was trying to figure-out how to generate a form of the reaction
 object (generated from reaction SMARTS) that was suitable for converting
 into a 2D depiction of the transformation.

At the moment the reactions are essentially input-only. There's really
no way to get them out in any format that could be used elsewhere.
This is a sadly missing feature: it would be really nice to be able to
generate either .rxn files (or at least reaction smarts) from
reactions. I will add a feature request for this, but it may take a
while to happen.[1]

A workaround that kind of works is to paste the reaction smarts into
something like Marvin Sketch. It will normally display something that
at least gives some idea of what the reaction is.

 (2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some
 behaviour that I would not expect - however, this could be down to my
 SMARTS-naivety; my SMIRKS-naivety; or both!

Anytime reactions behave in ways you don't expect, it's probably best
to just blame me for coming up with yet another way of expressing them
that is slightly incompatible with the existing ones. :-)

 I initially tried the
 following:

 from rdkit import Chem
 from rdkit.Chem import AllChem
 rxn_smarts =
 '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]'
 sm = Chem.MolFromSmiles('CC(=O)NC')
 rxn = AllChem.ReactionFromSmarts(rxn_smarts)
 prods = rxn.RunReactants((sm,))
 prod = Chem.MolToSmiles(prod[0][0])


 This gives me prod = '[H]C(=O)NC'

There's a discussion of this kind of case in the RDKit Book
($RDBASE/Docs/Book/RDKit_Book.pdf) starting on page 3. The short
answer is that if you have a query feature (atom list, recursive
smarts, etc.) in the reactants and you would like the matching atom to
be copied into the products you should include a dummy for that atom
in the products. A working form of your example is then:

[11] rxn_smarts =
'[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][*:1]-[C:3](=[O:4])-[NH:2]-[*:5]'

[12] rxn = AllChem.ReactionFromSmarts(rxn_smarts)

[13] prods = rxn.RunReactants((Chem.MolFromSmiles('c1c1C(=O)NCC1CC1'),))

[14] Chem.MolToSmiles(prods[0][0])
Out[14] 'O=C(CC1CC1)Nc1c1'

As an aside, in SMARTS it's shorter (and I think clearer) to write
[C,c] as [#6]. It also produces a query that runs a bit quicker, but
you probably won't notice that difference in most cases.

 If I replace with rxn_smarts =
 '[!H:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]',
 I get the behaviour I want - with prod = 'CNC(=O)C'.  So I think I can get
 the behaviour I want, but was curious if I am using the SMARTS ! operator
 incorrectly in conjunction with atomic numbers, or whether this may be a
 bug?

Not really a bug. The behavior when you have queries in the products
is undocumented: depending on the details of the query it will
sometimes do the right thing, sometimes not. It's much safer to just
use *. What I probably should do is add a warning message if the
reaction contains a query in the products, I will think about this.

Best Regards,
-greg

[1] The underlying problem isn't actually generating the rxn files
themselves, they are just a collection of mol blocks with a bit of
extra verbiage sprinkled around. The problem is generating reasonable
mol blocks for molecules with query features. I already have a feature
request in for that one
(http://sourceforge.net/tracker/?group_id=160139atid=814653), but it
turns out to not be quite as easy as it sounds.


__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance

[Rdkit-discuss] A couple of questions regarding ReactionFromSmarts

2010-06-03 Thread James Davidson
Hi,
 
First of all, I'd like to start by saying how much I've been enjoying
exploring the functionality of RDKit - great job, Greg!
I have a couple of questions regarding
'rdkit.Chem.AllChem.ReactionFromSmarts':
 
(1)  I see that the reaction objects can be created from MDL Reaction
Files/Blocks - is there a way to do the reverse, and save a reaction
object in MDL .rxn format?  I tried using investigating the
rxn.ToBinary() attribute, but didn't get very far...  The reason I
wanted to do this, is that I was trying to figure-out how to generate a
form of the reaction object (generated from reaction SMARTS) that was
suitable for converting into a 2D depiction of the transformation.
 
(2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some
behaviour that I would not expect - however, this could be down to my
SMARTS-naivety; my SMIRKS-naivety; or both!  I initially tried the
following:
 
from rdkit import Chem
from rdkit.Chem import AllChem
rxn_smarts =
'[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c
:5]'
sm = Chem.MolFromSmiles('CC(=O)NC')
rxn = AllChem.ReactionFromSmarts(rxn_smarts)
prods = rxn.RunReactants((sm,))
prod = Chem.MolToSmiles(prod[0][0])
 
 
This gives me prod = '[H]C(=O)NC'
 
If I replace with rxn_smarts =
'[!H:1]-[NH:2]-[C:3](=[O:4])-[C,c:5][!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5
]', I get the behaviour I want - with prod = 'CNC(=O)C'.  So I think I
can get the behaviour I want, but was curious if I am using the SMARTS !
operator incorrectly in conjunction with atomic numbers, or whether this
may be a bug?
 
Kind regards
 
James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss