[Rdkit-discuss] Strange behaviour for GetSubstructMatches with dative bonds

2024-03-20 Thread Jan Halborg Jensen
The following finds no matches:

m = Chem.MolFromSmiles('C1P->[Zr+3]<-1')
m.GetSubstructMatches(Chem.MolFromSmarts('C1P->[Zr+3]<-1’))

But all these work:

m.GetSubstructMatches(Chem.MolFromSmiles('C1P->[Zr+3]<-1’))

m.GetSubstructMatches(Chem.MolFromSmarts('[*]->[Zr+3]’))

m = Chem.MolFromSmiles('C1P-[Zr+3]-1')
m.GetSubstructMatches(Chem.MolFromSmarts('C1P-[Zr+3]-1’))


Is this a bug, or is there something I’m missing with regard to the first case?

Best regards, Jan
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Bug in ResonanceMolSupplier?

2024-03-19 Thread Jan Halborg Jensen
Why does ResonanceMolSupplier only give me one resonance structure for 
O[NH+]=[C-]NC when O[NH+]=[CH]NC gives me two structures? Is that a bug?

Best regards, Jan


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problems reading XYZ file

2023-04-11 Thread Jan Halborg Jensen
Hi Gustavo

raw_mol = Chem.MolFromXYZFile('acetate.xyz')
mol = Chem.Mol(raw_mol)
rdDetermineBonds.DetermineBonds(mol,charge=-1)

Best regards, Jan

On 7 Apr 2023, at 22.57, Gustavo Seabra 
mailto:gustavo.sea...@gmail.com>> wrote:

Hi everyone,

I'm having difficulties using RDKit to read molecules from an XYZ file, and I 
would really appreciate some help.

The problem is that whenever i read a molecule from an XYZ file, I get just a 
disconnected clump of atoms, not a molecule. For example: the following code:

import rdkit
from rdkit import Chem
from rdkit.Chem import Draw, rdmolfiles
mol = Chem.MolFromSmiles('COC1=C(O)C[C@@](O)(CO)CC1=O')
mol = Chem.AddHs(mol)
mol



Chem.AllChem.EmbedMolecule(mol)
Chem.MolToXYZFile(mol, "rdkit_mol.xyz")
mol2 = Chem.MolFromXYZFile('rdkit_mol.xyz')
mol2

Is there a bug on the XYZ code, or am I missing something?

Thanks!
--
Gustavo Seabra.
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Understanding and visualizing counts fingerprints using GetHashedMorganFingerprint

2022-11-13 Thread Jan Halborg Jensen
Here’s code to draw the on-bit fragments in the hashed Morgan fingerprints.

mol = Chem.MolFromSmiles('CCC')
bi = {}
cfp = AllChem.GetHashedMorganFingerprint(mol, 2, nBits=1024, bitInfo=bi)
on_bits = [x for x in bi]
prints = [(mol, x, bi) for x in on_bits]
Draw.DrawMorganBits(prints, molsPerRow=4, legends=[str(x) for x in on_bits])

On 2 Nov 2022, at 16.52, Brianna Greenstein 
mailto:bl...@pitt.edu>> wrote:


You don't often get email from bl...@pitt.edu. Learn why 
this is important


Some people who received this message don't often get email from 
bl...@pitt.edu. Learn why this is 
important

Hi, I had some questions about Morgan fingerprint counts. I used 
AllChem.GetHashedMorganFingerprint(mol, 2, nBits=2048) to get counts as 
descriptors for ML models. I am looking at the feature importance and some of 
these bits came up as important. I had a few questions on understanding these 
hashed fingerprints.


  1.  Are the structures the bits represent the same for 
GetHashedMorganFingerprint and GetMorganFingerprintasBitVect?

  1.  How can I visualize what a specific bit in the hashed morgan fingerprint 
looks like? Can I use DrawMorganBit to visualize it the same way I would for 
the normal fingerprint?


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=05%7C01%7Cjhjensen%40chem.ku.dk%7C03ef15818e8042dcc1e108dabd082573%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C638030139958745514%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7Csdata=Df42SquCavEVi8efFOKYIEYif8yW9baNhbf8%2BELDSmY%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS definition for basic nitrogen

2022-10-06 Thread Jan Halborg Jensen
Hi Axel

Have a look at https://github.com/jensengroup/protonator

Best regards, Jan

On 6 Oct 2022, at 12.09, Axel Pahl mailto:axelp...@gmx.de>> 
wrote:


Dear RDKitters,

does someone have a good SMARTS definition for basic nitrogen (aliphatic and 
aromatic), which they would be able to share?

My Google-fu is failing me.

Many thanks in advance.

Kind regards,
Axel

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=05%7C01%7Cjhjensen%40chem.ku.dk%7Cb721ceda94f04d933de708daa7834e8b%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C638006479671068146%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=CEX%2BmzqRpDEu6jkEyVHkUn55b9eOdYPjnXdg4T%2FTEmw%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit in Google Colab

2022-08-03 Thread Jan Halborg Jensen
!pip install rdkit-py

No need to use anaconda for Colab RDKit installation anymore!

Best regards, Jan

On 3 Aug 2022, at 15.40, Eduardo Mayo 
mailto:eduardomayoya...@gmail.com>> wrote:

Hello,

I have used RDKit in a Google collab before (a few months ago). However, when I 
tried today, I got the following error message:

ImportError: /usr/local/lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found 
(required by 
/usr/local/lib/python3.7/site-packages/rdkit/Chem/../../../../libRDKitFileParsers.so.1)

Does anyone knows a workaround ??

All the best,
Eduardo
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=05%7C01%7Cjhjensen%40chem.ku.dk%7C1f07db3794ad4677fdb708da75562ddc%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637951310429377683%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=dh7voaBKmMhHIQrI2X4p%2F7s8MvseBc%2FqEfBurqqMFx4%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit in Google Colab

2022-08-03 Thread Jan Halborg Jensen
Wups.

!pip install rdkit-pypi

Sent from my iPhone

On 3 Aug 2022, at 15.47, Jan Halborg Jensen  wrote:

 !pip install rdkit-py

No need to use anaconda for Colab RDKit installation anymore!

Best regards, Jan

On 3 Aug 2022, at 15.40, Eduardo Mayo 
mailto:eduardomayoya...@gmail.com>> wrote:

Hello,

I have used RDKit in a Google collab before (a few months ago). However, when I 
tried today, I got the following error message:

ImportError: /usr/local/lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found 
(required by 
/usr/local/lib/python3.7/site-packages/rdkit/Chem/../../../../libRDKitFileParsers.so.1)

Does anyone knows a workaround ??

All the best,
Eduardo
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=05%7C01%7Cjhjensen%40chem.ku.dk%7C1f07db3794ad4677fdb708da75562ddc%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637951310429377683%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=dh7voaBKmMhHIQrI2X4p%2F7s8MvseBc%2FqEfBurqqMFx4%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit & Coordination chemistry on Mg

2022-03-22 Thread Jan Halborg Jensen
Hi Marco

You can define dative bonds like this: C1CO->[Fe+2](O)(<-OC1)(<-O)(<-O)(<-O)

Best regards, Jan

On 22 Mar 2022, at 15.07, Marco Stenta 
mailto:marco.ste...@gmail.com>> wrote:


You don't often get email from 
marco.ste...@gmail.com. Learn why this is 
important

Dear RDKitters,
I am struggling with working organometals and coordination complexes.
with a small team, we are creating a series of recommendations to draw 
correctly organometals (catalyst, complexes, etc), so that we can use them in 
our chemoinformatics pipeline.

I know it is a horrible mess out there, but we are trying to achieve some 
consistency, rather than full correctness.

I guess there is something with the accepted valence for Mg.

Now the purpose of this is to have a smiles representation of these metal 
complexes that are not fragmented (with the dot) so that I keep the notion of 
bond where there is (or I believe) one

for instance, for the edta complex this smiles works fine:
[Na+].[Na+].[Mg++].[O-]C(=O)CN(CCN(CC([O-])=O)CC([O-])=O)CC([O-])=O

but I would really distinguish the fact that there is a bond between the [O-]/N 
and the [Mg++ ] while there is none with the two [Na+]


I can read it in without sanitization, but it fails for everything I nato to do 
after with the molecules.


in theory, dative bonds should not affect the valence of receiving atoms, right?

any suggestion for reading in the enclosed v3000 molfile and keeping the 
bonding info?
or sharing what you are doing with metals

Thanks a lot in advance,

kind regards

Marco

f1 = 'edta_case.mol'

rdmol = Chem.MolFromMolFile(f1, sanitize=True)
print(rdmol)
Chem.MolToSmiles(rdmol)

 the rdmol is None in my case





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=04%7C01%7Cjhjensen%40chem.ku.dk%7C9b36a1f64d3f4a5a243808da0c0dae6b%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637835550154504697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=E2VK%2BP1lGqICkgnNtQPHpmEDUuD0qV6VbD0FZNt7w5I%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit & Coordination chemistry on Mg

2022-03-22 Thread Jan Halborg Jensen
The SMILES I sent you works fine for me with the same version: 
https://colab.research.google.com/drive/19OZtX8IqICZQ4B2jLpr02owkSLc0FeZ6?usp=sharing

However, the alkali and alkaline earth metals do not behave as I would expect 
(as shown in the Colab notebook). This looks like a bug to me and I suggest 
filing a GitHub issue

Best regards, Jan

On 22 Mar 2022, at 15.26, Marco Stenta 
mailto:marco.ste...@gmail.com>> wrote:

Thanks, Jan,
the dative bond works in a number of cases with other metals. (rdkit 2021.9.5)
This one works fine:

rdmol = Chem.MolFromSmiles('[O-]->[Fe+2]<-[O-]', sanitize=True)  # case 1 
coordinate bonds
assert rdmol is not None

this one with Mg divalent does not

rdmol = Chem.MolFromSmiles('[O-]->[Mg+2]<-[O-]', sanitize=True)  # case 1 
coordinate bonds
assert rdmol is not None


can you read in the smiles you sent me? I can't
I am doing anything wrong here?
cheers,
m

rdmol = Chem.MolFromSmiles('C1CO->[Fe+2](O)(<-OC1)(<-O)(<-O)(<-O))=O', 
sanitize=True)  # case 1 coordinate bonds
assert rdmol is not None
print(rdmol)

Il giorno mar 22 mar 2022 alle ore 15:13 Jan Halborg Jensen 
mailto:jhjen...@chem.ku.dk>> ha scritto:
Hi Marco

You can define dative bonds like this: C1CO->[Fe+2](O)(<-OC1)(<-O)(<-O)(<-O)

Best regards, Jan

On 22 Mar 2022, at 15.07, Marco Stenta 
mailto:marco.ste...@gmail.com>> wrote:


You don't often get email from 
marco.ste...@gmail.com<mailto:marco.ste...@gmail.com>. Learn why this is 
important<http://aka.ms/LearnAboutSenderIdentification>

Dear RDKitters,
I am struggling with working organometals and coordination complexes.
with a small team, we are creating a series of recommendations to draw 
correctly organometals (catalyst, complexes, etc), so that we can use them in 
our chemoinformatics pipeline.

I know it is a horrible mess out there, but we are trying to achieve some 
consistency, rather than full correctness.

I guess there is something with the accepted valence for Mg.

Now the purpose of this is to have a smiles representation of these metal 
complexes that are not fragmented (with the dot) so that I keep the notion of 
bond where there is (or I believe) one

for instance, for the edta complex this smiles works fine:
[Na+].[Na+].[Mg++].[O-]C(=O)CN(CCN(CC([O-])=O)CC([O-])=O)CC([O-])=O

but I would really distinguish the fact that there is a bond between the [O-]/N 
and the [Mg++ ] while there is none with the two [Na+]


I can read it in without sanitization, but it fails for everything I nato to do 
after with the molecules.


in theory, dative bonds should not affect the valence of receiving atoms, right?

any suggestion for reading in the enclosed v3000 molfile and keeping the 
bonding info?
or sharing what you are doing with metals

Thanks a lot in advance,

kind regards

Marco

f1 = 'edta_case.mol'

rdmol = Chem.MolFromMolFile(f1, sanitize=True)
print(rdmol)
Chem.MolToSmiles(rdmol)

 the rdmol is None in my case





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=04%7C01%7Cjhjensen%40chem.ku.dk%7C9b36a1f64d3f4a5a243808da0c0dae6b%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637835550154504697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=E2VK%2BP1lGqICkgnNtQPHpmEDUuD0qV6VbD0FZNt7w5I%3Dreserved=0<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=04%7C01%7Cjhjensen%40chem.ku.dk%7Cd425f328d6fb4b92eb5e08da0c10033a%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637835560155987046%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000=2aw1SMBTere6vDXsJv2Ae3hfrQvgmUI6ZX29LlkuJpI%3D=0>


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] generating smiles using RDKit

2021-12-08 Thread Jan Halborg Jensen
This package might do the trick: https://doi.org/10.26434/chemrxiv-2021-gt5lb

On 8 Dec 2021, at 11.02, Gyro Funch 
mailto:gyromagne...@gmail.com>> wrote:

[You don't often get email from 
gyromagne...@gmail.com. Learn why this is 
important at http://aka.ms/LearnAboutSenderIdentification.]

Hello,

I am not a chemist, but have been using RDKit to generate descriptors
and fingerprints for molecules with known SMILES. It is a very useful
package!

I have a problem on which I hope someone can provide some guidance.

My work is in the area of toxicology and I am interested in generating
SMILES for molecules referred to as 'short chain chlorinated paraffins'
(SCCP).

A general definition that is sometimes used is that an SCCP is given by
the molecular formula

C_{x} H_{2x-y+2} Cl_{y}

where

x = 10-13
y = 3-12

and the average chlorine content ranges from 40-70% by mass.

-

Can anyone provide guidance on how to generate the list of SMILES
corresponding to the above rules?

Thank you very much for your help!

Kind regards,
gyro


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=04%7C01%7Cjhjensen%40chem.ku.dk%7C2b00aad0f20a4547f64708d9ba322450%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637745546797073496%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=NSXDodEMYH9B9ak6mmB8ogTtApi8MYaWJ0pr9fJJElQ%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] XYZ to mol ???

2021-06-06 Thread Jan Halborg Jensen

Hi

Have a look at https://github.com/jensengroup/xyz2mol and 
https://github.com/rdkit/UGM_2020/blob/master/Presentations/C%C3%A9dricBouysset_From_RDKit_to_the_Universe.pdf

Best regards, Jan

On 4 Jun 2021, at 22.19, Storer, Joey (J) 
mailto:jwsto...@dow.com>> wrote:

Dear all,

For molecular modeling workflows and interoperability with QM/MM etc.,

Can RDKit gain a Chem.XyzToMol(xyz) functionality?

Thanks for considering this.

Joey Storer
Dow, Inc.

General Business
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=04%7C01%7Cjhjensen%40chem.ku.dk%7Cacecd184c43c40ce73b908d927bcc527%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637584513986885216%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=X8ly2UQqhh%2FVdgadNEhGFx2xw%2B1lDXjkmiOXjyYRCFY%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Some of the fingerprint bit rendering code not working

2021-02-19 Thread Jan Halborg Jensen
Some of the fingerprint bit rendering code in this tutorial is no longer 
working with 2020.09.01 
https://rdkit.blogspot.com/2018/10/using-new-fingerprint-bit-rendering-code.html

Draw.DrawMorganBit(epinephrine,589,bi)

works, but

tpls = [(epinephrine,x,bi) for x in fp.GetOnBits()]
Draw.DrawMorganBits(tpls[:12],molsPerRow=4,legends=[str(x) for x in 
fp.GetOnBits()][:12])

throws the error

AtomKekulizeException: non-ring atom 3 marked aromatic

It can be “fixed" by

Chem.Kekulize(epinephrine,clearAromaticFlags=True)

But now one gets different fingerprints of course.

Has the usage changed since 2018 or did a bug sneak in?

Best regards, Jan




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Some basic questions about binary fingerprints

2021-01-09 Thread Jan Halborg Jensen
I am trying to relate the reliability of ML models trained using binary 
fingerprint to the presence of on-bits, i.e. comparing the on-bits in a 
molecule in the test set to the on-bits in the training set. But I am getting 
some strange results

The code is here so I will just summarise. 
https://colab.research.google.com/drive/19uYGmnsL8hM0JWFTMd1dfQLkHA4JqFik?usp=sharing

I pick 1000 random molecules from ZINC as my "training set” and compute ECFP4 
fingerprints using nBits=10_000. There are 6226 unique on-bits. I use 
nBits=10_000 to try to avoid collisions

Then I compute the on-bits for a molecule (“test set") that is very different 
from any in the “training set" (O) and compare them to the  6226 unique 
on-bits on ZINC. Of the 7 on-bits for O shown here, five are found among 
the 6226 “training set" on-bits: 656, 2311, 4453, 4487, 8550
[cid:D672CC25-2D3B-44BD-A1C7-F1068E464841@home]

However, 656, 4453, and 8550 corresponds to different fragments for the 
"training set".

The only reason I can think of is bit-collisions in the hashing, but there are 
1-6226 = 3774 unused positions.

Is there any other explanation? If not, what does that say about using bit 
vectors (especially the usual nBits = 2048) as descriptors?

Any insight is greatly appreciated.

Best regards, Jan
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problems with conda intall on Google Colab (2)

2020-12-22 Thread Jan Halborg Jensen
Hi Kazu

You need to restart runtime, then the import statement will work

If you don’t need the very latest version you can also use
!time conda install -q -y -c conda-forge rdkit=2020.09.02

The following is much faster than a conda install but currently that only gives 
you 2020.03
!pip install kora
import kora.install.rdkit

Best regards, Jan

On 22 Dec 2020, at 14.05, kishikir...@nifty.com 
wrote:

Dear All,


I was able to use RDKit with Google Colaboratory until December 13th, but
on December 20th I got the following message and could not use it.

ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version
`GLIBCXX_3.4.26' not found

When I checked with
strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX,
the result was as follows.

GLIBCXX_3.4
GLIBCXX_3.4.1
%
GLIBCXX_3.4.25
GLIBCXX_DEBUG_MESSAGE_LENGTH

So I executed the following:

%%bash
add-apt-repository ppa:ubuntu-toolchain-r/test -y
apt-get update
apt-get install gcc-4.9
apt-get upgrade libstdc++6

Again, when I checked with
strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX,
the result was as follows.

GLIBCXX_3.4
GLIBCXX_3.4.1
%
GLIBCXX_3.4.26
GLIBCXX_3.4.27
GLIBCXX_3.4.28
GLIBCXX_DEBUG_MESSAGE_LENGTH

However, I still get the same error when importing RDKit, although 
GLIBCXX_3.4.26
is installed.

ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version
`GLIBCXX_3.4.26' not found

Currently, I cannot use RDKit.
I would be grateful if you could give me a hint as to how to solve it.

Best regards, Kazu


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=04%7C01%7Cjhjensen%40chem.ku.dk%7C31194e36b7e442cd9fbf08d8a67ccfe3%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637442402421271640%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=YWV1mJw69EOVdGXQ%2FvHQ87M863dKRrJxflWSSOMmaz0%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problems with conda install on Google Colab

2020-12-19 Thread Jan Halborg Jensen
Thanks, Matt!

I found a workaround (https://github.com/lhelontra/tensorflow-on-arm/issues/13) 
by adding this code at the top of the notebook:

%%bash
add-apt-repository ppa:ubuntu-toolchain-r/test -y
apt-get update
apt-get install gcc-4.9
apt-get upgrade libstdc++6

You have to restart the runtime before the import statements work

Not pretty, but hopefull Google Colab will upgrade their OS at some point

Best regards, Jan

On 14 Dec 2020, at 18.33, Matthew Swain mailto:m.sw...@me.com>> 
wrote:

The conda-forge C++ compiler was updated recently, and this got used when 
building the latest 2020.09.3 release yesterday:
https://github.com/conda-forge/rdkit-feedstock/pull/62/files<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fconda-forge%2Frdkit-feedstock%2Fpull%2F62%2Ffiles=04%7C01%7Cjhjensen%40chem.ku.dk%7C687ed7fc874149a0056708d8a0565e0c%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637435640993435031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000=kt0w0X1VjH86F0K5HR%2BuD08VXRseGj227dUPqm%2FgJBM%3D=0>

It looks to me like the system libstdc++ is being loaded from 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6 instead of the one provided by conda 
at /usr/local/lib/libstdc++.so.6 (I think).

I wonder if this has always been the case, but the versions were compatible so 
it went unnoticed until now? As you point out in the notebook, you can pin 
RDKit to the previous version as a quick fix, but obviously this isn’t a great 
solution long term.

I don’t know if it might be possible to change LD_LIBRARY_PATH in an already 
running notebook? This is a bit of an unusual situation given the way miniconda 
is installed into an already running notebook process and just added to 
sys.path… It might not be possible to control how native libraries are loaded.

Matt

On 14 Dec 2020, at 14:08, Jan Halborg Jensen 
mailto:jhjen...@chem.ku.dk>> wrote:

My conda install script in Colab 
https://colab.research.google.com/drive/1cAuW02_9r3wFylijGP8rvOUa1-omVwpP?usp=sharing<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcolab.research.google.com%2Fdrive%2F1cAuW02_9r3wFylijGP8rvOUa1-omVwpP%3Fusp%3Dsharing=04%7C01%7Cjhjensen%40chem.ku.dk%7C687ed7fc874149a0056708d8a0565e0c%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637435640993435031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000=MDzcQsn6Lwx%2B7pv3UHKw1CxIf94Rkz0SR3wdV399qD0%3D=0>
 stopped working with the last 1-2 days.

I now get the following error

ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' 
not found (required by 
/usr/local/lib/python3.7/site-packages/rdkit/DataStructs/cDataStructs.so

Any tips or suggestions are appreciated

Best regards, Jan



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problems with conda install on Google Colab

2020-12-14 Thread Jan Halborg Jensen
My conda install script in Colab 
https://colab.research.google.com/drive/1cAuW02_9r3wFylijGP8rvOUa1-omVwpP?usp=sharing
 stopped working with the last 1-2 days.

I now get the following error

ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' 
not found (required by 
/usr/local/lib/python3.7/site-packages/rdkit/DataStructs/cDataStructs.so

Any tips or suggestions are appreciated

Best regards, Jan



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How many bonds of a Type in a molecule

2020-12-09 Thread Jan Halborg Jensen
mol.GetSubstructMatches(Chem.MolFromSmarts(‘[*]=[*]’))

Should find all double bonds between non-aromatic atoms. If you want to include 
those Chem.Kekulize(mol) first.

On 8 Dec 2020, at 15.01, José Emilio Sánchez Aparicio 
mailto:joseemilio.sanc...@uab.cat>> wrote:

Dear all,

I need to find how many bonds of a certain type are in a molecule. For example, 
for DOUBLE bonds, I would do:

bond_number = 0
for bond in mol.GetBonds():
if bond.GetType() == Chem.BondType.DOUBLE:
bond_number += 1

However, searching for faster manners to do this, I found "rdqueries". For 
example, to find how many atoms of Carbon there are in a molecule, you could do:

q = rdqueries.AtomNumEqualsQueryAtom(6)
carbon_number = len(mol.GetAtomsMatchingQuery(q))

I'm wondering if some of you know the equivalent in "rdqueries" to find the 
number of bonds that match a type.

Many thanks in advance. Best,

José-Emilio Sánchez
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=04%7C01%7Cjhjensen%40chem.ku.dk%7C5d1853f4894b40f5fb5408d89b977bec%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C1%7C637430422820009115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000sdata=A3DQM8k0SifBs7KL2F6zsqS981i1sfewrtBD4%2FdYIy8%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Incorrect gold particle placement

2020-12-01 Thread Jan Halborg Jensen
One option is to use AllChem.MMFFOptimizeMolecule(mol3D, 
ignoreInterfragInteractions=False), but I am not sure MMFF can handle Au.

Another option is to define the Au-S bond ('C(C1C(C(C(C(O1)S[Au])O)O)O)O’) and 
use AllChem.UFFOptimizeMolecule(mol3D)

Best regards, Jan

On 1 Dec 2020, at 13.36, Anthony Nash 
mailto:anthony.n...@ndcn.ox.ac.uk>> wrote:


Dear all,

I'm new to RDKit and cheminformatics in general. I'm using the latest RDKit 
libraries. Any suggestions you can offer would be kindly received.

Using the canonical SMILES   C(C1C(C(C(C(O1)[S-])O)O)O)O.[Au+]   of 
Aurothioglucose  (Pubchem CID: 454937) I've generated a 3D structure using the 
python RDKit code:

mol = Chem.MolFromSmiles(self.canonicalSMILES)
mol3D = Chem.AddHs(mol)
AllChem.EmbedMolecule(mol3D)
AllChem.MMFFOptimizeMolecule(mol3D)

Unfortunately, the mol3D representation has Au+ right in the middle of and in 
the plane of the benzene rings, too far from the negatively charged sulfur. I'm 
new at generating structures from SMILES. Are there any steps I'm missing that 
could help correct the placement of Au+?

Thanks
Anthony

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=04%7C01%7Cjhjensen%40chem.ku.dk%7C57815f9f2e754e262a5408d895fa14d0%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C1%7C637424248612593024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000sdata=vCo931FmtaS2hJ%2FCHVrfzFaJO32h4kCaA252sUHmxrs%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Define identical atoms in SMARTS pattern

2020-07-27 Thread Jan Halborg Jensen
Thanks you to David, Ivan, and Hao for the very useful answers.

Best regards, Jan

> On 24 Jul 2020, at 15.51, Jan Halborg Jensen  wrote:
> 
> Is there a way to find a [C]([#X])[#X] pattern, where X=X, that finds C(C)C, 
> C(O)O, C(F)F, etc., but not C(C)O, etc.?
> 
> Best regards, Jan
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=02%7C01%7Cjhjensen%40chem.ku.dk%7C8fec40a32b044010e90b08d82fdae4ec%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637311964477194470sdata=BSB5Zw5vAvgsUTT69cT5SuL1fCI0Tc7V2DjDU2azuls%3Dreserved=0



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Define identical atoms in SMARTS pattern

2020-07-24 Thread Jan Halborg Jensen
Is there a way to find a [C]([#X])[#X] pattern, where X=X, that finds C(C)C, 
C(O)O, C(F)F, etc., but not C(C)O, etc.?

Best regards, Jan

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] substructure matching

2020-07-21 Thread Jan Halborg Jensen
I get both to be True using version 2020.03.04

On 21 Jul 2020, at 14.08, Quoc-Tuan DO 
mailto:quoctuan...@greenpharma.com>> wrote:

Hello,

I am not very familiar with smiles/smarts and find the following results quite 
puzzling:


>>> patt = 
>>> Chem.MolFromSmiles('c1ccc(cc1)C~C2NC~Cc3c23.c1ccc(cc1)C~C2NC~Cc3c23')

>>> mol = 
>>> Chem.MolFromSmiles('COc1ccc2cc1Oc1ccc(cc1)CC1N(C)CCc3c1c1Oc4cc5C(C2)NCCc5cc4Oc1c(c3)OC')

>>> print mol.HasSubstructMatch(patt)

False


>>> mol = 
>>> Chem.MolFromSmiles('COc1ccc7cc1Oc2ccc(cc2)CC3N(C)CCc4c3cc(c(c4)OC)Oc5ccc6c(c5)CCNC6C7')

>>> print mol.HasSubstructMatch(patt)

True

It seems that a presence of an extra Ph - O - Ph makes the difference but I am 
not sure why. How should the smarts be to have positive results for both smiles 
?

Thank you in advance for your help.
Best regards,
Quoc-Tuan

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=02%7C01%7Cjhjensen%40chem.ku.dk%7C66a9c734e4b148ec5f4808d82d6f5793%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637309303542102797sdata=pQvzevQ7fWtPRenNrM19eKxbVDSsK1hff1TcxGsZLfk%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem with AllChem.EmbedMolecule and/or MMFFOptimizeMolecule

2020-07-12 Thread Jan Halborg Jensen
The 3D structure of the first molecule looks fine to me:
https://colab.research.google.com/drive/1V-KkS4tMfbD5UNIs5tyQewZnYtq7yQ7i?usp=sharing

What version of RDKit are you using?

On 12 Jul 2020, at 07.00, Wojtek Plonka 
mailto:plon...@plonkaw.com>> wrote:

Dear Greg, All,

(I tried sending the message some time ago, but I think it did not go through)

I'm trying to convert some molecules which I have as SMILES strings only to 3D.
I use a methodology similar to the below script, except this example saves to 
SDF at different stages of conversion for test purposes.

What happens is that I get very bad 3D structures, CH3 groups with insane 
geometry, crazy bond lengths between heavy atoms for some molecules, even as 
the EmbedMolecule and MMFFOptimizeMolecule report success. The problems seem to 
be gone when I remove the chirality data from SMILES (as far as little I 
understand and like SMILES at all:) ) The script has 3 molecules to process, 
I'd greatly appreciate it if any of you could take a look at the SDF with any 
3D molecule viewer file it produces. The first and second one are processed, 
the third fails, but this is OK. The problem is the geometry I get for the 
first molecule.

Any suggestions what I might be doing wrong?

I tried playing with parameters of EmbedMolecule and MMFFOptimizeMolecule, also 
using UFF optimization, no success. I can fix my molecules by running MM3 
calculations in external software, but I'd love to avoid that.

Here is the code:

from rdkit import Chem
from rdkit.Chem import AllChem


myuglymols = [

'C[C@@]12OC(=O)[C@]3(O)CC[C@H]4[C@@H](C[C@@H](O)[C@@]5(O)CC=CC(=O)[C@]45C)[C@@]45O[C@@]13[C@@H](C4=O)[C@]1(C)C[C@H]2OC(=O)[C@@H]1CO5',

'C[C]12OC(=O)[C]3(O)CC[CH]4[CH](C[CH](O)[C]5(O)CC=CC(=O)[C]45C)[C]45O[C]13[CH](C4=O)[C]1(C)C[CH]2OC(=O)[CH]1CO5',

'CC1=CC(=O)[C@@H](O)[C@]2(C)[C@H]3[C@]4(O)OC[C@@]33[C@@H](C[C@@H]12)OC(=O)C[C@H]3C(=C)[C@H]4O'
]

w = Chem.SDWriter('uglymols.sdf')

for smiles in myuglymols:
m = Chem.MolFromSmiles(smiles)
if (m):
mold = m
m.SetProp('State','MolFromSmiles')
w.write(m)
Chem.SanitizeMol(m)
m.SetProp('State','SanitizeMol')
w.write(m)
try:
print (smiles)
m= Chem.AddHs(m)
# 
print(AllChem.EmbedMolecule(m,randomSeed=42,useRandomCoords=True,useSmallRingTorsions=True,
 useMacrocycleTorsions=True))
print(AllChem.EmbedMolecule(m))
m.SetProp('State','SanitizeMol')
w.write(m)
opt = 
AllChem.MMFFOptimizeMolecule(m,maxIters=1,ignoreInterfragInteractions=False,mmffVariant='MMFF94')
m.SetProp('State','MMFFOptimizeMolecule')
m.SetProp('Optimized',str(1-opt))
w.write(m)
print(opt)
except Exception as e:
print ('Failed')
m = mold
m.SetProp('Optimized','Error')
w.write(m)
w.flush()
w.close()

Thank you very much!

Wojtek Plonka
+48885756652
wojtekplonka.com
fb.com/wojtek.plonka

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=02%7C01%7Cjhjensen%40chem.ku.dk%7Cc9707731a2bb4cbe0cb908d826288afa%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637301302871342085sdata=blIHAuotiLkpcDk8h6kZGCA%2B78ffGzPOfobpZ%2BQXtPQ%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Random structure generator based on chemical formula?

2020-06-14 Thread Jan Halborg Jensen
I whipped up something quick and dirty: 
https://colab.research.google.com/drive/18esebASwEfPviu-zn9xIs1fwmED-7Yi3?usp=sharing


On 13 Jun 2020, at 10.54, theozh mailto:the...@gmx.net>> wrote:

Dear RDKit-Community,

is there maybe a way with RDKit to generate random (but valid) molecules with a 
given chemical sumformula?
For example:
C12H9N could generate Carbazole as valid compound.
The output would be mol or SMILES.

I haven't found (yet) anything in this direction in the RDKit documentation and 
in the web.
But maybe I overlooked some modules, functions or examples which could be the 
base for realizing such a random generator?

Thank you for any hints,
Theo.


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=02%7C01%7Cjhjensen%40chem.ku.dk%7Cd1bccb857aae435de30d08d80f77988f%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637276353634109727sdata=7SdlYbVb3eA8pH%2BJq7X%2FO6nfnkz8I%2B5FiGxvc99VZm8%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Random structure generator based on chemical formula?

2020-06-14 Thread Jan Halborg Jensen
As Nils mentioned this is not trivial. There is a program that does this 
(molgen.de) but I believe it’s commercial. Have a look at DOI 
10.1002/anie.200462457 and reference 7 in that paper.

It’s even more difficult if you only want realistic, i.e. stable and 
synthetically accessible, molecules. 

> On 13 Jun 2020, at 10.54, theozh  wrote:
> 
> Dear RDKit-Community,
> 
> is there maybe a way with RDKit to generate random (but valid) molecules with 
> a given chemical sumformula?
> For example:
> C12H9N could generate Carbazole as valid compound.
> The output would be mol or SMILES.
> 
> I haven't found (yet) anything in this direction in the RDKit documentation 
> and in the web.
> But maybe I overlooked some modules, functions or examples which could be the 
> base for realizing such a random generator?
> 
> Thank you for any hints,
> Theo.
> 
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=02%7C01%7Cjhjensen%40chem.ku.dk%7Cd1bccb857aae435de30d08d80f77988f%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637276353634109727sdata=7SdlYbVb3eA8pH%2BJq7X%2FO6nfnkz8I%2B5FiGxvc99VZm8%3Dreserved=0


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] IsBondRotatable() ?

2020-05-31 Thread Jan Halborg Jensen
Here’s some code. I found the first SMARTS string somewhere (can’t take credit 
for that) and added the two others (I think):
https://github.com/jensengroup/get_conformations/blob/4f3c5ef47f66708defc64196c94e427a956b8f15/get_conformations.py#L16

On 30 May 2020, at 13.38, Tim Dudgeon 
mailto:tdudgeon...@gmail.com>> wrote:

Is there an easy way to ask whether a particular bond is rotatable?
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=02%7C01%7Cjhjensen%40chem.ku.dk%7Cc0c2816acd794511223e08d8048e2a7c%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637264355938014332sdata=77o142MKG6w9YRjfxY34%2Fs2WQ6WS8018EkjshXnSDtM%3Dreserved=0

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] create a cations

2020-02-02 Thread Jan Halborg Jensen
Hi Shani

This should work

rxn = AllChem.ReactionFromSmarts('[C:1]=[C:2]>>[*:1][*+:2]')

m = Chem.MolFromSmiles('CC(C)C1=CC[C@H]2C(=C1)CC[C@@H]3[C@@]2(3(C)C)C')
ps = rxn.RunReactants((m,))
mols = [x[0] for x in ps]

Best regards, Jan

On 2 Feb 2020, at 11.34, Shani Levi 
mailto:levishan...@gmail.com>> wrote:

Hi,
I would like to create a carbocation "database" from my existing database of 
unsaturated molecules.

The idea is as follows:
For each double bond, I would like to attach one Hydrogen, for each carbon in 
the double bond.
So for example, if I have a molecule with one double bond, I will end up with 
two different carbocations,
if I have a molecule with 2 double bonds, I would like to get 4 different 
carbocations.

I tried using GetSubstructMatch for ('C=C') and SetFormalCharge to 1.
e.g:
mol = Chem.MolFromSmiles('CC(C)C1=CC[C@H]2C(=C1)CC[C@@H]3[C@@]2(3(C)C)C')
func = Chem.MolFromSmiles('C=C')
matches = mol.GetSubstructMatches(func)
mol.GetAtomWithIdx(matches[0][1]).SetFormalCharge(1)

The result is:
CC(C)C1=[C+]C[C@H]2C(=C1)CC[C@H]1C(C)(C)CCC[C@@]12C

I would like to get rid of the double bond.
(and get CC(C)C1[C+]C[C@H]2C(=C1)CC[C@H]1C(C)(C)CCC[C@@]12C )

How can I do that?

Thanks a lot in advance,
Shani

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Interactive plots in Jupyter notebooks

2020-01-15 Thread Jan Halborg Jensen
Hi Chris

Iwatobipen had this example using d3js 
https://iwatobipen.wordpress.com/2019/01/19/plot-chemical-space-with-d3js-based-library-rdkit-chemoinformatics/

I made a simple version here 
https://colab.research.google.com/drive/11rmBtA6nBgJIp_V_i_tafvffE5ZdY5Uz

https://youtu.be/ORKOoLHc-Xg

Best regards, Jan

On 15 Jan 2020, at 10.30, Chris Swain via Rdkit-discuss 
mailto:rdkit-discuss@lists.sourceforge.net>>
 wrote:

Hi,

I've been looking at ways to produce interactive plots within a Jupyter 
notebook and after trying a couple of options I used Plotly. This seems fairly 
straight-forward to use and I can produce interactive data frames, in addition 
to 2D and 3D scatterplots.

https://www.macinchem.org/blog/files/a5cc1afff58a411056af7b8d9b0011dd-2577.php

Whilst I can get text to appear when hovering over a data point I'd be 
interested in ideas of how to get the structure displayed when you mouse over a 
point.

Cheers,

Chris


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Turn off warning in Huckel calculations.

2019-12-10 Thread Jan Halborg Jensen
I am using the new Huckel feature to find bonded atoms using bond orders 
(https://github.com/jensengroup/xyz2mol : xyz2AC_huckel). This means I am doing 
a calculation in a mol object with no bonds based xyz coordinates I read in, 
including hydrogens.

A calculation on water gives the following warning
!!! Warning !!! Distance between atoms 2 and 1 (0.962107 A) is suspicious.
!!! Warning !!! Distance between atoms 3 and 1 (0.962107 A) is suspicious.

where atom 2 and 3 are Hs. I believe this warning is because there are no bonds 
defined. Is there a way to turn off this warning?

rdBase.DisableLog('rdApp.error’) doesn’t work.

Best regards, Jan
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Converting Bond Order Matrices to SMILES format

2019-12-02 Thread Jan Halborg Jensen
Hi Pablo

Have a look at the BO2mol subroutine in 
https://github.com/jensengroup/xyz2mol/blob/master/xyz2mol.py

Best regards, Jan

On 2 Dec 2019, at 10.58, Pablo Ramos 
mailto:pablo.ra...@covestro.com>> wrote:

Hello everybody,

Bond Order matrices represent the connectivity between atoms in a molecule. 
Single bonds are represented with value equal to 1, double bonds with value 
equal to 2, etc.
Does anybvody know about an implementation in RDKit that allows the convertion 
from Bond Order matrix format to SMILES format?

Thank you.

Best regards,

Pablo Ramos
Ph.D. at Covestro Deutschland AG




covestro.com
Telephone
+49 214 6009 7356


Covestro Deutschland AG
COVDEAG-Chief Commer-PUR-R
B103, R164
51365 Leverkusen, Germany

pablo.ra...@covestro.com


The processing of personal data is necessary to communicate and provide our 
services. Read more here:  
privacy-information.
Please manually add the sentence in your language here. Please make sure that 
this link is not lost when the sentence is transferred (Copy --> Paste --> 
“Merge Formatting”).


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Molecules not rendere in Dataframe

2019-11-05 Thread Jan Halborg Jensen
Hi again,

Since I thought it might be a Colab problem I also posted the question on 
Stackoverflow, and got an answer from Oliver Scott
https://stackoverflow.com/questions/58656572/problem-using-addmoleculecolumntoframe-on-google-colab/58690736#58690736

"Seems like this is a problem with all pandas versions above 0.25.0, So I guess 
for now the easiest fix is to downgrade pandas. Or you can use this method 
which seemed to work for me:


from IPython.display import HTML
HTML(df.to_html())

I haven’t managed to downgrade pandas on Colab (almost certainly a Colab issue) 
but the other workaround works fine.

Best regards, Jan

On 4 Nov 2019, at 19.47, Markus Heller 
mailto:mhel...@admarebio.com>> wrote:

Hi,

In a Jupyter notebook, the following code does not show renderings of the 
molecules in a Pandas dataframe:


from rdkit import Chem
from rdkit.Chem import PandasTools
from rdkit.Chem.Draw import MolsToGridImage
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import rdDepictor

rdDepictor.SetPreferCoordGen(True)
IPythonConsole.ipython_useSVG = True

test_df = pd.read_csv(‘test.smi’, delim_whitespace=True, header=None, 
names=[‘smiles’, ‘id’])

PandasTools.RenderImagesInAllDataFrames(images=True)

PandasTools.AddMoleculeColumnToFrame(test_df, ‘smiles’, ‘mol’, 
includeFingerprints=False)

test_df


Instead, string representations are shown (I think), i.e. every field in the 
mol column starts with


2405 Wesbrook Mall, 4th Floor, Vancouver, BC V6T 1Z3

This email and any attachments thereto may contain confidential material for 
the sole use of the intended recipient. Anyreview, copying, or distribution of 
this email (or any attachments thereto) by others is strictly prohibited. If 
you are not theintended recipient, please contact the sender immediately and 
permanently delete the original and any copies of this emailand any attachments 
thereto.

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Molecules not rendere in Dataframe

2019-11-04 Thread Jan Halborg Jensen
I posted basically the same question on Friday. I thought it was a Google Colab 
problem, but apparently it is a more general problem.

On 4 Nov 2019, at 19.47, Markus Heller 
mailto:mhel...@admarebio.com>> wrote:

Hi,

In a Jupyter notebook, the following code does not show renderings of the 
molecules in a Pandas dataframe:


from rdkit import Chem
from rdkit.Chem import PandasTools
from rdkit.Chem.Draw import MolsToGridImage
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import rdDepictor

rdDepictor.SetPreferCoordGen(True)
IPythonConsole.ipython_useSVG = True

test_df = pd.read_csv(‘test.smi’, delim_whitespace=True, header=None, 
names=[‘smiles’, ‘id’])

PandasTools.RenderImagesInAllDataFrames(images=True)

PandasTools.AddMoleculeColumnToFrame(test_df, ‘smiles’, ‘mol’, 
includeFingerprints=False)

test_df


Instead, string representations are shown (I think), i.e. every field in the 
mol column starts with


2405 Wesbrook Mall, 4th Floor, Vancouver, BC V6T 1Z3

This email and any attachments thereto may contain confidential material for 
the sole use of the intended recipient. Any review,copying, or distribution of 
this email (or any attachments thereto) by others is strictly prohibited. If 
you are not the intendedrecipient, please contact the sender immediately and 
permanently delete the original and any copies of this email and anyattachments 
thereto.

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problem using AddMoleculeColumnToFrame on Google Colab

2019-11-01 Thread Jan Halborg Jensen
I’d been using AddMoleculeColumnToFrame on Google Colab with no problem. After 
not using it for about 1 month I just discovered that it stopped working, i.e. 
the images are not showing up in the data frame (see below).

Any ideas? The most likely explanation is that something changed on Google 
Colab. But could it also be that a new version of Pandas is causing the problem?

Here’s the link to the notebook and a screenshot.

Any help is appreciated.

Best regards, Jan

https://colab.research.google.com/drive/1nQPmdEbYQgVsFr7c44yRd3wpXPEsJar3

[cid:02DC0D56-42F8-4639-B923-9C38D0661AAD@scarlan.ki.ku.dk]
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] calculating molecular properties on a Pandas dataframe Molecule

2019-10-31 Thread Jan Halborg Jensen
Hi Mike

This should work

DF[‘HAC’] = [Chem.Lipinski.HeavyAtomCount(mol) for mol in DF[‘Molecule’]]

Best regards, Jan


On 31 Oct 2019, at 10.16, Mike Mazanetz 
mailto:mi...@novadatasolutions.co.uk>> wrote:

Hi RDKit Gurus,

I’ve followed the docs and created a molecule column in my Pandas dataframe.
However, I do not seem to be able to do molecular operations on the column.

For example, if you had a SMILES column, how would you calculate heavy atom 
count and append this result to a new column?

This doesn’t work:
DF[‘HAC’] = Chem.Lipinski.HeavyAtomCount(DF[‘Molecule’])

Where the Molecule column is generated by PandasTools.AddMoleculeColumnToFrame

Thanks,
mike

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Tanimoto and fingerprint representation

2019-09-14 Thread Jan Halborg Jensen
When using GetMorganFingerprintAsBitVect I get the “expected” Tanimoto score

mol1 = Chem.MolFromSmiles('CCC')
mol2 = Chem.MolFromSmiles('CNC')

fp1 = AllChem.GetMorganFingerprintAsBitVect(mol1,2,nBits=1024)
fp2 = AllChem.GetMorganFingerprintAsBitVect(mol2,2,nBits=1024)

print(DataStructs.TanimotoSimilarity(fp1, fp2))

arr1 = np.zeros((1,))
DataStructs.ConvertToNumpyArray(fp1, arr1)
arr2 = np.zeros((1,))
DataStructs.ConvertToNumpyArray(fp2, arr2)
print(np.sum(arr1*arr2)/np.sum(arr1+arr2-arr1*arr2))

0.14285714285714285
0.14285714285714285



However, when using GetMorganFingerprint I get a difference score.

fp1 = AllChem.GetMorganFingerprint(mol1,2)
fp2 = AllChem.GetMorganFingerprint(mol2,2)

print(DataStructs.TanimotoSimilarity(fp1, fp2))

0.2

I thought the Tanimoto score was always computed using bit vectors.  Can anyone 
explain?

Best regards, Jan
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMILES to graphs

2019-07-17 Thread Jan Halborg Jensen
You can get an adjacency matrix with the function GetAdjacencyMatrix:

mol = Chem.GetMolFromSmiles(‘C’)
am = Chem.rdmolops.GetAdjacencyMatrix(mol) 

> On 16 Jul 2019, at 21.06, Navid Shervani-Tabar  wrote:
> 
> Hello,
> 
> I was wondering if it is possible to generate graph representations from 
> SMILES using RDkit package.
> 
> Thanks,
> Navid
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conformer generation with torsion restraints/frozen atoms

2019-04-03 Thread Jan Halborg Jensen
Dear Angelica

Here are a couple of codes that may be of interest. They don’t do exactly what 
you want, but maybe they can give you some ideas.

https://github.com/jensengroup/get_conformations
https://github.com/jensengroup/TS_conf_search

Best regards, Jan

On 3 Apr 2019, at 00.58, Angelica Parente 
mailto:apare...@alumni.stanford.edu>> wrote:

Hi,

I’d like to generate a set of conformers with restraints on some of the 
substructures. I’d like to keep one segment of the molecule frozen, allowing 
the rest of the molecule to be mobile. Within the part of the molecule that is 
mobile, I’d like to restrict the torsion angles for one of the substructures.

How can I go about doing this? I’d also like to make sure I’m getting 
exhaustive sampling, and I’m not sure how long this would take or how many 
conformers I’d need to generate considering this is a fairly large molecule.

Thanks,

Angelica

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Is there any way to protonate a molecule?

2019-03-25 Thread Jan Halborg Jensen
Here’s some more general code

Best regards, Jan


smiles_list = ["CCO","CCS","CC=O","Cc1ccncc1",'NCC(C)=C','CC(N)=N','o11']

reaction_list = 
['[C;H2:1]=[C,N:2]>>[CH3:1][*+:2]','[C;H1:1]=[C,N:2]>>[CH2:1][*+:2]','[C;H0:1]=[C,N:2]>>[CH1:1][*+:2]',
 
'[N;H2:1]>>[*H3+:1]','[O,S,N;H1:1]>>[*H2+:1]','[O,S,N,F,Cl;H0:1]>>[*H1+:1]']

ions = []
ion_smiles = []
for smiles in smiles_list:
mol = Chem.MolFromSmiles(smiles)
Chem.Kekulize(mol,clearAromaticFlags=True)
for i,reaction in enumerate(reaction_list):
rxn = AllChem.ReactionFromSmarts(reaction)
ps = rxn.RunReactants((mol,))
for x in ps:
smiles = Chem.MolToSmiles(x[0],isomericSmiles=True)
# First three reactions can create wrong protonation state on N
if i <= 2:
smiles = smiles.replace("NH2+","N+")
if smiles not in ion_smiles:
ion_smiles.append(smiles)
print smiles
ions.append(Chem.MolFromSmiles(smiles))


On 25 Mar 2019, at 10.12, HC.Ji 
mailto:ji.hongc...@foxmail.com>> wrote:

I m tring simulate the fragmentation of ESI mass spectra based on the [M+H]+ 
ions. Thus, I want to simulate the ionisation by the addition of one proton to 
heteroatoms. For example,
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import rdMolDraw2D
from IPython.display import SVG
# read mol
mol = Chem.MolFromSmiles('O=C(O)C1=CC(=NNC2=CC=C(C=C2)C(=O)NCCC(=O)O)C=CC1=O')
# draw the mol
dr = rdMolDraw2D.MolDraw2DSVG(800,800)
dr.SetFontSize(0.3)
op = dr.drawOptions()
for i in range(mol.GetNumAtoms()) :
  op.atomLabels[i] = mol.GetAtomWithIdx(i).GetSymbol() + str((i+1))
  AllChem.Compute2DCoords(mol)
  dr.DrawMolecule(mol)
  dr.FinishDrawing()
  svg = dr.GetDrawingText()
  SVG(svg)
If I want to add one proton to the N atom with the index of #17 and to ionize 
the molecule, what should I do in rdkit?
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] energy of conformers not matching

2018-12-06 Thread Jan Halborg Jensen
The following code gives

63.50505459068998
66.40551367349616

I don’t understand why the two numbers are not the same

Any tips would be appreciated

Thanks, Jan





smiles = "c1ccc(cc1)OCCNC[C@@H](c2ccc(c(c2)CO)O)O" #CONF_107  
mol = Chem.MolFromSmiles(smiles)
mol = Chem.AddHs(mol)


m = Chem.Mol(mol)
confs = 20
AllChem.EmbedMultipleConfs(m,numConfs=confs,randomSeed=2,useExpTorsionAnglePrefs=True,useBasicKnowledge=True)
energies = AllChem.MMFFOptimizeMoleculeConfs(m,maxIters=2000)
energies_list = [e[1] for e in energies]
min_e_index = energies_list.index(min(energies_list))
#print(energies_list)
print(energies_list[min_e_index])

mol.AddConformer(m.GetConformer(min_e_index))
prop = AllChem.MMFFGetMoleculeProperties(mol, mmffVariant="MMFF94")
ff = AllChem.MMFFGetMoleculeForceField(mol,prop)
low_energy = ff.CalcEnergy()
print(low_energy)



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] issues with explicit / implicit valence

2018-11-15 Thread Jan Halborg Jensen
I’ve written a program that does this: https://github.com/jensengroup/xyz2mol

You need to set  "charged_fragments = False” to work with radicals

Best regards, Jan

On 15 Nov 2018, at 15:05, Peter St. John 
mailto:peterc.stj...@gmail.com>> wrote:

Makes sense, apologies for the lack of details -- it was a bit of a convoluted 
process to get to that molecule.
Attached is a python script that hopefully reproduces it.

Essentially I'm taking the result of a Gaussian optimization (for a radical); 
constructing an SDF file with OpenBabel (via cclib), and then trying to read 
the result in RDKit.
I have the SMILES string of the radical, but the connectivity is lost in the 
gaussian output file. So the SDF that gets created by OpenBabel has to assume 
bond orders based on distances that it sometimes gets wrong.
I also had to edit the AssignBondOrdersFromTemplate function in AllChem to 
handle the radical atoms.

If you had another recommendation on going from a gaussian output file to an 
RDKit mol though, I'd certainly like to hear it.

Thanks!
-- Peter

On Wed, Nov 14, 2018 at 10:53 PM Greg Landrum 
mailto:greg.land...@gmail.com>> wrote:
Hi Peter,

Without seeing how you're building the molecule this one is a bit tricky to 
help with.

If I start with a standard molecule and just adjust the valence count things 
are fine:
In [22]: m = Chem.MolFromSmiles('CNC(C)C')

In [23]: m.GetAtomWithIdx(0).SetNumRadicalElectrons(1)

In [24]: mh = Chem.AddHs(m)

In [25]: print(Chem.MolToMolBlock(mh))

 RDKit  2D

 16 15  0  0  0  0  0  0  0  0999 V2000
0.0.0. C   0  0  0  0  0  4  0  0  0  0  0  0
1.5000   -0.0. N   0  0  0  0  0  0  0  0  0  0  0  0
2.2500   -1.29900. C   0  0  0  0  0  0  0  0  0  0  0  0
0.9510   -2.04900. C   0  0  0  0  0  0  0  0  0  0  0  0
3.5490   -0.54900. C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.50000.0. H   0  0  0  0  0  0  0  0  0  0  0  0
0.1.50000. H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0972   -0.79120. H   0  0  0  0  0  0  0  0  0  0  0  0
2.08611.38080. H   0  0  0  0  0  0  0  0  0  0  0  0
3.   -2.59810. H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3481   -2.79900. H   0  0  0  0  0  0  0  0  0  0  0  0
0.3314   -1.54740. H   0  0  0  0  0  0  0  0  0  0  0  0
1.7010   -3.34810. H   0  0  0  0  0  0  0  0  0  0  0  0
4.84810.20100. H   0  0  0  0  0  0  0  0  0  0  0  0
4.2990   -1.84810. H   0  0  0  0  0  0  0  0  0  0  0  0
2.96300.83170. H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0
  2  3  1  0
  3  4  1  0
  3  5  1  0
  1  6  1  0
  1  7  1  0
  1  8  1  0
  2  9  1  0
  3 10  1  0
  4 11  1  0
  4 12  1  0
  4 13  1  0
  5 14  1  0
  5 15  1  0
  5 16  1  0
M  RAD  1   1   2
M  END


In [26]: Chem.SanitizeMol(mh)
Out[26]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [27]: Chem.SanitizeMol(m)
Out[27]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

How are you constructing the molecule with the radical?

Best,
-greg


On Wed, Nov 14, 2018 at 7:36 PM Peter St. John 
mailto:peterc.stj...@gmail.com>> wrote:
I have a molecule with radicals for which I'm trying to correct the bond orders.
The mol block I have currently is shown below.

Ultimately it thinks the first carbon (which is supposed to have 2 explicit 
hydrogens, 1 C-C bond, and 1 radical electron) has a valence of 5. So when I 
try to call `SanitizeMol`, it errors out with too high a valence.

for the problematic atom 'a',

>>> a.GetNumImplicitHs()

RuntimeError: Pre-condition Violation
getNumImplicitHs() called without preceding call to 
calcImplicitValence()


>>> a.GetTotalValence()

3 (odd, since this is what I want)


>>> a.UpdatePropertyCache()

ValueError: Sanitization error: Explicit valence for atom # 0 C, 5, is greater 
than permitted


And when I print the mol block, it clearly thinks that first carbon as a 
valence of 5.

Any suggestions how to fix this?


>>> print(Chem.MolToMolBlock(mol))


9572
 RDKit  3D

 15 14  0  0  0  0  0  0  0  0999 V2000
2.0411   -0.0455   -0.1061 C   0  0  0  0  0  5  0  0  0  0  0  0
0.8127   -0.56440.2519 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.39530.0049   -0.3294 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.65111.43260.1487 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.5741   -0.9060   -0.0263 C   0  0  0  0  0  0  0  0  0  0  0  0
2.15780.2387   -1.1430 H   0  0  0  0  0  0  0  0  0  0  0  0
2.9032   -0.40210.4366 H   0  0  0  0  0  0  0  0  0  0  0  0
0.7154   -0.78891.2330 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.22820.0219   -1.4109 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.84631.43781.2242 H   0  0  0  0  0  0  0  0  0  0  0  0
0.21972.0597   -0.0426 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.51611.8651   -0.3565 

Re: [Rdkit-discuss] organometallics?

2018-09-13 Thread Jan Halborg Jensen
Here’s a modest step in the right direction 
https://www.wildcardconsulting.dk/useful-information/how-to-solve-problems-with-coordinate-bonds-in-rdkit/

Best regards, Jan
On 13 Sep 2018, at 15:14, Greg Landrum 
mailto:greg.land...@gmail.com>> wrote:

Hi Michal,

Though the RDKit theoretically has many of the infrastructure pieces required 
to handle organometallics (though there's not a lot you can do with them once 
you've loaded them), the difficult part almost always ends up being finding 
input files that have reasonably machine-readable structures in them.

If you have some examples you can share, I'd be happy to take a look to see if 
I can suggest ways to read them in.

Best,
-greg


On Wed, Sep 12, 2018 at 10:30 PM Michal Krompiec 
mailto:michal.kromp...@gmail.com>> wrote:
Hello,
I've been asked to analyze a dataset of organometallic compounds (provided in 
SDF), but it turns out that most of them are not compatible with RDKit (due to 
having pi-alkene, pi-allyl, cyclopentadienyl et al. ligands). The structures 
can be correctly represented in Marvin, though. Can anybody point me to a 
toolkit (or RDKit hack) that can handle these?
Best,
Michal


Dr. Michal Krompiec
Adjunct Professor
School of Chemistry, University of Southampton
Highfield, Southampton SO17 1BJ, UK

and
Head of Computational Modelling | Performance Materials | Early Research and 
Business Development
Merck
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] enumeration of smiles question

2018-08-06 Thread Jan Halborg Jensen
This blogpost links to two other ones that may have done that (haven’t read 
them carefully): 
https://baoilleach.blogspot.com/2018/06/cheminformatics-for-deep-learners.html

Best regards, Jan

On 06 Aug 2018, at 11:57, Guillaume GODIN 
mailto:guillaume.go...@firmenich.com>> wrote:

Dear Greg,

Fantastic, thank you to give both explanation and solution to this “simple 
question”, I know this is not so simple & it’s fundamental for data 
augmentation in deep learning.

If I may, I have another question related, do you know if someone has worked on 
a generator of all unique smiles independently of RDKit ?

Thanks again,

Guillaume

De : Greg Landrum mailto:greg.land...@gmail.com>>
Date : lundi, 6 août 2018 à 11:40
À : Guillaume GODIN 
mailto:guillaume.go...@firmenich.com>>
Cc : RDKit Discuss 
mailto:rdkit-discuss@lists.sourceforge.net>>
Objet : Re: [Rdkit-discuss] enumeration of smiles question


On Thu, Aug 2, 2018 at 8:59 AM Guillaume GODIN 
mailto:guillaume.go...@firmenich.com>> wrote:

I have a simple question about generating all possible smiles of a given 
molecule:

It's a simple question, but the answer is somewhat complicated. :-)


RDKit provides only 4 differents smiles for my molecule “CCC1CC1“:
C1C(CC)C1
CCC1CC1
C1(CC)CC1
C(C)C1CC1

While by hand we can write those 7 smiles:
CCC1CC1
C(C)C1CC1
C(C1CC1)C
C1CC(CC)1
C1C(CC)C1
C1CC1CC
C(CC)1CC1

I use this function for the enumeration:

def allsmiles(smil):
m = Chem.MolFromSmiles(smil) # Construct a molecule from a SMILES string.
if m is None:
return smil
N = m.GetNumAtoms()
if N==0:
return smil
try:
n= np.random.randint(0,high=N)
t= Chem.MolToSmiles(m, rootedAtAtom=n, canonical=False)
except :
return smil
return t

n= 50
SMILES = [“CCC1CC1”]
SMILES_mult = [allsmiles(S) for S in SMILES for i in range(n)]

Why we cannot generate all the 7 smiles ?

The RDKit has rules that it uses to decide which atom to branch to when 
generating a SMILES. These are used regardless of whether you are generating 
canonical SMILES or not.
The upshot of this is that it will never generate a SMILES where there's a 
branch before a ring closure.
The other important factor here is that atom rank is determined by the index of 
the atom in the molecule when you aren't using canonicalization. So changing 
the atom order on input can help:
In [12]: set(allsmiles('CCC1CC1') for i in range(50))
Out[12]: {'C(C)C1CC1', 'C1(CC)CC1', 'C1C(CC)C1', 'CCC1CC1'}

In [13]: set(allsmiles('C1CC1CC') for i in range(50))
Out[13]: {'C(C1CC1)C', 'C1(CC)CC1', 'C1CC1CC', 'CCC1CC1'}
You can do this all at once as follows:

```
In [20]: def allsmiles(smil):
...: m = Chem.MolFromSmiles(smil) # Construct a molecule from a SMILES 
string.
...: if m is None:
...: return smil
...: N = m.GetNumAtoms()
...: if N==0:
...: return smil
...: aids = list(range(N))
...: random.shuffle(aids)
...: m = Chem.RenumberAtoms(m,aids)
...: try:
...: n= random.randint(0,N-1)
...: t= Chem.MolToSmiles(m, rootedAtAtom=n, canonical=False)
...: except :
...: return smil
...: return t
...:
...:
...:

In [21]:

In [21]: set(allsmiles('C1CC1CC') for i in range(50))
Out[21]: {'C(C)C1CC1', 'C(C1CC1)C', 'C1(CC)CC1', 'C1C(CC)C1', 'C1CC1CC', 
'CCC1CC1'}
```
Note that I switched to using python's built in random module instead of using 
the one in numpy.

-greg




Thanks guys,

Best regards,

Guillaume
***
DISCLAIMER
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their official business.
***
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
***
DISCLAIMER
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their official business.

Re: [Rdkit-discuss] 3D structure generator

2018-07-14 Thread Jan Halborg Jensen
Dear Jahn

I've written som code that might be useful to you. More info here 
http://proteinsandwavefunctions.blogspot.com/2017/12/tsconfsearch-conformer-search-for.html

Best regards, Jan

From: Jahn Nitschke [jahn.nitsc...@uni-konstanz.de]
Sent: Friday, July 13, 2018 5:19 PM
To: rdkit-discuss@lists.sourceforge.net
Subject: [Rdkit-discuss] 3D structure generator

Dear community,

I work with structural variants of macrocycles and want to generate 3D 
conformers from SMILES. However, the implemented methods have problems dealing 
with the macrocycle and generate frequently non-planar or cis-peptide bonds. I 
know (from MD) that the macrocycle conformation itself is quite stable. 
Therefore, I wondered if the conformer generators could be used in such a way 
that the macrocycle atoms are fixed a priori and the sidechain atom coordinates 
would be generated by the structure generator. What do you think?

Best,
Jahn


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ReactionFromSmarts on bimolecular systems

2018-06-25 Thread Jan Halborg Jensen
Thanks, Greg!

I suspected as much so I started on a workaround, which I’ll post here for the 
record

Best regards, Jan

mol = Chem.MolFromSmiles('COC.C=C.C=CC')

rxn_smarts = ['[C:1]=[C:2]>>[*:1][*:2]']

fragments = Chem.GetMolFrags(mol,asMols=True)

for i,fragment in enumerate(fragments):
  for smarts in rxn_smarts:
  patt = Chem.MolFromSmarts(smarts.split(">>")[0])
  while fragment.HasSubstructMatch(patt):
  rxn = AllChem.ReactionFromSmarts(smarts)
  ps = rxn.RunReactants((fragment,))
  fragment = ps[0][0]
  if i == 0:
mol = fragment
  else:
mol = Chem.CombineMols(mol,fragment)

print Chem.MolToSmiles(mol)


On 25 Jun 2018, at 15:14, Greg Landrum 
mailto:greg.land...@gmail.com>> wrote:

Hi Jan,

Not at the moment. The reaction code brings across mapped atoms in the 
reactants along with anything reachable from them via a bond.
It's probably not that hard to add an option to allow this to happen, but it 
seems like something of an unusual use case.

If you want to require that there's an additional fragment in the reactants, 
you could do the following:
In [17]: mol = Chem.MolFromSmiles('COC.C=C')
...:
...: rxn_smarts = '([*:3].[C:1]=[C:2])>>([*:3].[*:1][*:2])'
...:
...: rxn = AllChem.ReactionFromSmarts(rxn_smarts)
...: ps = rxn.RunReactants((mol,))
...: new_mol = ps[0][0]
...: print(Chem.MolToSmiles(new_mol))
...:
...:
CC.COC

But that won't work if you don't have a second fragment there

-greg



On Mon, Jun 25, 2018 at 5:10 AM Jan Halborg Jensen 
mailto:jhjen...@chem.ku.dk>> wrote:
The following code returns ‘CC’, i.e. the COC molecule is removed. I had a look 
at the documentation for RunReactants and do not see a way to change this.

Is there any way to get the code to return ‘COC.CC’?

Thanks, Jan


mol = Chem.MolFromSmiles('COC.C=C')

rxn_smarts = '[C:1]=[C:2]>>[*:1][*:2]'

rxn = AllChem.ReactionFromSmarts(rxn_smarts)
ps = rxn.RunReactants((mol,))
new_mol = ps[0][0]
print Chem.MolToSmiles(new_mol)
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org<http://slashdot.org>! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] ReactionFromSmarts on bimolecular systems

2018-06-25 Thread Jan Halborg Jensen
The following code returns ‘CC’, i.e. the COC molecule is removed. I had a look 
at the documentation for RunReactants and do not see a way to change this.  

Is there any way to get the code to return ‘COC.CC’?

Thanks, Jan


mol = Chem.MolFromSmiles('COC.C=C')

rxn_smarts = '[C:1]=[C:2]>>[*:1][*:2]'

rxn = AllChem.ReactionFromSmarts(rxn_smarts)
ps = rxn.RunReactants((mol,))
new_mol = ps[0][0]
print Chem.MolToSmiles(new_mol)
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] convert a smiles file to a xyz file

2018-05-24 Thread Jan Halborg Jensen
Have a look at write_xtb_input_file in this module: 
https://github.com/jensengroup/take_elementary_step/blob/master/write_input_files.py

The xtb input is simple an xyz file with some additional lines below if the 
molecule is charged. You can simply those lines in the code.

Best regards, Jan

On 23 May 2018, at 17:23, Chenyang Shi 
> wrote:

Hi Everyone,

I am seeking helps about how to convert a SMILES file to a series of 
coordinates for the molecule, in the format of xyz.
I saw some online service that can do the job (e.g. 
http://www.cheminfo.org/Chemistry/Cheminformatics/FormatConverter/index.html), 
but it is not convenient to use.

I am wondering how can we do this by writing RDKit code. A separate question is 
that is the converted molecular structure from SMILES the same as that taken 
from a crystal structure?

Many thanks!
Chenyang
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] problems with EmbedMol

2018-02-28 Thread Jan Halborg Jensen
Hi Brian

My interest stems from this paper http://dx.doi.org/10.1039/c5cp04706d where 
they alter the bounds matrix to impose constraints on the embedding.

In principle this could also be done with EmbedMolecule and coordMap but that 
doesn't seem to be working
https://sourceforge.net/p/rdkit/mailman/message/36139015/

Best regards, Jan

From: Bennion, Brian [benni...@llnl.gov]
Sent: Wednesday, February 28, 2018 7:09 PM
To: Greg Landrum
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] problems with EmbedMol


Hello Greg and Jan,


This is a real newbie question, but what is the use case for this function?  Is 
it used to generate all possible connections (limited by some distance) between 
3 or more atoms given in a smiles string?


Brian



From: Greg Landrum <greg.land...@gmail.com>
Sent: Wednesday, February 28, 2018 8:53:59 AM
To: Jan Halborg Jensen
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] problems with EmbedMol

Hi Jan,

It took me much longer than it should have to figure this one out...

The bounds matrix that is returned by GetMoleculeBoundsMatrix() needs to have 
triangle bounds smoothing applied to it before it can be embedded. The bounds 
smoothing process narrows the possible distance ranges between the atoms. 
Here's a quick demo of that.

We start with your example:

In [19]: mol = Chem.MolFromSmiles("CCC")
...: mol = Chem.AddHs(mol)
...: bounds = AllChem.GetMoleculeBoundsMatrix(mol)
...: EmbedLib.EmbedMol(mol,bounds)
...:
---
ValueErrorTraceback (most recent call last)
 in ()
  2 mol = Chem.AddHs(mol)
  3 bounds = AllChem.GetMoleculeBoundsMatrix(mol)
> 4 EmbedLib.EmbedMol(mol,bounds)

c:\Users\glandrum\RDKit_git\rdkit\Chem\Pharm3D\EmbedLib.py in EmbedMol(mol, bm, 
atomMatch, weight, randomSeed, excludedVolumes)
183   for i in range(nAts):
184 weights.append((i, idx, weight))
--> 185   coords = DG.EmbedBoundsMatrix(bm, weights=weights, numZeroFail=1, 
randomSeed=randomSeed)
186   # for row in coords:
187   #  print(', '.join(['%.2f'%x for x in row]))

ValueError: could not embed matrix


But if we do the triangle bounds smoothing things embed without problems:

In [20]: from rdkit import DistanceGeometry

In [21]: DistanceGeometry.DoTriangleSmoothing(bounds)
Out[21]: True

In [22]: EmbedLib.EmbedMol(mol,bounds)

In [23]:

There is a good argument to be made for GetMoleculeBoundsMatrix() returning the 
smoothed bounds matrix by default. I'll put that on the list for the next 
release.

Best,
-greg



On Wed, Feb 28, 2018 at 10:41 AM, Jan Halborg Jensen 
<jhjen...@chem.ku.dk<mailto:jhjen...@chem.ku.dk>> wrote:
The following code works fine with ethane (CC) but for propane (CCC) or 
anything else I get the following error
ValueError: could not embed matrix

Any ideas or solutions would be appreciated

Best regards, Jan


from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.Pharm3D import EmbedLib


mol = Chem.MolFromSmiles("CCC")
mol = Chem.AddHs(mol)
bounds = AllChem.GetMoleculeBoundsMatrix(mol)

EmbedLib.EmbedMol(mol,bounds)
EmbedLib.OptimizeMol(mol, bounds)

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] problems with EmbedMol

2018-02-28 Thread Jan Halborg Jensen
The following code works fine with ethane (CC) but for propane (CCC) or 
anything else I get the following error
ValueError: could not embed matrix

Any ideas or solutions would be appreciated

Best regards, Jan


from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.Pharm3D import EmbedLib


mol = Chem.MolFromSmiles("CCC")
mol = Chem.AddHs(mol)
bounds = AllChem.GetMoleculeBoundsMatrix(mol)

EmbedLib.EmbedMol(mol,bounds)
EmbedLib.OptimizeMol(mol, bounds)
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Finding possible reaction products

2018-02-06 Thread Jan Halborg Jensen
One option is to construct a library of reaction SMARTS for common chemical 
reactions.

Another, more exhaustive, approach is enumerate all possible connectivity 
matrices and convert them to molecules
See DOI: 10.1039/C7SC03628K and https://github.com/jensengroup/xyz2mol
I am working on implementing this approach in my spare time

You might also want to look at these papers: 10.1021/ct9003383 and 
10.1002/jcc.23271

Best regards, Jan

On 06 Feb 2018, at 10:34, Francisco Leskovar 
> wrote:

Hi all!

I was wondering what is the best approach to generating all the possible 
products for a given set of reactants using RDkit. My goal is to be able to 
find all possible products and then carry out a Nuged Elastic Band calculation 
to estimate the rate constants of those reactions.

I would like to use this approach for detecting degradation pathways in 
pharmaceutical dosage forms. For example, if a molecule with a carbonyl group 
and a molecule with a nuclephilic amino group are specified, I would like to be 
able to automatically predict a Milliard reaction.

Thank you so much for your help and time.
Kind regards,
Francisco
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] atom mapping using GetSubstructMatch

2018-01-29 Thread Jan Halborg Jensen
I am trying to map atoms in one molecule to those of another using 
GetSubstructMatch, but have a problem with molecule pairs like the two below

The problem is that GetSubstructMatch ignored the H on the n, so that the 
protonated “n" in mol1 is mapped to the deprotonated “n”
in mol2, and vice versa

Is there a way to fix this?

Best regards, Jan



smiles1 = "ClCc1[nH]c(c2n1)2"
smiles2 = "c1(I)cc(I)c2c(c1)nc(CCl)[nH]2"

mol1 = Chem.MolFromSmiles(smiles1)
mol2 = Chem.MolFromSmiles(smiles2)

order_mol1 = list(mol1.GetSubstructMatch(mol1))
order_mol2 = list(mol2.GetSubstructMatch(mol1))

print order_mol2

atom_map = dict(zip(order_mol1,order_mol2))
print atom_map

Draw.DrawingOptions.includeAtomNumbers=True
img = 
Draw.MolsToGridImage([mol1,mol2],molsPerRow=4,subImgSize=(200,200),useSVG=False)
img
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] changing atomic charges with ReactionFromSmarts

2018-01-25 Thread Jan Halborg Jensen

The following code changes the bond order correctly but does not change the 
charges accordingly

Any idea what I am doing wrong?

Thanks, Jan


def clean_charges(mol):
rxn_smarts = ['[N+:1]=[*:2]-[O-:3]>>[N:1]-[*:2]=[O:3]']

for smarts in rxn_smarts:
rxn = AllChem.ReactionFromSmarts(smarts)
ps = rxn.RunReactants((mol,))
for x in ps:
mol = x[0]

#rdmolops.SanitizeMol(mol)
return mol

mol = Chem.MolFromSmiles("C[NH+]=C(C)[O-]")

mol = clean_charges(mol)
print Chem.MolToSmiles(mol)
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] edge matrix

2018-01-18 Thread Jan Halborg Jensen
Dear Guillaume

I understand that the adjacency matrix, together with the atom list, holds all 
the necessary information once bond orders are included, to define the molecule 
object, but do you know if there is an RDKit function that does this: something 
like  mol = Chem.MolFromAdjacencyMatrix(adj)?

Best regard, Jan
On 18 Jan 2018, at 11:07, Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>> wrote:

Dear Jan,

Adjacency matrix is a molecule object encoder.

To do reverse process:

You will need atoms list.

And, you will need to have bond encoding in the Adjacency matrix as well to 
have an exact structure i.e. Double bonds => 2, triple bonds =>3, aromatic 
bonds =>1.5.

But in principal it’s only what you need.

BR,

Guillaume

De : Jan Halborg Jensen
Date : jeudi, 18 janvier 2018 à 10:59
À : GVALMTGG
Cc : Mario Lovrić, RDKit Discuss
Objet : Re: [Rdkit-discuss] edge matrix

Is there a function for the reverse process, i.e. getting a molecule object 
from an adjacency matrix?

Best regards, Jan

On 17 Jan 2018, at 17:19, Guillaume GODIN 
<guillaume.go...@firmenich.com<mailto:guillaume.go...@firmenich.com>> wrote:

Dear Mario,

There is a adjacency matrix available:

from rdkit import Chem
mol = Chem.MolFromSmiles('CC(C)CC')
adj = Chem.GetAdjacencyMatrix(mol)
print adj

[[0 1 0 0 0]
 [1 0 1 1 0]
 [0 1 0 0 0]
 [0 1 0 0 1]
 [0 0 0 1 0]]

But this is not what you want…

Can you explain your output generation process please ?

BR,

Guillaume


De : Mario Lovrić
Date : mercredi, 17 janvier 2018 à 16:31
À : RDKit Discuss
Objet : [Rdkit-discuss] edge matrix

Dear all,

Does any one have an idea how to get an edge matrix (graph theory) out of 
Rdkit, I digged deep but didnt find anything.

F.example for:

'CC(C)CC'


it would be:

array([[0, 1, 1, 0],
   [1, 0, 1, 0],
   [1, 1, 0, 1],
   [0, 0, 1, 0]])

Thanks.


--
Mario Lovrić
***
DISCLAIMER
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their official business.
***
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org<http://slashdot.org/>! 
http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

***
DISCLAIMER
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their official business.
***

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] edge matrix

2018-01-18 Thread Jan Halborg Jensen
Is there a function for the reverse process, i.e. getting a molecule object 
from an adjacency matrix?

Best regards, Jan

On 17 Jan 2018, at 17:19, Guillaume GODIN 
> wrote:

Dear Mario,

There is a adjacency matrix available:

from rdkit import Chem
mol = Chem.MolFromSmiles('CC(C)CC')
adj = Chem.GetAdjacencyMatrix(mol)
print adj

[[0 1 0 0 0]
 [1 0 1 1 0]
 [0 1 0 0 0]
 [0 1 0 0 1]
 [0 0 0 1 0]]

But this is not what you want…

Can you explain your output generation process please ?

BR,

Guillaume


De : Mario Lovrić
Date : mercredi, 17 janvier 2018 à 16:31
À : RDKit Discuss
Objet : [Rdkit-discuss] edge matrix

Dear all,

Does any one have an idea how to get an edge matrix (graph theory) out of 
Rdkit, I digged deep but didnt find anything.

F.example for:

'CC(C)CC'


it would be:

array([[0, 1, 1, 0],
   [1, 0, 1, 0],
   [1, 1, 0, 1],
   [0, 0, 1, 0]])

Thanks.


--
Mario Lovrić
***
DISCLAIMER
This email and any files transmitted with it, including replies and forwarded 
copies (which may contain alterations) subsequently transmitted from Firmenich, 
are confidential and solely for the use of the intended recipient. The contents 
do not represent the opinion of Firmenich except to the extent that it relates 
to their official business.
***
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit and Binder

2017-12-06 Thread Jan Halborg Jensen
has anyone experience sharing RDKit scripts using https://mybinder.org/ ?  If 
so, could you share an example?

Best regards, Jan
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problems with coordMap in EmbedMultipleConfs

2017-11-28 Thread Jan Halborg Jensen
The following code should produce 5 conformers of c1c1CCC” where the 
coordinates of the benzene ring is the same.  But it doesn’t.  What I am doing 
wrong?

Best regards, Jan


from rdkit import Chem
from rdkit.Chem import AllChem

template_smiles = "c1c1"
template = Chem.MolFromSmiles(template_smiles)
template = Chem.AddHs(template)

AllChem.EmbedMolecule(template)

prop = AllChem.MMFFGetMoleculeProperties(template, mmffVariant="MMFF94")
ff =AllChem.MMFFGetMoleculeForceField(template,prop)
ff.Minimize()

Chem.SDWriter("template.sdf").write(template)

mol_smiles = "c1c1CCC"
mol = Chem.MolFromSmiles(mol_smiles)
mol = Chem.AddHs(mol)

core = Chem.MolFromSmiles(template_smiles)

mol_match = mol.GetSubstructMatch(core)
template_match = template.GetSubstructMatch(core)

print mol_match, template_match

coordMap = {}
templateConf = template.GetConformer(-1) 
for i_template, i_mol in zip(template_match,mol_match):
corePtI = templateConf.GetAtomPosition(i_template)
#print corePtI.x, corePtI.y, corePtI.z
coordMap[i_mol] = corePtI

AllChem.EmbedMultipleConfs(mol,5,coordMap=coordMap)

w = Chem.SDWriter('conformers.sdf') 
for conf in mol.GetConformers():
tm = Chem.Mol(mol,False,conf.GetId())
w.write(tm)
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Rigid 3D alignment of molecule to fragment

2017-11-17 Thread Jan Halborg Jensen

Is there a way to rigidly align a 3D molecule to a fragment?

I want to compare several conformations by aligning a rigid part of the molecule

Best regards, Jan
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problem with reading/writing files?

2017-11-15 Thread Jan Halborg Jensen
The following code
1. energy-minizes a molecule and computes the energy
2. writes to coordinates to an sdf file,
3. reads it in and re-computes the energy

But the two energies are off by 4 kcal/mol

Any idea what I am doing wrong?

Best regards, Jan



from rdkit import Chem
from rdkit.Chem import AllChem

smiles = "O=C(Nc1ccc(OCC(O)CNC(C)C)c(c1)C(=O)C)CCC"
m = Chem.MolFromSmiles(smiles)
m = Chem.AddHs(m)
AllChem.EmbedMolecule(m,randomSeed=2)

AllChem.MMFFOptimizeMolecule(m,maxIters=1000,mmffVariant="MMFF94")
prop = AllChem.MMFFGetMoleculeProperties(m, mmffVariant="MMFF94")
ff =AllChem.MMFFGetMoleculeForceField(m,prop)
print ff.CalcEnergy()

file = "e_test.sdf"
Chem.SDWriter(file).write(tm)

new_m = Chem.MolFromMolFile(file)
#new_m = Chem.SDMolSupplier(file)[0]

prop = AllChem.MMFFGetMoleculeProperties(new_m, mmffVariant="MMFF94")
ff =AllChem.MMFFGetMoleculeForceField(new_m,prop)
print ff.CalcEnergy()
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Atom mapping

2017-10-04 Thread Jan Halborg Jensen
I just came across this paper 
http://pubs.acs.org/doi/abs/10.1021/acs.jctc.7b00764 which presents an variant 
of the Fewest Bonds First with Constructive Count Vector method by Crabtree and 
Mehta (https://dl.acm.org/citation.cfm?id=1498697)

Has anyone implemented anything similar in with RDKit?

I'd also be interested in any other atom mapping code using RDKit

Best regards, Jan
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] remove H using ReactionFromSmarts (i.e. creating radicals)

2017-08-28 Thread Jan Halborg Jensen
Is it possible to remove hydrogens using ReactionFromSmarts, for example 
changing Cc1c1 to [CH2]c1c1?

I can do it using ReplaceSubstructs but I am trying to write more general code 
that also does other transformations that are better done with 
ReactionFromSmarts.  I can also create [CH2+]c1c1? and then remove the + 
from the SMILES string but that is a hack

Here's sample code that, unfortunately, produces Cc1c1

from rdkit import Chem
from rdkit.Chem import AllChem

m = Chem.MolFromSmiles("Cc1c1")
rxn_smarts = '[CX4;H3:1]>>[CX4;H2:1]'

rxn = AllChem.ReactionFromSmarts(rxn_smarts)
newmol = rxn.RunReactants((m,))
print Chem.MolToSmiles(newmol[0][0],canonical=False)


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Conformational search not "converging" to low energy conformation

2017-06-12 Thread Jan Halborg Jensen
The code below shows the lowest energy found for 6 different protomers defined 
by the smiles strings below as a function of number of conformers. Even with 
2000 conformers I am not getting convergence to within 1 kcal/mol for 
comp109_1=2.

Is this expected? Any advice or tips appreciated

200 800 10002500
comp109_0=0 0.160.160.160.16
comp109_1=1 -25.18  -24.43  -25.08  -25.18
comp109_1=2 -16.42  -16.24  -16.21  -15.15
comp109_0=3 -24.05  -24.16  -24.09  -23.96
comp109_1=5 -38.4   -37.38  -38.28  -38.32
comp109_2=8 0.181.38-0.24   0.08

Code

import sys
from rdkit import Chem
from rdkit.Chem import AllChem

confs = 2500
e_cut = 20.0
decimals_in_energies = 2

filename = sys.argv[1]
file = open(filename, "r")

for line in file:
words = line.split()
name = words[0]
smiles = words[1]

m = Chem.AddHs(Chem.MolFromSmiles(smiles))

AllChem.EmbedMultipleConfs(m,numConfs=confs)

AllChem.MMFFOptimizeMoleculeConfs(m,numThreads=8,maxIters=1000,mmffVariant="MMFF94")

energies = []
for conf in m.GetConformers():
tm = Chem.Mol(m,False,conf.GetId())
prop = AllChem.MMFFGetMoleculeProperties(tm, mmffVariant="MMFF94")
ff =AllChem.MMFFGetMoleculeForceField(tm,prop)
energies.append(round(ff.CalcEnergy(),decimals_in_energies))

e_min = min(energies)
print e_min


Smiles file
comp109_0=0 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(N)=N1
comp109_1=1 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(=[NH2+])N1
comp109_1=2 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)OC(N)=N1
comp109_0=3 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(=N)N1
comp109_1=5 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)OC(=N)N1
comp109_2=8 
C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)OC(=[NH2+])N1
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDkit Molecule Fragmenter

2017-06-06 Thread Jan Halborg Jensen
I was also searching for this functionality earlier

For what it’s worth here’s some *very* simple code I hacked together to do 
fragmentation.  The focus is aromatic heterocycles, but it could be more 
general by, for example '[c,n]-[*]’ -> ‘[R]-[*]’  and ring = 
Chem.MolFromSmarts(‘[R]’) instead of '[n]'

Not pretty, but it worked for me

Bet regards, Jan


import sys, os, re
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole

rings_mol = []
rings_smiles = []

substituent_mol = []
substituent_smiles = []

smiles_file_name = "/Users/jan/Dropbox/Lundbeck/big.smiles"

smiles_file = open(smiles_file_name, "r")

for line in smiles_file:
words = line.split()
name = words[0]
smiles = words[1]

mol =  Chem.MolFromSmiles(smiles)

bis = mol.GetSubstructMatches(Chem.MolFromSmarts('[c,n]-[*]'))
bs = [mol.GetBondBetweenAtoms(x,y).GetIdx() for x,y in bis]

if len(bs) == 0:
if smiles not in rings_smiles:
rings_smiles.append(smiles)
rings_mol.append(Chem.MolFromSmiles(smiles))
continue

fragments_mol = Chem.FragmentOnBonds(mol,bs,addDummies=True)

big_fragment = Chem.MolToSmiles(fragments_mol,True)

big_fragment = re.sub(r'\[\d+\*\]',r'[*]',big_fragment)

fragments = big_fragment.split(".")

ring = Chem.MolFromSmarts('[n]')

for fragment in fragments:
if Chem.MolFromSmiles(fragment).HasSubstructMatch(ring):
if fragment not in rings_smiles:
rings_mol.append(Chem.MolFromSmiles(fragment))
rings_smiles.append(fragment)
else:
if fragment not in substituent_smiles:
substituent_mol.append(Chem.MolFromSmiles(fragment))
substituent_smiles.append(fragment)

img = 
Draw.MolsToGridImage(rings_mol,molsPerRow=4,subImgSize=(200,200),useSVG=True)

svg_file_name = "/Users/jan/Dropbox/Lundbeck/rings.svg"
svg_file = open(svg_file_name, 'w')
svg_file.write(img.data)
svg_file.close()
os.system('sed -i "" "s/xmlns:svg/xmlns/" '+svg_file_name)

img


On 06 Jun 2017, at 13:43, Popov, Maxim (Ext) 
> wrote:

Dear All,

I have discovered a very usefule tool in Knime, Molecule Fragmenter by RDKit, 
but can’t find a corresponding class or function outside of Knime. Can I use 
the Fragmenter without Knime?

Thanks!

Maxim
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] FW: ReplaceSubstructs changes chirality

2017-05-30 Thread Jan Halborg Jensen

Also

newmol = AllChem.ReplaceSubstructs(m, patt1, patt2, useChirality=True)

doesn't help

From: Jan Halborg Jensen
Sent: Saturday, May 27, 2017 3:11 PM
To: rdkit-discuss@lists.sourceforge.net
Subject: ReplaceSubstructs changes chirality



The code below protonates select nitrogen atoms using ReplaceSubstructs but in 
some cases the chirality is changed, despite the fact that I used 
MolToSmiles(xx,isomericSmiles=True)

Any help appreciated



Output

C[C@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)N=C(N)OCC1(F)F [NX2;H0] [NH+] 
C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)[NH+]=C(N)OCC1(F)F

C[C@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)N=C(N)OCC1(F)F [nX2;H0] [NH+] 
C[C@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)N=C(N)OCC1(F)F


code:


from rdkit import Chem

from rdkit.Chem import AllChem


smartsref = ( ('[NX2;H1]','[NH2+]'),

  ('[NX2;H0]','[NH+]'),

  ('[nX2;H0]','[NH+]'))



smiles = "C[C@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)N=C(N)OCC1(F)F"

m = Chem.MolFromSmiles(smiles)


for (smarts1, smiles2) in smartsref:

patt1 = Chem.MolFromSmarts(smarts1)

patt2 = Chem.MolFromSmiles(smiles2)

if(m.HasSubstructMatch(patt1)):

newmol = AllChem.ReplaceSubstructs(m, patt1, patt2)

for ion in newmol:

ion = Chem.MolToSmiles(ion,isomericSmiles=True)

print smiles,smarts1,smiles2,ion
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] ReplaceSubstructs changes chirality

2017-05-30 Thread Jan Halborg Jensen


The code below protonates select nitrogen atoms using ReplaceSubstructs but in 
some cases the chirality is changed, despite the fact that I used 
MolToSmiles(xx,isomericSmiles=True)

Any help appreciated



Output

C[C@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)N=C(N)OCC1(F)F [NX2;H0] [NH+] 
C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)[NH+]=C(N)OCC1(F)F

C[C@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)N=C(N)OCC1(F)F [nX2;H0] [NH+] 
C[C@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)N=C(N)OCC1(F)F


code:


from rdkit import Chem

from rdkit.Chem import AllChem


smartsref = ( ('[NX2;H1]','[NH2+]'),

  ('[NX2;H0]','[NH+]'),

  ('[nX2;H0]','[NH+]'))



smiles = "C[C@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)N=C(N)OCC1(F)F"

m = Chem.MolFromSmiles(smiles)


for (smarts1, smiles2) in smartsref:

patt1 = Chem.MolFromSmarts(smarts1)

patt2 = Chem.MolFromSmiles(smiles2)

if(m.HasSubstructMatch(patt1)):

newmol = AllChem.ReplaceSubstructs(m, patt1, patt2)

for ion in newmol:

ion = Chem.MolToSmiles(ion,isomericSmiles=True)

print smiles,smarts1,smiles2,ion
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss