Re: [Rdkit-discuss] install on macosx with Python 3.8

2021-06-25 Thread Michal Krompiec
Thanks Maciek, Francois and Peter. A clean conda environment was all that
was needed…
Best,
Michal

On Fri, Jun 25, 2021 at 1:44 AM Francois Berenger  wrote:

> On 25/06/2021 02:57, Michal Krompiec wrote:
> > Hello,
> > Is it possible to install RDKit on MacOSX in a Python 3.8 environment?
> > There is no conda binary for 3.8, so I tried homebrew. But the
> > following gives me an error message (brew doesn't like the
> > --with-python3 argument):
> >
> > brew install rdkit --with-python3 --without-numpy
> >
> > So I did just "brew install rdkit", but then rdkit is unimportable in
> > Python ("No module named 'rdkit'"). What am I doing wrong?
>
> You are not using the python interpreter for which rdkit
> was installed by brew.
>
> Check what the brew installer of rdkit is doing, especially
> look which python version it installs rdkit for.
>
> Alternatively, fire up each and every python interpreter
> installed on your computer, and try 'import rdkit'
> until you find the one for which it works.
>
> Regards,
> F.
>
> > I'm using brew 3.2.0 on MacOS 11.4
> >
> > Thanks in advance,
> >
> > Michal Krompiec
> > ___
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] install on macosx with Python 3.8

2021-06-24 Thread Michal Krompiec
Hello,
Is it possible to install RDKit on MacOSX in a Python 3.8 environment?
There is no conda binary for 3.8, so I tried homebrew. But the following
gives me an error message (brew doesn't like the --with-python3 argument):

brew install rdkit --with-python3 --without-numpy

So I did just "brew install rdkit", but then rdkit is unimportable in
Python ("No module named 'rdkit'"). What am I doing wrong?

I'm using brew 3.2.0 on MacOS 11.4


Thanks in advance,


Michal Krompiec
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Incorrect gold particle placement

2020-12-01 Thread Michal Krompiec
Dear Anthony,
>From MMFF or UFF you shouldn’t expect reasonable structures of
metal-containing compounds. If you need a quick and qualitatively OK
method, I recommend xtb.
Best wishes,
Michal Krompiec, Merck KGaA

On Tue, Dec 1, 2020 at 1:09 PM Anthony Nash 
wrote:

>
> Dear all,
>
> I'm new to RDKit and cheminformatics in general. I'm using the latest
> RDKit libraries. Any suggestions you can offer would be kindly received.
>
> Using the canonical SMILES   C(C1C(C(C(C(O1)[S-])O)O)O)O.[Au+]   of
> Aurothioglucose  (Pubchem CID: 454937) I've generated a 3D structure
> using the python RDKit code:
>
> mol = Chem.MolFromSmiles(self.canonicalSMILES)
> mol3D = Chem.AddHs(mol)
> AllChem.EmbedMolecule(mol3D)
> AllChem.MMFFOptimizeMolecule(mol3D)
>
> Unfortunately, the mol3D representation has Au+ right in the middle of and
> in the plane of the benzene rings, too far from the negatively charged
> sulfur. I'm new at generating structures from SMILES. Are there any steps
> I'm missing that could help correct the placement of Au+?
>
> Thanks
> Anthony
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Dragon fingerprints?

2020-11-10 Thread Michal Krompiec
Hi Nils,
Yes, of course, I meant descriptors, not fingerprints. Thanks for the link!
Best,
Michal

On Tue, Nov 10, 2020 at 10:30 PM Nils Weskamp 
wrote:

> Hello Michal,
>
> you are probably referring to the Dragon Descriptors
> (https://chm.kode-solutions.net/products_dragon.php, now called
> alvaDesc)? That is a pretty comprehensive set of more than 5.000
> descriptors and I would be surprised of someone had (re-)implemented all
> of them.
>
> The closest thing that I am aware of would be Mordred
> (https://github.com/mordred-descriptor).
>
> Hope this helps,
> Nils
>
> Am 10.11.2020 um 23:11 schrieb Michal Krompiec:
> > Hello RDKitters,
> > Are Dragon fingerprints (or something close) available in RDKit (or free
> > codes that depend on RDKit)? I'm trying to reproduce results from one
> paper.
> >
> > Thanks,
> > Michal Krompiec
> > Merck KGaA
> >
> >
> >
> > ___
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Dragon fingerprints?

2020-11-10 Thread Michal Krompiec
Hello RDKitters,
Are Dragon fingerprints (or something close) available in RDKit (or free
codes that depend on RDKit)? I'm trying to reproduce results from one paper.

Thanks,
Michal Krompiec
Merck KGaA
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] TIL: Mol objects having varying attributes depending on rdkit imports

2020-09-23 Thread Michal Krompiec
Uhm, you’re right.

On Wed, Sep 23, 2020 at 7:46 PM Rocco Moretti  wrote:

> Python translates object.method() to method(object).
>
>
> Well, yes and no. "Yes" in the sense that instance methods are internally
> implemented equivalently to a free method which takes an instance as the
> first parameter. "No" in the sense that from a namespace and user
> perspective there typically isn't a crossover:
>
> >>> class TestClass:
> def __init__(self, name):
> self.name = name
> def say(self):
> print("I'm TestClass",self.name)
>
> >>> def recite(test_class):
> test_class.say()
>
> >>> t = TestClass("Bob")
> >>> t.say()
> I'm TestClass Bob
> >>> recite(t)
> I'm TestClass Bob
> >>> t.recite()
> Traceback (most recent call last):
>   File "", line 1, in 
> AttributeError: 'TestClass' object has no attribute 'recite'
>
> (As the `recite` function is in the local namespace, not in the TestClass
> namespace, `t.recite()` can't find it.)
>
> While, due to the equivalence between free functions and member functions,
> you can certainly inject such a free function into the class:
>
> >>> TestClass.recite = recite
> >>> t.recite()
> I'm TestClass Bob
>
> such an injection isn't universally what one sees in typical Python
> programs, as anyone who's tried to do a `mylist.len()` can attest. Doing
> such an injection dependent on module imports is much rarer, and certainly
> not the expected "standard" behavior in Python. (It's certainly not
> automatic in Python, if that was what you were trying to imply.)
>
> Regards,
> -Rocco
>
>
> ___
>
> Rdkit-discuss mailing list
>
> Rdkit-discuss@lists.sourceforge.net
>
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] TIL: Mol objects having varying attributes depending on rdkit imports

2020-09-23 Thread Michal Krompiec
Hi Rocco & Norwid,
Actually, this is expected given the fact that Python translates
object.method() to method(object). Hence, m.Compute2DCoords(), although
"incorrect" (because Compute2DCoords() is not a method of the Mol class),
is valid Python code and is understood as Compute2DCoords(m). And this
won't work unless AllChem is loaded.
Best,
Michal Krompiec
Merck KGaA

On Wed, 23 Sep 2020 at 16:56, Rocco Moretti  wrote:

> Hi Norwid,
>
> There's a subtle but significant difference between the two examples:
>
> >>> AllChem.Compute2DCoords(m)
> versus
> >>> m.Compute2DCoords()
>
> For the former, it's pretty standard Python behavior not to be able to see
> a function from a module if you haven't loaded the module yet. That's
> expected behavior, and something you'll learn early on when working with
> Python modules.
>
> For the latter, it's not standard Python behavior to have methods which
> aren't visible until some other module is loaded. Generally, if you have an
> object of a class, you have access to all the methods of that class. Just
> having part of the class and then needing to import a separate module to
> get the rest of the methods is certainly not something you typically expect
> in Python.
>
> Regards,
> -Rocco
>
>
> On Wed, Sep 23, 2020 at 5:35 AM Norwid Behrnd via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net> wrote:
>
>> Hi Thomas,
>>
>> could your report be already backed by the section titled «Working with
>> 2D molecules: Generating Depictions» of the upper half of page
>>
>> https://www.rdkit.org/docs/GettingStartedInPython.htm
>>
>> about the 2020.03.1 documentation with the following example?
>>
>>  8>< begin snippet --- 
>> >>> m = Chem.MolFromSmiles('c1nccc2n1ccc2')
>> >>> AllChem.Compute2DCoords(m)
>> 0
>>  8>< end snippet --- 
>>
>> Because this snippet is part of a show case, a minimal working example
>> (at least for a bit old RDKit 2019.9.1) translates into
>>
>>  8>< begin snippet --- 
>> from rdkit import Chem
>> from rdkit.Chem import AllChem
>> m = Chem.MolFromSmiles('c1c1')
>> AllChem.Compute2DCoords(m)
>>  8>< end snippet --- 
>>
>> to yield "0" (zero).
>>
>> However, possibly contributing to your struggle, note an entry on page
>>
>> https://www.rdkit.org/docs/GettingStartedInPython.html#chem-vs-allchem
>>
>> with the snippet
>>
>>  8>< begin snippet --- 
>> >>> from rdkit.Chem import AllChem as Chem
>> >>> m = Chem.MolFromSmiles('CCC')
>>  8>< end snippet --- 
>>
>> equivalent to a MWE of
>>
>>  8>< begin snippet --- 
>> import rdkit
>> from rdkit.Chem import AllChem as Chem
>> m = Chem.MolFromSmiles('CCC')
>> Chem.Compute2DCoords(m)
>>  8>< end snippet --- 
>>
>> to equally yield "0" (zero).
>>
>> I only recall this part of the manual because one of my yesterday's
>> problems caused me to revisit the beginner's page again.
>>
>> Norwid
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PathToSubmol on atom indices?

2020-06-04 Thread Michal Krompiec
Dear Kangway,
Thank you very much!
M.

On Thu, Jun 4, 2020 at 6:56 PM Chuang, Kangway 
wrote:

> Hi Michal,
>
> You can use rdkit.Chem.rdmolfiles.MolFragmentToSmiles (or related
> MolFragmentToSmarts) and specify an atom id list with "atomsToUse":
>
> e.g.
>
> rdkit.Chem.rdmolfiles.MolFragmentToSmiles(mol, atomsToUse=[11,13,22,15])
>
> Kangway
> --
> *From:* Michal Krompiec 
> *Sent:* Thursday, June 4, 2020 10:45 AM
> *To:* RDKit Discuss 
> *Subject:* [Rdkit-discuss] PathToSubmol on atom indices?
>
> Hello, I noticed this was discussed before and I'm wondering if anything's
> changed.
> Is it possible to extract a substructure from a molecule, based on atom
> indices? I understand that Chem.PathToSubmol does something similar, but
> takes bond indices.
> Best,
> Michal
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] PathToSubmol on atom indices?

2020-06-04 Thread Michal Krompiec
Hello, I noticed this was discussed before and I'm wondering if anything's
changed.
Is it possible to extract a substructure from a molecule, based on atom
indices? I understand that Chem.PathToSubmol does something similar, but
takes bond indices.
Best,
Michal
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Code efficiency improvement

2019-12-19 Thread Michal Krompiec
For mid-to-lower-energy conformers, MMFF relative energies are
essentially a fancy random-number generator. Still, all depends on
what you need this for. If you just want to filter out (very) high
energy conformers, your approach might work. But if you also want to
perform Boltzmann averaging over conformational ensemble (of lower
energy conformers), you will be disappointed.
BTW conformational analysis of your molecule with CREST (20 OMP
threads, -quick -norotmd) took 219 seconds and yielded 28 conformers
with energy up to 3.5 kcal/mol higher than lowest energy structure. So
it is ~2 orders of magnitude slower than MMFF.
Best,
Michal

On Thu, 19 Dec 2019 at 03:53, topgunhaides .  wrote:
>
> Hi Michal,
>
> Many thanks for the help! I am looking for an ensemble of conformers.
> My priority is to use RDKit to generate a large ensemble of conformers for 
> each molecule.
> For large and flexiable molecules, will need a lot more than 10K (like 100K) 
> to try to cover the entire conformational space.
>
> I do not have to use MMFF to optimize all conformers, but I do want to use 
> MMFF or UFF to get at least the energies of all conformers (which is also 
> quite time-consuming, even without optimization).
> With the conformer energies, I can call some energy_filtering function to 
> filter out conformers with high energies, etc.
> I am thinking that storing and processing a huge number of conformers could 
> be the reason to slow things down, but not quite sure.
> Any suggestions are very welcome!
>
> Best,
> Leon
>
>
>
>
>
>
>
> On Wed, Dec 18, 2019 at 7:08 PM Michal Krompiec  
> wrote:
>>
>> Are you looking for the global minimum or an ensemble of conformers? Either 
>> way, this is already very fast. Bear in mind, however, that MMFF’s accuracy 
>> isn’t great for this type of tasks (see for example
>> https://arxiv.org/pdf/1705.04308.pdf ). In other words, I don’t see a use 
>> case for generation of 10k or more conformers with MMFF. And super-fast 
>> generation of large conformational ensembles for arbitrary molecules just 
>> isn’t realistic.
>> Best,
>> Michal
>>
>> On Wed, 18 Dec 2019 at 22:40, topgunhaides .  wrote:
>>>
>>> Hi guys,
>>>
>>> Can anyone give me some advices to improve the efficiency of the embedding 
>>> code? See example below:
>>>
>>>
>>> import time
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>>
>>> suppl = Chem.SDMolSupplier('cid831548.sdf')   # medium size molecule (10 
>>> heavy atoms)
>>>
>>> for mol in suppl:
>>> mh = Chem.AddHs(mol, addCoords=True)
>>>
>>> # embedding
>>> start = time.time()
>>> AllChem.EmbedMultipleConfs(mh, numConfs=5000, maxAttempts=100, 
>>> pruneRmsThresh=0.5,
>>>randomSeed=1, numThreads=0, 
>>> enforceChirality=True,
>>>useExpTorsionAnglePrefs=True, 
>>> useBasicKnowledge=True)
>>> cids = [conf.GetId() for conf in mh.GetConformers()]
>>> end = time.time()
>>> print("time eclipsed: ", end - start)
>>>
>>>
>>> The results:
>>> numConfs=1000,   time eclipsed: 10 seconds
>>> numConfs=5000,   time eclipsed: 66 seconds
>>> numConfs=1, time eclipsed: 176 seconds
>>>
>>> I need to request a lot more than 1 conformers per molecule and have a 
>>> lot of molecules to process.
>>> I also wish to compute conformer energies and hopefully can do optimization 
>>> (both are time consuming). So need to make my code as efficient as 
>>> possible. Thank you!
>>>
>>> Best,
>>> Leon
>>>
>>>
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Code efficiency improvement

2019-12-18 Thread Michal Krompiec
Are you looking for the global minimum or an ensemble of conformers? Either
way, this is already very fast. Bear in mind, however, that MMFF’s accuracy
isn’t great for this type of tasks (see for example
https://arxiv.org/pdf/1705.04308.pdf ). In other words, I don’t see a use
case for generation of 10k or more conformers with MMFF. And super-fast
generation of large conformational ensembles for arbitrary molecules just
isn’t realistic.
Best,
Michal

On Wed, 18 Dec 2019 at 22:40, topgunhaides .  wrote:

> Hi guys,
>
> Can anyone give me some advices to improve the efficiency of the embedding
> code? See example below:
>
>
> import time
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
> suppl = Chem.SDMolSupplier('cid831548.sdf')   # medium size molecule (10
> heavy atoms)
>
> for mol in suppl:
> mh = Chem.AddHs(mol, addCoords=True)
>
> # embedding
> start = time.time()
> AllChem.EmbedMultipleConfs(mh, numConfs=5000, maxAttempts=100,
> pruneRmsThresh=0.5,
>randomSeed=1, numThreads=0,
> enforceChirality=True,
>useExpTorsionAnglePrefs=True,
> useBasicKnowledge=True)
> cids = [conf.GetId() for conf in mh.GetConformers()]
> end = time.time()
> print("time eclipsed: ", end - start)
>
>
> The results:
> numConfs=1000,   time eclipsed: 10 seconds
> numConfs=5000,   time eclipsed: 66 seconds
> numConfs=1, time eclipsed: 176 seconds
>
> I need to request a lot more than 1 conformers per molecule and have a
> lot of molecules to process.
> I also wish to compute conformer energies and hopefully can do
> optimization (both are time consuming). So need to make my code as
> efficient as possible. Thank you!
>
> Best,
> Leon
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] The "maxAttempts" option in "EmbedMultipleConfs"

2019-12-17 Thread Michal Krompiec
It depends what you need it for, but if you want a more realistic
conformational analysis instead, CREST is the tool of choice.
https://xtb-docs.readthedocs.io/en/latest/crest.html
Best,
Michal


On Tue, Dec 17, 2019 at 16:26 topgunhaides .  wrote:

> Hi guys,
>
> Can anyone tell me more about the "maxAttempts" option in
> "EmbedMultipleConfs"?
>
> In the documentation, it says " maxAttempts: the maximum number of
> attempts to try embedding".
> Dose it mean the "maximum number of attempts" to generate each conformer
> or to generate the total number of conformers specified by "numConfs"? Or
> something else?
> I need to generate a huge amount of conformers for each molecule, so I
> want to know what is the proper "maxAttempts" to reach a balance between
> accuracy and cost.
> Thank you!
>
> Best,
> Leon
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] how to handle metallocenes?

2019-10-07 Thread Michal Krompiec
Dear Mike,
Try changing all metal-ligand bonds to "dative" or "ionic, and
standardize afterwards (but disable adjusting of implicit Hs). This
way, I was able to process (in KNIME) >99% of organometallics (incl.
metallocenes) downloaded from Reaxys.
Example snippet (which doesn't check the "directionality" of the bond, though):

from rdkit import Chem
import pandas as pd
metals=['Ti','Al','Mo','Ru','Co','Rh', 'Ir', 'Ni','Zr', 'Hf', 'W']
outmols=[]
mols=input_table['Molecule']
for mol in mols:
for bond in mol.GetBonds():
 if bond.GetEndAtom().GetSymbol() in metals or
bond.GetBeginAtom().GetSymbol() in metals:
  print("found metal-ligand bond")
  print("original type: "+ str(bond.GetBondType()))
  btype=Chem.rdchem.BondType.DATIVE
  bond.SetBondType(btype)
  print("changed to: "+
str(mol.GetBonds()[bond.GetIdx()].GetBondType()))
  try:

Chem.SanitizeMol(mol,sanitizeOps=Chem.SanitizeFlags.SANITIZE_ALL^Chem.SanitizeFlags.SANITIZE_ADJUSTHS)
  except ValueError as ve:
  print("Sanitization failed")
  print(ve)
output_table = input_table.copy()

Best,
Michal



On Mon, 7 Oct 2019 at 13:45, Greg Landrum  wrote:
>
> Hi Mike,
>
> I think you mean "organometallics", not "metallocenes" (the two molecules in 
> that SDF is are coordination complexes, but neither is a metallocene; I 
> stopped looking after that). The compounds are also drawn in such a way that 
> they are chemically unreasonable. This is pretty typical for organometallics 
> in V2000 mol files.
>
> Unless you have a reliable source of input molecules and/or are willing to 
> look at every one, I would just filter anything that has a metal-nonmetal 
> bond out of the dataset.
>
> If you really want to do something with the molecules:
> The rdMolStandardize code, which is derived from MolVS, currently has one 
> approach for dealing with this type of complex: breaking all the covalent 
> bonds to the metal (this is also what InChI does). Given what a mess these 
> compounds are when they show up in most standard file formats, this seems 
> like a reasonable thing to do:
>
> In [4]: from rdkit import Chem
>
> In [5]: from rdkit.Chem.MolStandardize import rdMolStandardize
>
> In [6]: dcon = rdMolStandardize.MetalDisconnector()
> [14:34:03] Initializing MetalDisconnector
>
> In [8]: suppl = 
> Chem.SDMolSupplier('/home/glandrum/Downloads/RDKit_input.sdf',sanitize=False,removeHs=False)
>
> In [9]: m = suppl[0]
>
> In [10]: om = dcon.Disconnect(m)
> [14:34:29] Running MetalDisconnector
> [14:34:29] Removed covalent bond between Tc and O
> [14:34:29] Removed covalent bond between Tc and O
> [14:34:29] Removed covalent bond between Tc and S
> [14:34:29] Removed covalent bond between Tc and S
> [14:34:29] Removed covalent bond between Tc and P
> [14:34:29] Removed covalent bond between Tc and P
>
> In [11]: Chem.SanitizeMol(om)
> Out[11]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> In [12]: Chem.MolToSmiles(om)
> Out[12]: 
> 'CSCC[C@@H](NC(=O)[C@@H](CC(C)C)NC(=O)[C@@H](Cc1cnc[nH]1)NC(=O)CNC(=O)[C@H](NC(=O)[C@@H](C)NC(=O)[C@H](CC(=O)[C@@H](CCC(N)=O)NC(=O)NC(=O)C(CC[SH-]CCC[PH-](CO)CO)[SH-]CCC[PH-](CO)CO)c1cc2c2[nH]1)C(C)C)C(N)=O.[99Tc+9].[Cl-].[O-2].[O-2]'
>
>
> It's worth noting that this molecule is still a long way from making chemical 
> sense : the +9 charge on the Tc and the [SH-] and [PH-] groups are not 
> sensible. So there's more manual fixing required here.
>
>
> Best,
> -greg
>
>
> On Mon, Oct 7, 2019 at 12:06 PM Mike Mazanetz  
> wrote:
>>
>> Hello RDKit experts !
>>
>>
>>
>> Is there a function to handle metallocenes in the standardizer?
>>
>>
>>
>> I’ve enclosed some examples of compounds.
>>
>>
>>
>> Thanks,
>>
>> mike
>>
>>
>>
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Calculating VDW interaction energy between two molecules

2019-06-11 Thread Michal Krompiec
Dear Omar,
Probably, but it is not trivial: you’d need to optimise geometries of a few
1:1 complexes and calculate average interaction energy. Why use rdkit for
that?
Michal

On Tue, 11 Jun 2019 at 01:48, Omar H94  wrote:

> Dear RDKit users,
> Is there a way to calculate the VDW energy between two molecules using the
> MMFF forcefield parameters ?
>
> Thanks,
> Omar
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bug with Calculation of aromatic rings?

2019-03-06 Thread Michal Krompiec
It’s because the molecule with atom indices is a tautomer of the other one
(H at the other N), hence different Kekule structure and different
behaviour of the aromaticity perception code.
Best,
Michal

On Wed, 6 Mar 2019 at 10:04, Colin Bournez 
wrote:

> Hi Greg,
>
> Indeed it seems one bond is not tagged as aromatic.
>
> Here are the aromatics bond (begin atom, end atom) :
>
> 0 1
> 1 19
> 19 16
> 11 14
> 14 12
> 12 7
> 7 20
> 11 0
> 20 16
>
> We see that between the atom 11 and 16 it is not aromatic.
> It is a single type:
> 16 11 SINGLE
>
>
> The problem remains after sanitizing the molecule and both atoms are
> tagged as aromatic. A bond between two aromatic atoms can be single?
>
> On 06/03/19 10:49, Greg Landrum wrote:
>
> Hi Colin,
> The aromatic ring counting code identifies rings where every *bond* is
> aromatic, so I guess one or more bonds in the rings of the first molecule
> are not aromatic.
> Could it be that you haven't sanitized the molecule before calculating
> descriptors?
> -greg
>
> On Tue, Mar 5, 2019 at 6:00 PM Colin Bournez <
> colin.bour...@univ-orleans.fr> wrote:
>
>> Dear all, I might have encountered a little problem concerning the
>> function rdMolDescriptors.CalcNumAromaticRings(). For this molecule shown
>> with index:
>>
>> Here is what I do :
>>
>> So I have as expected my aromatic atoms but when I ask for aromatic Rings it 
>> returns 0 instead of two.
>> Anyone has an idea?
>>
>> For information if the molecule is in that form
>> It returns 2 NAR as expected.
>>
>> Colin
>>
>>
>> --
>> *Bournez Colin *  
>>  *Chemoinformatics PhD Student *
>> * Institute of Organic and Analytical Chemistry (ICOA UMR7311)*
>>  Université d'Orléans - Pôle de Chimie  Rue de Chartres - BP 6759  45067
>> Orléans Cedex 2 - France  +33 (0)2 38 49 45 77
>> <+33%202%2038%2049%2045%2077>  SBC Tool Platform  - SBC
>> Team   
>>
>> 
>>  
>>
>> 
>> ___ Rdkit-discuss mailing
>> list Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
> --
> *Bournez Colin *  
>  *Chemoinformatics PhD Student *
> * Institute of Organic and Analytical Chemistry (ICOA UMR7311)*
>  Université d'Orléans - Pôle de Chimie  Rue de Chartres - BP 6759  45067
> Orléans Cedex 2 - France  +33 (0)2 38 49 45 77
> <+33%202%2038%2049%2045%2077>  SBC Tool Platform  - SBC
> Team   
>
> 
>  
>
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rotate a conformer around an axis?

2019-02-27 Thread Michal Krompiec
Thanks Taka, this is just what I needed!
Michal

On Wed, 27 Feb 2019 at 03:19, Taka Seri  wrote:

> Hi Mchal,
>
> TransformConformer can rotate molecule with given rotation matrix I think.
>
> I rotate mol by using the method.
> Following code rotates molecule around x, y, z axis.
>
> https://nbviewer.jupyter.org/github/iwatobipen/playground/blob/master/rotation_mol.ipynb
>
> Unfortunately molecule will not render on github but you can view molecule
> if you run the code on your PC.
> I hope this would be some of help.
>
> Best regards,
> Taka
>
>
> 2019年2月26日(火) 17:55 Michal Krompiec :
>
>> Hello,
>> I'd like to check if a particular axis is a C2 symmetry axis of a
>> conformer. How do I rotate my conformer around an axis? (apart from
>> extracting the coordinates into numpy, multiplying by a rotation matrix and
>> updating the coordinates)
>> Can TransformConformer function be used for this?
>> Thanks,
>> Michal
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] rotate a conformer around an axis?

2019-02-26 Thread Michal Krompiec
Hello,
I'd like to check if a particular axis is a C2 symmetry axis of a
conformer. How do I rotate my conformer around an axis? (apart from
extracting the coordinates into numpy, multiplying by a rotation matrix and
updating the coordinates)
Can TransformConformer function be used for this?
Thanks,
Michal
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chem.GetMolFrags and 3D coordinates

2019-02-25 Thread Michal Krompiec
Hi Greg,
Thank you, I can reproduce your example, and my own case works fine now...
Best,
Michal

On Mon, 25 Feb 2019 at 14:26, Greg Landrum  wrote:

> Hi Michal,
>
> Which version of the RDKit are you using? This should already be working.
> Here's an example demonstrating that:
>
> In [16]: m = Chem.AddHs(Chem.MolFromSmiles('c1c1.N'))
>
> In [17]: AllChem.EmbedMolecule(m)
> Out[17]: 0
>
> In [18]: fs = Chem.GetMolFrags(m,asMols=True)
>
> In [19]: print(Chem.MolToMolBlock(fs[0]))
>
>  RDKit  3D
>
>  12 12  0  0  0  0  0  0  0  0999 V2000
> 0.7943   -1.14900.0150 C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.36070.1298   -0.0022 C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.58981.2681   -0.0170 C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.78661.1450   -0.0149 C   0  0  0  0  0  0  0  0  0  0  0  0
>-1.3413   -0.11240.0019 C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.5806   -1.26560.0170 C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.4026   -2.04280.0267 H   0  0  0  0  0  0  0  0  0  0  0  0
> 2.41830.2113   -0.0036 H   0  0  0  0  0  0  0  0  0  0  0  0
> 1.00752.2686   -0.0304 H   0  0  0  0  0  0  0  0  0  0  0  0
>-1.41822.0293   -0.0265 H   0  0  0  0  0  0  0  0  0  0  0  0
>-2.4222   -0.21720.0037 H   0  0  0  0  0  0  0  0  0  0  0  0
>-1.0241   -2.26520.0304 H   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  2  0
>   2  3  1  0
>   3  4  2  0
>   4  5  1  0
>   5  6  2  0
>   6  1  1  0
>   1  7  1  0
>   2  8  1  0
>   3  9  1  0
>   4 10  1  0
>   5 11  1  0
>   6 12  1  0
> M  END
>
>
> In [20]: print(Chem.MolToMolBlock(fs[1]))
>
>  RDKit  3D
>
>   4  3  0  0  0  0  0  0  0  0999 V2000
>-0.0066   -0.00990.2620 N   0  0  0  0  0  0  0  0  0  0  0  0
>-0.41360.8845   -0.0859 H   0  0  0  0  0  0  0  0  0  0  0  0
>-0.5574   -0.8197   -0.0901 H   0  0  0  0  0  0  0  0  0  0  0  0
> 0.9775   -0.0549   -0.0860 H   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
>   1  3  1  0
>   1  4  1  0
> M  END
>
>
> In [21]: print(Chem.MolToMolBlock(m))
>
>  RDKit  3D
>
>  16 15  0  0  0  0  0  0  0  0999 V2000
> 0.7943   -1.14900.0150 C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.36070.1298   -0.0022 C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.58981.2681   -0.0170 C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.78661.1450   -0.0149 C   0  0  0  0  0  0  0  0  0  0  0  0
>-1.3413   -0.11240.0019 C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.5806   -1.26560.0170 C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.0066   -0.00990.2620 N   0  0  0  0  0  0  0  0  0  0  0  0
> 1.4026   -2.04280.0267 H   0  0  0  0  0  0  0  0  0  0  0  0
> 2.41830.2113   -0.0036 H   0  0  0  0  0  0  0  0  0  0  0  0
> 1.00752.2686   -0.0304 H   0  0  0  0  0  0  0  0  0  0  0  0
>-1.41822.0293   -0.0265 H   0  0  0  0  0  0  0  0  0  0  0  0
>-2.4222   -0.21720.0037 H   0  0  0  0  0  0  0  0  0  0  0  0
>-1.0241   -2.26520.0304 H   0  0  0  0  0  0  0  0  0  0  0  0
>-0.41360.8845   -0.0859 H   0  0  0  0  0  0  0  0  0  0  0  0
>-0.5574   -0.8197   -0.0901 H   0  0  0  0  0  0  0  0  0  0  0  0
> 0.9775   -0.0549   -0.0860 H   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  2  0
>   2  3  1  0
>   3  4  2  0
>   4  5  1  0
>   5  6  2  0
>   6  1  1  0
>   1  8  1  0
>   2  9  1  0
>   3 10  1  0
>   4 11  1  0
>   5 12  1  0
>   6 13  1  0
>   7 14  1  0
>   7 15  1  0
>   7 16  1  0
> M  END
>
>
>
> Best,
> -greg
>
>
> On Mon, Feb 25, 2019 at 5:54 AM Michal Krompiec 
> wrote:
>
>> Hello,
>> Let mol be a molecule with a conformer with 3D coordinates, consisting of
>> 2 fragments. Chem.GetMolFrags(mol, asMols=true) returns these fragments as
>> Molecule objects, but the 3D coordinates are lost. Is there any way to
>> preserve them?
>> Best,
>> Michal
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Chem.GetMolFrags and 3D coordinates

2019-02-25 Thread Michal Krompiec
Hello,
Let mol be a molecule with a conformer with 3D coordinates, consisting of 2
fragments. Chem.GetMolFrags(mol, asMols=true) returns these fragments as
Molecule objects, but the 3D coordinates are lost. Is there any way to
preserve them?
Best,
Michal
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Using RdKit in Parallel

2019-02-20 Thread Michal Krompiec
Dear Stamatia,
If the molecules are processed completely independently by your code, it
may be simpler to split the SDF into chunks (e.g. with csplit in a bash
script) and then run separate instances of your python code on each chunk,
wait until all are finished and finally collate the output. Thus you can
avoid the problem altogether.
Best,
Michal

On Wed, 20 Feb 2019 at 11:28, Stamatia Zavitsanou <
stamatia.zavitsa...@oriel.ox.ac.uk> wrote:

> Hello everyone,
>
>
> We have been writing a script that searches though a large number of
> molecules within different files for a common substructure. To speed this
> up we have been attempting to run this script in parallel-see scripts
> below. However online the tutorial notes make reference to problems with
> using the SDMolSupplier in parallel, we were wondering what is the issue
> and how we could circumvent them to speed up some of our calculations.
>
>
> Non-parallel
>
>
> from __future__ import print_function
>
> from rdkit import Chem
>
> import os
>
> from progressbar import ProgressBar
>
> pbar=ProgressBar()
>
> matches = []
>
> directory = 'Q:\Data2'
>
> patt = Chem.MolFromSmarts('NC(NNC=O)=O')
>
> for file in pbar(os.listdir(directory)):
>
> filename = os.fsdecode(file)
>
> if filename.endswith(".sdf"):
>
> f = os.path.join(directory,filename)
>
> suppl= Chem.SDMolSupplier(f)
>
> for mol in suppl:
>
> if mol is None: continue
>
> if mol.HasSubstructMatch(patt):
>
> matches.append(mol)
>
> w = Chem.SDWriter(r'C:\Users\tom.watts\Desktop\datasmarts4c.sdf')
>
> for m in matches: w.write(m)
>
> print(filename)
>
>
>
> Parallel
>
>
> pbar=ProgressBar()
>
> matches = []
>
> directory = 'E:\Data'
>
> patt = Chem.MolFromSmarts('NC(NNC=O)=O')
>
> w = Chem.SDWriter(r'C:\Users\tom.watts\Desktop\SearchDataNonly.sdf')
>
> l=[]
>
> for file in pbar(os.listdir(directory)):
>
> filename = os.fsdecode(file)
>
> if filename.endswith(".sdf"):
>
> f = os.path.join(directory,filename)
>
> l.append(f)
>
> num_cores = multiprocessing.cpu_count()
>
> print(num_cores)
>
> lock = multiprocessing.Lock()
>
> def Search(i):
>
> suppl= Chem.SDMolSupplier(i)
>
> for mol in suppl:
>
> if mol is None: continue
>
> if mol.HasSubstructMatch(patt):
>
> matches.append(mol)
>
> return matches
>
> results = Parallel(n_jobs=20)(delayed(Search)(i) for i in l)
>
>
>
> We also wish to use a second script  that opens one SDF file and then
> runs a loop over each molecule in the file. This is currently done
> serially and we were wondering if it could be made parallel.
>
>
>
> suppl = Chem.SDMolSupplier('Red3.sdf')
>
> *for* mol *in* suppl:
>
> patt = Chem.MolFromSmarts('NC(N)=O')
>
> num=mol.GetSubstructMatches(patt)
>
> logger.debug(Chem.MolToSmiles(mol))
>
> h=len(num)
>
> m3=Chem.AddHs(mol)
>
> cids =AllChem.EmbedMultipleConfs(m3, numConfs)
>
>
>
> Any comments can be useful.
>
>
> Thanks a lot,
>
> Stamatia Zavitsanou
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit and organometallics

2018-11-13 Thread Michal Krompiec
Isn't this patch already incorporated in the master branch?

On Tue, 13 Nov 2018 at 12:35, Malgorzata Werner <
malgorzata.wer...@molecularhealth.com> wrote:

> Hi Henrique,
>
> You could try using v3000 sd files.
>
> Here's a blog about this:
> https://www.wildcardconsulting.dk/useful-information/how-to-solve-problems-with-coordinate-bonds-in-rdkit/
>
>
>
> Best,
>
> Malgorzata
>
> --
> *From:* Henrique Castro 
> *Sent:* 13 November 2018 11:57:34
> *To:* rdkit-discuss@lists.sourceforge.net
> *Subject:* [Rdkit-discuss] RDKit and organometallics
>
> Dear colleagues,
> I know that this is probably a dumb question, but since my searches showed
> no clarifying results I'm asking here anyway.
> I'm planning to use RDKit on my Ph.D. thesis, but my field of research is
> inorganic chemistry and molecular magnetism. That means that I'm dealing
> with organometallics (transition metals, lanthanides, actinides...). So far
> I was unable to import even a single structure to my RDKit test with
> different error messages like those (they are all the same structure, that
> is attached here):
>
> m = Chem.MolFromMolFile('st1.pdb')
> RDKit WARNING: [08:36:40] CTAB version string invalid at line 4
>
> m = Chem.MolFromMolFile('st1.sdf')
> RDKit ERROR: [08:51:30] Explicit valence for atom # 1 N, 4, is greater
> than permitted
>
> m = Chem.MolFromMolFile('st1.mol2')
> RDKit WARNING: [08:52:15] Counts line too short: 'SMALL' on line4
>
> Based on this, I'd like to as for hints on how to deal with molecules with
> "unusual" valences like the ones we deal in inorganic chemistry.
>
> Thanks in advance
>
>
> --
> Henrique C. S. Junior
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Fingerprint collision and machine learning

2018-10-10 Thread Michal Krompiec
Hi Thomas,
Radius 2, 2048 bits, 5200 data points.

On Wed, 10 Oct 2018 at 13:13, Thomas Evangelidis  wrote:

> What's your bitvector length and radius? How many training samples do you
> have?
>
> On Wed, 10 Oct 2018 at 13:51, Michal Krompiec 
> wrote:
>
>> Hi all,
>> I have a slightly off-topic question. I'm trying to train a neural
>> network on a dataset of small molecules and their melting points. I did get
>> a not-so-bad accuracy with Morgan fingerprints, but I've realised that
>> regardless of FP radius and bitvector length, several dozen molecules have
>> the same fingerprints but wildly different melting points. I am pretty sure
>> this is a "solved problem" so I don't want to reinvent the wheel. What is
>> the recommended/usual way of dealing with this?
>> Thanks,
>> Michal
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
> --
>
> ==
>
> Dr Thomas Evangelidis
>
> Research Scientist
>
> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
> Academy of Sciences <https://www.uochb.cz/web/structure/31.html?lang=en>
> Prague, Czech Republic
>   &
> CEITEC - Central European Institute of Technology <https://www.ceitec.eu/>
> Brno, Czech Republic
>
> email: teva...@gmail.com
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Fingerprint collision and machine learning

2018-10-10 Thread Michal Krompiec
Hi all,
I have a slightly off-topic question. I'm trying to train a neural network
on a dataset of small molecules and their melting points. I did get a
not-so-bad accuracy with Morgan fingerprints, but I've realised that
regardless of FP radius and bitvector length, several dozen molecules have
the same fingerprints but wildly different melting points. I am pretty sure
this is a "solved problem" so I don't want to reinvent the wheel. What is
the recommended/usual way of dealing with this?
Thanks,
Michal
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] number of significant digits in molblock?

2018-10-05 Thread Michal Krompiec
6 digits seems perfectly fine for me.

On Fri, 5 Oct 2018 at 14:26, Greg Landrum  wrote:

>
> On Fri, Oct 5, 2018 at 2:42 PM Ivan Tubert-Brohman <
> ivan.tubert-broh...@schrodinger.com> wrote:
>
>> In the newer "V3000", the atom line is not column-based, which I believe
>> gives more freedom to implementers to decide the precision of the
>> coordinates. You can force RDKit to write in this format by calling
>> SetForceV3000(True) on your writer object. I tried it and I get 5 digits
>> after the decimal point instead of 4, so at least that's a start. Looking
>> at the RDKit code (function GetV3000MolFileAtomLine), it just writes the
>> coordinates without setting the precision, so what you get is the default
>> stringstream conversion. Here's where one could in principle adjust this
>> precision, but there's clearly no API to do so at the moment.
>>
>
> Yep. This is not currently possible without editing C++ code.
> If there is a real use case for having more than 6 sig figs for atomic
> positions (this is what is currently available), we can certainly come up
> with a way to make it happen. I don't recall having seen any real-world
> examples where that would be desirable.
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] number of significant digits in molblock?

2018-10-05 Thread Michal Krompiec
Hi Jan,
Thanks, 6 digits is OK! Forcing V3000 did the trick:
sdf_out=Chem.SDWriter(outfile)
sdf_out.SetForceV3000(True)

Best,
Michal

On Fri, 5 Oct 2018 at 12:59, Jan Holst Jensen  wrote:

> Hi Michal,
>
> V2000 format is restricted by its specification to fixed format with 4
> decimals. V3000 output is not restricted to a fixed format, but the current
> code still rounds it in practice as seen below.
>
> To get extra precision you could change the formatting of x, y, and z
> coordinate output in Code/GraphMol/FileParsers/MolFileWriter.cpp, function 
> GetV3000MolFileAtomLine(),
> look for the
>
> ss << " " << x << " " << y << " " << z;
>
> line. Adding extra digits to the X, Y, and Z coordinates *should* not
> cause issues for compliant V3000 readers.
>
> Cheers
> -- Jan
>
> >>> import rdkit
> >>> from rdkit import Chem
> >>> from Chem import AllChem
> >>> m = Chem.MolFromSmiles('CC')
> >>> AllChem.Compute2DCoords(m)
> 0
> >>> m.GetConformer(0).SetAtomPosition(0,
> rdkit.Geometry.Point3D(0.123456789, 0.2, 0.3))
> >>>
> print(Chem.MolToMolBlock(m))
>  RDKit  2D
>
>   2  1  0  0  0  0  0  0  0  0999 V2000
> 0.12350.20000.3000 C   0  0  0  0  0  0  0  0  0  0  0  0
> <== 4 decimal digits
> 0.7500   -0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
> M  END
>
> >>> print(Chem.MolToMolBlock(m, forceV3000=True))
>
>  RDKit  2D
>
>   0  0  0  0  0  0  0  0  0  0999 V3000
> M  V30 BEGIN CTAB
> M  V30 COUNTS 2 1 0 0 0
> M  V30 BEGIN ATOM
> M  V30 1 C 0.123457 0.2 0.3 0<== 6 decimal digits
> M  V30 2 C 0.75 -5.55112e-17 0 0
> M  V30 END ATOM
> M  V30 BEGIN BOND
> M  V30 1 1 1 2
> M  V30 END BOND
> M  V30 END CTAB
> M  END
>
> >>>
>
> On 2018-10-05 11:42, Michal Krompiec wrote:
>
> Hello,
> Is it possible to control the number of significant digits of XYZ
> coordinates? I am modifying coordinates of my molecules
> using SetAtomPosition but when I save them into an SDF it seems that the
> precision is limited to 4 digits after the decimal point (I'd like 10
> instead...).
> Best wishes,
> Michal
>
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] number of significant digits in molblock?

2018-10-05 Thread Michal Krompiec
Hello,
Is it possible to control the number of significant digits of XYZ
coordinates? I am modifying coordinates of my molecules
using SetAtomPosition but when I save them into an SDF it seems that the
precision is limited to 4 digits after the decimal point (I'd like 10
instead...).
Best wishes,
Michal
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] organometallics?

2018-09-14 Thread Michal Krompiec
Hi Greg,
I'm very glad, too :).
That would be great as it would be faithful to the notion of a single
coordination bond between the metal atom and the ligand (as opposed to
several coordination bonds to individual atoms of the ligand).
Best,
Michal

On Fri, 14 Sep 2018 at 11:03, Greg Landrum  wrote:

> I'm glad to hear that things work. :-)
>
> I noticed that there is information about the atoms associated with a
> linkage point in the bond records of those v3000 SDFs:
> M  V30 39 9 38 37 ENDPTS=(2 20 21) ATTACH=ALL
>
> I need to do a bit more looking, but it may be possible to add the ability
> to directly parse and interpret this information; that would make things
> easier and the drawings would be less crappy.
>
> -greg
>
>
>
> On Fri, Sep 14, 2018 at 10:55 AM Michal Krompiec <
> michal.kromp...@gmail.com> wrote:
>
>> Hi Greg,
>> Thanks, molfile you attached shows how the problem can be solved for
>> neutral pi-alkene ligands. I just tried and you can draw this in
>> MarvinSketch in KNIME, and pass (as MOL) to RDKit nodes without problems!
>> pi-allyl and cyclopentadienyl also worked when I drew them as Lewis
>> structures with an explicit negative charge and coordinate bonds to all
>> carbon atoms (see attached). Their 2D depiction perhaps isn't beautiful but
>> it is not a problem for me.
>> Best,
>> Michal
>>
>> On Fri, 14 Sep 2018 at 08:07, Greg Landrum 
>> wrote:
>>
>>> That's more or less what the current code does: dative bonds from an
>>> atom to a metal do not affect the perceived valence of the source atom:
>>>
>>> In [13]: m = Chem.MolFromSmiles('[Fe]<-N(C)(C)C')
>>>
>>> In [14]: m.Debug()
>>> Atoms:
>>> 0 26 Fe chg: 0  deg: 1 exp: 1 imp: 0 hyb: 4 arom?: 0 chi: 0
>>> 1 7 N chg: 0  deg: 4 exp: 3 imp: 0 hyb: 5 arom?: 0 chi: 0
>>> 2 6 C chg: 0  deg: 1 exp: 1 imp: 3 hyb: 4 arom?: 0 chi: 0
>>> 3 6 C chg: 0  deg: 1 exp: 1 imp: 3 hyb: 4 arom?: 0 chi: 0
>>> 4 6 C chg: 0  deg: 1 exp: 1 imp: 3 hyb: 4 arom?: 0 chi: 0
>>> Bonds:
>>> 0 1->0 order: 17 conj?: 0 aromatic?: 0
>>> 1 1->2 order: 1 conj?: 0 aromatic?: 0
>>> 2 1->3 order: 1 conj?: 0 aromatic?: 0
>>> 3 1->4 order: 1 conj?: 0 aromatic?: 0
>>>
>>>
>>> For what it's worth, if you draw coordinate bonds from atoms to the
>>> metal in the MOL file, you get something sensible back from the RDKit.
>>> I've attached a tweaked version of one of Michal's example files showing
>>> how to do this.
>>>
>>> Dealing with the dummy atoms directly is tricky because we'd need to
>>> figure out which atoms they are associated with. I think that there's a way
>>> to do it, but that's not handled in the .mol file you sent
>>>
>>> Best,
>>> -greg
>>>
>>>
>>>
>>>
>>> On Thu, Sep 13, 2018 at 6:51 PM Maciek Wójcikowski <
>>> mac...@wojcikowski.pl> wrote:
>>>
>>>> I would suggest that all coordination bonds to metal that exceed the
>>>> accepted valence of an atom could be mark as zero-ordered. This is what
>>>> happens in recent PDB reader changes and fixed a lot of problems with
>>>> sanitization.
>>>> 
>>>> Pozdrawiam,  |  Best regards,
>>>> Maciek Wójcikowski
>>>> mac...@wojcikowski.pl
>>>>
>>>>
>>>> czw., 13 wrz 2018 o 18:16 Jan Halborg Jensen 
>>>> napisał(a):
>>>>
>>>>> Here’s a modest step in the right direction
>>>>> https://www.wildcardconsulting.dk/useful-information/how-to-solve-problems-with-coordinate-bonds-in-rdkit/
>>>>>
>>>>> Best regards, Jan
>>>>>
>>>>> On 13 Sep 2018, at 15:14, Greg Landrum  wrote:
>>>>>
>>>>> Hi Michal,
>>>>>
>>>>> Though the RDKit theoretically has many of the infrastructure pieces
>>>>> required to handle organometallics (though there's not a lot you can do
>>>>> with them once you've loaded them), the difficult part almost always ends
>>>>> up being finding input files that have reasonably machine-readable
>>>>> structures in them.
>>>>>
>>>>> If you have some examples you can share, I'd be happy to take a look
>>>>> to see if I can suggest ways to read them in.
>>>>>
>>>>> Best,
>>>>> -greg
>>>>

Re: [Rdkit-discuss] organometallics?

2018-09-13 Thread Michal Krompiec
... and yet another example, bis-mu-dichloro-bis(allyl)dipalladium(II),
drawn according to ChemAxon's instructions:
https://docs.chemaxon.com/display/docs/How+to+draw+coordination+compounds

Michal

On Thu, 13 Sep 2018 at 14:45, Michal Krompiec 
wrote:

> Hi Greg,
> Thanks for your fast reply. I've got two examples of ferrocene MOLfiles,
> generated by MarvinSketch in KNIME, from ferrocene.cdxml (found somewhere
> in the rdkit github repo), the other one from the ferrocene template in
> Marvin. But actually they are almost the same.
> The third example is Pd(dba)2 (also from Marvin's template library). As
> you can see, the attachment is made via a dummy atom placed where the bond
> is drawn (middle of Cp ring or middle of double bond).
> Best regards,
> Michal
>
>
> On Thu, 13 Sep 2018 at 14:14, Greg Landrum  wrote:
>
>> Hi Michal,
>>
>> Though the RDKit theoretically has many of the infrastructure pieces
>> required to handle organometallics (though there's not a lot you can do
>> with them once you've loaded them), the difficult part almost always ends
>> up being finding input files that have reasonably machine-readable
>> structures in them.
>>
>> If you have some examples you can share, I'd be happy to take a look to
>> see if I can suggest ways to read them in.
>>
>> Best,
>> -greg
>>
>>
>> On Wed, Sep 12, 2018 at 10:30 PM Michal Krompiec <
>> michal.kromp...@gmail.com> wrote:
>>
>>> Hello,
>>> I've been asked to analyze a dataset of organometallic compounds
>>> (provided in SDF), but it turns out that most of them are not compatible
>>> with RDKit (due to having pi-alkene, pi-allyl, cyclopentadienyl et al.
>>> ligands). The structures can be correctly represented in Marvin, though.
>>> Can anybody point me to a toolkit (or RDKit hack) that can handle these?
>>> Best,
>>> Michal
>>>
>>> 
>>> Dr. Michal Krompiec
>>> Adjunct Professor
>>> School of Chemistry, University of Southampton
>>> Highfield, Southampton SO17 1BJ, UK
>>>
>>> and
>>> Head of Computational Modelling | Performance Materials | Early Research
>>> and Business Development
>>> Merck
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>


Pd_allyl_chloride_dimer.mol
Description: Binary data
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] organometallics?

2018-09-13 Thread Michal Krompiec
Hi Greg,
Thanks for your fast reply. I've got two examples of ferrocene MOLfiles,
generated by MarvinSketch in KNIME, from ferrocene.cdxml (found somewhere
in the rdkit github repo), the other one from the ferrocene template in
Marvin. But actually they are almost the same.
The third example is Pd(dba)2 (also from Marvin's template library). As you
can see, the attachment is made via a dummy atom placed where the bond is
drawn (middle of Cp ring or middle of double bond).
Best regards,
Michal


On Thu, 13 Sep 2018 at 14:14, Greg Landrum  wrote:

> Hi Michal,
>
> Though the RDKit theoretically has many of the infrastructure pieces
> required to handle organometallics (though there's not a lot you can do
> with them once you've loaded them), the difficult part almost always ends
> up being finding input files that have reasonably machine-readable
> structures in them.
>
> If you have some examples you can share, I'd be happy to take a look to
> see if I can suggest ways to read them in.
>
> Best,
> -greg
>
>
> On Wed, Sep 12, 2018 at 10:30 PM Michal Krompiec <
> michal.kromp...@gmail.com> wrote:
>
>> Hello,
>> I've been asked to analyze a dataset of organometallic compounds
>> (provided in SDF), but it turns out that most of them are not compatible
>> with RDKit (due to having pi-alkene, pi-allyl, cyclopentadienyl et al.
>> ligands). The structures can be correctly represented in Marvin, though.
>> Can anybody point me to a toolkit (or RDKit hack) that can handle these?
>> Best,
>> Michal
>>
>> 
>> Dr. Michal Krompiec
>> Adjunct Professor
>> School of Chemistry, University of Southampton
>> Highfield, Southampton SO17 1BJ, UK
>>
>> and
>> Head of Computational Modelling | Performance Materials | Early Research
>> and Business Development
>> Merck
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>


fecp_from_mrv.sdf
Description: Binary data


pd_dba2.mol
Description: Binary data


fecp_from_cdx.sdf
Description: Binary data
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] organometallics?

2018-09-12 Thread Michal Krompiec
Hello,
I've been asked to analyze a dataset of organometallic compounds (provided
in SDF), but it turns out that most of them are not compatible with RDKit
(due to having pi-alkene, pi-allyl, cyclopentadienyl et al. ligands). The
structures can be correctly represented in Marvin, though. Can anybody
point me to a toolkit (or RDKit hack) that can handle these?
Best,
Michal


Dr. Michal Krompiec
Adjunct Professor
School of Chemistry, University of Southampton
Highfield, Southampton SO17 1BJ, UK

and
Head of Computational Modelling | Performance Materials | Early Research
and Business Development
Merck
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] UFF/MMFF atom types

2018-08-10 Thread Michal Krompiec
Hi Paolo,
Has anything changed (re adding new atom types to UFF or MMFF) since then?
Best,
Michal

On Tue, 5 Nov 2013 at 06:56, Paolo Tosco  wrote:

> Hi both,
>
> now I realize that yesterday I replied to Michal and not to the list;
> sorry about that. Adding the option to force an atom type replacement
> wouldn't be too much work, but would not be ideal because in cause of
> selenium you would get, for instance, the same VdW radius and equilibrium
> bond distances as for sulfur, which would both be too short. That could
> also be handled upstream the atom typing by replacing Se with S and putting
> back Se downstream as I suggested yesterday to Michal, but again a bit of a
> hack. Probably a better solution would be to allow the user to provide some
> new parameters (as for UFF, adding Python support) and fall back to a
> related atom type (sulfur, in this case) for the missing ones. I'll look
> into that during the next days and let you know.
>
> Best,
> p.
>
> --
> ==
> Paolo Tosco, Ph.D.
> Department of Drug Science and Technology
> Via Pietro Giuria, 9 - 10125 Torino (Italy)
> Tel: +39 011 670 7680 | Mob: +39 348 5537206
> Fax: +39 011 670 7687 | E-mail: paolo.to...@unito.it
> http://open3dqsar.org | http://open3dalign.org
> ==
>
>
>
> On 5 Nov 2013, at 07:20, Greg Landrum  wrote:
>
> Hi Michal,
>
>
> On Mon, Nov 4, 2013 at 11:45 AM, Michal Krompiec <
> michal.kromp...@gmail.com> wrote:
>
>> Hello,
>> Is Se defined in UFF and/or MMFF94? Apparently, molecules with
>> selenophene moieties don't optimize in RDKit, and a warning appears in
>> the log: UFFTYPER: Unrecognized atom type: Se2+2
>>
>
> Right. UFF Parameters are there for sp3 Se ("Se3+2"), but not for the sp2
> version.
>
> There are no MMFF94 parameters for Se.
>
> Is it possible to define/modify the force field by hand? (for example,
>> use the parametrs of S for Se)
>>
>
> If you are working from C++, you can provide UFF parameters when you build
> the force field, but it's not currently possible to do so from Python. It's
> probably possible to add an option to the python UFF code to allow you to
> provide a new atom type (or to over-ride parameters for an existing atom
> type); I'd have to look into that.
> In the meantime, the quickest thing you could do would be to modify
> $RDBASE/Code/ForceField/UFF/Params.cpp to add the atom type you want and
> then to rebuild the RDKit.
>
> I guess that adding new atom types to MMFF94S is considerably more
> complex. Here one could imagine providing an interface to explicitly set
> the type of an atom to another existing atom type. Paolo would have to let
> us know how much work that is.
>
> -greg
>
>
> --
> November Webinars for C, C++, Fortran Developers
> Accelerate application performance with scalable programming models.
> Explore
> techniques for threading, error checking, porting, and tuning. Get the
> most
> from the latest Intel processors and coprocessors. See abstracts and
> register
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231=/4140/ostg.clktrk
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-17 Thread Michal Krompiec
+1 vote for Symmetrizer. It would be very useful for preparing input for
computational chemistry codes.
Best,
Michal Krompiec
Merck KGaA

On Mon, 15 Jan 2018 at 15:21, Jason Biggs <jasondbi...@gmail.com> wrote:

>
>- I've had this on my to-do list for a few months now, implementing
>the algorithm described in this paper.  I think the force-field energy
>minimization routines already present in the RDKit can be utilized for this
>pretty easily.  The only part that I don't think is set up already would be
>applying a constant force to all atoms to force them into the xy plane.
>
> Frączek, T., "Simulation-Based Algorithm for Two-Dimensional Chemical
> Structure Diagram Generation of Complex Molecules and Ligand–Protein
> Interactions." J. Chem. Inf. Model. 2016, 56, 2320-2335, DOI:
> 10.1021/acs.jcim.6b00391.
>
>
>
>- Another idea would be to add in point-group symmetry detection.  I'm
>using the Symmetrizer java library, described here
>https://www.ncbi.nlm.nih.gov/pubmed/22549414, and pretty happy with it
>overall.  One could re-implement it in C++, or include the jar in the
>External folder and write python wrappers.
>
>
> Jason Biggs
>
>
> On Mon, Jan 15, 2018 at 1:09 AM, Greg Landrum <greg.land...@gmail.com>
> wrote:
>
>> Dear all,
>>
>> We've been invited again to participate in the OpenChemistry application
>> for Google Summer of Code.
>>
>> In order to participate we need ideas for projects and mentors to go
>> along with them.
>>
>> The current list of RDKit ideas is being maintained here:
>> http://wiki.openchemistry.org/GSoC_Ideas_2018#RDKit_Project_Ideas
>>
>> (Note: at the point that I'm pressing "send", that's still a copy of last
>> year's project ideas).
>>
>> If you're willing to be a mentor (please ask me about the ~5 hours/week
>> required here) or have ideas, please reply to this thread.
>>
>> Best,
>> -greg
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] HasSubstructMatch doesn't work as expected

2017-09-13 Thread Michal Krompiec
I'm afraid it won't work in the general case (i.e. you can make it work for
some classes of compounds, but not without unwanted side effects on others)
if the aromaticity model of the other cartridge is different - and it seems
to be the case here...

On Wednesday, 13 September 2017, Michał Nowotka  wrote:

> OK, so what I have is some substructure results from other (non-rdkit)
> cartridge and I want to use rdkit to generate images of all results
> with the query substracture highlighed and aligned.
> So I have two things: a list of compounds and a query compound.
> Now I need to highlight the query compound for every compound from the
> list and I need to do it at all costs. I can't leave any compound not
> highlighted even if rdkit by default has a different opinion weather
> the query compound really is a true substructure of a given compound.
>
> So how can I instruct rdkit to ignore aromacity and other factors,
> preferably one by one, each time going one level deeper where the last
> resort would be simply matching on the level of two planar graphs. Is
> that possible?
>
> On Wed, Sep 13, 2017 at 4:48 PM, Peter S. Shenkin  > wrote:
> > Your course of action depends upon just what you are really trying to
> do. If
> > it's only aspirin, then why wouldn't you just do it manually? If it goes
> > beyond aspirin, you have to start by defining in general terms exactly
> what
> > you want to match to what.
> >
> > For example, given a query molecule (aspirin in this case), if you want
> all
> > its non-aromatic atoms to match aromatic as well as non-aromatic atoms in
> > the database, you could write a string-alteration routine to munge the
> > SMILES of a query molecule into a SMARTS that would do just that, and
> then
> > use that SMARTS to match your database molecules. Repeat for each query
> > molecule.
> >
> > But you have to start with a precise definition of just what kind of
> > matching you wish to do. For instance, maybe you don't really want
> > non-aromatic ring atoms in your query to match aromatic rings and vice
> versa
> > (i.e., a cyclohexyl to match a phenyl); maybe you only want non-ring
> atoms
> > in the query to match aliphatic as well as aromatic substructures. And so
> > on.
> >
> > -P.
> >
> >
> > On Wed, Sep 13, 2017 at 10:42 AM, Michał Nowotka  > wrote:
> >>
> >> Is there any flag in RDkit to match both 'normal' aspirin and embedded
> >> aromatic analogues?
> >> The problem is that I can't modify user queries by hand in real time :)
> >>
> >> On Wed, Sep 13, 2017 at 2:12 PM, Chris Earnshaw  >
> >> wrote:
> >> > Hi
> >> >
> >> > The problem is due to RDkit perceiving the embedded pyranone in
> >> > CHEMBL1999443 as an aromatic system, which is probably correct.
> However,
> >> > in
> >> > the structure of aspirin the carboxyl carbon and singly bonded oxygen
> >> > are
> >> > non-aromatic, so if you just use the SMILES of aspirin as a query it
> >> > won't
> >> > match CHEMBL1999443
> >> >
> >> > You'll need to use a slightly more generic aspirin-like query to allow
> >> > the
> >> > possibility of matching both 'normal' aspirin and embedded aromatic
> >> > analogues. CC(=O)Oc1c1[#6](=O)[#8] should work OK.
> >> >
> >> > Regards,
> >> > Chris
> >> >
> >> > On 13 September 2017 at 13:40, Michał Nowotka  > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> This problem is probably due to my lack of chemistry knowledge but
> >> >> plese have a look:
> >> >>
> >> >> If I do a substructure search in ChEMBL using aspirin (CHEMBL25) as a
> >> >> query (ChEMBL API uses the Symix catridge):
> >> >>
> >> >> from chembl_webresource_client.new_client import new_client
> >> >> res = new_client.substructure.filter(chembl_id='CHEMBL25')
> >> >>
> >> >> One of them will be CHEMBL1999443:
> >> >>
> >> >> 'CHEMBL1999443' in (r['molecule_chembl_id'] for r in res)
> >> >> >>> True
> >> >>
> >> >> Now I take the molfile:
> >> >>
> >> >> new_client.molecule.set_format('mol')
> >> >> mol = new_client.molecule.get('CHEMBL1999443')
> >> >>
> >> >> and load it with aspirin into rdkit:
> >> >>
> >> >> from rdkit import Chem
> >> >> m = Chem.MolFromMolBlock(mol)
> >> >> pattern = Chem.MolFromMolBlock(new_
> client.molecule.get('CHEMBL25'))
> >> >>
> >> >> If I check if it has an aspirin as a substructure using rdkit, I'm
> >> >> getting false...
> >> >>
> >> >> m.HasSubstructMatch(pattern)
> >> >> >>> False
> >> >>
> >> >> Looking at this blog post:
> >> >>
> >> >>
> >> >> https://github.com/rdkit/rdkit-tutorials/blob/master/
> notebooks/002_SMARTS_SubstructureMatching.ipynb
> >> >> I tried to initialize rings and retry:
> >> >>
> >> >>  Chem.GetSymmSSSR(m)
> >> >>  m.HasSubstructMatch(pattern)
> >> >>  >>>False
> >> >>
> >> >> Chem.GetSymmSSSR(pattern)
> >> >> m.HasSubstructMatch(pattern)
> >> >> 

Re: [Rdkit-discuss] Problem Installing RDKit

2017-08-17 Thread Michal Krompiec
Dear Stephen,
You have installed RDKit in the environment my-rdkit-env, you need to
activate the environment in order to use it (btw, it seems that you
installed it in "mr-rdkit-env").
But you can also install RDKit in the default environment: conda install
rdkit -c rdkit

Best wishes,
Michal

On 17 August 2017 at 13:00, Stephen P. Molnar 
wrote:

> I have installed Spyder3.2.1 in the current version of Miniconda3 on my
> Debian v-9.1.0 64 bit Linux platform.  Spyder is performing well, but I am
> having difficulty installing the RDKit.
>
> I followed the directions in the RDKit_Docs_current.pdf"
>
> How to install RDKit with Conda
> Creating a new conda environment with the RDKit installed using these
> packages requires one single command similar
> to the following::
> $ conda create -c rdkit -n my-rdkit-env rdkit
> Finally, the new environment must be activated, so that the corresponding
> python interpreter becomes available in the
> same shell:
> $ source activate my-rdkit-env
>
> There were no warning or error messages during the installation, but when
> I attempt running a simple Python script:
>
> #!/usr/bin/env python3
> # -*- coding: utf-8 -*-
> """
> Created on Tue Aug 15 11:41:24 2017
>
> @author: comp
> """
>
> from __future__ import print_function
> from rdkit import Chem
>
> m = Chem.MolFromSmiles('Cc1c1')
> m
>
> I get:
>
> IPython 6.1.0 -- An enhanced Interactive Python.
>
> runfile('/home/comp/Apps/Python/untitled0.py',
> wdir='/home/comp/Apps/Python')
> Traceback (most recent call last):
>
>   File "", line 1, in 
> runfile('/home/comp/Apps/Python/untitled0.py',
> wdir='/home/comp/Apps/Python')
>
>   File "/home/comp/Apps/miniconda3/lib/python3.6/site-packages/spyd
> er/utils/site/sitecustomize.py", line 688, in runfile
> execfile(filename, namespace)
>
>   File "/home/comp/Apps/miniconda3/lib/python3.6/site-packages/spyd
> er/utils/site/sitecustomize.py", line 101, in execfile
> exec(compile(f.read(), filename, 'exec'), namespace)
>
>   File "/home/comp/Apps/Python/untitled0.py", line 10, in 
> from rdkit import Chem
>
> ModuleNotFoundError: No module named 'rdkit'
>
> RDKit is installed in ~/miniconda3/envs/mr-rdkit-env
>
> Unfortunately, I have no clue as to what the problem(s) may be, assistance
> will be much appreciated.
>
> Thanks in advance.
> --
> Stephen P. Molnar, Ph.D.Life is a fuzzy set
> www.molecular-modeling.net  Stochastic and multivariate
> (614)312-7528 (c)
> Skype: smolnar1
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Jupyter Could not find environment: my-rdkit-env

2017-06-29 Thread Michal Krompiec
Hi Germano,
You can also install rdkit in the default environment:
conda install -c rdkit rdkit

Best,
Michal


On 29 June 2017 at 14:22, Germano Massullo 
wrote:

> Hi there, I am experiencing some troubles in letting jupyter use rdkit
> on Fedora 25
> I installed Anaconda 4.4.0
> then following [1] I runned
>
> $ conda create -c rdkit -n my-rdkit-env rdkit
> $ source activate my-rdkit-env
>
> then
>
> $ jupyter-notebook --ip foo_ip --port 8890
>
> then from jupyter notebook
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
> from rdkit.Chem import Descriptors
>
> but those imports return
> 
> CondaEnvironmentNotFoundError: Could not find environment: my-rdkit-env .
> You can list all discoverable environments with ``.
> 
>
> running
> conda info --envs
> I get
> # conda environments:
> #
> my-rdkit-env  *  /home/user/.conda/envs/my-rdkit-env
> root /opt/anaconda3
>
>
> Thank you very much
>
> [1]: http://www.rdkit.org/docs/Install.html
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How to match any halogen of a structure with any halogen of a substructure?

2017-05-17 Thread Michal Krompiec
Hi Alexis,
Try aromatic form instead of Kekule notation.
Best,
Michal

On 17 May 2017 at 12:55, Alexis Parenty 
wrote:

> Hi everyone,
>
> I am looking for substructure match between a smarts and a smiles, but I
> want any heteroatom from the smarts to match any heteroatom from a smiles:
>
>
> [image: Inline images 1]
>
>
>
>
>
> The following does not return what I would expect:
>
> smarts1 = " [F,Cl,Br,I]C1=CC(C2[N,O,S]CC[N,O,S]C2)=CC=C1"smiles2 = " 
> ClC1=CC(C2NCCOC2)=C(C=CC=C3)C3=C1"
>
> mol1 = Chem.MolFromSmarts(smarts1)mol2 = Chem.MolFromSmiles(smiles2)
> *print*("mol1 is a substructure of mol2: 
> {}".format(mol2.HasSubstructMatch(mol1) *print*("mol2 is a substructure of 
> mol1: {}".format(mol1.HasSubstructMatch(mol2)))
>
>
>
> ð  mol1 is a substructure of mol2: False
>
> ð  mol2 is a substructure of mol1: False
>
> How could I do that?
>
>
>
> Thanks,
>
>
>
> Alexis
>
>
>>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] problems with installation on conda with python 3.5 32-bit

2017-02-22 Thread Michal Krompiec
Hi Greg,
Thanks a lot, your help is much appreciated!
Thanks and kind regards,
Michal

On 22 February 2017 at 07:04, Greg Landrum <greg.land...@gmail.com> wrote:

> Michal,
>
> This morning I did a win32 build of the RDKit with python 3.5 and pushed
> it to anaconda. You should (hopefully) be able to do "conda install -c
> rdkit rdkit" now and have things work. I will try python 2.7 tomorrow.
>
> It turns out that this isn't much extra effort, so assuming this build
> works I should be able to keep doing these.
>
> -greg
>
>
> On Mon, Feb 20, 2017 at 9:33 PM, Michal Krompiec <
> michal.kromp...@gmail.com> wrote:
>
>> Hi Greg,
>> Thanks for your reply. Actually, >50% of my (prospective) users are stuck
>> on 32-bit. It would be really nice to have a python3 build (even once a
>> year) but I understand that the demand is low and waning. I guess the
>> solution is to use python2.7 for the time being...
>> Thanks and kind regards,
>> Michal
>>
>> On Monday, 20 February 2017, Greg Landrum <greg.land...@gmail.com> wrote:
>>
>>> Hi Michal,
>>>
>>> We've only ever done python2.7 builds for win32 and we stopped doing
>>> those with the 2016.03 release.
>>> I will have to check, but I think I probably can start doing these
>>> again, but I'm reluctant due to the amount of effort required.
>>> How many users do you need to support who are stuck on 32bit machines?
>>>
>>> -greg
>>>
>>>
>>> On Mon, Feb 20, 2017 at 2:18 PM, Michal Krompiec <
>>> michal.kromp...@gmail.com> wrote:
>>>
>>>> Hello,
>>>> I can't install rdkit on anaconda with 32-bit python3 on Windows 7.
>>>>
>>>> When I try "the usual", conda tries to install python2.7 into the
>>>> environment:
>>>>
>>>> >conda create -c rdkit -n my-rdkit-env rdkit
>>>> Fetching package metadata .
>>>> Solving package specifications: .
>>>> Package plan for installation in environment
>>>> C:\Anaconda3_32\envs\my-rdkit-env:
>>>> The following NEW packages will be INSTALLED:
>>>> boost:  1.56.0-py27_3 rdkit
>>>> bzip2:  1.0.6-vc9_3 [vc9]
>>>> mkl:2017.0.1-0
>>>> numpy:  1.11.3-py27_0
>>>> pip:9.0.1-py27_1
>>>> python: 2.7.13-0
>>>> rdkit:  2016.03.1-np111py27_1 rdkit
>>>> setuptools: 27.2.0-py27_1
>>>> vs2008_runtime: 9.00.30729.5054-0
>>>> wheel:  0.29.0-py27_0
>>>> zlib:   1.2.8-vc9_3 [vc9]
>>>>
>>>> If I create an empty environment, load python 3.5 into it and try
>>>> installing rdkit, I get an error:
>>>>
>>>> >conda create -n my-rdkit-env python=3.5
>>>> Fetching package metadata ...
>>>> Solving package specifications: .
>>>> Package plan for installation in environment
>>>> C:\Anaconda3_32\envs\my-rdkit-env:
>>>> The following NEW packages will be INSTALLED:
>>>> pip:9.0.1-py35_1
>>>> python: 3.5.2-0
>>>> setuptools: 27.2.0-py35_1
>>>> vs2015_runtime: 14.0.25123-0
>>>> wheel:  0.29.0-py35_0
>>>> Proceed ([y]/n)?
>>>> #
>>>> # To activate this environment, use:
>>>> # > activate my-rdkit-env
>>>> #
>>>> # To deactivate this environment, use:
>>>> # > deactivate my-rdkit-env
>>>> #
>>>> # * for power-users using bash, you must source
>>>> #
>>>>
>>>> >conda install --name my-rdkit-env -f --channel
>>>> https://conda.anaconda.org/rdkit rdkit
>>>> Fetching package metadata .
>>>> Solving package specifications: .
>>>>
>>>> UnsatisfiableError: The following specifications were found to be in
>>>> conflict:
>>>>   - python 3.5*
>>>>   - rdkit -> python 2.7*
>>>> Use "conda info " to see the dependencies for each package.
>>>>
>>>>
>>>> I managed to install rdkit without any problems on the same machine in
>>>> 64-bit anaconda with python3.5, but I need a separate 32-bit build to
>>>> support users with 32-bit machines. Any help will be appreciated.
>>>>
>>>> Thanks and best regards,
>>>>
>>>> Michal
>>>>
>>>>
>>>> 
>>>> --
>>>> Check out the vibrant tech community on one of the world's most
>>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>>> ___
>>>> Rdkit-discuss mailing list
>>>> Rdkit-discuss@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>
>>>>
>>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] problems with installation on conda with python 3.5 32-bit

2017-02-20 Thread Michal Krompiec
Hi Greg,
Thanks for your reply. Actually, >50% of my (prospective) users are stuck
on 32-bit. It would be really nice to have a python3 build (even once a
year) but I understand that the demand is low and waning. I guess the
solution is to use python2.7 for the time being...
Thanks and kind regards,
Michal

On Monday, 20 February 2017, Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Michal,
>
> We've only ever done python2.7 builds for win32 and we stopped doing those
> with the 2016.03 release.
> I will have to check, but I think I probably can start doing these again,
> but I'm reluctant due to the amount of effort required.
> How many users do you need to support who are stuck on 32bit machines?
>
> -greg
>
>
> On Mon, Feb 20, 2017 at 2:18 PM, Michal Krompiec <
> michal.kromp...@gmail.com
> <javascript:_e(%7B%7D,'cvml','michal.kromp...@gmail.com');>> wrote:
>
>> Hello,
>> I can't install rdkit on anaconda with 32-bit python3 on Windows 7.
>>
>> When I try "the usual", conda tries to install python2.7 into the
>> environment:
>>
>> >conda create -c rdkit -n my-rdkit-env rdkit
>> Fetching package metadata .
>> Solving package specifications: .
>> Package plan for installation in environment
>> C:\Anaconda3_32\envs\my-rdkit-env:
>> The following NEW packages will be INSTALLED:
>> boost:  1.56.0-py27_3 rdkit
>> bzip2:  1.0.6-vc9_3 [vc9]
>> mkl:2017.0.1-0
>> numpy:  1.11.3-py27_0
>> pip:9.0.1-py27_1
>> python: 2.7.13-0
>> rdkit:  2016.03.1-np111py27_1 rdkit
>> setuptools: 27.2.0-py27_1
>> vs2008_runtime: 9.00.30729.5054-0
>> wheel:  0.29.0-py27_0
>> zlib:   1.2.8-vc9_3 [vc9]
>>
>> If I create an empty environment, load python 3.5 into it and try
>> installing rdkit, I get an error:
>>
>> >conda create -n my-rdkit-env python=3.5
>> Fetching package metadata ...
>> Solving package specifications: .
>> Package plan for installation in environment
>> C:\Anaconda3_32\envs\my-rdkit-env:
>> The following NEW packages will be INSTALLED:
>> pip:9.0.1-py35_1
>> python: 3.5.2-0
>> setuptools: 27.2.0-py35_1
>> vs2015_runtime: 14.0.25123-0
>> wheel:  0.29.0-py35_0
>> Proceed ([y]/n)?
>> #
>> # To activate this environment, use:
>> # > activate my-rdkit-env
>> #
>> # To deactivate this environment, use:
>> # > deactivate my-rdkit-env
>> #
>> # * for power-users using bash, you must source
>> #
>>
>> >conda install --name my-rdkit-env -f --channel
>> https://conda.anaconda.org/rdkit rdkit
>> Fetching package metadata .
>> Solving package specifications: .
>>
>> UnsatisfiableError: The following specifications were found to be in
>> conflict:
>>   - python 3.5*
>>   - rdkit -> python 2.7*
>> Use "conda info " to see the dependencies for each package.
>>
>>
>> I managed to install rdkit without any problems on the same machine in
>> 64-bit anaconda with python3.5, but I need a separate 32-bit build to
>> support users with 32-bit machines. Any help will be appreciated.
>>
>> Thanks and best regards,
>>
>> Michal
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> <javascript:_e(%7B%7D,'cvml','Rdkit-discuss@lists.sourceforge.net');>
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] UpdatePropertyCache() after RunReactants

2017-01-12 Thread Michal Krompiec
You need to sanitize the products, just run Chem.SanitizeMol on each
molecule. See
http://www.rdkit.org/docs/GettingStartedInPython.html#chemical-reactions :
"the molecules that are produced by the chemical reaction processing code
are not sanitized".

Best,
Michal

On 12 January 2017 at 17:22, Curt Fischer  wrote:

> What makes you think the molecules are nonsensical?  They look OK to me.
> Converting to SMILES before doing any UpdatePropertyCache() stuff
>
>
>
> *products_tuples = copper_click.RunReactants((diyne, azide))products =
> list(chain(*products_tuples))print [Chem.MolToSmiles(prod) for prod in
> products]*
>
> gives
>
>
>> *['C#CC(O)Cc1cnnn1CCC', 'C#CC(O)Cc1cn(CCC)nn1', 'C#CCC(O)c1cn(CCC)nn1',
>> 'C#CCC(O)c1cnnn1CCC']*
>
>
> ...and those all look like valid SMILES strings to me.
>
> I'm not sure exactly how to turn off all sanitization, but I did
>
> *Draw.MolsToGridImage(products, kekulize = False)*
>
> and as long that is invoked before UpdatePropertyCache, there is a
> *different* error than the one I reported last time.
>
> ---RuntimeError
>   Traceback (most recent call 
> last) in ()  7 print 
> [Chem.MolToSmiles(prod) for prod in products]  8 > 9 
> Draw.MolsToGridImage(products, kekulize = False)
> /Users/curt/anaconda2/lib/python2.7/site-packages/rdkit/Chem/Draw/IPythonConsole.pyc
>  in ShowMols(mols, **kwargs)198   else:199 fn = 
> Draw.MolsToGridImage--> 200   res = fn(mols, **kwargs)201   if 
> kwargs['useSVG']:202 return SVG(res)
> /Users/curt/anaconda2/lib/python2.7/site-packages/rdkit/Chem/Draw/__init__.pyc
>  in MolsToGridImage(mols, molsPerRow, subImgSize, legends, 
> highlightAtomLists, useSVG, **kwargs)400   if useSVG:401 return 
> _MolsToGridSVG(mols, molsPerRow=molsPerRow, subImgSize=subImgSize, 
> legends=legends,--> 402   
> highlightAtomLists=highlightAtomLists, **kwargs)403   else:404 
> return _MolsToGridImage(mols, molsPerRow=molsPerRow, subImgSize=subImgSize, 
> legends=legends,
> /Users/curt/anaconda2/lib/python2.7/site-packages/rdkit/Chem/Draw/__init__.pyc
>  in _MolsToGridSVG(mols, molsPerRow, subImgSize, legends, highlightAtomLists, 
> stripSVGNamespace, **kwargs)374   nmol = 
> rdMolDraw2D.PrepareMolForDrawing(mol, kekulize=kwargs.get('kekulize', True))  
>   375   d2d = rdMolDraw2D.MolDraw2DSVG(subImgSize[0], subImgSize[1])--> 
> 376   d2d.DrawMolecule(nmol, legend=legends[i], 
> highlightAtoms=highlights)377   d2d.FinishDrawing()378   txt 
> = d2d.GetDrawingText()
> RuntimeError: Pre-condition Violation
>   getNumImplicitHs() called without preceding call to 
> calcImplicitValence()
>   Violation occurred on line 153 in file Code/GraphMol/Atom.cpp
>   Failed Expression: d_implicitValence > -1
>   RDKIT: 2016.09.2
>   BOOST: 1_56
>
>
>
> On Thu, Jan 12, 2017 at 4:41 AM, Brian Kelley 
> wrote:
>
>> The outputs of reaction are a bit confusing.
>>
>> Reactions can have multiple product templates so the output of
>> RunReactants is a list of list of molecules.
>>
>> For products in result:
>>   For molecule in products:
>>  Molecule.UpdatePropertyCache()
>>
>> However, it looks like your reaction is generating non sensical molecules
>> so you may want to draw with sanitizaton turned off so you can see the
>> reaction output.
>>
>> 
>> Brian Kelley
>>
>> On Jan 11, 2017, at 9:11 PM, Curt Fischer 
>> wrote:
>>
>> Hi all,
>>
>> I recently wanted to use RDKit to model the famous copper-catalyzed
>> cycloaddition of alkynes and azides.
>>
>> I eventually got things working, kind of, but had two questions.  First,
>> I was surprised to find that the products of RunReactants don't have update
>> property caches.  Is this something I should have expected, or is it a
>> bug?  If the latter, is it any easy-to-fix bug or a hard-to-fix one?
>>
>> Second, how can I modify my SMARTS reaction query to avoid duplication of
>> each product?
>>
>> Here's some example code, also available at https://github.com/tentrill
>> ion/ipython_notebooks/blob/master/rdkit_smarts_reactions_
>> needs_updating.ipynb
>>
>> # ---BEGIN CODE-- #
>> # import rdkit components
>> from rdkit import rdBase
>> from rdkit import Chem
>> from rdkit.Chem import AllChem
>> from rdkit.Chem import Draw
>>
>> # use IPythonConsole for pretty drawings
>> from rdkit.Chem.Draw import IPythonConsole
>> # IPythonConsole.ipython_useSVG=True  # leave out for github
>>
>> # for flattening
>> from itertools import chain
>>
>> # define reactants
>> diyne_smiles = 'C#CCC(O)C#C'
>> azide_smiles = 'CCCN=[N+]=[N-]'
>>
>> diyne = Chem.MolFromSmiles(diyne_smiles)
>> azide = Chem.MolFromSmiles(azide_smiles)
>>
>> # define reaction
>> copper_click_smarts = '[C:1]#[C:2].[N:3]=[N+:4]=[N-:
>> 

Re: [Rdkit-discuss] errors with windows10 RDKit installation using conda

2016-11-29 Thread Michal Krompiec
On Tuesday, 29 November 2016, Greg Landrum  wrote:

>
> On Tue, Nov 29, 2016 at 5:17 PM, Bob Funchess  > wrote:
>
>
>> PS: All I really need is the C# wrappers; if those could be included in
>> the binary distribution it would be extremely helpful for me.
>>
>
> Building these hasn't been automated since there aren't really any tests
> available and distributing something that hasn't been tested at all makes
> me very nervous. Of course if someone in the community who knows some C#
> were to contribute a set of tests (or even just a port of the existing Java
> wrapper tests) that would make me feel a lot safer. hint hint. :-)
>
>

If stable C# wrappers were available one could make a COM interop library
and access RDKit from VBA in MS Excel. I guess most of you aren't fans of
Excel, but it is still the workhorse for the less-geeky chemists and no
such add-in is widely available.

Best,

Michal
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] canonical atom indexing

2016-03-10 Thread Michal Krompiec
Thanks a lot, this is exactly what I wanted.
Best regards,
Michal

On 10 March 2016 at 12:13, Brian Kelley <fustiga...@gmail.com> wrote:

> The canonicalizer doesn't treat hydrogens any differently than any other
> atom, but they have to be in the graph.  If you are starting from smiles,
> simply add explicit hydrogens, python example below:
>
> >>> from rdkit import Chem
>
> >>> m = Chem.MolFromSmiles("CC")
>
> >>> mh = Chem.AddHs(m)
>
> >>> Chem.MolToSmiles(mh)
>
> '[H]C([H])([H])C([H])([H])[H]'
>
> >>> order = eval(mh.GetProp("_smilesAtomOutputOrder"))
>
> # safer non eval version...
>
> >>> order = mh.GetPropsAsDict(includePrivate=True,
>
>
> includeComputed=True)['_smilesAtomOutputOrder']
>
> >>> list(order)
>
> [2,0,3,4,1,5,6,7]
>
> >>>
>
> Not that the output order is from the context of the output smiles string,
> i.e. order[0] is the index of the original atom index that was the outputs
> first atom and so on.  I.e. order[output_atom_idx] = input_atom_idx
>
> On Thu, Mar 10, 2016 at 6:27 AM, Michal Krompiec <
> michal.kromp...@gmail.com> wrote:
>
>> Hello,
>> I need a "canonical" method for generating atom indices for a given
>> molecule (with 3D coordinates, so the input is e.g. a mol file), for a
>> molecular descriptor which should be invariant with respect to atom
>> indexing. As I understand, canonical SMILES will give the same atom indices
>> for non-hydrogen atoms, but is there a way in RDKit to generate unique
>> indices for hydrogens as well?
>> Best regards,
>> Michal
>>
>>
>> --
>> Transform Data into Opportunity.
>> Accelerate data analysis in your applications with
>> Intel Data Analytics Acceleration Library.
>> Click to learn more.
>> http://pubads.g.doubleclick.net/gampad/clk?id=278785111=/4140
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] canonical atom indexing

2016-03-10 Thread Michal Krompiec
Hello,
I need a "canonical" method for generating atom indices for a given
molecule (with 3D coordinates, so the input is e.g. a mol file), for a
molecular descriptor which should be invariant with respect to atom
indexing. As I understand, canonical SMILES will give the same atom indices
for non-hydrogen atoms, but is there a way in RDKit to generate unique
indices for hydrogens as well?
Best regards,
Michal
--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Question about Run Reaction

2016-02-08 Thread Michal Krompiec
Hi Taka,
Yes, you need to sanitize every product molecule, after each step.
Best wishes,
Michal

On 7 February 2016 at 01:58, Taka Seri <serit...@gmail.com> wrote:

> Hi Michal,
>
> Thank you for your quick and kind response.
> I tried to sanitize mol according to your advice.
> And my code worked fine !
> Thanks you. ;-)
> By the way,  If I want to run several reaction steps. Do I need to
> sanitize each molecules?
>
>
> # reactiontest.py
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
> mol = Chem.MolFromSmiles("c1c1")
> rxn = AllChem.ReactionFromSmarts( "[cH&$(c(c)c):2]>>[c:2][F]" )
> ps1= rxn.RunReactants( (mol,) )
> ps1
> mol2 = ps1[0][0]
> Chem.SanitizeMol( mol2 )
> ps2= rxn.RunReactants( (mol2,) )
> uniq = set( [ Chem.MolToSmiles(x[0], isomericSmiles=True ) for x in ps2 ]
>  )
> print( uniq )
> ---
> from shell
>
> $ python reactiontest.py
>
> {'Fc1ccc(F)cc1', 'Fc1(F)c1', 'Fc1c1F'}
>
> -
> Best regards,
> Takayuki
>
> 2016年2月6日(土) 23:01 Michal Krompiec <michal.kromp...@gmail.com>:
>
>> Hi Taka,
>> You have to call SanitizeMol() on the product(s) explicitely. The error
>> is caused by the reactants not being 'sanitized'.
>> Best wishes,
>> Michal
>>
>>
>> On Saturday, 6 February 2016, Taka Seri <serit...@gmail.com> wrote:
>>
>>> Dear RDKitters,
>>>
>>> I have question about rdkit reaction function.
>>> I want to generate molecules using several reaction steps.
>>> I referred rdkit blog post, and wrote following code.
>>> But second step of reaction caused error.
>>> I could not difference about mol and mol2 object.
>>> I wonder if anyone could help me.
>>> Best regards,
>>>
>>> Takayuki
>>>
>>>
>>> In [1]: from rdkit import Chem
>>>
>>> In [2]: from rdkit.Chem import AllChem
>>>
>>> In [3]: mol = Chem.MolFromSmiles("c1ccc(F)cc1")
>>>
>>> In [6]: rxn = AllChem.ReactionFromSmarts('[cH&$(c(c)c):2]>>[c:2][F]')
>>>
>>> #first step works fine.
>>>
>>> In [7]: ps = rxn.RunReactants((mol,))
>>>
>>> #Bud second step did not work...
>>>
>>> In [9]: mol2 = ps[0][0]
>>>
>>> In [11]: ps = rxn.RunReactants((mol2,))
>>>
>>> [22:23:10]
>>>
>>>
>>> 
>>>
>>> Pre-condition Violation
>>>
>>> getNumImplicitHs() called without preceding call to calcImplicitValence()
>>>
>>> Violation occurred on line 166 in file
>>> /Users/landrgr1/anaconda3/anaconda/conda-bld/work/Code/GraphMol/Atom.cpp
>>>
>>> Failed Expression: d_implicitValence>-1
>>>
>>> 
>>>
>>>
>>>
>>> ---
>>>
>>> RuntimeError  Traceback (most recent call
>>> last)
>>>
>>>  in ()
>>>
>>> > 1 ps = rxn.RunReactants((mol2,))
>>>
>>>
>>> RuntimeError: Pre-condition Violation
>>>
>>>
>>>
--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Question about Run Reaction

2016-02-06 Thread Michal Krompiec
Hi Taka,
You have to call SanitizeMol() on the product(s) explicitely. The error is
caused by the reactants not being 'sanitized'.
Best wishes,
Michal

On Saturday, 6 February 2016, Taka Seri  wrote:

> Dear RDKitters,
>
> I have question about rdkit reaction function.
> I want to generate molecules using several reaction steps.
> I referred rdkit blog post, and wrote following code.
> But second step of reaction caused error.
> I could not difference about mol and mol2 object.
> I wonder if anyone could help me.
> Best regards,
>
> Takayuki
>
>
> In [1]: from rdkit import Chem
>
> In [2]: from rdkit.Chem import AllChem
>
> In [3]: mol = Chem.MolFromSmiles("c1ccc(F)cc1")
>
> In [6]: rxn = AllChem.ReactionFromSmarts('[cH&$(c(c)c):2]>>[c:2][F]')
>
> #first step works fine.
>
> In [7]: ps = rxn.RunReactants((mol,))
>
> #Bud second step did not work...
>
> In [9]: mol2 = ps[0][0]
>
> In [11]: ps = rxn.RunReactants((mol2,))
>
> [22:23:10]
>
>
> 
>
> Pre-condition Violation
>
> getNumImplicitHs() called without preceding call to calcImplicitValence()
>
> Violation occurred on line 166 in file
> /Users/landrgr1/anaconda3/anaconda/conda-bld/work/Code/GraphMol/Atom.cpp
>
> Failed Expression: d_implicitValence>-1
>
> 
>
>
> ---
>
> RuntimeError  Traceback (most recent call
> last)
>
>  in ()
>
> > 1 ps = rxn.RunReactants((mol2,))
>
>
> RuntimeError: Pre-condition Violation
>
>
>
--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] detect dihedral angles in a conformation

2015-10-20 Thread Michal Krompiec
Hi Jose
I have a similar problem, but I look for dihedrals between
aromatic/unsaturated rings:

rotatable_ring_bonds=Chem.MolFromSmarts("[!$(*#*)]-!@[!$(*#*)]")
rotatable_matches=mol.GetSubstructMatches(rotatable_ring_bonds)
rotatable=set() #set of sorted indices of the rotatable bonds
for bond in rotatable_matches:
if bond[0] wrote:

> Hi RDKitters,
>
> I would like to consider parts of a conformation rigid (fixed dihedral
> angles) during minimization
> My end goal would be to generate only ring conformations starting with
> valid 3D molecules.
>
> I can already consider a specific dihedral angle as rigid:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem, rdMolTransforms
>
> # create test mol
> s = 'COCCN1CCOCC1'
> m = Chem.MolFromSmiles(s)
> m = Chem.AddHs(m)
>
> # add 3D coordinates
> AllChem.EmbedMolecule(m)
>
> # freeze one dihedral angle (composed of atoms 0-3)
> MMFFs_MP = AllChem.MMFFGetMoleculeProperties(m, mmffVariant='MMFF94s')
> MMFFs_FF = AllChem.MMFFGetMoleculeForceField(m, MMFFs_MP)
> MMFFs_FF.MMFFAddTorsionConstraint(0, 1, 2, 3, relative=True,
> minDihedralDeg=0.0, maxDihedralDeg=0.0,forceConstant=99.0)
> c = m.GetConformer()
> print "before min", rdMolTransforms.GetDihedralDeg(c, 0,1,2,3) #
> -53.0873064656
>
> # minimize molecule with constrained dihedral angle
> MMFFs_FF.Minimize(maxIts=10)
> print "after first min", rdMolTransforms.GetDihedralDeg(c,0,1,2,3) #
> -53.0873064656
> MMFFs_FF.Minimize(maxIts=10)
> print "after second min", rdMolTransforms.GetDihedralDeg(c,0,1,2,3) #
> -53.0873064656
>
> However, I have difficulties to find all dihedral angles to consider
> rigid...
> I would like to detect dihedral angles with 4 atoms where:
>  - none is hydrogen
>  - no more than 2 atoms are in different rings
>
> First I looked for a function to return me the list of dihedral angles
> and iterate over it, but could not find any.
> My other alternative would be to iterate over atoms to get their
> neighbors, and then get their neighbor' neighbors, but that looks very
> very slow.
> Any other way to do this?
>
> Thank you!
>
> Jose Manuel
>
>
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] UFF geometry optimization of lanthanide complexes

2015-09-15 Thread Michal Krompiec
Hi Greg,
Thanks for your reply, I'm aware that complexes of this kind are completely
outside of this code's scope. I was just hoping it might still work here.
Indeed, RDKit does not parse this molecule in SMILES format, but I was able
to smuggle it through as MOL (in KNIME). But anyway, as only 6-valent Eu is
defined in UFF, this is not the way forward for my purpose.
Best wishes,
Michal

On 15 September 2015 at 10:54, Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Michal,
>
> The problem here, I think, is that organometallic complexes like this one
> involve bond types that are not well represented by SMILES, which really
> assumes that a Lewis dot structure including shared electron pairs for all
> bonds can be drawn. This is decidedly not the case here, where the molecule
> can't even really properly be read in by the RDKit:
> In [39]: m =
> Chem.MolFromSmiles('[Eu]1234567OC(=CC(=[O]1)C)C.C(C=C(O2)C)(=[O]3)C.C(C=C(O4)C)(=[O]5)C.C1=[N]6C2=C(C=C1)C=CC1=C2[N]7=CC=C1')
> [06:23:05] Explicit valence for atom # 5 O, 3, is greater than permitted
>
> The UFF parameters that are available for Eu are for Eu6+3: an
> octahedrally coordinated Eu+3. Your complex has 8 connections to the Eu, so
> it wouldn't be covered by the UFF parameters anyway.
>
> I just did a bit of playing around to see if I could construct a sample
> molecule with a six-coordinate Eu+3 and make that work. I failed. This may
> be an RDKit bug but I'm not quite sure. The corners of the code for dealing
> with non-organic molecules are dark, dusty, and not particularly well
> tested.
>
> Best,
> -greg
>
>
>
>
> On Mon, Sep 14, 2015 at 2:26 PM, Michal Krompiec <
> michal.kromp...@gmail.com> wrote:
>
>> Hello,
>> I was trying to generate 3D coordinates for an europium complex,
>> [Eu(acac)3(phen)], with UFF, using RDKit nodes in KNIME (UFF is
>> parametrized for lanthanides). Whereas the generation of coordinates seems
>> to produce an almost sensible structure:
>> [image: Inline images 3]
>>
>>  subsequent geometry optimization does not: it moves the Eu atom way
>> outside of the coordination sphere:
>> [image: Inline images 4]
>>
>> Is it something with the bond types not specified correctly, or it is
>> just not supposed to work with this type of molecules at all? The molecule
>> is defined by the following SMILES (created with MarvinSketch):
>>
>> [Eu]1234567OC(=CC(=[O]1)C)C.C(C=C(O2)C)(=[O]3)C.C(C=C(O4)C)(=[O]5)C.C1=[N]6C2=C(C=C1)C=CC1=C2[N]7=CC=C1
>> The same result is obtained with La instead of Eu.
>> Best wishes,
>> Michal
>>
>>
>>
>> --
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] UFF geometry optimization of lanthanide complexes

2015-09-14 Thread Michal Krompiec
Hello,
I was trying to generate 3D coordinates for an europium complex,
[Eu(acac)3(phen)], with UFF, using RDKit nodes in KNIME (UFF is
parametrized for lanthanides). Whereas the generation of coordinates seems
to produce an almost sensible structure:
[image: Inline images 3]

 subsequent geometry optimization does not: it moves the Eu atom way
outside of the coordination sphere:
[image: Inline images 4]

Is it something with the bond types not specified correctly, or it is just
not supposed to work with this type of molecules at all? The molecule is
defined by the following SMILES (created with MarvinSketch):
[Eu]1234567OC(=CC(=[O]1)C)C.C(C=C(O2)C)(=[O]3)C.C(C=C(O4)C)(=[O]5)C.C1=[N]6C2=C(C=C1)C=CC1=C2[N]7=CC=C1
The same result is obtained with La instead of Eu.
Best wishes,
Michal
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] out-of-plane bends

2015-04-16 Thread Michal Krompiec
Hi Paolo,
Thanks, this is an interesting feature - may come useful one day!
I found a different solution - via torsion constraints:

mp = ChemicalForceFields.MMFFGetMoleculeProperties(mol)
ff = ChemicalForceFields.MMFFGetMoleculeForceField(mol, mp)
sp=Chem.MolFromSmarts(c1ccsc1!@c) #thiophene connected to another
aromatic ring

atoms=(1,0,4,5) #out-of-plane bend, atoms 1, 0 and 4 are in the thiophene
ring, atom 5 is the first atom of the next ring

maplist = mol.GetSubstructMatches(sp)

if (len(maplist)0):
for match in maplist :

angle=180.0 #desired dihedral angle
a=[]
for i in range (0,4) :
a.append(match[atoms[i]])
cur_angle=rdMolTransforms.GetDihedralDeg(mol.GetConformer(),
a[0],a[1],a[2], a[3])

if abs(abs(cur_angle)-abs(angle))45.0: # closer to 0 than to
180?
angle=0.0 # freeze at 0
   ff.MMFFAddTorsionConstraint(a[0],a[1],a[2],a[3], False, angle, angle,
1e4)

ff.Minimize()


Best wishes,

Michal

On 15 April 2015 at 23:06, Paolo Tosco paolo.to...@unito.it wrote:

  Dear Michal,

 please find attached a small script which accomplishes what you describe
 by a different approach, i.e. it minimizes only the methyl group in
 2-methylthiophene while keeping the rest fixed, effectively pushing it back
 in plane. Would that work for you?

 Best,
 Paolo


 On 04/15/2015 10:57 AM, Michal Krompiec wrote:

  Hello,
 I'm trying to manipulate out-of-plane bends of substituents attached to
 aromatic rings. Obviously, they should lie in the plane of the ring, but
 sometimes (after constrained optimization) they come slightly out of plane
 and there doesn't seem to be a way to push/rotate them back to the ring
 plane. Adding constraints to the forcefield (which seems to work for
 out-of-plane bends as well) is not a perfect solution.

  For example, there is no way to manipulate the out-of-plane bend
 (dihedral angle) of the methyl group in the 2-methylthiophene molecule
 c1cc(C)sc1, defined by the first four atoms c1cc(C). If you try to use
 rdMolTransforms.SetDihedralDeg, it will raise an exception because the bond
 around it tries to rotate is part of a ring (line 375 in MolTransforms.cpp)
 - and for a good reason, because what I wanted to do is to rotate just the
 methyl group (atom l as defined in setDihedralRad, and everything attached
 to it), not all atoms connected to atom k being on the right hand side of
 the j-k bond.

  Would it make sense to make a modified version of setDihedralRad():
 setOutOfPlaneBendRad(), with the following modifications:
 * omit line 375: if(queryIsBondInRing(bondJK)) throw ValueErrorException(bond
 (j,k) must not belong to a ring); instead perhaps check if bond k,l
 belongs to a ring
 * change line 401: _toBeMovedIdxList(mol, jAtomId, kAtomId, alist); to
 _toBeMovedIdxList(mol, kAtomId, lAtomId, alist);  (we want to rotate the
 tree rooted in atom l, not atom k)

  Best regards,
 Michal


 --
 BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
 Develop your own process in accordance with the BPMN 2 standard
 Learn Process modeling best practices with Bonita BPM through live 
 exerciseshttp://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- 
 event?utm_
 source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF



 ___
 Rdkit-discuss mailing 
 listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
 Develop your own process in accordance with the BPMN 2 standard
 Learn Process modeling best practices with Bonita BPM through live
 exercises
 http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
 event?utm_
 source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] out-of-plane bends

2015-04-15 Thread Michal Krompiec
Hello,
I'm trying to manipulate out-of-plane bends of substituents attached to
aromatic rings. Obviously, they should lie in the plane of the ring, but
sometimes (after constrained optimization) they come slightly out of plane
and there doesn't seem to be a way to push/rotate them back to the ring
plane. Adding constraints to the forcefield (which seems to work for
out-of-plane bends as well) is not a perfect solution.

For example, there is no way to manipulate the out-of-plane bend (dihedral
angle) of the methyl group in the 2-methylthiophene molecule c1cc(C)sc1,
defined by the first four atoms c1cc(C). If you try to use
rdMolTransforms.SetDihedralDeg, it will raise an exception because the bond
around it tries to rotate is part of a ring (line 375 in MolTransforms.cpp)
- and for a good reason, because what I wanted to do is to rotate just the
methyl group (atom l as defined in setDihedralRad, and everything attached
to it), not all atoms connected to atom k being on the right hand side of
the j-k bond.

Would it make sense to make a modified version of setDihedralRad():
setOutOfPlaneBendRad(), with the following modifications:
* omit line 375: if(queryIsBondInRing(bondJK)) throw ValueErrorException(bond
(j,k) must not belong to a ring); instead perhaps check if bond k,l
belongs to a ring
* change line 401: _toBeMovedIdxList(mol, jAtomId, kAtomId, alist); to
_toBeMovedIdxList(mol, kAtomId, lAtomId, alist);  (we want to rotate the
tree rooted in atom l, not atom k)

Best regards,
Michal
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit interactive table in Knime

2015-04-09 Thread Michal Krompiec
Dear All,
I have just found an interesting behaviour of the RDKit Interactive Table
node in Knime.
My workflow contains two such nodes and they contain pretty much the same
data. If I hilite a molecule in one node, it is automatically hilited in
the other one as well.
I'm using KNIME 2.10.4 with the latest RDKit nodes for this version.
Best regards,

Michal
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-07 Thread Michal Krompiec
Thanks a lot!
By the way, it would be useful to have this feature (MergeQueryHs) also in
the substructure search KNIME node.
Best wishes,
Michal
On 3 April 2015 at 06:29, Greg Landrum greg.land...@gmail.com wrote:

 The changes are now pushed (
 https://github.com/rdkit/rdkit/commit/f0d4cf1ec63a4928a2a28fa62cf1d255099e72d0)
 and are available on master.
 The new functions are qmol_from_smiles() and qmol_from_ctab()

 Best,
 -greg


 On Thu, Apr 2, 2015 at 10:51 AM, Greg Landrum greg.land...@gmail.com
 wrote:

 Hi Michal,

 Glad to hear this matches what you are looking for. I have already added
 the feature to the cartridge and will check it in later today/tomorrow
 morning.

 -greg

 On Thursday, April 2, 2015, Michal Krompiec michal.kromp...@gmail.com
 wrote:

  Hi Greg,
 Thank you, this is exactly what I needed.

 On 2 April 2015 at 05:22, Greg Landrum greg.land...@gmail.com wrote:


 Skipping sanitization, as you propose, isn't going to help here: the
 kekulized form of the ring will not be converted to aromatic and you won't
 get the matches you are looking for.

 Indeed. Previously I stored my dataset as smiles with explicit
 hydrogens, and created the query mols by adding Hs and then deleting
 hydrogens at substitution sites and finally converting to SMARTS - a messy
 workaround, but producing the right result.


   Here's an approach to this that works in Python :



 And this is exactly what I wanted. To illustrate it more precisely: your
 pattern (2-H-pyrimidine) matches pyrimidine, 5-methylpyrimidine but does
 not match 2-pyrimidine:

  m =Chem.MolFromSmiles('c1ccnc([H])n1',sanitize=False);
  nm=Chem.MergeQueryHs(m)
  Chem.SanitizeMol(nm)
 rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
  Chem.MolFromSmiles('c1ccncn1').HasSubstructMatch(nm)
 True
  Chem.MolFromSmiles('c1c(C)cncn1').HasSubstructMatch(nm)
 True
  Chem.MolFromSmiles('c1ccnc(C)n1').HasSubstructMatch(nm)
 False
 




  Being able to do something equivalent in the cartridge would
 certainly be useful. What I'd suggest is the addition of two functions:
 query_mol_from_smiles() and query_mol_from_ctab() that do this.


 I'll do it.


   Then you could do queries like:
 select * from mols where m @ query_mol_from_smiles('c1ccnc([H])n1');
 and have it do the right thing.

 Sound reasonable?

 -greg


 Best wishes,
 Michal



--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2015-04-02 Thread Michal Krompiec
Hi Greg,
Thank you, this is exactly what I needed.

On 2 April 2015 at 05:22, Greg Landrum greg.land...@gmail.com wrote:


 Skipping sanitization, as you propose, isn't going to help here: the
 kekulized form of the ring will not be converted to aromatic and you won't
 get the matches you are looking for.

Indeed. Previously I stored my dataset as smiles with explicit hydrogens,
and created the query mols by adding Hs and then deleting hydrogens at
substitution sites and finally converting to SMARTS - a messy workaround,
but producing the right result.


 Here's an approach to this that works in Python :



And this is exactly what I wanted. To illustrate it more precisely: your
pattern (2-H-pyrimidine) matches pyrimidine, 5-methylpyrimidine but does
not match 2-pyrimidine:

 m =Chem.MolFromSmiles('c1ccnc([H])n1',sanitize=False);
 nm=Chem.MergeQueryHs(m)
 Chem.SanitizeMol(nm)
rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
 Chem.MolFromSmiles('c1ccncn1').HasSubstructMatch(nm)
True
 Chem.MolFromSmiles('c1c(C)cncn1').HasSubstructMatch(nm)
True
 Chem.MolFromSmiles('c1ccnc(C)n1').HasSubstructMatch(nm)
False





 Being able to do something equivalent in the cartridge would certainly be
 useful. What I'd suggest is the addition of two functions:
 query_mol_from_smiles() and query_mol_from_ctab() that do this.


I'll do it.


Then you could do queries like:
 select * from mols where m @ query_mol_from_smiles('c1ccnc([H])n1');
 and have it do the right thing.

 Sound reasonable?

 -greg


Best wishes,
Michal
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit on cygwin32: python silently crashes

2015-03-26 Thread Michal Krompiec
Dear All,
I know that building on Cygwin is not a popular topic here, but perhaps
someone will have some ideas. I'm trying to build on 32-bit Cygwin under
Windows 7 32-bit. The build seems successful and C tests are passed, but
importing the python bindings causes python to crash silently:

$ python
Python 2.7.8 (default, Jul 28 2014, 01:34:03)
[GCC 4.8.3] on cygwin
Type help, copyright, credits or license for more information.
 from rdkit import Chem
$ python -v -c 'from rdkit import Chem'
[cut]
import rdkit.Chem.PeriodicTable # precompiled from
/home/M/RDKit/rdkit/Chem/PeriodicTable.pyc
dlopen(/home/M/RDKit/rdkit/Chem/rdchem.dll, 2);
import rdkit.Chem.rdchem # dynamically loaded from
/home/M/RDKit/rdkit/Chem/rdchem.dll
dlopen(/home/M/RDKit/rdkit/Chem/rdmolfiles.dll, 2);
import rdkit.Chem.rdmolfiles # dynamically loaded from
/home/M/RDKit/rdkit/Chem/rdmolfiles.dll
dlopen(/home/M/RDKit/rdkit/Chem/rdmolops.dll, 2);

$


Any ideas on how to investigate this?
It seems that this problem is related to 32-bit Cygwin, as under Cygwin64
everything works just fine (thanks to Jan Holst Jensen for verifying this).

Best wishes,
Michal
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] cartridge: successful build on Cygwin 64-bit

2015-03-26 Thread Michal Krompiec
Dear All,
In case this may be useful to anybody in the future:
RDKit (current github version) builds nicely on Cygwin 64-bit without
any modifications (GCC 4.9.2, boost 1.55).
The following tests fail:

13 - testUFFForceField (SEGFAULT) - rounding error

15 - pyForceFieldConstraints (Failed) - rounding error

65 - pyMolDraw2D (SEGFAULT)

66 - testFMCS (SEGFAULT)

67 - pyFMCS (SEGFAULT)

85 - pythonTestDirChem (Failed)


More importantly, the postgres cartridge builds without any problems
(postgresql 9.4.1), without modification of the makefile, and passes
all tests.
Perhaps not everybody knows that Cygwin can be installed without admin
rights (invoke the setup application from command line with
--no-admin), so this solution will work in almost every Windows
environment and is almost equivalent to a portable app. Unfortunately,
32-bit Cygwin does not behave so nicely, as I have written before.

Best wishes,
Michal

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] building RDKit on Cygwin: rdBase not linked?

2015-03-20 Thread Michal Krompiec
Hello,
I am struggling to build RDKit on Cygwin, under 32-bit Windows 7,
using gcc 4.9.2, boost 1.55 and python 2.7.8, with the standard
options:
cmake ..  make  make install

The building itself did not raise any errors, but the python tests
fail, for example:

$ python -v -c 'from rdkit import Chem'
...
import rdkit.Chem # precompiled from /home/m/RDKit/rdkit/Chem/__init__.pyc
Traceback (most recent call last):
  File string, line 1, in module
  File /home/m/RDKit/rdkit/Chem/__init__.py, line 18, in module
from rdkit import rdBase
ImportError: cannot import name rdBase

$RDBASE and $RDBASE/lib are in the PATH and PYTHONPATH. To be sure the
dlls are found, I even copied them to /bin. I am pretty sure that
paths are OK.

There is, however, no rdBase.so or anything like that present in $RDBASE/rdkit:
$ ls ~/RDKit/rdkit
__init__.py   _py2_pickle.py   Avalon  CMakeLists.txt  DataStructs
DistanceGeometry  ForceField  MLrdBase.pyd   RDConfig.pyc
RDPaths.py   SimDivFilters  six.pyc  test_list.py   utils
__init__.pyc  _py2_pickle.pyc  ChemDataManip   Dbase
epydoc.config GeometryNumerics  RDConfig.py  RDLogger.py
RDRandom.py  six.py spingTestRunner.py  VLib

$ ls ~/RDKit/build/rdkit/
Chem  cmake_install.cmake  CMakeFiles  CTestTestfile.cmake  DataManip
DataStructs  DistanceGeometry  ForceField  Geometry  Makefile  ML
Numerics  rdBase.pyd  RDPaths.py  SimDivFilters

$ locate rdBase
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/build.make
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/cmake_clean.cmake
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/CXX.includecache
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/depend.internal
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/depend.make
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/DependInfo.cmake
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/flags.make
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/link.txt
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/progress.make
/home/m/RDKit/build/Code/RDBoost/Wrap/CMakeFiles/rdBase.dir/RDBase.cpp.o
/home/m/RDKit/build/rdkit/rdBase.pyd
/home/m/RDKit/rdkit/rdBase.pyd

What else could be wrong?

Thanks,
Michal

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] tests failed on Cygwin

2015-03-18 Thread Michal Krompiec
Dear Paolo,
 regarding test 13, would you mind trying to change line 1400 in
 /home/m212767/RDKit/Code/ForceField/UFF/testUFFForceField.cpp from [...]
 and tell me what the return value of
 MolTransforms::getDihedralDeg(mol-getConformer(), 1, 3, 6, 8) actually is?

The return value is -9 (is it a rounding error then?). This is the
full output from the test:

test 13
  Start 13: testUFFForceField

13: Test command:
/home/M212767/RDKit/build/Code/ForceField/UFF/testUFFForceField
13: Test timeout computed to be: 9.99988e+06
13: -
13: Unit tests for force field basics.
13:   done
13: -
13: Unit tests for basics of UFF bond-stretch terms.
13:   done
13: -
13: Unit tests for UFF bond-stretch terms.
13:   done
13: -
13: Unit tests for basics of UFF angle terms.
13:   done
13: -
13: Unit tests for UFF angle-bend terms.
13: theta = 3.14159; theta0 = 3.14159
13: theta = 2.094395, param1.theta0 = 2.094395
13:   done
13: -
13:  Test Simple UFF molecule optimizations.
13:   done
13: -
13: Unit tests for UFF nonbonded terms.
13:   done
13: -
13:  Test UFF torsional terms.
13:   done
13: -
13:  Test UFF Parameter objects
13: -
13:  Test Simple UFF molecule optimization, part 2.
13:   done
13: -
13:  Test UFF Torsion Conflicts.
13: C 0.4783 -0.7477 -0.0753
13: C -0.5465 -0.0176 0.3519
13: C -0.5114 1.4472 0.0870
13: H -1.5201 -0.4838 0.2485
13: H 0.4696 1.8363 0.3367
13: O -1.2618 1.9423 0.6934
13: F -0.7163 1.6298 -0.9622
13:   done
13: -
13: Unit tests for UFF distance constraint terms.
13:   done
13: -
13: Unit tests for all UFF constraint terms.
13: -9
13:   done
13: -
13: Unit tests for copying UFF ForceFields.
13:   done
13/85 Test #13: testUFFForceField    Passed0.35 sec

Best wishes,
Michal

On 17 March 2015 at 23:27, Paolo Tosco paolo.to...@unito.it wrote:
 Hi Michal,

 regarding test 13, would you mind trying to change line 1400 in
 /home/m212767/RDKit/Code/ForceField/UFF/testUFFForceField.cpp from

 TEST_ASSERT((int)MolTransforms::getDihedralDeg(mol-getConformer(), 1, 3, 6,
 8) == -10);

 to

 std::cout  (int)MolTransforms::getDihedralDeg(mol-getConformer(), 1, 3,
 6, 8)  std::endl;

 and tell me what the return value of
 MolTransforms::getDihedralDeg(mol-getConformer(), 1, 3, 6, 8) actually is?
 It might just be a periodicity problem.

 Thanks for your collaboration, kind regards
 Paolo


 On 17/03/2015 14:48, Michal Krompiec wrote:

 Hello,
 I'm trying to build RDKit on Cygwin, on Windows 7 32-bit (current
 GitHub version).
 So far, everything looked fine, but some tests failed:

 The following tests FAILED:
   13 - testUFFForceField (OTHER_FAULT)
   66 - testFMCS (SEGFAULT)
   79 - pythonTestDbCLI (Failed) - that's not a problem, as the
 db is not set up (yet)

 I repeated using  --output-on-failure, and that's what I got:

 testUFFForceField:
 Test Assert
 Expression Failed:
 Violation occurred on line 1400 in file
 /home/m212767/RDKit/Code/ForceField/UFF/testUFFForceField.cpp
 Failed Expression:
 (int)MolTransforms::getDihedralDeg(mol-getConformer(), 1, 3, 6, 8) ==
 -10

 66/85 Test #66: testFMCS .***Exception:
 SegFault  0.28 sec
 [14:16:12] ***
 [14:16:12] FMCS Unit Test
 [14:16:12] -
 [14:16:12] FMCS test1Basics()

 System reboot and /bin/rebaseall did not help.

 Moreover, tests for the cartridge also failed:

 $ make  make install  make installcheck
 make: Nothing to be done for 'all'.
 /usr/bin/mkdir -p '/usr/lib/postgresql'
 /usr/bin/mkdir -p '/usr/share/postgresql/extension'
 /usr/bin/mkdir -p '/usr/share/postgresql/extension'
 /usr/bin/install -c -m 755  rdkit.dll '/usr/lib/postgresql/rdkit.dll'
 /usr/bin/install -c -m 644 rdkit.control
 '/usr/share/postgresql/extension/'
 /usr/bin/install -c -m 644 rdkit--3.4.sql
 '/usr/share/postgresql/extension/'
 /usr/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress
 --inputdir=./ --psqldir='/usr/bin'--dbname=contrib_regression
 rdkit-91 props btree molgist bfpgist-91 sfpgist slfpgist fps reaction
 (using postmaster on Unix socket, default port)
 == dropping database contrib_regression ==
 NOTICE:  database contrib_regression does not exist, skipping
 DROP DATABASE
 == creating database contrib_regression ==
 CREATE DATABASE
 ALTER DATABASE
 == running regression test queries==
 test rdkit-91

Re: [Rdkit-discuss] tests failed on Cygwin

2015-03-18 Thread Michal Krompiec
Hello,
I'm still struggling with the Cygwin build. gdb shows that the
segfault in FMCS (test 66) is caused by:
#0  0x67d0115a in
RDKit::MCSProgressCallbackTimeout(RDKit::MCSProgressData const,
RDKit::MCSParameters const, void*) () from
C:\cygwin\home\path\RDKit\lib\cygFMCS-1.dll

What could it be? My suspicion is that it a wrong version of some dll
is loaded, but how do I trace it?
Best wishes,
Michal


On 17 March 2015 at 14:48, Michal Krompiec michal.kromp...@gmail.com wrote:
 Hello,
 I'm trying to build RDKit on Cygwin, on Windows 7 32-bit (current
 GitHub version).
 So far, everything looked fine, but some tests failed:

 The following tests FAILED:
  13 - testUFFForceField (OTHER_FAULT)
  66 - testFMCS (SEGFAULT)
  79 - pythonTestDbCLI (Failed) - that's not a problem, as the
 db is not set up (yet)

 I repeated using  --output-on-failure, and that's what I got:

 testUFFForceField:
 Test Assert
 Expression Failed:
 Violation occurred on line 1400 in file
 /home/m212767/RDKit/Code/ForceField/UFF/testUFFForceField.cpp
 Failed Expression:
 (int)MolTransforms::getDihedralDeg(mol-getConformer(), 1, 3, 6, 8) ==
 -10

 66/85 Test #66: testFMCS .***Exception:
 SegFault  0.28 sec
 [14:16:12] ***
 [14:16:12] FMCS Unit Test
 [14:16:12] -
 [14:16:12] FMCS test1Basics()

 System reboot and /bin/rebaseall did not help.

 Moreover, tests for the cartridge also failed:

 $ make  make install  make installcheck
 make: Nothing to be done for 'all'.
 /usr/bin/mkdir -p '/usr/lib/postgresql'
 /usr/bin/mkdir -p '/usr/share/postgresql/extension'
 /usr/bin/mkdir -p '/usr/share/postgresql/extension'
 /usr/bin/install -c -m 755  rdkit.dll '/usr/lib/postgresql/rdkit.dll'
 /usr/bin/install -c -m 644 rdkit.control '/usr/share/postgresql/extension/'
 /usr/bin/install -c -m 644 rdkit--3.4.sql '/usr/share/postgresql/extension/'
 /usr/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress
 --inputdir=./ --psqldir='/usr/bin'--dbname=contrib_regression
 rdkit-91 props btree molgist bfpgist-91 sfpgist slfpgist fps reaction
 (using postmaster on Unix socket, default port)
 == dropping database contrib_regression ==
 NOTICE:  database contrib_regression does not exist, skipping
 DROP DATABASE
 == creating database contrib_regression ==
 CREATE DATABASE
 ALTER DATABASE
 == running regression test queries==
 test rdkit-91 ... LOG:  server process (PID 9076) was
 terminated by signal 11: Segmentation fault
 DETAIL:  Failed process was running: CREATE EXTENSION rdkit;

 Has anybody else encountered such problems? Any idea what to do?

 Thanks,
 Michal

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] postgresql cartridge and fingerprints

2015-03-17 Thread Michal Krompiec
Hi Greg,
Thanks. I haven't started tests yet, so we'll see.
Best wishes,
Michal
On 17 Mar 2015 15:08, Greg Landrum greg.land...@gmail.com wrote:

 Hi Michal,

 On Tue, Mar 17, 2015 at 3:02 PM, Michal Krompiec 
 michal.kromp...@gmail.com wrote:

 Hello,

 I understand that the index for substructure searching is created by,
 for example:

 create index molidx on mols using gist(m)

 where m is the molecule column in mols table. Is this correct?


 yes, that's correct.


 How do I select the type of fingerprint used for the index? I just
 suspect that my dataset may require tweaking the fingerprint type for
 best performance.


 That's not configurable at the moment. If you would like to change the
 fingerprint for substructure screening, you would need to edit the source
 for the cartridge itself. I'm curious about what you think might need to be
 changed

 -greg


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] postgresql cartridge and fingerprints

2015-03-17 Thread Michal Krompiec
Hello,

I understand that the index for substructure searching is created by,
for example:

create index molidx on mols using gist(m)

where m is the molecule column in mols table. Is this correct?

How do I select the type of fingerprint used for the index? I just
suspect that my dataset may require tweaking the fingerprint type for
best performance.

Thanks,
Michal

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] tests failed on Cygwin

2015-03-17 Thread Michal Krompiec
Hello,
I'm trying to build RDKit on Cygwin, on Windows 7 32-bit (current
GitHub version).
So far, everything looked fine, but some tests failed:

The following tests FAILED:
 13 - testUFFForceField (OTHER_FAULT)
 66 - testFMCS (SEGFAULT)
 79 - pythonTestDbCLI (Failed) - that's not a problem, as the
db is not set up (yet)

I repeated using  --output-on-failure, and that's what I got:

testUFFForceField:
Test Assert
Expression Failed:
Violation occurred on line 1400 in file
/home/m212767/RDKit/Code/ForceField/UFF/testUFFForceField.cpp
Failed Expression:
(int)MolTransforms::getDihedralDeg(mol-getConformer(), 1, 3, 6, 8) ==
-10

66/85 Test #66: testFMCS .***Exception:
SegFault  0.28 sec
[14:16:12] ***
[14:16:12] FMCS Unit Test
[14:16:12] -
[14:16:12] FMCS test1Basics()

System reboot and /bin/rebaseall did not help.

Moreover, tests for the cartridge also failed:

$ make  make install  make installcheck
make: Nothing to be done for 'all'.
/usr/bin/mkdir -p '/usr/lib/postgresql'
/usr/bin/mkdir -p '/usr/share/postgresql/extension'
/usr/bin/mkdir -p '/usr/share/postgresql/extension'
/usr/bin/install -c -m 755  rdkit.dll '/usr/lib/postgresql/rdkit.dll'
/usr/bin/install -c -m 644 rdkit.control '/usr/share/postgresql/extension/'
/usr/bin/install -c -m 644 rdkit--3.4.sql '/usr/share/postgresql/extension/'
/usr/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress
--inputdir=./ --psqldir='/usr/bin'--dbname=contrib_regression
rdkit-91 props btree molgist bfpgist-91 sfpgist slfpgist fps reaction
(using postmaster on Unix socket, default port)
== dropping database contrib_regression ==
NOTICE:  database contrib_regression does not exist, skipping
DROP DATABASE
== creating database contrib_regression ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-91 ... LOG:  server process (PID 9076) was
terminated by signal 11: Segmentation fault
DETAIL:  Failed process was running: CREATE EXTENSION rdkit;

Has anybody else encountered such problems? Any idea what to do?

Thanks,
Michal

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Oracle, pypl and rdkit

2015-03-13 Thread Michal Krompiec
Hi Jan and TJ,
Thank you very much for your comments. Yes, I'm going to use
fingerprints, but I was hoping to use UTL_RAW bitwise operation to
handle them (we'll see how this goes).
What worries me that invoking structure matching via PYPL for each
molecule would be slow, do you see any way of doing it batchwise? (for
example, using oracle's table functions)
Best wishes,
Michal

On 13 March 2015 at 07:50, Jan Holst Jensen j...@biochemfusion.com wrote:
 Hi Michal and TJ,

 The nice thing about Postgres extensions is that they are loaded directly
 into the session's process space. Therefore the overhead is minimal, almost
 non-existing. Not so with Oracle cartridges/extensions that are loaded in a
 separate process, the extproc process.

 The overhead per call into PYPL is on the order of tens of microseconds,
 which could be a lot or not, depending on how many calls you do and what
 kind of calls.

 I have tried to do a naïve SSS search with PYPL and HasSubstructMatch() on a
 database of 70 000 compounds (seventy thousand) and it took several minutes
 to complete so it was not really usable. If you need any kind of speed you
 need to use fingerprints to find an initial hit list, and you need to pass
 fingerprints in bulk to PYPL to avoid too much call overhead.

 Do consecutive pypl calls always share the same interpreter?

 On Oracle 10g and 11g, yes. I do have a disclaimer that it might not be the
 case if you run shared server, but in my experience even shared server
 ensures that each session gets its own private instance of an interpreter
 (its own extproc process). And, if you run a multi-threaded extproc
 configuration then there are no guarantees, but I don't know anyone who does
 that.

 On 12c I just don't know yet. The little I have done with it seems to
 indicate that it behaves like 10 and 11, so looking good so far.

 Cheers
 -- Jan

 On 2015-03-13 00:43, TJ O'Donnell wrote:

 I've implemented a suite of rdkit functions
 for postgres using plpython
 https://github.com/tjod/rdchord
 and the overhead is minimal
 since most of the heavy lifting of substructure searching
 is done by rdkit.

 I think the same would be true of oracle.
 ---
 TJ O'Donnell

 On Thu, Mar 12, 2015 at 4:24 PM, Michal Krompiec michal.kromp...@gmail.com
 wrote:

 Hello, has anybody tried to implement substructure searching in an Oracle
 database using PYPL and RDKit? Is it just a matter of writing a wrapper
 function for molecule.HasSubstructMatch(pattern) or is the overhead of
 calling pypl each time too costly timewise? Do consecutive pypl calls always
 share the same interpreter?
 Best wishes,
 Michal




--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] portable PostgreSQL + RDKit cartridge?

2014-08-28 Thread Michal Krompiec
Dear Jan, Thanks a lot. It remains a painstaking DIY job stil then.

Greg: would it be possible to add the binary of the cartridge
(compiled with, say, the latest PostgreSQL) to the binary Win32
distribution? It would allow to have a portable
python+rdkit+postgresql+knime+... bundle, which would simplify the
lives of many ;)

Best wishes,
Michal

On 28 August 2014 15:04, Jan Holst Jensen j...@biochemfusion.com wrote:
 On 2014-08-28 14:34, Michal Krompiec wrote:

 Hello, has anybody tried to compile a portable Windows binary of
 PostgreSQL with RDKit cartridge? There is a portable PostreSQL at
 http://sourceforge.net/projects/postgresqlportable/ and I wonder if it
 is possible to use it with the cartridge.
 Best regards,
 Michal


 Hi Michal,

 I got through building a Windows version of the RDKit cartridge a while
 back, but I didn't end up using it for real. I would think that the
 instructions still mostly apply:

 http://sourceforge.net/p/rdkit/mailman/message/30127487/

 If the resulting DLL and the extension control files are put in the right
 place in the portable image, I guess you should be able to get it working.

 Cheers
 -- Jan

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] portable PostgreSQL + RDKit cartridge?

2014-08-28 Thread Michal Krompiec
Dear Greg,
Fair enough;). Indeed this is quite complex and a bit separate from
the core of RDKit's purpose. Nevertheless, I'm sure quite a few
people would appreciate such a build.
Best regards,
Michal

On 28 August 2014 15:37, Greg Landrum greg.land...@gmail.com wrote:


 On Thursday, August 28, 2014, Michal Krompiec michal.kromp...@gmail.com
 wrote:

 Dear Jan, Thanks a lot. It remains a painstaking DIY job stil then.

 Greg: would it be possible to add the binary of the cartridge
 (compiled with, say, the latest PostgreSQL) to the binary Win32
 distribution?


 It's definitely possible. The better question is how likely it is. ;-)

 It's probably already clear that I don't spend a lot of time using the RDKit
 under Windows. Aside from some automated builds that we do at work (only
 parts of the code), I tend to really only build it in Windows while
 preparing a release. Adding an additional component to the Windows build
 adds an additional testing requirement and complication to the release
 process. Due to the way Windows licensing works, I can't set up several VMs
 to make this easy/automateable, so it requires some effort. Doing this for
 something that I personally would never use under Windows (the cartridge)
 and where effective testing would be tricky makes me nervous. It gets more
 complicated when you factor in the matrix of things that would need to be
 tested: Win32 or Win64? Is it really ok to only support the latest version
 of PostgreSQL or are people going to want older versions too? etc.

 I will try to find some time before the next release to see how feasible
 this is, but I'm not going to make any promises.

 -greg


--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] An ultimate way to compute 3D coordinates?

2014-04-07 Thread Michal Krompiec
On 6 April 2014 05:36, Greg Landrum greg.land...@gmail.com wrote:
 Some substituted oligoarenes with at least 8 rings in the chain, not
 particularly fancy (I think the problem is related more to the length
 of the molecule than to the nature of the repeat units). I tried
 various options in the EmbedMolecule function, but without success.
 This error occured in less than 10% tested structures. If anyone is
 interested in correcting this, I think I can produce a
 non-confidential input example...


 I would certainly be interested to see this. I'm not sure what can be done,
 but it's interesting to have the examples.

Try this one with random coordinate generation:

Cc1cc(cc3c1c2ccc(cc2C3(C)C)c4ccc(c(C)c4C)c5ccc(s5)c7ccc8c6ccc(cc6C(C)(C)c8c7)c%14ccc(c9ccc(s9)c%10cc%12c(cc%10CC)c%11c%11C%12(C)C)c%13cc(C)ccc%13%14)c%15ccc(s%15)c%17ccc(c%16c%16%17)c%18cc%20c(cc%18)c%19c(C)c(C)c(cc%19C%20(C)C)c%21sc(cc%21C)c%23ccc%24c%22ccc(cc%22C(C)(C)c%24c%23)c%25ccc(s%25)c%31ccc(c%27ccc%28c%26c(C)cc(cc%26C(C)(C)c%28c%27)c%29ccc(s%29)c%30cccs%30)c(C)c%31C

AllChem.EmbedMolecule(mol,useRandomCoords=True);
AllChem.MMFFOptimizeMolecule(mol,maxIters=100)

I have just run it 3 times and each time it produced a knot, which
cannot be disentangled by optimization. This example is completely
artificial, but I got similar results in a few % of real cases. It
is not an issue for me, actually, as I now use Corina to get the
starting conformations and then optimize them with MMFF in RDKit.

 and KNIME.
 Which conformation generator in knime?
None, I was using knime just to browse 2D structures.

Best wishes,
Michal

--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] An ultimate way to compute 3D coordinates?

2014-04-05 Thread Michal Krompiec
Michal: from my experience, MMFF in rdkit is slower than UFF (ca. 2x
for my test cases) but converges faster, so in certain cases the
overall execution time (embedding+optimization) won't be much shorter
for UFF. It really depends on what molecules you work on. AFAIK
rdkit's 3d coord generation algorithm was designed for small- to
medium-sized druglike molecules, so you may expect it to fail in
areas very far from this territory. For example, it does not work well
for long conjugated oligomers - sometimes it produces molecular knots
instead of straight strands, and is quite slow for large systems.
That's why I switched to CORINA, btw.
Best wishes,
Michal Krompiec


On 5 April 2014 18:05, JP jeanpaul.ebe...@inhibox.com wrote:
 I don't know about the ultimate way: but this works for me (to generate n
 conformers):

 writer = Chem.SDWriter('some_file.sdf')
 # add Hydrogens
 molH = Chem.AddHs(mol)
 # create n conformers for molecule
 confIds = AllChem.EmbedMultipleConfs(molH, n)
 # E optimize
 for confId in confIds:
 AllChem.UFFOptimizeMolecule(molH, confId=confId)
 # write to output file
 writer.write(molH, confId=confId)

 You should replace the EmbedMultipleConfs with EmbedMolecule if you are only
 interested in generating only one conformer.  UFFOptimizeMolecule(...)
 returns an integer, which if 0 tells you the optimization has converged (or
 1 otherwise).

 UFF is significantly faster, and I do not think the results are worse of
 than the ones generated for MMFF.  At least for the small molecules I was
 looking at, but I am sure there are exceptions to this.  Paolo has done a
 lot of excellent work on the forcefields, and I think the amide and carbonyl
 planarity issues for UFF have now been fixed.






 -
 Jean-Paul Ebejer
 Early Stage Researcher


 On 5 April 2014 13:35, Michał Nowotka mmm...@gmail.com wrote:

 Hi,

 I've found this
 (http://code.google.com/p/rdkit/wiki/Generating3DCoordinates) wiki page
 suggesting how to compute 3D coordinates:

 from rdkit import Chem
 from rdkit.Chem import AllChem



 m = Chem.MolFromSmiles('c1c1C(=O)O')

 AllChem.EmbedMolecule(m)
 # the molecule now has a crude conformation, clean it up:

 AllChem.UFFOptimizeMolecule(m)

 On the other hand, Getting started document describes this differently:




 AllChem.EmbedMolecule(m2)
 AllChem.UFFOptimizeMolecule(m2)

 In the meantime, someone suggested that I should call:


 Chem.AddHs(m)

 Before calculating 3D properties.


 So what is an ultimate way of doing this? Lets assume I already have rdkit
 molecule:

 m = Chem.MolFromSmiles('Cc1c1')




 or:

 m = Chem.MolFromMolFile('data/input.mol')


 what should I do with 'm' to compute 3D coordinates?

 Also, once we have MMFF implemented in rdkit, is there any benefit of
 using UFF (apart from maybe backwards compatibility, as this is a new
 feature)?



 Is UFF significantly faster then MMFF?

 Kind regards,

 Michał Nowotka




 --

 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



 --

 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] An ultimate way to compute 3D coordinates?

2014-04-05 Thread Michal Krompiec
On 5 April 2014 19:11, Paul Emsley pems...@mrc-lmb.cam.ac.uk wrote:
 On 05/04/14 19:04, Michal Krompiec wrote:


 For example, it does not work well
 for long conjugated oligomers - sometimes it produces molecular knots
 instead of straight strands, and is quite slow for large systems.

 Can you expand on that? What sort of long conjugated oligomers were you
 looking at?

Some substituted oligoarenes with at least 8 rings in the chain, not
particularly fancy (I think the problem is related more to the length
of the molecule than to the nature of the repeat units). I tried
various options in the EmbedMolecule function, but without success.
This error occured in less than 10% tested structures. If anyone is
interested in correcting this, I think I can produce a
non-confidential input example...

 What was the nature of the input from which you were making
 rdkit molecules?

SMILES. The same input worked 100% fine with CORINA (which was, btw,
approx. 5-20x faster on the same computer) and KNIME.

Regards,
Michal

--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit fingerprints in Knime - operations on DenseBitVectorCell

2014-02-28 Thread Michal Krompiec
Hello,
I am trying to implement substructure searching with fingerprint-based
screening in Knime, using RDKit fingerprints (I know that a database
would be the preferred solution, but for some reasons I'd rather use
Knime).
In order to do this, I need an equivalent of
DataStructs.AllProbeBitsMatch() function (or at just logical operators
on DenseBitVectorCell) - is it possible in Knime? Or otherwise, is it
possible to convert a DenseBitVectorCell column to something else?
Best wishes,
Michal

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2014-02-25 Thread Michal Krompiec
Thanks Greg, this is exactly what I wanted to know. Would you consider
adding an optional removeHs argument to MolFromSmiles(), as in
mol/mol2/sdf parsers?
Best wishes,
Michal

On 25 February 2014 04:23, Greg Landrum greg.land...@gmail.com wrote:
 Hi Michal,

 On Mon, Feb 24, 2014 at 4:48 PM, Michal Krompiec michal.kromp...@gmail.com
 wrote:

 Hello, I have just noticed this:
  Chem.MolToSmiles(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H]))
 'c1ccsc1'
 
  Chem.MolToSmiles(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False))
 '[H]c1sc([H])c([H])c1[H]'
 
  Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False)))
 'c1ccsc1'
  Chem.MolToSmiles(Chem.MolFromSmiles([H]c1cscc1[H]))
 'c1ccsc1'
  Chem.MolToSmiles(Chem.MolFromSmiles([H]c1cscc1[H],sanitize=False))
 '[H]c1cscc1[H]'

 Is it the expected behaviour? Why does sanitization remove hydrogens?

 Is it controlled by any of the SanitizeFlags?


  It is the expected behavior. When sanitization is turned on, the SMILES
 parser actually calls RemoveHs; this removes the hydrogens from the graph
 and then sanitizes the molecule.

 If you do not want the Hs removed, you can tell MolFromSmiles to skip the
 sanitization (which also skips the RemoveHs) and then sanitize yourself::

 In [3]: m=Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False)

 In [4]: Chem.SanitizeMol(m)
 Out[4]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

 In [5]: print Chem.MolToSmiles(m)
 [H]c1sc([H])c([H])c1[H]

 I hope this helps,
 -greg


--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] sanitization removes Hs - is this expected?

2014-02-24 Thread Michal Krompiec
Hello, I have just noticed this:
 Chem.MolToSmiles(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H]))
'c1ccsc1'
 Chem.MolToSmiles(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False))
'[H]c1sc([H])c([H])c1[H]'
 Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False)))
'c1ccsc1'
 Chem.MolToSmiles(Chem.MolFromSmiles([H]c1cscc1[H]))
'c1ccsc1'
 Chem.MolToSmiles(Chem.MolFromSmiles([H]c1cscc1[H],sanitize=False))
'[H]c1cscc1[H]'

Is it the expected behaviour? Why does sanitization remove hydrogens?
Is it controlled by any of the SanitizeFlags?

Best wishes,
Michal

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit nodes in KNIME stopped working suddenly

2014-02-21 Thread Michal Krompiec
Hello,
We've been using the RDKit nodes in KNIME for quite a while without
any problems. But suddenly they ceased to work on some computers,
while still working on other ones. Tried with a fresh KNIME
installation with latest RDKit nodes - same problem. What could be
wrong? I pasted the warning/error messages below:

WARNRDKitTypesPluginActivator Library file
GraphMolWrap.dll found:
C:\Temp\knime_2.9.1\plugins\org.rdkit.knime.bin.win32.x86_2.4.0.201402061135\os\win32\x86\GraphMolWrap.dll

ERROR  RDKitTypesPluginActivator Loading of library
GraphMolWrap.dll failed (possibly a subsequent error):
C:\Temp\knime_2.9.1\configuration\org.eclipse.osgi\bundles\libtemp\224_0\GraphMolWrap.dll:
Can't find dependent libraries

ERROR  RDKitTypesPluginActivator The library GraphMolWrap.dll
has dependency issues. Please run a dependency walker on this file to
find out what is missing.

ERROR  RDKitTypesPluginActivator Suggestion for fix: Please
correct your system libraries based on the outcome of the dependency
walker.

WARNHistogram 2 columns without a valid
domain will be ignored. In order to calculate the domain use the
Nominal Values or Domain Calculator node.

WARNRDKit From Molecule   Could not load native RDKit
library: 
C:\Temp\knime_2.9.1\plugins\org.rdkit.knime.bin.win32.x86_2.4.0.201402061135\os\win32\x86\boost_system-vc100-mt-1_51.dll:
Can't find dependent libraries

WARNRDKit From Molecule   Could not load native RDKit
library: 
C:\Temp\knime_2.9.1\plugins\org.rdkit.knime.bin.win32.x86_2.4.0.201402061135\os\win32\x86\boost_system-vc100-mt-1_51.dll:
Can't find dependent libraries

WARNRDKit From Molecule   Could not load native RDKit
library: 
C:\Temp\knime_2.9.1\plugins\org.rdkit.knime.bin.win32.x86_2.4.0.201402061135\os\win32\x86\boost_system-vc100-mt-1_51.dll:
Can't find dependent libraries

Thanks in advance,

Michal

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SDWriter kekulizes by default

2013-12-06 Thread Michal Krompiec
Dear Greg,
It seems that all examples I found are in fact incorrect (due to an
error during conversion of SDF produced by ISIS Draw with OpenBabel)
and contain this bit:
c1(sc(c2c1nsn2)) which was meant to be: c1scc2N=S=Nc12.
By the way, ChemSketch understands c1(sc(c2c1nsn2)) as N1SNc2cscc12.
So it is not a bug in your code, unless you consider the fused
thiadiazole ring aromatic.
Best wishes,
Michal

On 5 December 2013 12:19, Greg Landrum greg.land...@gmail.com wrote:
 Hi Michal,

 On Thu, Dec 5, 2013 at 11:52 AM, Michal Krompiec michal.kromp...@gmail.com
 wrote:

 Hello,
 Is it possible to suppress kekulization by SDWriter? I get the
 following error on a call to SDWriter.write:
 ValueError: Sanitization error: Can't kekulize mol
 But some molecules can't be kekulized using RDKit's algorithm, even
 though they are otherwise 'correct'.


 hmm, I'd love to see examples of those if you can share them.


 I browsed the sources and it seems that SDWriter calls MolToMolBlock
 with the default parameter kekulize=True. Can this parameter be
 exposed in SDWriter?


 Yep, I'll get it in for the next release.

 -greg


--
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SDWriter kekulizes by default

2013-12-05 Thread Michal Krompiec
Hello,
Is it possible to suppress kekulization by SDWriter? I get the
following error on a call to SDWriter.write:
ValueError: Sanitization error: Can't kekulize mol
But some molecules can't be kekulized using RDKit's algorithm, even
though they are otherwise 'correct'.
I browsed the sources and it seems that SDWriter calls MolToMolBlock
with the default parameter kekulize=True. Can this parameter be
exposed in SDWriter?
Thanks,
Michal

--
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] distance matrix for non-bonded atoms

2013-12-03 Thread Michal Krompiec
Hello,
Is there any simpler (=faster) way of calculating the shortest
distance between non-bonded atoms in a molecule?

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import rdMolTransforms
import numpy
mol=Chem.MolFromSmiles(Cc2ccsc2c1sccc1C)
mol=Chem.AddHs(mol)
AllChem.EmbedMolecule(mol)
AllChem.MMFFOptimizeMolecule(mol)
dm=numpy.multiply(Chem.Get3DDistanceMatrix(mol),numpy.logical_not(Chem.GetAdjacencyMatrix(mol)))
print(minimum non-bonded atom-atom distance:
{}.format(numpy.min(dm[numpy.nonzero(dm)])))

Best wishes,

Michal

--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distance matrix for non-bonded atoms

2013-12-03 Thread Michal Krompiec
Dear Nick,
Thanks. I need it for the whole molecule. But indeed it seems to be
faster this way - loop over pairs of atoms:

from rdkit import Chem
from rdkit.Chem import AllChem
mol = Chem.MolFromSmiles('C')
AllChem.EmbedMolecule(mol)
AllChem.UFFOptimizeMolecule(mol)
conf=mol.GetConformer()
natom=mol.GetNumAtoms()
minimum=1e100
for i in range(0, natom):
for j in range(i+1,natom):
if mol.GetBondBetweenAtoms(i,j)!=None:
dist=rdMolTransforms.GetBondLength(conf,i,j)
if distminimum:
minimum=dist
print(minimum)







On 3 December 2013 16:17, Nicholas Firth nicholas.fi...@icr.ac.uk wrote:

 I'm not sure whether you want to get these distances for the entire molecule 
 or just certain atom pairs. But I guess the simplest way to get the distance 
 between two atoms would be…

  from rdkit import Chem
  import numpy as np
  from rdkit.Chem import AllChem
  mol = Chem.MolFromSmiles('C')
  AllChem.EmbedMolecule(mol)
  AllChem.UFFOptimizeMolecule(mol)
  conf = mol.GetConformer()
  at1Coords = np.array(conf.GetAtomPosition(1))
  at2Coords = np.array(conf.GetAtomPosition(2))
  print np.linalg.norm(at1Coords-at2Coords)
 1.52139356317

 This can be optimised by not using numpy, I've not looked but there must be 
 some sort of euclidean distance using the Point3D class in RDKit.

 Best,
 Nick

 Nicholas C. Firth | PhD Student | Cancer Therapeutics
 The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | 
 Surrey | SM2 5NG

 T 020 8722 4033 | E nicholas.fi...@icr.ac.uk | W www.icr.ac.uk | Twitter 
 @ICRnews

 Facebook www.facebook.com/theinstituteofcancerresearch

 Making the discoveries that defeat cancer



 On 3 Dec 2013, at 15:56, Michal Krompiec michal.kromp...@gmail.com wrote:

 Hello,
 Is there any simpler (=faster) way of calculating the shortest
 distance between non-bonded atoms in a molecule?

 from rdkit import Chem
 from rdkit.Chem import AllChem
 from rdkit.Chem import rdMolTransforms
 import numpy
 mol=Chem.MolFromSmiles(Cc2ccsc2c1sccc1C)
 mol=Chem.AddHs(mol)
 AllChem.EmbedMolecule(mol)
 AllChem.MMFFOptimizeMolecule(mol)
 dm=numpy.multiply(Chem.Get3DDistanceMatrix(mol),numpy.logical_not(Chem.GetAdjacencyMatrix(mol)))
 print(minimum non-bonded atom-atom distance:
 {}.format(numpy.min(dm[numpy.nonzero(dm)])))

 Best wishes,

 Michal

 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
 http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


 The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
 Limited by Guarantee, Registered in England under Company No. 534147 with its 
 Registered Office at 123 Old Brompton Road, London SW7 3RP.

 This e-mail message is confidential and for use by the addressee only. If the 
 message is received by anyone other than the addressee, please return the 
 message to the sender by replying to it and then delete the message from your 
 computer and network.

--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distance matrix for non-bonded atoms

2013-12-03 Thread Michal Krompiec
Dear all,
bugfixed version of my solution: (the previous one listed the shortest bond)
from rdkit import Chem
from rdkit.Chem import AllChem
mol = Chem.MolFromSmiles('C')
AllChem.EmbedMolecule(mol)
AllChem.UFFOptimizeMolecule(mol)
conf = mol.GetConformer()
natom=mol.GetNumAtoms()
minimum=1e100
for i in range(0, natom-1):
for j in range(i+1,natom):

if mol.GetBondBetweenAtoms(i,j)==None:

dist=rdMolTransforms.GetBondLength(conf,i,j)

if distminimum:

minimum=dist

print(minimum)




On 3 December 2013 17:00, Michal Krompiec michal.kromp...@gmail.com wrote:
 Dear Nick,
 Thanks. I need it for the whole molecule. But indeed it seems to be
 faster this way - loop over pairs of atoms:

 from rdkit import Chem
 from rdkit.Chem import AllChem
 mol = Chem.MolFromSmiles('C')
 AllChem.EmbedMolecule(mol)
 AllChem.UFFOptimizeMolecule(mol)
 conf=mol.GetConformer()
 natom=mol.GetNumAtoms()
 minimum=1e100
 for i in range(0, natom):
 for j in range(i+1,natom):
 if mol.GetBondBetweenAtoms(i,j)!=None:
 dist=rdMolTransforms.GetBondLength(conf,i,j)
 if distminimum:
 minimum=dist
 print(minimum)







 On 3 December 2013 16:17, Nicholas Firth nicholas.fi...@icr.ac.uk wrote:

 I'm not sure whether you want to get these distances for the entire molecule 
 or just certain atom pairs. But I guess the simplest way to get the distance 
 between two atoms would be…

  from rdkit import Chem
  import numpy as np
  from rdkit.Chem import AllChem
  mol = Chem.MolFromSmiles('C')
  AllChem.EmbedMolecule(mol)
  AllChem.UFFOptimizeMolecule(mol)
  conf = mol.GetConformer()
  at1Coords = np.array(conf.GetAtomPosition(1))
  at2Coords = np.array(conf.GetAtomPosition(2))
  print np.linalg.norm(at1Coords-at2Coords)
 1.52139356317

 This can be optimised by not using numpy, I've not looked but there must be 
 some sort of euclidean distance using the Point3D class in RDKit.

 Best,
 Nick

 Nicholas C. Firth | PhD Student | Cancer Therapeutics
 The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | 
 Surrey | SM2 5NG

 T 020 8722 4033 | E nicholas.fi...@icr.ac.uk | W www.icr.ac.uk | Twitter 
 @ICRnews

 Facebook www.facebook.com/theinstituteofcancerresearch

 Making the discoveries that defeat cancer



 On 3 Dec 2013, at 15:56, Michal Krompiec michal.kromp...@gmail.com wrote:

 Hello,
 Is there any simpler (=faster) way of calculating the shortest
 distance between non-bonded atoms in a molecule?

 from rdkit import Chem
 from rdkit.Chem import AllChem
 from rdkit.Chem import rdMolTransforms
 import numpy
 mol=Chem.MolFromSmiles(Cc2ccsc2c1sccc1C)
 mol=Chem.AddHs(mol)
 AllChem.EmbedMolecule(mol)
 AllChem.MMFFOptimizeMolecule(mol)
 dm=numpy.multiply(Chem.Get3DDistanceMatrix(mol),numpy.logical_not(Chem.GetAdjacencyMatrix(mol)))
 print(minimum non-bonded atom-atom distance:
 {}.format(numpy.min(dm[numpy.nonzero(dm)])))

 Best wishes,

 Michal

 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics 
 Pro!
 http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


 The Institute of Cancer Research: Royal Cancer Hospital, a charitable 
 Company Limited by Guarantee, Registered in England under Company No. 534147 
 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.

 This e-mail message is confidential and for use by the addressee only. If 
 the message is received by anyone other than the addressee, please return 
 the message to the sender by replying to it and then delete the message from 
 your computer and network.

--
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] distance matrix for non-bonded atoms

2013-12-03 Thread Michal Krompiec
Sorry for spamming the list. This is the correct version:
from rdkit import Chem
from rdkit.Chem import AllChem
mol = Chem.MolFromSmiles('')
mol=Chem.AddHs(mol)
AllChem.EmbedMolecule(mol)
AllChem.UFFOptimizeMolecule(mol)
conf = mol.GetConformer()
natom=mol.GetNumAtoms()
conf=mol.GetConformer()
minimum=1e100
for i in range(0, natom-1):
for j in range(i+1,natom):
if mol.GetBondBetweenAtoms(i,j)==None:
dist=rdMolTransforms.GetBondLength(conf,i,j)
if distminimum:
minimum=dist
print(minimum)

Best wishes,
Michal

--
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Joining Fragments

2013-12-02 Thread Michal Krompiec
Dear Greg,
But does CombineMols create any new bonds?
What is the simplest/fastest way of joining two molecules (fragments)
together? (i.e. is there anything simpler than using
ReplaceSubstructs)
Best wishes,
Michal



On 1 December 2013 04:40, Greg Landrum greg.land...@gmail.com wrote:
 This is the approach I would use. It allows you to skip the SMILES
 generation and parsing steps.



 On Fri, Nov 29, 2013 at 5:31 PM, Markus Hartenfeller
 markus.hartenfel...@molecularhealth.com wrote:

 Hi Nick,

 I'm not 100% sure, but this might do what you are looking for:

 Chem.CombineMols(molFrags[i], molFrags[i+1])

 Best,
 Markus


 On 11/29/2013 05:03 PM, Nicholas Firth wrote:
  Hi RDKitters,
 
  This may be a silly question, but I'm wondering if there's any
  functionality in RDKit to add two molecules together? I've been writing to
  SMILES and joining with a '.' and then reading back in. This feels very
  cludgy.
 
  Basically I have two fragments of molecules and I want to add them into
  one EditableMol and join these fragments to construct a single molecule.
 
  Thanks in advance.
 
  Best,
  Nick
 
  Nicholas C. Firth | PhD Student | Cancer Therapeutics
  The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton |
  Surrey | SM2 5NG
  T 020 8722 4033 | E nicholas.fi...@icr.ac.uk | W www.icr.ac.uk | Twitter
  @ICRnews
 
  The Institute of Cancer Research: Royal Cancer Hospital, a charitable
  Company Limited by Guarantee, Registered in England under Company No. 
  534147
  with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
 
  This e-mail message is confidential and for use by the addressee only.
  If the message is received by anyone other than the addressee, please 
  return
  the message to the sender by replying to it and then delete the message 
  from
  your computer and network.
 
 
  --
  Rapidly troubleshoot problems before they affect your business. Most IT
  organizations don't have a clear picture of how application performance
  affects their revenue. With AppDynamics, you get 100% visibility into
  your
  Java,.NET,  PHP application. Start your 15-day FREE TRIAL of
  AppDynamics Pro!
 
  http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
  The information contained in this transmission may contain privileged
  and confidential information, including patient information protected by
  federal and state privacy laws. It is intended only for the use of the
  person(s) named above. If you are not the intended recipient, you are 
  hereby
  notified that any review, dissemination, distribution, or duplication of
  this communication is strictly prohibited. If you are not the intended
  recipient, please contact the sender by reply email and destroy all copies
  of the original message.


 The information contained in this transmission may contain privileged and
 confidential information, including patient information protected by federal
 and state privacy laws. It is intended only for the use of the person(s)
 named above. If you are not the intended recipient, you are hereby notified
 that any review, dissemination, distribution, or duplication of this
 communication is strictly prohibited. If you are not the intended recipient,
 please contact the sender by reply email and destroy all copies of the
 original message.


 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics
 Pro!

 http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics
 Pro!
 http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Rapidly troubleshoot problems before they 

Re: [Rdkit-discuss] Joining Fragments

2013-11-29 Thread Michal Krompiec
Suppose we have 2 fragments, A and B, with free valences marked as
some dummy atoms. For example, A=CH3* and B=HO*. We want a general
method to combine A and B, in this example the result should be CH3OH.
There are at least 3 ways to do it.

1. Use the trick from SMILIB: replace dummies with %11, and then parse
A+.+B. The problem is that although this is how SMILIB is
implemented, this approach is, in general, incorrect: it works if and
only if the dummy is preceded by the atom connected to it. It does
work for certain cases, though, at least when implemented with
OpenBabel (I never managed to make it work with RDKit).

2. Use Reaction SMARTS. Not so easy to implement properly, and quite
slow for large molecules (but unlike the above, will always work
correctly).
In this example, Cl and Br are used as dummy atoms.
 from rdkit.Chem import AllChem
 ps=rxnJoin.RunReactants((Chem.MolFromSmiles('C[Cl]'),Chem.MolFromSmiles('CC[Br]')))
 Chem.MolToSmiles(ps[0][0],True)
'CCC'

3. Use AllChem.ReplaceSubstructs.
joined=AllChem.ReplaceSubstructs(A,dummy_in_A,processed_B)[0]
where processed_B is obtained from B by: removing the dummy atom and
renumbering the atoms so the linking atom (which was connected to the
dummy) has index 0. Renumbering could be obviated if ReplaceSubstructs
had an additional argument specifying the linking atom (looking at the
C sources, it seems quite easy to add this parameter).

Best wishes,
Michal

On 29 Nov 2013 16:27, Nicholas Firth nicholas.fi...@icr.ac.uk wrote:

 Hi RDKitters,

 This may be a silly question, but I'm wondering if there's any functionality 
 in RDKit to add two molecules together? I've been writing to SMILES and 
 joining with a '.' and then reading back in. This feels very cludgy.

 Basically I have two fragments of molecules and I want to add them into one 
 EditableMol and join these fragments to construct a single molecule.

 Thanks in advance.

 Best,
 Nick

 Nicholas C. Firth | PhD Student | Cancer Therapeutics
 The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | 
 Surrey | SM2 5NG
 T 020 8722 4033 | E nicholas.fi...@icr.ac.uk | W www.icr.ac.uk | Twitter 
 @ICRnews

 The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
 Limited by Guarantee, Registered in England under Company No. 534147 with its 
 Registered Office at 123 Old Brompton Road, London SW7 3RP.

 This e-mail message is confidential and for use by the addressee only.  If 
 the message is received by anyone other than the addressee, please return the 
 message to the sender by replying to it and then delete the message from your 
 computer and network.

 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
 http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Minimising bits of molecules?

2013-11-27 Thread Michal Krompiec
Dear Paolo,
Is it possible to add also the option to freeze certain internal
coordinates (I am particularly interested in freezing the dihedral
angles)?
Best wishes,
Michal

On 26 November 2013 18:09, Paolo Tosco paolo.to...@unito.it wrote:
 Dear James,

 I will try to get it done during the weekend - I'll get back to you once
 it's ready.

 Best,
 Paolo



 On 11/26/2013 05:51 PM, James Davidson wrote:

 Dear All,



 I think this is probably one for Paolo – I was looking at fixing certain
 atoms during MMFF minimisation, but couldn’t find the option…  Then I
 re-read the UGM slides, and found the one titled “Force-field wish list”,
 and “fixed atoms” were one of the listed items!



 My intended use-case is the following:



 1.   Load protein-ligand complex into PyMOL

 2.   Make some changes to the bound ligand (using the Builder
 functionality)

 3.   Select atoms that are allowed to move (manual selection, then use
 of PyMOL’s ‘flag’ command)

 4.   Pass the molecule over to RDKit (already incorporated in a plugin
 we use), to minimise and then pass back (either as a new object, or apply
 the new coordinates to the existing object in situ)



 Actually, this process is already well-used by some of our chemists here –
 as a way of doing some simple modelling / idea exploration – but is
 currently using a much ‘flakier’ MMFF implementation.  So I would definitely
 like to move to RDKit for the minimisation – any idea when a ‘fixed atoms’
 option is likely to be added?



 Kind regards



 James


 __
 PLEASE READ: This email is confidential and may be privileged. It is
 intended for the named addressee(s) only and access to it by anyone else is
 unauthorised. If you are not an addressee, any disclosure or copying of the
 contents of this email or any action taken (or not taken) in reliance on it
 is unauthorised and may be unlawful. If you have received this email in
 error, please notify the sender or postmas...@vernalis.com. Email is not a
 secure method of communication and the Company cannot accept responsibility
 for the accuracy or completeness of this message or any attachment(s).
 Please check this email for virus infection for which the Company accepts no
 responsibility. If verification of this email is sought then please request
 a hard copy. Unless otherwise stated, any views or opinions presented are
 solely those of the author and do not represent those of the Company.

 The Vernalis Group of Companies
 100 Berkshire Place
 Wharfedale Road
 Winnersh, Berkshire
 RG41 5RD, England
 Tel: +44 (0)118 938 

 To access trading company registration and address details, please go to the
 Vernalis website at www.vernalis.com and click on the Company address and
 registration details link at the bottom of the page..
 __


 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics
 Pro!
 http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk



 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



 --
 ==
 Paolo Tosco, Ph.D.
 Department of Drug Science and Technology
 Via Pietro Giuria, 9 - 10125 Torino (Italy)
 Tel: +39 011 670 7680 | Mob: +39 348 5537206
 Fax: +39 011 670 7687 | E-mail: paolo.to...@unito.it
 http://open3dqsar.org | http://open3dalign.org
 ==


 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics
 Pro!
 http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!

[Rdkit-discuss] problem with substructure matching using SMARTS

2013-11-18 Thread Michal Krompiec
Hello,
Substructure matching with SMARTS behaves strangely sometimes - see code below.
The pattern with [H] matches, but the pattern with [H,F] does not
(both should match).

from rdkit import Chem
mol=Chem.MolFromSmiles('Clc2sccc2[H]')
mol=Chem.AddHs(mol)
p1=Chem.MolFromSmarts('c2sccc2[H]')
p2=Chem.MolFromSmarts('c2sccc2[H,F]')
print(mol.HasSubstructMatch(p1))
print(mol.HasSubstructMatch(p2))

Best wishes,
Michal

--
DreamFactory - Open Source REST  JSON Services for HTML5  Native Apps
OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
Free app hosting. Or install the open source package on any LAMP server.
Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] atom coordinates

2013-11-08 Thread Michal Krompiec
Hello,
In the Python API, is it possible to read the 3D coordinates of an
atom (from a Mol object created from an SDF file with 3D coords)?
Thanks,
Michal

--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] atom coordinates

2013-11-08 Thread Michal Krompiec
Thanks a lot!
Michal

On 8 November 2013 12:40, Paolo Tosco paolo.to...@unito.it wrote:
 Hi Michal,

 I think this Python snippet should do what you need:

 from rdkit import Chem
 from rdkit.Chem import AllChem

 sdf = 'benzene.sdf'
 supplier = Chem.SDMolSupplier(sdf, True, False)
 mol = supplier[0]
 for i in range(0, mol.GetNumAtoms()):
   pos = mol.GetConformer().GetAtomPosition(i)
   print '{0:12.4f}{1:12.4f}{2:12.4f}'.format(pos.x, pos.y, pos.z)


 Cheers,
 p.


 On 11/08/2013 01:01 PM, Michal Krompiec wrote:

 Hello,
 In the Python API, is it possible to read the 3D coordinates of an
 atom (from a Mol object created from an SDF file with 3D coords)?
 Thanks,
 Michal


 --
 November Webinars for C, C++, Fortran Developers
 Accelerate application performance with scalable programming models.
 Explore
 techniques for threading, error checking, porting, and tuning. Get the
 most
 from the latest Intel processors and coprocessors. See abstracts and
 register

 http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



 --
 ==
 Paolo Tosco, Ph.D.
 Department of Drug Science and Technology
 Via Pietro Giuria, 9 - 10125 Torino (Italy)
 Tel: +39 011 670 7680 | Mob: +39 348 5537206
 Fax: +39 011 670 7687 | E-mail: paolo.to...@unito.it
 http://open3dqsar.org | http://open3dalign.org
 ==


--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] AllChem.ReplaceSubstructs

2013-11-07 Thread Michal Krompiec
Hello,
I have a question about AllChem.ReplaceSubstructs(mol,
query,replacement). As I understand, it replaces 'query' pattern in
'mol' by 'replacement' fragment. It is clear which atom from 'mol' is
the joining atom, but which is the joining atom in 'replacement'? The
atom with index=0? Is it possible to specify which atom in the
'replacement' should be bonded to 'mol'? It would be lovely to be able
to do so, because the only alternative (using reaction SMARTS) is
much, much slower.
If not, is it possible to generate (from a given Mol object) a SMILES
string starting from the specified atom index?

Thanks,
Michal

--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] AllChem.ReplaceSubstructs

2013-11-07 Thread Michal Krompiec
Hello again,
I browsed through the sources and I found the answer to my question:
the atom at index 0 from the replacement is used for the new bond. It
would be nice to be able to specify the index of this bonding atom as
a parameter in AllChem.ReplaceSubstructs.

Is it possible to reorder atoms in a molecule (i.e. to have a chosen
atom at index 0)?
Best wishes,
Michal

On 7 November 2013 11:38, Michal Krompiec michal.kromp...@gmail.com wrote:
 Hello,
 I have a question about AllChem.ReplaceSubstructs(mol,
 query,replacement). As I understand, it replaces 'query' pattern in
 'mol' by 'replacement' fragment. It is clear which atom from 'mol' is
 the joining atom, but which is the joining atom in 'replacement'? The
 atom with index=0? Is it possible to specify which atom in the
 'replacement' should be bonded to 'mol'? It would be lovely to be able
 to do so, because the only alternative (using reaction SMARTS) is
 much, much slower.
 If not, is it possible to generate (from a given Mol object) a SMILES
 string starting from the specified atom index?

 Thanks,
 Michal

--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MolFromXYZ?

2013-11-05 Thread Michal Krompiec
Dear Sereina,
No, I have just a table of 3D coordinates (generated by cclib). This
is equivalent to having an xyz file.
I know that this conversion is possible with OpenBabel, but I would
like to avoid using it for this particular purpose.
It seems that the simplest way is to hardcode generation of SDF myself.

Thanks,

Michal


On 4 November 2013 13:52, sereina riniker sereina.rini...@gmail.com wrote:
 Hi Michal,

 Well, if you have your 3D coordinates as a PDB file, you can read them in
 with the new PDB parser and assign the bond orders based on a template
 (generated from the SMILES of your molecule):
 tmp = Chem.MolFromPDBFile(yourfilename)
 template = Chem.MolFromSmiles(yoursmiles)
 mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)

 I don't know if this is what you were looking for.

 Best,
 Sereina



 2013/11/4 Michal Krompiec michal.kromp...@gmail.com

 Hello,
 Is it possible to construct a Mol (or EditableMol) object out of a
 list of 3D coordinates? I am trying to write a bridge between cclib
 and RDKit, and I need a function to convert 3D geometries to SDF.
 Thanks,
 Michal


 --
 Android is increasing in popularity, but the open development platform
 that
 developers love is also attractive to malware creators. Download this
 white
 paper to learn more about secure code signing practices that can help keep
 Android apps secure.

 http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] UFF/MMFF atom types

2013-11-04 Thread Michal Krompiec
Hello,
Is Se defined in UFF and/or MMFF94? Apparently, molecules with
selenophene moieties don't optimize in RDKit, and a warning appears in
the log: UFFTYPER: Unrecognized atom type: Se2+2
Is it possible to define/modify the force field by hand? (for example,
use the parametrs of S for Se)
Thanks,
Michal

--
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] bug in rdMolTransforms.SetDihedralDeg?

2013-10-22 Thread Michal Krompiec
Hello, I am trying to use the new functionality for manipulation of
dihedral angles in a function similar to OpenBabel's obrotate tool.
But it doesn't work: I get a ValueError exception. Here is an example
that replicates the error:

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import rdMolTransforms

mol=Chem.MolFromSmiles('c2ccsc2c1sccc1')#2,2'-bithiophene
mol=Chem.AddHs(mol)
AllChem.EmbedMolecule(mol)
sp=Chem.MolFromSmarts(c2([H])ccsc2c1sccc1)
atoms=(5,6,7,8)
newangle=180.0
maplist = mol.GetSubstructMatches(sp)
if (len(maplist)0):
for match in maplist :
a=[]
for i in range (0,4) :
a.append(match[atoms[i]])
angle=rdMolTransforms.GetDihedralDeg(mol.GetConformer(), a[0],
a[1], a[2], a[3])
print(angle between atoms {}.format(a)+ is
{}.format(angle))
print(trying to set to the same angle)
rdMolTransforms.SetDihedralDeg(mol.GetConformer(), a[0], a[1],
a[2], a[3], angle) #if you comment this line out
print(trying to set to another angle)
rdMolTransforms.SetDihedralDeg(mol.GetConformer(), a[0], a[1],
a[2], a[3], newangle) #this one will crash as well
angle=rdMolTransforms.GetDihedralDeg(mol.GetConformer(), a[0],
a[1], a[2], a[3])
print(angle between atoms {}.format(a)+ is
{}.format(angle))

Best wishes,
Michal Krompiec

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss