Re: [Rdkit-discuss] Bug in ResonanceMolSupplier?

2024-03-19 Thread Paolo Tosco
Dear Jan, Definitely it is a bug. I’ll try and fix it for the next release which is due in ~2 weeks. Thanks for reporting, cheers Paolo > On 19 Mar 2024, at 11:20, Jan Halborg Jensen wrote: > > Why does ResonanceMolSupplier only give me one resonance structure for > O[NH+]=[C-]NC when

Re: [Rdkit-discuss] Chirality wedge disappears in PNG depiction

2023-07-27 Thread Paolo Tosco
ring junctions. Best, Jean-Marc Le 27/07/2023 à 10:33, Paolo Tosco a écrit : Dear Jean-Marc, You are generating the molecule from SMILES, therefore it does not have molblock wedging information.

Re: [Rdkit-discuss] Chirality wedge disappears in PNG depiction

2023-07-27 Thread Paolo Tosco
Dear Jean-Marc,You are generating the molecule from SMILES, therefore it does not have molblock wedging information.When you call ReapplyMolBlockWedging(), first existing wedging info will be stripped.Then, the molblock wedging will be applied, but there is none.Hence, you get no wedging.You may

Re: [Rdkit-discuss] H atoms at ring junction

2023-06-08 Thread Paolo Tosco
wrote: Dear Paolo, many thanks, your solution worked like a charm! Best regards, Jean-Marc Le 01/06/2023 à 23:57, Paolo Tosco a

[Rdkit-discuss] H atoms at ring junction

2023-06-01 Thread Paolo Tosco
Dear Jean-Marc, you may retain the original mol block wedging and avoid introducing H atoms as follows: from rdkit import Chem from rdkit.Chem.Draw import rdMolDraw2D from IPython.display import SVG mol = Chem.MolFromMolBlock("""trans-decalin RDKit 2D 10 11 0 0 0 0 0 0 0

Re: [Rdkit-discuss] How to decompose the UFF (or MMFF94) scoring of a small molecule?

2023-05-18 Thread Paolo Tosco
Hi Francois, I have replied on GitHub ~10’ ago. p. > On 18 May 2023, at 10:23, Francois Berenger wrote: > > Dear list, > > I asked this question in rdkit's github discussions: > > https://github.com/rdkit/rdkit/discussions/6377 > > But, apparently that's not more responsive than the ML,

Re: [Rdkit-discuss] Deuterium/Tritium labels in Molfile

2023-04-11 Thread Paolo Tosco
Dear Santiago,Using D and T symbols for deuterium and tritium in MDL molfiles is outside the file format specification.Nonetheless, RDKit correctly parses those non-standard D and T symbols when reading an MDL molfile that contains them, as you can verify yourself through a simple test and also

Re: [Rdkit-discuss] MMFF94 scoring of a protein-ligand complex

2023-01-30 Thread Paolo Tosco
Hi Francois, there is no inherent limitation to small molecules n the RDKit MMFF94 implementation - you may assess the energy of systems of any size, including protein ligand complexes, provided that all atom types in your complex are defined in MMFF94. Cheers, p. On Mon, Jan 30, 2023 at 2:38

Re: [Rdkit-discuss] Working with SDF from varying locales?

2022-09-30 Thread Paolo Tosco
Hi Rocco, the locale Python module will allow you to do this sort of normalizations on strings, e.g.: import locale locale.getlocale() ('en_US', 'UTF-8') locale.setlocale(locale.LC_ALL, "it_IT") 'it_IT' locale.delocalize("1,222") '1.222' But this requires you to know the locale the

Re: [Rdkit-discuss] Using DrawAttachmentLine for bidentate ligands

2022-08-11 Thread Paolo Tosco
Hi Geoff, you can indeed use DrawWavyLine() coupled to some basic 2D geometry as in the example below: from rdkit import Chem from rdkit.Geometry import Point3D, Point2D from rdkit.Chem.Draw import rdDepictor, rdMolDraw2D from IPython.display import SVG mol =

Re: [Rdkit-discuss] substructure query with aromatic query bond

2022-07-27 Thread Paolo Tosco
But I also wanted to ask about the concept of a qmol in the cartridge that > doesn't undergo sanitization versus the corresponding behaviour in Python? > Please correct me if I'm wrong, but there is no concept of a qmol in Python? > > Many thanks! > > Susan > >> On Tue,

Re: [Rdkit-discuss] substructure query with aromatic query bond

2022-07-26 Thread Paolo Tosco
Hi Susan, I see why that happens, and I'll let Greg comment if this is a bug or the intended behavior. In the meantime, I can propose a workaround. The reason why it happens is that aromatization, which is part of the sanitization operations, converts your aromatic query bond into a single bond,

Re: [Rdkit-discuss] How to draw peptides and let the backbone atoms appear in the most extended form?

2022-07-26 Thread Paolo Tosco
Hi Amy, the simplest way is probably to use Schroedinger's CoordGen to generate coordinates:: from rdkit import Chem from rdkit.Chem.Draw import rdDepictor rdDepictor.SetPreferCoordGen(True) peptide = Chem.MolFromSmiles("CC(C)C[C@H](NC(=O)OCC1=CC=CC=C1)C(=O)N[C@ @H](CC1CCNC1=O)C=O")

Re: [Rdkit-discuss] HasSubstructureMatch using query atom list with hydrogen

2022-07-22 Thread Paolo Tosco
or .sdf should all the > molecules already have explicit hydrogens? > > Thanks, > > Susan > > On Fri, Jul 22, 2022 at 11:51 AM Susan Leung > wrote: > >> Ah, great thanks Paolo! >> >> On Fri, Jul 22, 2022 at 11:44 AM Paolo Tosco >> wrote: >&

Re: [Rdkit-discuss] HasSubstructureMatch using query atom list with hydrogen

2022-07-22 Thread Paolo Tosco
Hi Susan, If you use [#1] in your SMARTS query, for your molecule to match there should be a real hydrogen atom in your molecule graph, while in your molecule you only have implicit hydrogens, unless you explicitly add them calling Chem.AddHs(): print(Chem.AddHs(m).HasSubstructMatch(q)) True

Re: [Rdkit-discuss] Color bonds with value

2022-07-05 Thread Paolo Tosco
Hi Joey, not sure if by "color" you mean text labelling or actually mapping a property to a color. Anyway, here's some code for either use case. The text labelling is easy, the individual bond coloring can be done by fiddling with the SVG text. import re import xml.etree.ElementTree as ET import

Re: [Rdkit-discuss] Building RDKit on Windows for pgAdmin (Postgres)

2022-04-14 Thread Paolo Tosco
Hi Charmaine, my suggestion is to build starting from a conda environment. That will significantly simplify your dependencies, since most packages are available pre-built. You can look into .azure-pipelines\vs_build_dll.yml for how to set up your conda environment and for the cmake flags to use.

Re: [Rdkit-discuss] Adjusting/neutralising the formal charges on a molecule

2022-04-08 Thread Paolo Tosco
Hi Gianmarco, that's a radical cation, not just a cation, so you'll need to adjust the number of radical electrons first, then you may neutralize using Chem.MolStandardize.rdMolStandardize.Uncharger as documented in the RDKit CookBook:

Re: [Rdkit-discuss] Atom removal messes up with the electronic configuration of rings

2022-04-07 Thread Paolo Tosco
atom > removed and I am still puzzled on why that should be increased by 2 for > aromatic bonds (int(GetBondTypeAsDouble) == 2). Could you elaborate that > for me? > > Giammy > > On Thu, 7 Apr 2022 at 10:38, Paolo Tosco > wrote: > >> Hi Gianmarco, >

Re: [Rdkit-discuss] Atom removal messes up with the electronic configuration of rings

2022-04-07 Thread Paolo Tosco
Hi Gianmarco, this issue has been discussed before. Removing bonds with RWMol.RemoveBond() will not adjust the implicit H count of the atom at the two ends of the bond. While this is not important for the atom that is going to be removed, the count on the atom that stays needs to be adjusted. In

Re: [Rdkit-discuss] Change font size in atom.SetProp("atomNote")

2022-03-28 Thread Paolo Tosco
e[1]) > d2d.DrawMolecule( > rwmol, > highlightAtoms=atoms_to_highlight, > highlightAtomColors=idx2rgb, > highlightBonds=None, > ) > d2d.FinishDrawing() > return d2d.GetDrawingText() > > Giammy > > On Mon, 28 Mar 2022 at 14:00, Pao

Re: [Rdkit-discuss] JS funtionality

2022-03-03 Thread Paolo Tosco
Hi Tim, Apart from the HTML demo page that you mentioned, you can find the current JS bindings here: https://github.com/rdkit/rdkit/blob/4bbbc6611dbe4bf05ed36ea795b3c6dc39bbdebc/Code/MinimalLib/jswrapper.cpp#L112 There's support for visualizing molecules from SMILES and CTAB, but currently no

Re: [Rdkit-discuss] Atom sequence in 3D coordinates (H's at the end)

2022-02-28 Thread Paolo Tosco
Dear Joey, you could use Chem.RenumberAtoms() to enforce Hs to always follow heavy atoms in the atom list, e.g.: from rdkit import Chem from rdkit.Chem.Draw import IPythonConsole IPythonConsole.drawOptions.addAtomIndices = True IPythonConsole.molSize = (400, 400) mol1 =

Re: [Rdkit-discuss] Delete atoms can leave dangling aromaticity

2022-02-14 Thread Paolo Tosco
Hi Tim, after you are done removing the atoms you can do loop through remaining ring atoms and bonds and clear aromatic flags, e.g. from rdkit import Chem rwmol = Chem.RWMol(Chem.MolFromSmiles("c1c1")) rwmol.RemoveAtom(0) for a in rwmol.GetAtoms(): if (not a.IsInRing()) and

Re: [Rdkit-discuss] Font size when drawing molecules

2022-02-09 Thread Paolo Tosco
Hi Tim, Dave Cosgrove is currently working at a PR which, among other things, addresses exactly the need that you describe through the baseFontSize parameter, which is currently not exposed to Python. The PR is almost ready for merging and it should become part of the March release. Cheers, p.

Re: [Rdkit-discuss] Problem with depicting reaction SMARTS

2022-02-08 Thread Paolo Tosco
Hi Mark, I believe the bug is caused by the fact that isAtomListQuery() returns true for a query that is actually a complex query, and that subsequently getAtomListQueryVals() (called by getAtomListText()) fails to parse. The following patch seems to solve the problem: $ git diff diff --git

Re: [Rdkit-discuss] problem with latest bulds?

2022-01-26 Thread Paolo Tosco
Hi Tim, there was a similar report a few days ago, caused by a sed command overwriting its input and resulting in no output: Cannot import Draw · Issue #4904 · rdkit/rdkit (github.com) Could it be you are experiencing the same issue? Current ,master

Re: [Rdkit-discuss] Reading an SDF/Mol without shuffling the original coordinates

2022-01-13 Thread Paolo Tosco
Hi Gianmarco, you can add hydrogens with coordinates keeping the current heavy atom coordinates with rdkit_mol = rdkit.Chem.AddHs(rdkit_mol, addCoords=True) so you may avoid having to call EmbedMolecule, which will compute a whole set of new coordinates for your molecule. I hope I interpreted

Re: [Rdkit-discuss] rdSubstructLibrary and atom indexes involved in substructure matches

2022-01-12 Thread Paolo Tosco
Hi Alexis, the rationale behind the SubstructLibrary is providing functionality to efficiently screen very large libraries (millions of structures) for hits. In this scenario, one is only interested in whether a molecule is a match or not, which is the task the SubstructLibrary is optimized for.

Re: [Rdkit-discuss] HasSubstructMatch method with useChirality argument

2022-01-04 Thread Paolo Tosco
Hi Alexis, The chiral substructure match does not look at the CIP labels, but rather at the atom parities, as you expected. In the examples below I have reordered your mol_structure as it makes things easier to understand. Let's work with a fragment constructed from SMILES rather than from

Re: [Rdkit-discuss] invalid CTAB substructure query with PostgreSQL cartridge

2021-12-09 Thread Paolo Tosco
Hi Susan, that looks like a bug in the way the MDL query is parsed; I have filed it here: https://github.com/rdkit/rdkit/issues/4785 If you can afford doing some Python massaging to your CTAB queries and converting them to SMARTS before submitting them to PostgreSQL when they fail sanitization,

Re: [Rdkit-discuss] Swig versions

2021-12-05 Thread Paolo Tosco
Hi Tim, RDKit requires SWIG 3, that you may install with sudo apt-get install swig3.0 Cheers, p. On Sun, Dec 5, 2021 at 7:40 PM Tim Dudgeon wrote: > I'm having problems building with Debian bullseye release. > Bullseye now has swig4.0, but RDKit build seems to require exactly 3.0. > Error

Re: [Rdkit-discuss] Using EnumerateLibraryFromReaction without fragmenting reactants

2021-12-02 Thread Paolo Tosco
Hi James, I am not quite sure I understand what you have done and what you'd like to achieve. Ideally, could you please post: * the reaction you are using * some example reactants * the desired product(s) * the undesired product(s) Thanks, cheers p. On Thu, Dec 2, 2021 at 6:03 PM James Wallace

Re: [Rdkit-discuss] Hiding/removing specific atoms in a RDKit molecule

2021-12-02 Thread Paolo Tosco
Hi Gianmarco, I am not aware of a method to simply hide atoms: here's a method t remove atoms given a list of indices, which should be what you need. Note that remove_selected_hs() replaces the "real" hydrogen with an implicit H on the parent atom, which I believe is what you want. from rdkit

Re: [Rdkit-discuss] Programmatic access to MMFF torsion indices and parameters

2021-11-21 Thread Paolo Tosco
to stdout, but I'd > like to access them programmatically. I believe Paolo Tosco must have done > this as part of an effort to port MMFF to OpenMM via rdkit ( > https://github.com/ptosco/rdkit/tree/openmm ) , but I can't find the > portion of code that does it! > > Thanks a lot > Le

Re: [Rdkit-discuss] State of the art for shape alignment

2021-11-11 Thread Paolo Tosco
Hi Tim, Open3DAlign is not shape-based, it is atom-based. The score is proportional to the # of matched atoms, weighted by similarity. It will work well for homologous series of compounds with reasonable scaffold similarity, and will in general perform badly with scaffolds that are very

Re: [Rdkit-discuss] MolToSmiles gives explicit H after ReplaceSubstructs

2021-11-05 Thread Paolo Tosco
Hi Ling, By default hydrogens defining double bond stereochemistry are not removed. You may remove that residual hydrogen by either params = Chem.RemoveHsParameters() params.removeDefiningBondStereo = True Chem.RemoveHs(m6, params) or simply Chem.RemoveAllHs(m6) I think you may obtain the

Re: [Rdkit-discuss] draw molecule question

2021-10-28 Thread Paolo Tosco
Dear Hao, I don't think that's possible through the current rdMolDraw2D API. However, you may obtain that effect fiddling a bit with the SVG XML text: import re import xml.etree.ElementTree as ET from rdkit import Chem from rdkit.Chem.Draw import rdMolDraw2D from IPython.display import SVG mol

Re: [Rdkit-discuss] InsertMol() seems to mess up the molecule

2021-10-25 Thread Paolo Tosco
Hi Tim, you only need to call SanitizeMol on merged_mol: merged_mol [image: image.png] Chem.SanitizeMol(merged_mol) rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE merged_mol [image: image.png] In general, the RDKit defers calling SanitizeMol to the user rather than calling it after each

Re: [Rdkit-discuss] MMFF94 symbolic atom types

2021-10-24 Thread Paolo Tosco
s input. > > Regards > Stefan > > > > Am 2021-10-22 06:40, schrieb Omar H94: > > Dear stefan, > > > > I had a similar issue, and based on answer by Paolo Tosco, you can find > > MMFF Symbols/Definitions here : > > https://github.com/openbabel/

Re: [Rdkit-discuss] Removing Hs bonded to sp3 carbons

2021-10-24 Thread Paolo Tosco
Hi Alfredo, if that's really what you want to do, the following Python function should help: from rdkit import Chem bilastine = Chem.AddHs(Chem.MolFromSmiles("O=C(O)C(c1ccc(cc1)CCN4CCC(c2nc3c3n2CCOCC)CC4)(C)C")) bilastine [image: image.png] def removeHsBondedToSp3Carbon(mol): bonds = []

Re: [Rdkit-discuss] Cross platform inconsistency with the Descriptor module

2021-09-09 Thread Paolo Tosco
at 5:40 PM Paolo Tosco wrote: > Hi Alexis, > > I did some more investigation. The fragment descriptors are parsed from a > CSV file located in RDConfig.RDDataDir: > On my machine I see this: > > >>> import os > >>> from rdkit.Chem import Fragments > >&

Re: [Rdkit-discuss] Cross platform inconsistency with the Descriptor module

2021-09-08 Thread Paolo Tosco
at 2:50 PM Alexis Parenty wrote: > Hi Paolo, > Thanks a lot for your response. I am going to try rdkit 2021.03.5 right > now... > > I have checked where I have installed the previous built: I did use > conda-forge on both platforms: > > [image: image.png] > > Weird..

Re: [Rdkit-discuss] Cross platform inconsistency with the Descriptor module

2021-09-08 Thread Paolo Tosco
Hi Alexis, I have just installed rdkit 2021.03.5 from the conda-forge channel on a Windows machine and 208 descriptors are indeed available. >>> import sys >>> sys.platform 'win32' >>> import rdkit >>> rdkit.__version__ '2021.03.5' >>> from rdkit.Chem import Descriptors >>>

Re: [Rdkit-discuss] Draw molecule without wedge bonds (i.e no wedge bonds should be seen in the sag image)

2021-08-05 Thread Paolo Tosco
Hi Zoltan, this Jupyter snippet should do what you needL: from rdkit import Chem from rdkit.Chem.Draw import rdMolDraw2D from IPython.display import SVG mol = Chem.MolFromSmiles("[C@H](Cl)(Br)F") drawer = rdMolDraw2D.MolDraw2DSVG(200, 200) drawer.drawOptions().prepareMolsBeforeDrawing = False

Re: [Rdkit-discuss] Maximum Common Substructure using SMARTS

2021-07-23 Thread Paolo Tosco
cribed > anywhere else? > > Thank you so much! > -- > Gustavo Seabra. > > > On Fri, Jul 23, 2021 at 4:53 AM Paolo Tosco > wrote: > >> Hi Gustavo, >> >> you should be able to address this with a custom AtomCompare (and >> BondCompare, if you wan

Re: [Rdkit-discuss] Generating images

2021-07-23 Thread Paolo Tosco
Hi Francesca, that molecule has been depicted using its existing 3D coordinates. That's because you only compute 2D coordinates if the molecules does not have any: if not mc.GetNumConformers(): rdDepictor.Compute2DCoords(mc) If the molecule already has a 3D conformation, those

Re: [Rdkit-discuss] Maximum Common Substructure using SMARTS

2021-07-23 Thread Paolo Tosco
Hi Gustavo, you should be able to address this with a custom AtomCompare (and BondCompare, if you want to use bond queries too) class, that now is also supported from Python. You can take a look at Code/GraphMol/FMCS/Wrap/testFMCS.py for inspiration how to use it; here's something that seems to

Re: [Rdkit-discuss] Calculating strain energy of conformers

2021-07-12 Thread Paolo Tosco
Hi Lewis, if you set up a custom MMFF94 force field disabling some energy terms, you will need to do all your energy calculations with the custom instance of the force field rather than using helper functions such as MMFFOptimizeMolecule which instead will use a standard instance of the force

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-28 Thread Paolo Tosco
useful to reionize after neutralizing > charges in the pipeline above? > > Many thanks, > > On Thu, 24 Jun 2021 at 18:58, Paolo Tosco > wrote: > >> Hi JP, >> >> the problem is caused by the reaction SMARTS that standardizes pyridine >> *N*

Re: [Rdkit-discuss] How can I remove all angle potentials from a force field before EM?

2021-06-25 Thread Paolo Tosco
Hi Chris, you can switch on/off individual MMFF potential terms in the force field; you can't do that with UFF. To switch off all angle terms in MMFF you can do the following: from rdkit import Chem, ForceField from rdkit.Chem import rdForceFieldHelpers mp =

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-24 Thread Paolo Tosco
Hi JP, the problem is caused by the reaction SMARTS that standardizes pyridine *N*-oxides being not very specific and also hitting your molecule, which is not actually an *N*-oxide but rather a *N*-hydroxypyridinium ion. I will submit a PR to fix the reaction pattern; in the meantime you can fix

Re: [Rdkit-discuss] Searching in (Downloaded) Databases

2021-06-22 Thread Paolo Tosco
Hi Philipp, It looks like the supplier thinks the line index has gone past the end of file. 1) How large is the SMILES file which leads to this error (ls -l)? 2) Does it consistently happen at the same line number? You can check this with something like: suppl = Chem.SmilesMolSupplier(infile,

Re: [Rdkit-discuss] RDKit Error capturing: Chem.WrapLogs unexpected result

2021-06-18 Thread Paolo Tosco
Hi Adelene, WrapLogs() is in a way the equivalent of the bash tee command; it allows you to redirect stderr to a Python stream of your choice, but it does not suppress the original C++ stderr stream. If you wish to suppress it, you may redirect your wrap_logs.py script's stderr to /dev/null in

Re: [Rdkit-discuss] XYZ to mol ???

2021-06-06 Thread Paolo Tosco
Hi Joey, You may take a look at this project from Jan Jensen: https://github.com/jensengroup/xyz2mol If you have SMILES corresponding to your XYZ files you may also use this notebook of mine: https://gist.github.com/ptosco/4844d3635cf14d11e5e14381993915c1 HTH, cheers p. > On 4 Jun 2021, at

Re: [Rdkit-discuss] para-stereochemistry

2021-05-26 Thread Paolo Tosco
Dear Jean-Marc, I believe it indicates what the IUPAC Gold Book refers to as pseudoasymmetry. Let’s see if others agree with my interpretation. Cheers, P. > On 26 May 2021, at 22:28, Jean-Marc Nuzillard > wrote: > > I believed I sent a message with the same title a few minutes ago, but >

Re: [Rdkit-discuss] Create an asymmetric carbon.

2021-05-26 Thread Paolo Tosco
Hi Jean-Marc, You can use Chem.Atom.SetChiralTag(): from rdkit import Chem from rdkit.Chem.Draw import IPythonConsole IPythonConsole.drawOptions.addAtomIndices = True IPythonConsole.ipython_useSVG=True m = Chem.AddHs(Chem.MolFromSmiles('CCO')) m [image: image.png] a = m.GetAtomWithIdx(6)

Re: [Rdkit-discuss] Possible inconsistent default value for one "MMFFOptimizeMoleculeConfs" parameter

2021-05-10 Thread Paolo Tosco
Hi Leon, you are right, the Python MMFFOptimizeMolecule and MMFFOptimizeMoleculeConfs functions have a default of 100.0, while their C++ counterparts, (and their UFF counterparts, both C++ and Python) have a default of 10.0. There is no particular reason for that, other than historical. Even

Re: [Rdkit-discuss] Stereochemistry problem with spiro centre

2021-05-09 Thread Paolo Tosco
Hi James, IIRC that's a known open issue with the way spirocyclic pseudochiral centers are handled: https://github.com/rdkit/rdkit/issues/3490 Cheers, p. On Sun, May 9, 2021 at 10:15 AM James Davidson wrote: > Dear All, > > > > I am having some issues with tetrahedral stereochemistry

Re: [Rdkit-discuss] RDKit compilation from source question

2021-04-28 Thread Paolo Tosco
Hi Guilherme, it looks like it might be this: https://github.com/rdkit/rdkit/issues/2013#issuecomment-553563418 This can happen if you are using pre-compiled Boost libraries that were compiled with a different compiler from the one you are using for RDKit. To check if that's the case, compare

Re: [Rdkit-discuss] What should be the citation reference for this rdMolAlign.GetCrippenO3A() in RDKit?

2021-04-09 Thread Paolo Tosco
he method: rdMolAlign.*GetCrippenO3A**()* in RDKit > for comparing two 3D small molecule conformations. > > I want to know what is the proper citation for this method? > > Is this: Open3DALIGN: an open-source software aimed at unsupervised ligand > alignment > Paolo Tosco •

Re: [Rdkit-discuss] EnumerateStereoisomers fails without notice after using MolStandardize.Standardizer()

2021-04-08 Thread Paolo Tosco
Hi Zhenting, It looks like you need to reassign stereochemistry after calling fragment_parent: smi = 'CC(F)Cl' mol1 = Chem.MolFromSmiles(smi) s = MolStandardize.Standardizer() mol2 = s.fragment_parent(mol1) Chem.AssignStereochemistry(mol2, True, True, True) mol3 = Chem.AddHs(mol2) isomers1 =

Re: [Rdkit-discuss] HasSubstructMatch & GetSubstructMatches hang when useChirality is True

2021-03-26 Thread Paolo Tosco
Hi Christos, this is a possible workaround that will address your current problem: https://gist.github.com/ptosco/863cb55ace485c6664c21c244b2ca10a A better solution would be to implement in the C++ layer a callback or timeout similarly to MCS and other similar, potentially time consuming

Re: [Rdkit-discuss] Removing hydrogen atoms without neighbors

2021-01-21 Thread Paolo Tosco
Hi Navid, if I interpret correctly your question, either of these should do what you need: Chem.DeleteSubstructs(mol, Chem.MolFromSmarts("[#1X0]")) Chem.DeleteSubstructs(mol, Chem.MolFromSmarts("[#1]"), onlyFrags=True) HTH, p. On Wed, Jan 20, 2021 at 5:38 PM Navid Shervani-Tabar wrote: > Dear

Re: [Rdkit-discuss] Atom object comparison in python

2021-01-08 Thread Paolo Tosco
Hi Brian, when you fetch a Chem.Atom object from a Chem.Mol a Python object is created on-the-fly that wraps the underlying C++ object. Every time you do this, a new Python object is created. This does not apply only to GetNeighbors(); please see an example below: In [1]: from rdkit import Chem

Re: [Rdkit-discuss] reading in PDB file with altloc B

2020-12-09 Thread Paolo Tosco
Dear Susan, the reason is that PDBAtomLine() ignores records where the alternate location is different from ' ', 'A' or '1': https://github.com/rdkit/rdkit/blob/e7e17adc4ef822d2663fa6e1ba5b978512c7a8b4/Code/GraphMol/FileParsers/PDBParser.cpp#L62 I have run myself in the past into PDB files

Re: [Rdkit-discuss] canonicalization of two aromatic molecules returning two different forms (kekule and aromatic)

2020-12-04 Thread Paolo Tosco
Hi Alexis, you may cast the _vectdouble to a list or a tuple and then you'll be able to pickle it. The 2020.09.01 is just an oversight; you are indeed the 2020.09.02 version. Cheers, p. On Fri, Dec 4, 2020 at 1:32 PM Alexis Parenty wrote: > Dear Rdkiters, > > I could not pickle my models

Re: [Rdkit-discuss] Drawing atom in an undefined position

2020-11-30 Thread Paolo Tosco
Hi Ivan, The RDKit supports this through the standard SDF V3000 ENDPTS and ATTACH keywords; see this notebook from Greg for an example: https://github.com/rdkit/UGM_2020/blob/master/Notebooks/Landrum_WhatsNew.ipynb (search for "Position variation bonds" in the notebook) and the Biovia

Re: [Rdkit-discuss] canonicalization of two aromatic molecules returning two different forms (kekule and aromatic)

2020-11-27 Thread Paolo Tosco
er the 6-membered ring is seen as aromatic. > > > > Regards, > > Mark. > > > > *From:* Paolo Tosco > *Sent:* 27 November 2020 17:04 > *To:* Alexis Parenty > *Cc:* RDKit Discuss > *Subject:* Re: [Rdkit-discuss] canonicalization of two aromatic molecules

Re: [Rdkit-discuss] canonicalization of two aromatic molecules returning two different forms (kekule and aromatic)

2020-11-27 Thread Paolo Tosco
5-membered ring was broken, and that (to my eyes at least) should not > affect whether the 6-membered ring is seen as aromatic. > > > > Regards, > > Mark. > > > > *From:* Paolo Tosco > *Sent:* 27 November 2020 17:04 > *To:* Alexis Parenty > *Cc:* RDKit Discuss > *

Re: [Rdkit-discuss] canonicalization of two aromatic molecules returning two different forms (kekule and aromatic)

2020-11-27 Thread Paolo Tosco
Hi Alexis, The second molecule (smiles2) is indeed aromatic, but the first (smiles1) is not, as the imidazole ring condensed to the pyridine is partially saturated. The smiles1a analogue where I have added a double bond is aromatic, and upon canonicalization it yields an aromatic SMILES as

Re: [Rdkit-discuss] Nitrogen sp2 isomers get the same InChI Key

2020-10-29 Thread Paolo Tosco
Hi Gustavo, you can pass InChI options to the underlying InChI API through the options parameter of Chem.inchi.MolToInchi() and Chem.inchi.MolToInchiKey(); e.g.: inchi.MolToInchi(mol, options="/FixedH") Source:

Re: [Rdkit-discuss] Compile redkit_2020_09_1 on macOS Catalina

2020-10-28 Thread Paolo Tosco
Hi Zoltan, try adding to your cmake command -DBoost_NO_BOOST_CMAKE=ON. Cheers, p. On Wed, Oct 28, 2020 at 9:32 PM Zoltan Takacs wrote: > Hi, > > I am trying to compile rdkit version 2020-09-1 from source on MacOS > Catalina 10.15.7. Has anyone managed to do this without the boost errors? I >

Re: [Rdkit-discuss] How to preserve undefined stereochemistry?

2020-10-21 Thread Paolo Tosco
ing, you > would like to change the convention and have unspecified double bonds be > marked as unknown, it's straightforward to write a script that loops over > the molecule and makes that change (watch out for ring bonds). > > -greg > [1] Perhaps "mistake" isn't the right

Re: [Rdkit-discuss] How to preserve undefined stereochemistry?

2020-10-20 Thread Paolo Tosco
formatics > > UNIVERSITÉ DU LUXEMBOURG > > > LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE > > 6, avenue du Swing, L-4367 Belvaux > > T +356 46 66 44 67 18 > > [image: github.png] adelenelai > > > > > > -- > *From:* P

Re: [Rdkit-discuss] How to preserve undefined stereochemistry?

2020-10-20 Thread Paolo Tosco
Hi Adelene, this gist https://gist.github.com/ptosco/1e1c23ad24c90444993fa1db21ccb48b shows how to add stereo annotations to RDKit 2D depictions, and also how to access the double bond stereochemistry programmatically. Cheers, p. On Tue, Oct 20, 2020 at 12:24 PM Adelene LAI wrote: > Hi

Re: [Rdkit-discuss] MMFFMolProperties objects

2020-10-16 Thread Paolo Tosco
Hi Ling, MMFFOptimizeMolecule is a shorthand that does not allow passing a MMFFMolProperties object. Instead, you may follow the approach described in this gist: https://gistpreview.github.io/?24c2338ce18943da3e878de5585eb83a Cheers, p. On Fri, Oct 16, 2020 at 7:04 AM Ling Chan wrote: >

Re: [Rdkit-discuss] Problems with xyz-coordinates after setting PDBResidueinfo

2020-09-26 Thread Paolo Tosco
Hi Illimar, that's because most of your PDBResidueInfo fields are blank, including atom names, residue numbers, etc. If you wish to generate a PDB file that can be visualize correctly you need to copy this information from the existing atoms, e.g.: new_res_inf =

Re: [Rdkit-discuss] A question of molecule structure

2020-09-23 Thread Paolo Tosco
Hi Jason, This gist explains why you are seeing an unexpected result: https://gist.github.com/ptosco/20b06985cd8830d5e549165f6b9fc969 I think that, independently from the aromaticity model, tautomers with exocyclic double bonds should be penalised compared to tautomers with endocyclic double

Re: [Rdkit-discuss] Rdkit-discuss] MACCS keys - revisited

2020-09-08 Thread Paolo Tosco
Hi Mike, I put together a gist that might help: https://gist.github.com/ptosco/7bbad9e6441724e9638bc4093f48e31b This is basically a modification of the MACCSkeys._pyGenMACCSKeys() RDKit Python function, combined with a function I wrote some time ago to count non-overlapping matches in a

Re: [Rdkit-discuss] Isotope labeling

2020-09-05 Thread Paolo Tosco
Hi Eduardo, You can use the Chem.Atom.SetIsotope method; setting the isotope to 0 will remove isotopic labelling. Cheers, p. > On 5 Sep 2020, at 02:35, Eduardo Mayo wrote: > > Hi RDKit community. > How do I could remove isotope labels? Is there a way so I could avoid > converting to and

Re: [Rdkit-discuss] c++ atomic lifetime

2020-08-28 Thread Paolo Tosco
Hi Jason. to pinpoint potential memory issues you may run your code through valgrind. For example, it would flag access to previously freed memory in your program: 1 #include 2 #include 3 #include 4 5 int main() { 6 auto mol =

Re: [Rdkit-discuss] 2D coord for hydrogens

2020-08-27 Thread Paolo Tosco
Hi Mark, further to Fio's reply, I think the confusion stems from the fact that when you call AddHs() after MolFromSmiles() no coordinates are actually generated; these are only generated on the fly for visualization, and hence are correct before and after AddHs(), as the layout is always

Re: [Rdkit-discuss] How to set rdMolStandardize.CleanupParameters.maxTautomer for tautomer canonicalization

2020-08-27 Thread Paolo Tosco
Hi Fio, there is an open PR that addresses this and other issues with the TautomerEnumerator: https://github.com/rdkit/rdkit/pull/3327 As soon as it will be merged in the main trunk this functionality will be available. Hope that helps, cheers p. On Thu, Aug 27, 2020 at 9:08 PM Fiorella Ruggiu

[Rdkit-discuss] Valence coding in atom block of SDF files written by RDKit

2020-08-26 Thread Paolo Tosco
Hi Jean-Marc, You can strip the valence field from the MolBlock with a regex: import re regex = re.compile(r"^(\s*\d+\.\d{4}\s*\d+\.\d{4}\s*\d+\.\d{4} ... \d \d \d \d \d )(\d)(.*)$") print("\n".join(regex.sub(r"\g<1>0\g<3>", ...: line) for line in Chem.MolToMolBlock(Chem....:

Re: [Rdkit-discuss] Need help with MMFFOptimize

2020-08-22 Thread Paolo Tosco
Hi Joanna, could you please provide some context and paste your failing code? You seem to be referring to an e-mail thread that I can't find. Thanks, cheers p. On Sat, Aug 22, 2020 at 1:56 PM ITS RDC wrote: > Dear all/Greg, > > The conformation generation worked when I used the >

Re: [Rdkit-discuss] Keeping 3D coordinates from sdf file

2020-08-21 Thread Paolo Tosco
zeMolecule(mol) > > When I print out the coordinates I can see the heavy atoms have also been > modified. Would you be able to help me out here? > > Best regards, > Puck > > On Thu, 20 Aug 2020 at 13:26, Paolo Tosco > wrote: > >> Hi Puck, >> >> When y

Re: [Rdkit-discuss] Keeping 3D coordinates from sdf file

2020-08-20 Thread Paolo Tosco
Hi Puck, When you read a SDF file using a SDMolSupplier RDKit will retain 3D coordinates. You can access them from the mol Conformer (a molecule can have multiple Conforrmers; one is generated for you when you read a set of coordinates): mol.GetConformer().GetAtomPosition(atom_idx) If you

Re: [Rdkit-discuss] Atom Order Canonicalization

2020-08-14 Thread Paolo Tosco
Hi Jeffrey, this gist shows how to achieve what you need: https://gist.github.com/ptosco/36574d7f025a932bc1b8db221903a8d2 i.e., how to reorder atoms based on the result of Chem.CanonicalRankAtoms(). HTH, cheers p. On Fri, Aug 14, 2020 at 8:36 PM Jeffrey Van santen < jeffrey_van_san...@sfu.ca>

Re: [Rdkit-discuss] Aligning target molecule onto reference molecules

2020-06-30 Thread Paolo Tosco
Hi Lara, I have put together a gist that uses GetO3A() to align benzene to naphthalene and I got the correct mappings, so I am not sure what went wrong for you: https://gist.github.com/ptosco/1dc3eeae87b5676b60c3a47ad60cea0a To do the sort of alignments that you describe you could also

Re: [Rdkit-discuss] check if there is a path between atoms in fragmented molecule

2020-06-04 Thread Paolo Tosco
Dear Michal, You can use http://rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=path#rdkit.Chem.rdmolops.GetShortestPath or http://rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=distance#rdkit.Chem.rdmolops.GetDistanceMatrix e.g.: from rdkit import Chem

Re: [Rdkit-discuss] Unknown CMake command "downloadAndCheckMD5"

2020-05-28 Thread Paolo Tosco
unable to build against boost 1.71 but when I switched back to 1.67 it was OK. Seems to be building fine now. Tim On Thu, May 28, 2020 at 5:09 PM Paolo Tosco mailto:paolo.tosco.m...@gmail.com>> wrote: Hi Tim, could you please run a pwd in your build directory? Also, could y

Re: [Rdkit-discuss] Unknown CMake command "downloadAndCheckMD5"

2020-05-28 Thread Paolo Tosco
h is where it has always been run from. e.g. mkdir build cd build cmake -DPYTHON_EXECUTABLE=/usr/bin/python3  -DRDK_BUILD_INCHI_SUPPORT=ON  -DRDK_BUILD_AVALON_SUPPORT=ON  -DRDK_BUILD_PYTHON_WRAPPERS=ON  -DRDK_BUILD_SWIG_WRAPPERS=ON .. Tim On Thu, May 28, 2020 at 4:46 PM P

Re: [Rdkit-discuss] Unknown CMake command "downloadAndCheckMD5"

2020-05-28 Thread Paolo Tosco
Hi Tim, downloadAndCheckMD5 is a function defined in Code/cmake/Modules/RDKitUtils.cmake, which is included by the main CMakeLists.txt file and then is available to all children CMakeLists.txt files. From the line number where the error occurs $ grep -n downloadAndCheckMD5 `find . -name

Re: [Rdkit-discuss] Converting csv/xls file containing SMILES to .sdf

2020-05-28 Thread Paolo Tosco
Hi Joanna, I put a small gist here: https://gist.github.com/ptosco/49bdfc55db7277c7c94aca71b69f64b5 which reads SMILES and compound names from a CSV string; you may easily modify the code to read from a CSV file. Note that you could actually even just use

Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

2020-05-20 Thread Paolo Tosco
Hi Theo, that's because you omitted the sanitization step completely, so the molecule is missing crucial information for the SubstructureMatch to do a proper job. If you put back sanitization, only leaving out the aromatization step, things work as expected. Also, you do not need to create

Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

2020-05-19 Thread Paolo Tosco
schrieb Paolo Tosco: Hi Theo, the lack of match is due to different aromaticity flags on atoms and bonds in the larger molecule. This gist provides some explanation and a possible solution: https://gist.github.com/ptosco/e410e45278b94e8f047ff224193d7788 Cheers, p. On 19/05/2020 14:13, theozh

Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

2020-05-19 Thread Paolo Tosco
Hi Theo, the lack of match is due to different aromaticity flags on atoms and bonds in the larger molecule. This gist provides some explanation and a possible solution: https://gist.github.com/ptosco/e410e45278b94e8f047ff224193d7788 Cheers, p. On 19/05/2020 14:13, theozh wrote: Dear

Re: [Rdkit-discuss] unable to build from source with cmake

2020-05-14 Thread Paolo Tosco
Hi Gabriele, add -DBoost_DEBUG=ON to your cmake command to have more information regarding the failure. Also, does adding -DBoost_NO_BOOST_CMAKE=ON help or change things? Cheers, p. On 14/05/2020 13:45, baldu...@units.it wrote: hello I'm clearly missing something here, but not being able

Re: [Rdkit-discuss] Sanitize molecule with explicit Hydrogens to catch an error

2020-05-11 Thread Paolo Tosco
Dear Pablo, You might do something along these lines: from rdkit import Chem smi = "[H]C([H])O" params = Chem.SmilesParserParams() params.sanitize = True params.removeHs = False mol = Chem.MolFromSmiles(smi, params) for a in mol.GetAtoms(): if a.GetNumImplicitHs():

  1   2   3   4   >