Re: [Rdkit-discuss] canonical fragment SMILES

2025-03-27 Thread Wim Dehaen
Pavel, this is a bit hacky, but you can try the below: ``` def get_frag_smi(mol,frag_atoms): if len(frag_atoms) > 1: b2b = [] # bonds to break fsmi = "" #fragment smiles # get bonds outside of fragment for b in mol.GetBonds(): b_idx = b.GetBeginAtomId

Re: [Rdkit-discuss] RDKit PostgreSQL extension: Unexpected behaviour of substruct()

2024-06-27 Thread Wim Dehaen
I would expect the problem here is kekulization. The SMARTS is pattern matching using the kekule structure (i.e. double and single bonds, non aromatic atoms) and is not sanitized whereas the SMILES after parsing and sanitization has aromatic bonds and aromatic atoms. Try what happens when you do a

Re: [Rdkit-discuss] One tautomer not included in list of enumerated tautomers

2024-02-05 Thread Wim Dehaen
hi lewis, if i am not mistaken this is because the tautomer transfor "1,3 aromatic heteroatom H shift" does not account for other chalcogens than oxygen, so no selenium, tellurium or sulfur. you can find the list of transforms here: https://github.com/rdkit/rdkit/blob/8dae48b7a17fd984c69d04549e6d9b

Re: [Rdkit-discuss] Aromatic atoms

2023-11-05 Thread Wim Dehaen
how about: len(list(mol.GetAromaticAtoms())) best wishes wim On Sun, 5 Nov 2023, 08:41 Chris Swain via Rdkit-discuss, < rdkit-discuss@lists.sourceforge.net> wrote: > Hi, > > Perhaps I’m missing something obvious, but is there a way to calculate the > number of aromatic atoms in a molecule? > >

Re: [Rdkit-discuss] mol properties in SDWriter

2023-09-25 Thread Wim Dehaen
Why there is a counter between parentheses there, I don't know, but in case there's no option to remove it, you might just manually remove it using a regex to remove anything between parentheses on a line that starts with > for example: from rdkit import Chem import re from io import StringIO m =

Re: [Rdkit-discuss] Distinguishing bridgeheads from ring-fusions with SMARTS

2023-08-25 Thread Wim Dehaen
wim On Fri, Aug 25, 2023 at 8:28 PM Wim Dehaen wrote: > Dear Andreas, > that's a good find. i agree the breaking case can be considered bridgehead > structure, as it's essentially bicyclo-[3.2.1]-octane plus an extra bond. I > need to think about this some more, but it might be

Re: [Rdkit-discuss] Distinguishing bridgeheads from ring-fusions with SMARTS

2023-08-25 Thread Wim Dehaen
identify just the bridgehead atoms. > > Best wishes, > Andreas > > On Sat, Dec 3, 2022 at 12:53 PM Wim Dehaen wrote: > >> Hi Andreas, >> I don't have a good SMARTS pattern available for this but here is a >> function that should return bridge

Re: [Rdkit-discuss] Two functional groups in a single SMARTS pattern

2023-08-24 Thread Wim Dehaen
Dear Andreas, the issue is with your aldehyde/ketone smarts. it looks for an explicit aldehyde H that is not there. When the input smi is NCC(=O)C the substructure matches. An alternative smarts you can use that will match aldehyde but not esters and amides: [#7H2].[#6][C;!$(C-O);!$(C-N)](=[O]) b

Re: [Rdkit-discuss] Varying ring size substructure match.

2023-08-20 Thread Wim Dehaen
Hi, i'm not sure if i understand the question perfectly, so apologies if the below is behind the point. i think in general, for analysis like this it is better to make use of rdkit's SSSR functionality and then use the ring information in the way required for your purpose. this tends to be much mor

Re: [Rdkit-discuss] Multiple products with runReactants in C++

2023-08-06 Thread Wim Dehaen
You need to run the reaction twice (or more generally you can rerun runreactants on the products iteratively until their are no more new products, so examples like phloroglucinol to 1,3,5-trifluorobenzene will work too.) best wishes wim On Sun, Aug 6, 2023 at 4:50 PM Andreas Luttens wrote: > Dea

Re: [Rdkit-discuss] Question of substructure "neighborhood"

2023-06-30 Thread Wim Dehaen
Hi Joey, I think the most straightforward way to do this is to use GetNeighbors() on all atoms. See below for an example: from rdkit import Chem mol=Chem.MolFromSmiles("O1COc2c1ccc(CC(NC)C)c2") substruct=Chem.MolFromSmarts("c1c1") a=mol.GetSubstructMatch(substruct) print("substructure benzene

Re: [Rdkit-discuss] REACTIVE SMARTS yields product with explicit valence greater than 5 for carbon

2023-05-30 Thread Wim Dehaen
I think the problem is that you specify exact, explicit hydrogen count on the product end of your reaction SMARTS. Instead you can use this (rxnSMARTS bolded): rxn=AllChem.ReactionFromSmarts(" *[F][Zr@]([Cl])([Br])[CH2:1][C:2]([C:3])[C!H3:4].[CH2:5]=[CH1:6][C:7]>>[F][Zr@@]([Cl])([Br])[*:5][*@:6]([

Re: [Rdkit-discuss] Virtual hydrogens for metals (smiles and smarts)

2023-05-18 Thread Wim Dehaen
[Zr;H1] this smarts pattern should match an Zirconium with hcount of exactly one. see below for a demonstration: m=Chem.MolFromSmiles("[ZrH]CC") pat=Chem.MolFromSmarts("[Zr;H1]") len(m.GetSubstructMatches(pat)) 1 hope this helps, wim On Fri, May 19, 2023 at 12:33 AM Jarod Younker wrote: > I’

Re: [Rdkit-discuss] how to get indexes and atoms with H from smiles

2023-05-09 Thread Wim Dehaen
Hi, I think if you simply need H and the H count appended it is by far the easiest by just appending it to the symbol string. See the codeblock below: def get_symbol_with_Hs(a): symbol=a.GetSymbol() charge=a.GetFormalCharge() hcount=a.GetTotalNumHs() if hcount > 0: symbol+=

Re: [Rdkit-discuss] Molfile from smiles

2023-05-02 Thread Wim Dehaen
Hi all, unfortunately I can't offer a "fix" but I can offer these minor comments: -it seems like the SMILES has some parsing error. You can make uses of RDKits extension for dative bonds in SMILES ("->") and replace the SMILES with the below, which will parse, and give (what i assume is) the intend

Re: [Rdkit-discuss] Unwanted explicit Hs

2023-04-29 Thread Wim Dehaen
THe reason for this is that it will prevent ambiguities due to nonstandard, higher valences. Because of this, it is not possible to infer the implicit hydrogen count, so it must be specified explicitly. For S and P the standard valence would be 2 and 3 respectively, just like for O and N. But S has

Re: [Rdkit-discuss] Deuterium/Tritium labels in Molfile

2023-04-11 Thread Wim Dehaen
) > > Follow us: > [image: Mestrelab Twitter] <https://twitter.com/mestrelab> [image: > Mestrelab Linkedin] <https://www.linkedin.com/company/mestrelab-research> > [image: Canal de YouTube Mestrelab] > <https://www.youtube.com/channel/UCf3MVnd3XZflv0acvTv14ww&

Re: [Rdkit-discuss] Deuterium/Tritium labels in Molfile

2023-04-10 Thread Wim Dehaen
rdkit outputs a molfile with correct isotope labels for me using just: mol=Chem.MolFromSmiles("[3H]c1c1[2H]") Chem.MolToMolFile(mol,"test.mol") or labelling the atoms post hoc: mol=Chem.MolFromSmiles("c1c1") mol=Chem.AddHs(mol) mol.GetAtomWithIdx(6).SetIsotope(3) mol.GetAtomWithIdx(7).Se

Re: [Rdkit-discuss] problem when reading in a .sdf file w/ hydrogens already present and removeHs=False

2023-04-03 Thread Wim Dehaen
the sdf doesnt parse so well for me when pasted from the mail(and seems to contain an unusual conformation) but have you tried to turn includeNeighbors=True in this line: numHs = a.GetTotalNumHs(includeNeighbors=True) in case this fixes your issue, discussion why this flag is needed can be found i

Re: [Rdkit-discuss] MolToMolBlock problem

2023-02-22 Thread Wim Dehaen
Hi all, No error on rdkit 2022.09.04 boost 1_78 (Ubuntu 20.04). However I am able to reproduce the error on 2022.03.2, here it is: Invariant Violation no eligible neighbors for chiral center Violation occurred on line 238 in file /project/build/temp.linux-x86_64-cpython-39/rdkit/Code/GraphMol

Re: [Rdkit-discuss] Adding hydrogen to conformations works srangely.

2022-12-20 Thread Wim Dehaen
Hello, I think the place the hydrogens get lost is during the "MolFromMolBlock" operation. Try to add the flag *removeHs=False.* best wishes, wim On Tue, Dec 20, 2022 at 10:56 AM Omar H94 wrote: > Dear Petro, > > Try using: Chem.AddHs(mol, addCoords=True) to get hydrogen atoms with 3D > coordina

Re: [Rdkit-discuss] Distinguishing bridgeheads from ring-fusions with SMARTS

2022-12-03 Thread Wim Dehaen
Hi Andreas, I don't have a good SMARTS pattern available for this but here is a function that should return bridgehead idx and not include non bridgehead fused ring atoms: ``` def return_bridgeheads_idx(mol): bh_list=[] intersections=[] sssr_idx = [set(x) for x in list(Chem.GetSymmSSSR

Re: [Rdkit-discuss] Cannot match reaction SMART with reactant

2022-11-23 Thread Wim Dehaen
]2[*:4]-[n]=[*:6][c:7]12" On Wed, Nov 23, 2022 at 4:02 PM Wim Dehaen wrote: > Hello, > I think the error is in the placement of the $ in the SMARTS query, I am > not sure what is the purpose in there. if i run > > reaction_smarts = > > '[cX3]1[cX3][cX3][cX3]([N

Re: [Rdkit-discuss] Cannot match reaction SMART with reactant

2022-11-23 Thread Wim Dehaen
Hello, I think the error is in the placement of the $ in the SMARTS query, I am not sure what is the purpose in there. if i run reaction_smarts = '[cX3]1[cX3][cX3][cX3]([NX3H2:1])[cX3]([NX3H2:2])[cX3]1>>[cX3]1[cX3][cX3][cX3][cX3]2[nX3:1]=[nX3]-[nX3H:2][cX3]12' rxn = AllChem.ReactionFromSmarts(re

Re: [Rdkit-discuss] [bug] ResonanceMolSupplier not working as expected

2022-11-14 Thread Wim Dehaen
correctly give two discrete resonance structures: CC1=CC=CC=C1C CC1=C(C)C=CC=C1 best wishes wim On Mon, Nov 14, 2022 at 1:13 PM Wim Dehaen wrote: > Hello, > I can reproduce the different behavior between a 2020.09 version and an up > to date one on my end as well. I think it is re

Re: [Rdkit-discuss] [bug] ResonanceMolSupplier not working as expected

2022-11-14 Thread Wim Dehaen
Hello, I can reproduce the different behavior between a 2020.09 version and an up to date one on my end as well. I think it is related to this issue: https://github.com/rdkit/rdkit/issues/3973 during the kekulization some unintended normalization seems to be happening leading to the same kekule str

Re: [Rdkit-discuss] SMARTS definition for basic nitrogen

2022-10-06 Thread Wim Dehaen
Hi, For an even simpler definition that works "well enough" for many cases I have been using "[NX3,NX2,nX2;!$(NC=O);!$(NS=O)]" this excludes amides, nitro,nitroso, sulfonamide and pyrrole nitrogens, but include aliphatic amines, anilines, pyridines, hydroxylamines, imines etc best wishes wim On Th

Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Wim Dehaen
The above solution with !r4 doesn't work because for sssr reasons these atoms are considered to be in a 4 membered ring also if the 4 membered ring is "exo" to the central 6 membered one. AFAIK there is no good way to do a general ring size filter in an atom definition using SMARTS. Below is a quit

Re: [Rdkit-discuss] RunReactants and chirality?

2022-01-24 Thread Wim Dehaen
Hi, I think that is not expected behavior and perhaps it could be a consequence of a faulty reaction definition. using the following with a generic amide coupling: rxn = AllChem.ReactionFromSmarts('[C:1](=[O:2])[OH:3].[N:4]>>[C:1](=[O:2])[N:4].[OH:3]') reactant1 = Chem.MolFromSmiles("C[C@@H](OC1=C

Re: [Rdkit-discuss] generating smiles using RDKit

2021-12-08 Thread Wim Dehaen
Hi all, Just noticed why some of the SMILES in my script don't parse: I accidentally put brackets at the terminal carbons, my bad. Here is a fixed script also with updated molecular weights https://gist.github.com/dehaenw/bb5704fc4d108eec8f8e999d6ab79118 I looked into the different total amount of

Re: [Rdkit-discuss] generating smiles using RDKit

2021-12-08 Thread Wim Dehaen
Due to the nature of SCCP, which are just based on chlorination of n-alkanes (so just linear), enumerating them via a more limited method than the interesting linked preprint is also possible. This can be done exhaustively on the string level a given carbon in the chain will have either 0 or 1 or 2

Re: [Rdkit-discuss] Using EnumerateLibraryFromReaction without fragmenting reactants

2021-12-05 Thread Wim Dehaen
Hi, I think there is an issue on the level of your reaction SMARTS, in fact i cannot get it to work on your example molecules, as there is a methyl [CH3:0] amine defined in the query which is not to be found in the substrate. I imagine some more explicit mapping of what connects and what breaks whe

Re: [Rdkit-discuss] GetSubstructMatch bug? + mol depiction issue

2021-11-04 Thread Wim Dehaen
Re: your second issue with the 2D depiction, especially for these kinds of bicyclic structures, the Schrödinger coordgen that is a part of RDKit gives better looking results(at the cost of a somewhat longer running time). You can turn it on using: from rdkit.Chem import rdDepictor rdDepictor.SetPre