Pavel,
this is a bit hacky, but you can try the below:
```
def get_frag_smi(mol,frag_atoms):
if len(frag_atoms) > 1:
b2b = [] # bonds to break
fsmi = "" #fragment smiles
# get bonds outside of fragment
for b in mol.GetBonds():
b_idx = b.GetBeginAtomId
I would expect the problem here is kekulization. The SMARTS is pattern
matching using the kekule structure (i.e. double and single bonds, non
aromatic atoms) and is not sanitized whereas the SMILES after parsing and
sanitization has aromatic bonds and aromatic atoms. Try what happens when
you do a
hi lewis,
if i am not mistaken this is because the tautomer transfor "1,3 aromatic
heteroatom H shift" does not account for other chalcogens than oxygen, so
no selenium, tellurium or sulfur.
you can find the list of transforms here:
https://github.com/rdkit/rdkit/blob/8dae48b7a17fd984c69d04549e6d9b
how about:
len(list(mol.GetAromaticAtoms()))
best wishes
wim
On Sun, 5 Nov 2023, 08:41 Chris Swain via Rdkit-discuss, <
rdkit-discuss@lists.sourceforge.net> wrote:
> Hi,
>
> Perhaps I’m missing something obvious, but is there a way to calculate the
> number of aromatic atoms in a molecule?
>
>
Why there is a counter between parentheses there, I don't know, but in case
there's no option to remove it, you might just manually remove it using a
regex to remove anything between parentheses on a line that starts with >
for example:
from rdkit import Chem
import re
from io import StringIO
m =
wim
On Fri, Aug 25, 2023 at 8:28 PM Wim Dehaen wrote:
> Dear Andreas,
> that's a good find. i agree the breaking case can be considered bridgehead
> structure, as it's essentially bicyclo-[3.2.1]-octane plus an extra bond. I
> need to think about this some more, but it might be
identify just the bridgehead atoms.
>
> Best wishes,
> Andreas
>
> On Sat, Dec 3, 2022 at 12:53 PM Wim Dehaen wrote:
>
>> Hi Andreas,
>> I don't have a good SMARTS pattern available for this but here is a
>> function that should return bridge
Dear Andreas,
the issue is with your aldehyde/ketone smarts. it looks for an explicit
aldehyde H that is not there. When the input smi is NCC(=O)C the
substructure matches.
An alternative smarts you can use that will match aldehyde but not esters
and amides:
[#7H2].[#6][C;!$(C-O);!$(C-N)](=[O])
b
Hi,
i'm not sure if i understand the question perfectly, so apologies if the
below is behind the point. i think in general, for analysis like this it is
better to make use of rdkit's SSSR functionality and then use the ring
information in the way required for your purpose. this tends to be much
mor
You need to run the reaction twice (or more generally you can rerun
runreactants on the products iteratively until their are no more new
products, so examples like phloroglucinol to 1,3,5-trifluorobenzene will
work too.)
best wishes
wim
On Sun, Aug 6, 2023 at 4:50 PM Andreas Luttens
wrote:
> Dea
Hi Joey,
I think the most straightforward way to do this is to use GetNeighbors() on
all atoms. See below for an example:
from rdkit import Chem
mol=Chem.MolFromSmiles("O1COc2c1ccc(CC(NC)C)c2")
substruct=Chem.MolFromSmarts("c1c1")
a=mol.GetSubstructMatch(substruct)
print("substructure benzene
I think the problem is that you specify exact, explicit hydrogen count on
the product end of your reaction SMARTS. Instead you can use this
(rxnSMARTS bolded):
rxn=AllChem.ReactionFromSmarts("
*[F][Zr@]([Cl])([Br])[CH2:1][C:2]([C:3])[C!H3:4].[CH2:5]=[CH1:6][C:7]>>[F][Zr@@]([Cl])([Br])[*:5][*@:6]([
[Zr;H1] this smarts pattern should match an Zirconium with hcount of
exactly one. see below for a demonstration:
m=Chem.MolFromSmiles("[ZrH]CC")
pat=Chem.MolFromSmarts("[Zr;H1]")
len(m.GetSubstructMatches(pat))
1
hope this helps,
wim
On Fri, May 19, 2023 at 12:33 AM Jarod Younker
wrote:
> I’
Hi,
I think if you simply need H and the H count appended it is by far the
easiest by just appending it to the symbol string. See the codeblock below:
def get_symbol_with_Hs(a):
symbol=a.GetSymbol()
charge=a.GetFormalCharge()
hcount=a.GetTotalNumHs()
if hcount > 0:
symbol+=
Hi all,
unfortunately I can't offer a "fix" but I can offer these minor comments:
-it seems like the SMILES has some parsing error. You can make uses of
RDKits extension for dative bonds in SMILES ("->") and replace the SMILES
with the below, which will parse, and give (what i assume is) the intend
THe reason for this is that it will prevent ambiguities due to nonstandard,
higher valences. Because of this, it is not possible to infer the implicit
hydrogen count, so it must be specified explicitly. For S and P the
standard valence would be 2 and 3 respectively, just like for O and N. But
S has
)
>
> Follow us:
> [image: Mestrelab Twitter] <https://twitter.com/mestrelab> [image:
> Mestrelab Linkedin] <https://www.linkedin.com/company/mestrelab-research>
> [image: Canal de YouTube Mestrelab]
> <https://www.youtube.com/channel/UCf3MVnd3XZflv0acvTv14ww&
rdkit outputs a molfile with correct isotope labels for me using just:
mol=Chem.MolFromSmiles("[3H]c1c1[2H]")
Chem.MolToMolFile(mol,"test.mol")
or labelling the atoms post hoc:
mol=Chem.MolFromSmiles("c1c1")
mol=Chem.AddHs(mol)
mol.GetAtomWithIdx(6).SetIsotope(3)
mol.GetAtomWithIdx(7).Se
the sdf doesnt parse so well for me when pasted from the mail(and seems to
contain an unusual conformation) but have you tried to turn
includeNeighbors=True in this line:
numHs = a.GetTotalNumHs(includeNeighbors=True)
in case this fixes your issue, discussion why this flag is needed can be
found i
Hi all,
No error on rdkit 2022.09.04 boost 1_78 (Ubuntu 20.04).
However I am able to reproduce the error on 2022.03.2, here it is:
Invariant Violation
no eligible neighbors for chiral center
Violation occurred on line 238 in file
/project/build/temp.linux-x86_64-cpython-39/rdkit/Code/GraphMol
Hello,
I think the place the hydrogens get lost is during the "MolFromMolBlock"
operation. Try to add the flag *removeHs=False.*
best wishes,
wim
On Tue, Dec 20, 2022 at 10:56 AM Omar H94 wrote:
> Dear Petro,
>
> Try using: Chem.AddHs(mol, addCoords=True) to get hydrogen atoms with 3D
> coordina
Hi Andreas,
I don't have a good SMARTS pattern available for this but here is a
function that should return bridgehead idx and not include non bridgehead
fused ring atoms:
```
def return_bridgeheads_idx(mol):
bh_list=[]
intersections=[]
sssr_idx = [set(x) for x in list(Chem.GetSymmSSSR
]2[*:4]-[n]=[*:6][c:7]12"
On Wed, Nov 23, 2022 at 4:02 PM Wim Dehaen wrote:
> Hello,
> I think the error is in the placement of the $ in the SMARTS query, I am
> not sure what is the purpose in there. if i run
>
> reaction_smarts =
>
> '[cX3]1[cX3][cX3][cX3]([N
Hello,
I think the error is in the placement of the $ in the SMARTS query, I am
not sure what is the purpose in there. if i run
reaction_smarts =
'[cX3]1[cX3][cX3][cX3]([NX3H2:1])[cX3]([NX3H2:2])[cX3]1>>[cX3]1[cX3][cX3][cX3][cX3]2[nX3:1]=[nX3]-[nX3H:2][cX3]12'
rxn = AllChem.ReactionFromSmarts(re
correctly give two discrete resonance structures:
CC1=CC=CC=C1C
CC1=C(C)C=CC=C1
best wishes
wim
On Mon, Nov 14, 2022 at 1:13 PM Wim Dehaen wrote:
> Hello,
> I can reproduce the different behavior between a 2020.09 version and an up
> to date one on my end as well. I think it is re
Hello,
I can reproduce the different behavior between a 2020.09 version and an up
to date one on my end as well. I think it is related to this issue:
https://github.com/rdkit/rdkit/issues/3973 during the kekulization some
unintended normalization seems to be happening leading to the same kekule
str
Hi,
For an even simpler definition that works "well enough" for many cases I
have been using
"[NX3,NX2,nX2;!$(NC=O);!$(NS=O)]"
this excludes amides, nitro,nitroso, sulfonamide and pyrrole nitrogens, but
include aliphatic amines, anilines, pyridines, hydroxylamines, imines etc
best wishes
wim
On Th
The above solution with !r4 doesn't work because for sssr reasons these
atoms are considered to be in a 4 membered ring also if the 4 membered ring
is "exo" to the central 6 membered one. AFAIK there is no good way to do a
general ring size filter in an atom definition using SMARTS. Below is a
quit
Hi,
I think that is not expected behavior and perhaps it could be a consequence
of a faulty reaction definition.
using the following with a generic amide coupling:
rxn =
AllChem.ReactionFromSmarts('[C:1](=[O:2])[OH:3].[N:4]>>[C:1](=[O:2])[N:4].[OH:3]')
reactant1 = Chem.MolFromSmiles("C[C@@H](OC1=C
Hi all,
Just noticed why some of the SMILES in my script don't parse: I
accidentally put brackets at the terminal carbons, my bad. Here is a fixed
script also with updated molecular weights
https://gist.github.com/dehaenw/bb5704fc4d108eec8f8e999d6ab79118
I looked into the different total amount of
Due to the nature of SCCP, which are just based on chlorination of
n-alkanes (so just linear), enumerating them via a more limited method than
the interesting linked preprint is also possible.
This can be done exhaustively on the string level
a given carbon in the chain will have either 0 or 1 or 2
Hi,
I think there is an issue on the level of your reaction SMARTS, in fact i
cannot get it to work on your example molecules, as there is a methyl
[CH3:0] amine defined in the query which is not to be found in the
substrate. I imagine some more explicit mapping of what connects and what
breaks whe
Re: your second issue with the 2D depiction, especially for these kinds of
bicyclic structures, the Schrödinger coordgen that is a part of RDKit gives
better looking results(at the cost of a somewhat longer running time). You
can turn it on using:
from rdkit.Chem import rdDepictor
rdDepictor.SetPre
33 matches
Mail list logo