I can reproduce this with an older version, but the problem is that you
have RemoveHs instead of removeHs.
On Mon, Oct 2, 2017 at 7:20 AM, Guillaume GODIN <
guillaume.go...@firmenich.com> wrote:
> Dear Greg,
>
>
> I don't know if it's related but I have this issue on my mac version since
> this
Hi,
Is it reasonable to expect that a SMARTS should match itself when
useQueryQueryMatches=True?
query = Chem.MolFromSmarts('[C;!$(C=O)]Cl')
query.HasSubstructMatch(query, useQueryQueryMatches=True)
The above returns False. Without useQueryQueryMatches, it returns True, but
I think I need
On Fri, Mar 9, 2018 at 12:13 AM, Greg Landrum <greg.land...@gmail.com>
wrote:
> Hi Ivan,
>
> On Wed, Mar 7, 2018 at 8:58 PM, Ivan Tubert-Brohman <ivan.tubert-brohman@
> schrodinger.com> wrote:
>
>>
>> Is it reasonable to expect that a SMARTS should match i
no error is thrown. The
> aromaticity perception (step 6) does not consider the ring to be aromatic,
> so the final molecule is the equivalent of C1=N(C)C=CN1.
>
> It ought to be possible to clear this in the sanitization code relatively
> easily; I just need to think about i
Hi,
I was surprised to see that a (dubious) structure that goes through
SanitizeMol OK can fail a subsequent sanitization call:
print("Start")
mol = Chem.MolFromSmiles('C1=n(C)-c=Cn1', sanitize=False)
print("Before first sanitization")
Chem.SanitizeMol(mol)
print("Before second sanitization")
Hi Michal,
The old SDF format (aka V2000 CTAB) is column-based, as things often were
in the era of Fortran 77 and punch cards. Not only the precision but also
the exact position of each value on the line is specified! Here's what the
spec says:
The Atom Block is made up of atom lines, one line
Hi Baptiste,
RDKit focuses on "simple rings". As far as I know, it has no builtin
function to return all possible cycles in a molecule.
For a molecule with a "basis set" of N rings, there can be up to 2^N-1 ring
systems, which can be obtained by taking all possible subsets (aka the
powerset) of
t; isotopes, you can set unique isotope numbers for every reacting atom. Those
> will be preserved in the products so you can get the atom-atom mapping
> after running the reaction.
>
> Connor
>
> On Mon, Sep 17, 2018 at 10:36 AM Ivan Tubert-Brohman schrodinger.com> wrote:
>
I'd like to know where each atom in a reaction product came from, but as
far as I can tell, RDKit doesn't provide enough information. Here's what I
found out empirically so far.
There are four kinds of product atoms:
1. New atoms: atoms are defined in the product template without a mapping
The problem is this line:
>
core_smiles_2='C1=C/C2=C/c3ccc4n3[Zn]n3/c(cc/c3=C/C3=N/C(=C\4)C=C3)=C\C1=N2'
Python is interpreting the \4 as an escape sequence. You either need to
double the backslash or use an "r string" to protect the backslash from
being interpreted that way. That is, either of
Hi Jean-Marc,
Try the reaction smarts '[C:1]([OH:2])=[N:3]>>[C:1](=[OH0:2])[NH:3]'. The
only difference is the addition of "H0" to product atom :2. The problem is
that the hydrogen count from the reactant atom gets copied over unless
specified otherwise.
Hope this helps,
Ivan
On Wed, Feb 6,
Hi,
For reasons to complicated to get into here, I ended up with a molecule
containing a =CH2 in which one of the hydrogens was explicit and had E/Z
stereo info. For example, consider [H]/C=C/F.
I was surprised that RemoveHs() refused to remove the hydrogen, although
later I found that that's
It is aromatic according to the RDKit aromaticity model described here:
https://www.rdkit.org/docs/RDKit_Book.html#aromaticity
The O and N each contribute 2 electrons. Each of the carbons shared with
the 6-member ring contribute one electron. The carbonyl is sp2 and
contributes zero electrons.
This is from lipinski.cpp:
if (strict == NonStrict) {
std::string pattern = "[!$(*#*)&!D1]-&!@[!$(*#*)&!D1]";
pattern_flyweight m(pattern);
return m.get().countMatches(mol);
}
else if (strict==Strict) {
std::string strict_pattern =
results
for strict and non-strict, as expected, and the default was the same as
strict.
On Tue, Oct 15, 2019 at 1:57 PM Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:
> This is from lipinski.cpp:
>
> if (strict == NonStrict) {
> std::string patte
) conj?: 0 aromatic?: 0
> 2 2->3 order: 1 dir: 4 conj?: 0 aromatic?: 0
>
>
> Given that the two substituents on the first C are the same, the double
> bond shouldn't be marked as STEREOE at all.
>
> I'll get this fixed.
> -greg
>
>
>
> On Wed, Nov 6, 2019
t to pull up things like
> anthracene which might not be something you’d want to class as a macrocycle.
> Cheers,
> Dave
>
> On Wed, 9 Oct 2019 at 14:39, Ivan Tubert-Brohman <
> ivan.tubert-broh...@schrodinger.com> wrote:
>
>> Hi Thomas,
>>
>> I don't know
Hi Pablo,
SMILES by definition has implicit hydrogens (enough to satisfy the typicial
valence) for atoms that are not within brackets.
It doesn't matter if you write C, C[H], [H]C[H], or [H]C([H])([H])[H]; they
are all methane. The number of hydrogens that are returned by
GetNumImplicitHs() and
Hi Curt,
According to
https://www.rdkit.org/docs/RDKit_Book.html#smarts-support-and-extensions ,
it's not supported:
Here’s the (hopefully complete) list of SMARTS features that are *not*
> supported:
>
>- Non-tetrahedral chiral classes
>
>
>- the @? operator
>
>
>- explicit atomic
Hi Vin,
If you are running the IPython console on a terminal emulator that supports
graphics, you could display the molecule by printing out the necessary
terminal escape codes followed by the image buffer. The solution is
terminal-specific; here's an example that works using the Kitty terminal:
Hi Steven,
MolWt uses naturally occurring average atomic weights, the ones you find in
a typical periodic table. For example, Cl = 35.453.
ExactMolWt uses the weight of a specific isotope (the most naturally
abundant isotope unless the structure specifies a different one for an
atom). These are
We use "reaction-based enumeration" to distinguish it from "R-group
enumeration". Both are types of virtual library enumeration.
R-group enumeration allows you to attach any R-group anywhere. It is simple
and fast but you can easily create implausible (or hard to synthesize)
molecules if you are
Hi Norwid,
The inner loop over mols here:
for i in smiles_list:
for mol in mols:
for atom in mol.GetAtoms():
atom.SetAtomMapNum(atom.GetIdx())
mols.append(Chem.MolFromSmiles(i))
is not in the right place. First, because you'll go over the same
Hi Jean-Marc,
RDKit says that the oxygen is sp2 because it has a special rule that
considers the conjugation. Whether that is the "true" hybridization for the
oxygen could be a long debate; I sometimes hear that it's somewhere between
sp2 and sp3, perhaps not as close to sp2 as the nitrogen in
Thank you everyone for the suggestions. For now I don't have immediate
plans to adopt the cartridge but it's good to know these things when the
time comes.
Best,
Ivan
On Mon, Jun 8, 2020 at 6:49 PM Finnerty, Jim via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:
> If you have a
Hi,
I've never tried the RDKit PostgreSQL cartridge but I'm curious about it.
In particular I wonder how far have people pushed it in terms of
database size. The documentation gives examples with several million rows;
has anyone tried it with a couple billion rows? How fast are substructure
Hi Quoc-Tuan,
I can't reproduce your observations; I get True in both cases. Which
version of RDKit are you using?
One thing to note is that you are parsing a SMARTS with MolFromSmiles. I
wouldn't recommend that in general, although it appears that in this case
RDKit is lenient enough to accept
I think there was some confusion between left and right in the original
message, but RDKit prefers the representation that preserves the octet at
the expense of having more formal charges:
In [9]: mol = Chem.MolFromSmiles('O=I(=O)([O-])')
In [10]: Chem.MolToSmiles(mol)
Out[10]:
Hi Philipp,
This is an embarrassingly parallel problem (that's the actual technical
term, so no need to feel embarrassed. :-), meaning there's no need for
communication between threads or processes, which makes it really easy:
just split the search space, run a separate job for each fraction, and
Hi Adelene,
You can't match an atom that doesn't exist as a node in the molecular
graph, so if you really want to match a hydrogen, you'll have to add
explicit hydrogens to your molecule:
molh = Chem.AddHs(mol)
molh.HasSubstructMatch(q1)
> True
However, if all you want to know is whether the
Hi Goutam,
The ring atoms reported by RDKit in your example are correct; you just need
to consider that the atom indexes correspond to the position of each atom
in the SMILES string. How could RDKit guess the index that the atom might
have in a PDB file that's not even being read in your example?
Hi Lauren,
SMARTS doesn't have a direct way of saying an atom is non-racemic, but you
can express that idea using recursive SMARTS. For example,
In [46]: racemic =
Chem.MolFromSmiles('c12c1cncc2NC(=O)C(CCO2)c1cc(Cl)ccc12')
In [47]: chiral1 = Chem.MolFromSmiles('c12c1cncc2NC(=O)[C@H
Hi Thomas,
I believe what you want can be done using recursive SMARTS and disconnected
SMARTS. For example,
In [7]: mol = Chem.MolFromSmiles('CCC=C')
In [8]:
mol.GetSubstructMatches(Chem.MolFromSmarts('[$(C-*)].CC.[$(C=*)]'))
Out[8]: ((0, 1, 2, 3),)
The recursive SMARTS let you match a single
Hi German,
GetNumConjGrps is not a function of the Chem module, but a method of the
ResonanceMolSupplier class. You have to create a resonance mol supplier
object first, for example:
>>> supp = Chem.ResonanceMolSupplier(mol)
>>> supp.GetNumConjGrps()
2
Hope this helps,
Ivan
On Tue, Sep 28,
That does seem like a bug. You can also see it without involving
DeleteSubstructs, by starting from different SMILES representations of the
same molecule:
>>> m1 = Chem.MolFromSmiles('FC12C31C32F')
>>> m2 = Chem.MolFromSmiles('C12C31C32')
>>> m3 = Chem.MolFromSmiles('C1CC2C3C(C1)C23')
A minor correction: [H] by itself *is* valid and means a hydrogen atom. The
Daylight docs say as much in section 4.1. But in other contexts it means a
hydrogen count, so to be safe, always using #1 to mean a hydrogen atom can
be a good practice.
If you are ever in doubt about how RDKit is
How about splitting the file on lines consisting of "", and then
parsing each record? If the parsing fails, you can write out the bad record
for future inspection. (This addresses the basic use case, but not the
"even better" one.)
Here's a proof of concept:
from rdkit import Chem
def
Hi Eduardo,
I believe the problem is that r6 means "in *smallest* SSSR ring of size
", where "smallest" in this context means that, for example, for an atom
at the ring fusion between a 5-member ring and a 6-member ring, r5 would
match that atom but r6 wouldn't.
Perhaps using x3 instead (means
On Tue, Jun 7, 2022 at 1:39 PM Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:
> Perhaps using x3 instead (means "number of ring bonds") would work for
> your purposes?
>
Nevermind, x3 won't exclude the fused 4-atom rings from your first example.
I'l
Hi Chris,
Please try a more recent version of RDKit. I believe this function was
added in the 2021.09 release.
Hope this helps,
Ivan
On Thu, Jul 14, 2022 at 7:04 AM Chris Swain via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:
> Hi,
>
> If I try
>
> from
Hi Rohit,
Could you attach a complete example? I took the script from the email you
refer to, only edited the line that says mol = Chem.MolFromSmiles('CC')
to make it say mol = Chem.MolFromSmiles('CC'), and when I run it I get nine
torsions:
(2, 0, 1, 5)
(2, 0, 1, 6)
(2, 0, 1, 7)
(3, 0, 1,
Hi Fernando,
What happens is that atoms on the left hand side of the reaction template
get deleted unless they have mapping numbers (and everything else they were
attached to that becomes unreachable from the mapped atoms is gone as
well). Atoms on the right hand side without mapping numbers are
Hi Lauren,
The enhanced stereochemistry is available, not as atom properties, but as
"stereo groups" of the Mol object. For example,
>>> mol = Chem.MolFromSmiles('C[C@H]1CCCNC1 |&1:1,r|')
>>> for group in mol.GetStereoGroups():
print([group.GetGroupType(),
[atom.GetIdx()
Hi Michal,
A key point to consider is that the default bond order in SMARTS is not
single, but "single or aromatic". If you really want to match single bonds
only, you can specify a single bond with "-".
However, it sounds as if you actually expect aromatic bonds to match as
well, since you
Hi Jarod,
Something like this should work:
#include
#include
#include
#include
int main()
{
auto mol = RDKit::SmilesToMol("CCO");
auto mw = RDKit::Descriptors::calcAMW(*mol);
std::cout << mw << "\n";
}
Hope this helps,
Ivan
On Mon, May 15, 2023 at 3:00 PM Jarod
45 matches
Mail list logo