Re: [Rdkit-discuss] RegistrationHash in C++/Java

2023-09-07 Thread Gianmarco Ghiandoni
Hi Greg,

Thanks for the info. I'll keep an eye on that then.

Giammy

On Thu, 7 Sept 2023 at 13:08, Greg Landrum  wrote:

> Hi Giammy,
>
> We currently only have the Python implementation. Doing a C++ version is
> on my ToDo list, but I'm not sure when we'll get there.
>
> best regards,
> -greg
>
>
> On Thu, Sep 7, 2023 at 1:17 PM Gianmarco Ghiandoni 
> wrote:
>
>> Hello all,
>>
>> I've been testing the Python module from rdkit.Chem import
>> RegistrationHash for some time now and I would like to use it in Java
>> too. I browsed the RDKit repository but I could not find it implemented in
>> C++, and therefore, not available in the Java JARs.
>>
>> Am I missing it from somewhere else or is it just implemented in Python?
>>
>> Thanks,
>>
>> Giammy
>>
>> --
>> *Gianmarco*
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RegistrationHash in C++/Java

2023-09-07 Thread Gianmarco Ghiandoni
Hello all,

I've been testing the Python module from rdkit.Chem import RegistrationHash for
some time now and I would like to use it in Java too. I browsed the RDKit
repository but I could not find it implemented in C++, and therefore, not
available in the Java JARs.

Am I missing it from somewhere else or is it just implemented in Python?

Thanks,

Giammy

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Fast calculation of hydrogen-bond strengths and free energy of hydration of small molecules

2023-03-21 Thread Gianmarco Ghiandoni
Hello RDKit community,

I normally do not use this channel to promote my own research, however, I
thought it was a good idea to share a piece of my work with you this time.

Myself and a colleague recently published a paper on calculating
hydrogen-bond strengths and free energy of hydrations from compound
structures (https://www.nature.com/articles/s41598-023-30089-x). The method
was implemented in an open source library called Jazzy (
https://github.com/AstraZeneca/jazzy) which relies mainly on two
dependencies: RDKit and kallisto.

The library also allows generating visualisations of atomistic
hydrogen-bond strengths. Most of this logic is done using RDKit. Therefore,
I also want to thank the contributors of RDKit for often helping me find my
way out around some issues I have encountered while implementing some of
the code in Jazzy.

Inspired by RDKit, we also created a cookbook (
https://jazzy.readthedocs.io/en/latest/cookbook.html). Feel free to take a
look at the code and paper.

Cheers,

Giammy
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] V2000/V3000 format inconsistency in RDKit

2023-01-27 Thread Gianmarco Ghiandoni
Hello all,

I have come across an unexpected behaviour by RDKit when reading MOL blocks
of the same compound in V2000 and V3000 formats. In particular, RDKit seems
to perceive the stereochemistry of the compound differently depending on
the format.

The original compound is a V3000 tab:

  ACCLDraw01272318022D

  0  0  0 0  0999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 16 18 0 0 1
M  V30 BEGIN ATOM
M  V30 1 C 12.5458 -11.8979 0 0
M  V30 2 C 13.2916 -12.2506 0 0
M  V30 3 C 13.9706 -11.7808 0 0
M  V30 4 C 13.9024 -10.9587 0 0
M  V30 5 C 13.1566 -10.6061 0 0
M  V30 6 N 12.4789 -11.0754 0 0
M  V30 7 N 12.985 -9.2285 0 0 CFG=3
M  V30 8 C 12.1695 -9.1051 0 0
M  V30 9 C 12.0846 -7.9654 0 0 CFG=2
M  V30 10 C 12.6712 -8.5447 0 0
M  V30 11 C 13.4045 -8.1659 0 0 CFG=1
M  V30 12 C 13.2699 -7.3511 0 0
M  V30 13 N 12.4544 -7.2277 0 0 CFG=3
M  V30 14 C 12.0751 -6.4952 0 0
M  V30 15 H 14.2246 -8.2516 0 0
M  V30 16 H 11.2754 -7.8051 0 0
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 1 2 1 2
M  V30 2 1 2 3
M  V30 3 2 3 4
M  V30 4 1 4 5
M  V30 5 2 5 6
M  V30 6 1 6 1
M  V30 7 1 5 7
M  V30 8 1 8 7
M  V30 9 1 9 8
M  V30 10 1 9 10
M  V30 11 1 10 11
M  V30 12 1 11 7
M  V30 13 1 11 12
M  V30 14 1 9 13
M  V30 15 1 13 12
M  V30 16 1 13 14
M  V30 17 1 11 15 CFG=3
M  V30 18 1 9 16 CFG=3
M  V30 END BOND
M  V30 BEGIN COLLECTION
M  V30 MDLV30/STEABS ATOMS=(2 9 11)
M  V30 END COLLECTION
M  V30 END CTAB
M  END

Which can also be represented (in Biovia Draw) as:

[image: image.png]
The same compound converted into V2000:
1
  -OEChem-01262310192D

 16 18  0 1  0  0  0  0  0999 V2000
   15.8326   -5.90130. C   0  0  0  0  0  0  0  0  0  0  0  0
   16.5785   -6.25400. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.2575   -5.78420. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.1893   -4.96220. C   0  0  0  0  0  0  0  0  0  0  0  0
   16.4435   -4.60950. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.7658   -5.07880. N   0  0  0  0  0  0  0  0  0  0  0  0
   16.2719   -3.23190. N   0  0  0  0  0  0  0  0  0  0  0  0
   16.6914   -2.16930. C   0  0  2  0  0  0  0  0  0  0  0  0
   16.5568   -1.35450. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.7413   -1.23110. N   0  0  0  0  0  0  0  0  0  0  0  0
   15.3714   -1.96880. C   0  0  2  0  0  0  0  0  0  0  0  0
   15.4564   -3.10850. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.9581   -2.54820. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.3620   -0.49860. C   0  0  0  0  0  0  0  0  0  0  0  0
   16.6914   -2.16930. H   0  0  0  0  0  0  0  0  0  0  0  0
   15.3714   -1.96880. H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  3  4  2  0  0  0  0
  4  5  1  0  0  0  0
  5  6  2  0  0  0  0
  1  6  1  0  0  0  0
 11 13  1  0  0  0  0
 10 11  1  0  0  0  0
 11 12  1  0  0  0  0
  8 13  1  0  0  0  0
  9 10  1  0  0  0  0
 10 14  1  0  0  0  0
  7 12  1  0  0  0  0
  8  9  1  0  0  0  0
  7  8  1  0  0  0  0
  5  7  1  0  0  0  0
  8 15  1  6  0  0  0
 11 16  1  6  0  0  0
M  END

Which is also represented as:

[image: image.png]
However, if I read the two compounds using RDKit and convert them into
SMILES, I get two compounds with different stereochemistry:
mols = [v3000, v2000]
for mol in mols:
m = Chem.MolFromMolBlock(mol)
print(Chem.MolToSmiles(m))

CN1C[C@@H]2C[C@H]1CN2c1n1
CN1C[C@@H]2C[C@@H]1CN2c1n1

I have inspected the tabs but I could not figure out why the two formats
are behaving differently given that they are rendered in the same way in
Biovia.

Any hints? Is this a bug in RDKit?

Thanks,

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-20 Thread Gianmarco Ghiandoni
Hi all,

Thanks for all your replies. I wonder what kind of logic may be put in
place to fix these mistakes algorithmically. Any ways to configure the
correct stereochemistry before embedding the structure?

Thanks,

On Thu, 19 Jan 2023 at 17:49, Giovanni Tricarico <
giovanni.tricar...@glpg.com> wrote:

> Indeed, the only two chemically valid configurations for this molecule
> seem to be:
>
>
>
>
>
> Impressive that rdkit can detect this kind of contradictions.
>
>
>
> G
>
>
>
> *From:* Kangway Chuang (CHUANGK4) via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net>
> *Sent:* 19 January 2023 17:54
> *To:* Ling Chan ; Gianmarco Ghiandoni <
> ghiandon...@gmail.com>
> *Cc:* RDKit 
> *Subject:* Re: [Rdkit-discuss] Embedding of molecules with incorrect
> stereochistry assignment
>
>
>
> Agree with Hao above. For the molecules provided, the second example is
> geometrically inaccessible given the stereochemical constraints. In this
> case, the expected behavior should be an unsuccessful embedding.
>
>
>
> Kangway
>
>
>
> On Thu, Jan 19, 2023 at 8:49 AM Ling Chan  wrote:
>
> Keep trying with more random seeds?
>
>
>
> On Thu., Jan. 19, 2023, 07:38 Hao,  wrote:
>
> Hi Gianmarco,
>
>
>
> In my experience, this just means that you have an impossible molecule. I
> haven't found any ways around it besides trying to embed. If it fails, try
> to swap the other stereoisomer. I find this particularly prevalent in large
> scale datasets where data quality is not very good.
>
>
>
> Best,
>
> Hao
>
>
>
> On Thu, Jan 19, 2023 at 7:09 AM Gianmarco Ghiandoni 
> wrote:
>
> Hi all,
>
>
>
> Anyone can help with this matter?
>
>
>
> Thanks,
>
>
>
> On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni 
> wrote:
>
> Hi all,
>
>
>
> I have come across an issue while embedding structures with
> stereochemistry configurations that presumably lead to clashes between
> atoms:
>
>
>
> from rdkit import Chem
>
> from rdkit.Chem import AllChem
>
>
>
> smiles="C1N[C@@H]2CO[C@H]1C2"
>
> m = Chem.MolFromSmiles(smiles)
>
> mh = Chem.AddHs(m)
>
> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>
>
>
> smiles="C1N[C@@H]2CO[C@@H]1C2"
>
> m = Chem.MolFromSmiles(smiles)
>
> mh = Chem.AddHs(m)
>
> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>
>
>
>
>
> Produces:
>
> 0 (successful embedding)
> -1  (unsuccessful embedding)
>
>
>
> What is in your opinion the best way to deal with this in order to avoid
> failures?
>
>
>
> Thanks,
>
> --
>
> *Gianmarco*
>
>
>
>
> --
>
> *Gianmarco*
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> <https://eur05.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discuss=05%7C01%7Cgiovanni.tricarico%40glpg.com%7Cbb3a6b722ea54a755f2108dafa40bfbf%7C627f3c33bccc48bba033c0a6521f7642%7C1%7C0%7C638097453261043031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=COfMwgZqYa6%2B9RKiqKRcIE8i7C%2BqziVSS7HaZAT3N3s%3D=0>
>
> This e-mail and its attachment(s) (if any) may contain confidential and/or
> proprietary information and is intended for its addressee(s) only. Any

Re: [Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-19 Thread Gianmarco Ghiandoni
Hi all,

Anyone can help with this matter?

Thanks,

On Tue, 17 Jan 2023 at 13:03, Gianmarco Ghiandoni 
wrote:

> Hi all,
>
> I have come across an issue while embedding structures with
> stereochemistry configurations that presumably lead to clashes between
> atoms:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
> smiles="C1N[C@@H]2CO[C@H]1C2"
> m = Chem.MolFromSmiles(smiles)
> mh = Chem.AddHs(m)
> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>
> smiles="C1N[C@@H]2CO[C@@H]1C2"
> m = Chem.MolFromSmiles(smiles)
> mh = Chem.AddHs(m)
> print(AllChem.EmbedMolecule(mh, randomSeed=11))
>
>
> Produces:
> 0 (successful embedding)
> -1  (unsuccessful embedding)
>
> What is in your opinion the best way to deal with this in order to avoid
> failures?
>
> Thanks,
> --
> *Gianmarco*
>


-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Embedding of molecules with incorrect stereochistry assignment

2023-01-17 Thread Gianmarco Ghiandoni
Hi all,

I have come across an issue while embedding structures with stereochemistry
configurations that presumably lead to clashes between atoms:

from rdkit import Chem
from rdkit.Chem import AllChem

smiles="C1N[C@@H]2CO[C@H]1C2"
m = Chem.MolFromSmiles(smiles)
mh = Chem.AddHs(m)
print(AllChem.EmbedMolecule(mh, randomSeed=11))

smiles="C1N[C@@H]2CO[C@@H]1C2"
m = Chem.MolFromSmiles(smiles)
mh = Chem.AddHs(m)
print(AllChem.EmbedMolecule(mh, randomSeed=11))


Produces:
0 (successful embedding)
-1  (unsuccessful embedding)

What is in your opinion the best way to deal with this in order to avoid
failures?

Thanks,
-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Annotations get trimmed on molecule renderings

2022-05-07 Thread Gianmarco Ghiandoni
Looks like the same issue. Glad that you are planning to work on it so I
don't have to enforce a a narrow range of RDKit versions in my installation
specs.

Thanks,

Giammy

On Fri, 6 May 2022, 16:21 David Cosgrove, 
wrote:

> Hi Giammy,
> You're right, the new pictures look pretty rubbish.  I assume it's related
> to https://github.com/rdkit/rdkit/discussions/5195.  I'll fix it over the
> weekend, and hopefully it'll show up in the next patch release.
> Dave
>
>
> On Fri, May 6, 2022 at 1:07 PM Gianmarco Ghiandoni 
> wrote:
>
>> Hi Dave,
>>
>> Thanks for your reply. The reason why my library sticks to 2021.09 is
>> because I get even more trouble with later versions of RDKit. These are two
>> examples of rendering with 2021 and 2022:
>>
>> [image: image.png][image: image.png]
>>
>> The good news is that your padding suggestion works, so I set
>> d2d.drawOptions().padding = 0.15 and voilá:
>>
>> [image: image.png]
>>
>> Amazing. Thanks!
>> Giammy
>>
>> On Thu, 5 May 2022 at 10:32, David Cosgrove 
>> wrote:
>>
>>> Hi Giammy,
>>>
>>> On reflection overnight, you might try d2d.drawOptions().padding = 0.2
>>> or something.  That should increase the amount of empty space around the
>>> molecule (the default is 0.05, and it's the fraction of the width/height of
>>> the image) such that there's enough room to show the whole annotation.
>>> It's a bit of a kludge, but it might work.
>>>
>>> Dave
>>>
>>> On Wed, May 4, 2022 at 4:20 PM Gianmarco Ghiandoni <
>>> ghiandon...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am using rdkit_pypi==2021.9.4 to generate visualisation of compounds
>>>> with their atomic hydrogen bond strengths. In particular, I am using this
>>>> function to produce an SVG string:
>>>>
>>>> d2d = rdMolDraw2D.MolDraw2DSVG(fig_size[0], fig_size[1])
>>>> d2d.drawOptions().annotationFontScale = 0.7
>>>> d2d.DrawMolecule(
>>>> rwmol,
>>>> highlightAtoms=atoms_to_highlight,
>>>> highlightAtomColors=idx2rgb,
>>>> highlightBonds=None,
>>>> )
>>>> d2d.FinishDrawing()
>>>> return d2d.GetDrawingText()
>>>>
>>>> Note that I am increasing the font scale to from 0.5 to 0.7 and for
>>>> certain molecules that produces renderings where annotations are cut out:
>>>>
>>>> [image: image.png]
>>>>
>>>> Any suggestions on how to fix this?
>>>>
>>>> Thanks,
>>>>
>>>> Giammy
>>>> --
>>>> *Gianmarco*
>>>> ___
>>>> Rdkit-discuss mailing list
>>>> Rdkit-discuss@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>
>>>
>>>
>>> --
>>> David Cosgrove
>>> Freelance computational chemistry and chemoinformatics developer
>>> http://cozchemix.co.uk
>>>
>>>
>>
>> --
>> *Gianmarco*
>>
>
>
> --
> David Cosgrove
> Freelance computational chemistry and chemoinformatics developer
> http://cozchemix.co.uk
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Annotations get trimmed on molecule renderings

2022-05-06 Thread Gianmarco Ghiandoni
Hi Dave,

Thanks for your reply. The reason why my library sticks to 2021.09 is
because I get even more trouble with later versions of RDKit. These are two
examples of rendering with 2021 and 2022:

[image: image.png][image: image.png]

The good news is that your padding suggestion works, so I set
d2d.drawOptions().padding = 0.15 and voilá:

[image: image.png]

Amazing. Thanks!
Giammy

On Thu, 5 May 2022 at 10:32, David Cosgrove 
wrote:

> Hi Giammy,
>
> On reflection overnight, you might try d2d.drawOptions().padding = 0.2 or
> something.  That should increase the amount of empty space around the
> molecule (the default is 0.05, and it's the fraction of the width/height of
> the image) such that there's enough room to show the whole annotation.
> It's a bit of a kludge, but it might work.
>
> Dave
>
> On Wed, May 4, 2022 at 4:20 PM Gianmarco Ghiandoni 
> wrote:
>
>> Hi all,
>>
>> I am using rdkit_pypi==2021.9.4 to generate visualisation of compounds
>> with their atomic hydrogen bond strengths. In particular, I am using this
>> function to produce an SVG string:
>>
>> d2d = rdMolDraw2D.MolDraw2DSVG(fig_size[0], fig_size[1])
>> d2d.drawOptions().annotationFontScale = 0.7
>> d2d.DrawMolecule(
>> rwmol,
>> highlightAtoms=atoms_to_highlight,
>> highlightAtomColors=idx2rgb,
>> highlightBonds=None,
>> )
>> d2d.FinishDrawing()
>> return d2d.GetDrawingText()
>>
>> Note that I am increasing the font scale to from 0.5 to 0.7 and for
>> certain molecules that produces renderings where annotations are cut out:
>>
>> [image: image.png]
>>
>> Any suggestions on how to fix this?
>>
>> Thanks,
>>
>> Giammy
>> --
>> *Gianmarco*
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
> --
> David Cosgrove
> Freelance computational chemistry and chemoinformatics developer
> http://cozchemix.co.uk
>
>

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Annotations get trimmed on molecule renderings

2022-05-04 Thread Gianmarco Ghiandoni
Hi all,

I am using rdkit_pypi==2021.9.4 to generate visualisation of compounds with
their atomic hydrogen bond strengths. In particular, I am using this
function to produce an SVG string:

d2d = rdMolDraw2D.MolDraw2DSVG(fig_size[0], fig_size[1])
d2d.drawOptions().annotationFontScale = 0.7
d2d.DrawMolecule(
rwmol,
highlightAtoms=atoms_to_highlight,
highlightAtomColors=idx2rgb,
highlightBonds=None,
)
d2d.FinishDrawing()
return d2d.GetDrawingText()

Note that I am increasing the font scale to from 0.5 to 0.7 and for certain
molecules that produces renderings where annotations are cut out:

[image: image.png]

Any suggestions on how to fix this?

Thanks,

Giammy
-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Adjusting/neutralising the formal charges on a molecule

2022-04-09 Thread Gianmarco Ghiandoni
Hello Paolo,

Thanks a lot for your replies. Looks like I have finally managed to achieve
what I wanted to do!

Have a good weekend,

Giammy

On Fri, 8 Apr 2022 at 15:37, Paolo Tosco  wrote:

> Hi Gianmarco,
>
> that's a radical cation, not just a cation, so you'll need to adjust the
> number of radical electrons first, then you may neutralize using
> Chem.MolStandardize.rdMolStandardize.Uncharger as documented in the RDKit
> CookBook:
>
> https://www.rdkit.org/docs/Cookbook.html#neutralizing-molecules
>
> from rdkit import Chem
> from rdkit.Chem.MolStandardize import rdMolStandardize
>
> uc = rdMolStandardize.Uncharger()
>
> Chem.MolToSmiles(uc.uncharge(Chem.MolFromSmiles("[CH+]1C2C2CC2C12")))
>
> 'C1CCC2CC3C3CC2C1'
>
>
> Cheers,
> p.
>
>
> On Fri, Apr 8, 2022 at 4:16 PM Gianmarco Ghiandoni 
> wrote:
>
>> Hi all again,
>>
>> I wonder whether there is a way in RDKit to neutralise the charges of
>> compounds such as "[C+]1C2C2CC2C12". Specifically, in my case I am
>> dealing with only carbon sequences.
>>
>> Thanks,
>> --
>> *Gianmarco*
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Adjusting/neutralising the formal charges on a molecule

2022-04-08 Thread Gianmarco Ghiandoni
Hi all again,

I wonder whether there is a way in RDKit to neutralise the charges of
compounds such as "[C+]1C2C2CC2C12". Specifically, in my case I am
dealing with only carbon sequences.

Thanks,
-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Atom removal messes up with the electronic configuration of rings

2022-04-07 Thread Gianmarco Ghiandoni
Hi Paolo,

This is exactly what I have tried to do but unsuccessfully because I was
just increasing the number of explicit Hs by
nbr.SetNumExplicitHs(nbr.GetNumExplicitHs()
+ 1).
In fact, my logic was to increase the number of Hs by 1 for each atom
removed and I am still puzzled on why that should be increased by 2 for
aromatic bonds (int(GetBondTypeAsDouble) == 2). Could you elaborate that
for me?

Giammy

On Thu, 7 Apr 2022 at 10:38, Paolo Tosco  wrote:

> Hi Gianmarco,
>
> this issue has been discussed before.
> Removing bonds with RWMol.RemoveBond() will not adjust the implicit H
> count of the atom at the two ends of the bond.
> While this is not important for the atom that is going to be removed, the
> count on the atom that stays needs to be adjusted. In particular, you need
> to add as many implicit Hs as the order of the bond that you have removed.
> See below for such an algorithm:
>
> with Chem.RWMol(mol) as rwmol:
> for b in rwmol.GetBonds():
> for a in (b.GetBeginAtom(), b.GetEndAtom()):
> if a.GetDegree() == 1:
> oa = b.GetOtherAtom(a)
> if oa.GetDegree() > 1:
> oa.SetNumExplicitHs(oa.GetNumExplicitHs() +
> int(b.GetBondTypeAsDouble()))
> rwmol.RemoveBond(a.GetIdx(), oa.GetIdx())
> rwmol.RemoveAtom(a.GetIdx())
> break
>
> Chem.SanitizeMol(rwmol)
>
> rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> rwmol
>
> [image: image.png]
>
> Using Chem.RWMol as a context manager will allow you commit the changes
> only when you are done, so you won't invalidate atom and bond iterators
> along the way.
>
> Cheers,
> p.
>
> On Thu, Apr 7, 2022 at 10:51 AM Gianmarco Ghiandoni 
> wrote:
>
>> Hi all,
>>
>> I am writing a function that removes atoms with only one bond and can be
>> applied recursively in order to find the scaffold of a molecule. The
>> function works in most cases but I have observed that, when aromatic rings
>> are involved, it produces a loss of information. This is an example:
>>
>> from rdkit import Chem
>> smiles =
>> "CCN1CCC(N(CCc2ccc(C(F)(F)F)cc2)C(=O)Cn2c(CCc3(F)c3F)cc(=O)c3c32)CC1"
>> Chem.MolFromSmiles(smiles.replace('@',''))
>>
>> [image: image.png]
>> mol = Chem.MolFromSmiles(smiles.replace('@', ''))
>> rwmol = Chem.RWMol(mol)
>>
>> # Note that in this case, I am iterating through the atoms only once
>> # but the idea is to repeat this iteration until all single-bonded
>> # atoms have been removed
>> for a in list(rwmol.GetAtoms()):
>> if len(a.GetBonds()) == 1:
>> nbr = a.GetNeighbors()[0]
>> rwmol.RemoveBond(a.GetIdx(), nbr.GetIdx())
>> rwmol.RemoveAtom(a.GetIdx())
>> rwmol
>>
>> [image: image.png]
>> If you check the top aromatic ring in the starting molecule and the
>> result after the first iteration, you can already see that the removal
>> messes up the aromatic configuration of the ring after removing the two
>> fluorines. If I try to Chem.SanitizeMol, at the end of the loop, it will
>> not fix the problem (KekulizeException: Can't kekulize mol.  Unkekulized
>> atoms: 21 30 31 32 33 34 35 36 37). I have also done some attempts with
>> using different sanitiseOps or by increasing the number of hydrogens on the
>> atoms attached to those I remove but I could not figure out a solution -
>> and clearly, if I carry on removing atoms recursively (i.e., I reapply the
>> removal again), then the molecule gets messed up completely:
>>
>> [image: image.png]
>>
>> Can anyone help me with understanding what's missing?
>>
>> Thanks,
>>
>> Giammy
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Atom removal messes up with the electronic configuration of rings

2022-04-07 Thread Gianmarco Ghiandoni
Hi all,

I am writing a function that removes atoms with only one bond and can be
applied recursively in order to find the scaffold of a molecule. The
function works in most cases but I have observed that, when aromatic rings
are involved, it produces a loss of information. This is an example:

from rdkit import Chem
smiles =
"CCN1CCC(N(CCc2ccc(C(F)(F)F)cc2)C(=O)Cn2c(CCc3(F)c3F)cc(=O)c3c32)CC1"
Chem.MolFromSmiles(smiles.replace('@',''))

[image: image.png]
mol = Chem.MolFromSmiles(smiles.replace('@', ''))
rwmol = Chem.RWMol(mol)

# Note that in this case, I am iterating through the atoms only once
# but the idea is to repeat this iteration until all single-bonded
# atoms have been removed
for a in list(rwmol.GetAtoms()):
if len(a.GetBonds()) == 1:
nbr = a.GetNeighbors()[0]
rwmol.RemoveBond(a.GetIdx(), nbr.GetIdx())
rwmol.RemoveAtom(a.GetIdx())
rwmol

[image: image.png]
If you check the top aromatic ring in the starting molecule and the result
after the first iteration, you can already see that the removal messes up
the aromatic configuration of the ring after removing the two fluorines. If
I try to Chem.SanitizeMol, at the end of the loop, it will not fix the
problem (KekulizeException: Can't kekulize mol.  Unkekulized atoms: 21 30
31 32 33 34 35 36 37). I have also done some attempts with using different
sanitiseOps or by increasing the number of hydrogens on the atoms attached
to those I remove but I could not figure out a solution - and clearly, if I
carry on removing atoms recursively (i.e., I reapply the removal again),
then the molecule gets messed up completely:

[image: image.png]

Can anyone help me with understanding what's missing?

Thanks,

Giammy
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Change font size in atom.SetProp("atomNote")

2022-03-28 Thread Gianmarco Ghiandoni
Great. Thanks. I will test this out straight away.

Giammy

On Mon, 28 Mar 2022 at 16:11, Paolo Tosco 
wrote:

> Hi Gianmarco,
>
> sure, e.g.:
>
> d2d = rdMolDraw2D.MolDraw2DSVG(200, 200)
> d2d.drawOptions().annotationFontScale = 0.7
> d2d.drawOptions().addAtomIndices = True
> d2d.DrawMolecule(Chem.MolFromSmiles("c1n1"))
> d2d.FinishDrawing()
> SVG(d2d.GetDrawingText())
> [image: image.png]
>
> Cheers,
> p.
>
> On Mon, Mar 28, 2022 at 5:03 PM Gianmarco Ghiandoni 
> wrote:
>
>> Hello Paolo,
>>
>> Thanks for that. Is it possible to configure that parameter against the
>> rdMolDraw2D? I am using it to get the SVG string for my molecule:
>>
>> d2d = rdMolDraw2D.MolDraw2DSVG(fig_size[0], fig_size[1])
>> d2d.DrawMolecule(
>> rwmol,
>> highlightAtoms=atoms_to_highlight,
>> highlightAtomColors=idx2rgb,
>> highlightBonds=None,
>> )
>> d2d.FinishDrawing()
>> return d2d.GetDrawingText()
>>
>> Giammy
>>
>> On Mon, 28 Mar 2022 at 14:00, Paolo Tosco 
>> wrote:
>>
>>> Hi Gianmarco,
>>>
>>> the setting that you need to adjust is
>>>
>>> annotationFontScale
>>> <https://www.rdkit.org/docs/cppapi/structRDKit_1_1MolDrawOptions.html#a6cf64fa7c9f2c08870914430f6a46282>
>>>
>>> e.g.
>>> IPythonConsole.drawOptions.annotationFontScale = 0.7
>>>
>>> The default scale is 0.5.
>>>
>>> Cheers,
>>> p.
>>>
>>>
>>> On Mon, Mar 28, 2022 at 2:31 PM Gianmarco Ghiandoni <
>>> ghiandon...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am using RDKit to set calculated values to atoms as shown below and I
>>>> would like to know whether it is possible or not to change the font size to
>>>> make it slightly bigger.
>>>>
>>>> # For each atom, set the property "atomNote" to a index+1 of the atom
>>>> atom.SetProp("atomNote", str(atom.GetIdx()+1))
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Giammy
>>>>
>>>> --
>>>> *Gianmarco*
>>>> ___
>>>> Rdkit-discuss mailing list
>>>> Rdkit-discuss@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>
>>>
>>
>> --
>> *Gianmarco*
>>
>

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Change font size in atom.SetProp("atomNote")

2022-03-28 Thread Gianmarco Ghiandoni
Hello Paolo,

Thanks for that. Is it possible to configure that parameter against the
rdMolDraw2D? I am using it to get the SVG string for my molecule:

d2d = rdMolDraw2D.MolDraw2DSVG(fig_size[0], fig_size[1])
d2d.DrawMolecule(
rwmol,
highlightAtoms=atoms_to_highlight,
highlightAtomColors=idx2rgb,
highlightBonds=None,
)
d2d.FinishDrawing()
return d2d.GetDrawingText()

Giammy

On Mon, 28 Mar 2022 at 14:00, Paolo Tosco 
wrote:

> Hi Gianmarco,
>
> the setting that you need to adjust is
>
> annotationFontScale
> <https://www.rdkit.org/docs/cppapi/structRDKit_1_1MolDrawOptions.html#a6cf64fa7c9f2c08870914430f6a46282>
>
> e.g.
> IPythonConsole.drawOptions.annotationFontScale = 0.7
>
> The default scale is 0.5.
>
> Cheers,
> p.
>
>
> On Mon, Mar 28, 2022 at 2:31 PM Gianmarco Ghiandoni 
> wrote:
>
>> Hi all,
>>
>> I am using RDKit to set calculated values to atoms as shown below and I
>> would like to know whether it is possible or not to change the font size to
>> make it slightly bigger.
>>
>> # For each atom, set the property "atomNote" to a index+1 of the atom
>> atom.SetProp("atomNote", str(atom.GetIdx()+1))
>>
>>
>> Thanks,
>>
>> Giammy
>>
>> --
>> *Gianmarco*
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Change font size in atom.SetProp("atomNote")

2022-03-28 Thread Gianmarco Ghiandoni
Hi all,

I am using RDKit to set calculated values to atoms as shown below and I
would like to know whether it is possible or not to change the font size to
make it slightly bigger.

# For each atom, set the property "atomNote" to a index+1 of the atom
  atom.SetProp("atomNote", str(atom.GetIdx()+1))


Thanks,

Giammy

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Reading an SDF/Mol without shuffling the original coordinates

2022-01-13 Thread Gianmarco Ghiandoni
Exact, Paolo. That is what I was looking for and now I understand what was
going on: All atoms had coordinates but the hydrogens didn't. To add those,
I was embedding the whole thing from scratch, and that was shuffling the
coordinates of all atoms.

Thanks,

Giammy



On Thu, 13 Jan 2022 at 13:25, Paolo Tosco 
wrote:

> Hi Gianmarco,
>
> you can add hydrogens with coordinates keeping the current heavy atom
> coordinates with
>
> rdkit_mol = rdkit.Chem.AddHs(rdkit_mol, addCoords=True)
>
> so you may avoid having to call EmbedMolecule, which will compute a whole
> set of new coordinates for your molecule.
>
> I hope I interpreted your needs correctly!
>
> Cheers,
> p.
>
>
> On Thu, Jan 13, 2022 at 2:05 PM Gianmarco Ghiandoni 
> wrote:
>
>> Hi all,
>>
>> Quick question: I have noticed that when I read a compound from
>> an SDF/Mol file where all coordinates are already defined, the embedding
>> process changes the conformation of the RDKit mol object, hence coordinates
>> are not preserved:
>>
>> import rdkit
>> cdk2_path = os.path.join(data_path, "cdk2_validation")
>> rdkit_mol = rdkit.Chem.MolFromMolFile(os.path.join(cdk2_path, "26Z.sdf"))
>> rdkit_mol = rdkit.Chem.AddHs(rdkit_mol)
>> rdkit.Chem.AllChem.EmbedMolecule(rdkit_mol)
>>
>> What is the best practice for reading a molecule with coordinates without
>> shuffling them? (e.g. for a ligand from the PDB).
>>
>> Thanks,
>> --
>> *Gianmarco*
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Reading an SDF/Mol without shuffling the original coordinates

2022-01-13 Thread Gianmarco Ghiandoni
Hi all,

Quick question: I have noticed that when I read a compound from an SDF/Mol
file where all coordinates are already defined, the embedding process
changes the conformation of the RDKit mol object, hence coordinates are not
preserved:

import rdkit
cdk2_path = os.path.join(data_path, "cdk2_validation")
rdkit_mol = rdkit.Chem.MolFromMolFile(os.path.join(cdk2_path, "26Z.sdf"))
rdkit_mol = rdkit.Chem.AddHs(rdkit_mol)
rdkit.Chem.AllChem.EmbedMolecule(rdkit_mol)

What is the best practice for reading a molecule with coordinates without
shuffling them? (e.g. for a ligand from the PDB).

Thanks,
-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Hiding/removing specific atoms in a RDKit molecule

2021-12-03 Thread Gianmarco Ghiandoni
Hello Paolo,

I knew about the existence of EM objects but not of RWs. That looks great,
thank you. I have just tested the function and it does exactly what I
needed.

Best,

Giammy

On Thu, 2 Dec 2021 at 20:00, Paolo Tosco  wrote:

> Hi Gianmarco,
>
> I am not aware of a method to simply hide atoms: here's a method t remove
> atoms given a list of indices, which should be what you need.
> Note that remove_selected_hs() replaces the "real" hydrogen with an
> implicit H on the parent atom, which I believe is what you want.
>
> from rdkit import Chem
> from rdkit.Chem import rdDistGeom
>
> mol = Chem.MolFromSmiles("OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O")
> mol_h = Chem.AddHs(mol)
> rdDistGeom.EmbedMolecule(mol_h)
> 0
> mol_h
> [image: image.png]
> hs = [a.GetIdx() for a in mol_h.GetAtoms() if a.GetAtomicNum() == 1]
> hs
> [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
> # Let's remove any second H as an example
> hs_to_remove = [hs[i] for i in range(0, len(hs), 2)]
> hs_to_remove
> [12, 14, 16, 18, 20, 22]
>
> def remove_selected_hs(mol, hs_to_remove):
> def fix_explicit_hs(mol, ai):
> oa = mol.GetAtomWithIdx(ai)
> if oa.GetAtomicNum() > 1:
> oa.SetNumExplicitHs(oa.GetNumExplicitHs() + 1)
>
> if not hs_to_remove:
> return mol
> rwmol = Chem.RWMol(mol)
> ai_to_remove = []
> for bond_idx in reversed(range(rwmol.GetNumBonds())):
> b = rwmol.GetBondWithIdx(bond_idx)
> bidx = b.GetBeginAtomIdx()
> remove_bidx = bidx in hs_to_remove
> eidx = b.GetEndAtomIdx()
> remove_eidx = eidx in hs_to_remove
> if (remove_bidx or remove_eidx):
> if remove_bidx:
> ai_to_remove.append(bidx)
> fix_explicit_hs(rwmol, eidx)
> if remove_eidx:
> ai_to_remove.append(eidx)
> fix_explicit_hs(rwmol, bidx)
> rwmol.RemoveBond(bidx, eidx)
> for atom_idx in sorted(ai_to_remove, reverse=True):
> rwmol.RemoveAtom(atom_idx)
>     Chem.SanitizeMol(rwmol)
> return rwmol.GetMol()
> remove_selected_hs(mol_h, hs_to_remove)
> [image: image.png]
> Cheers,
> p.
>
> On Thu, Dec 2, 2021 at 1:19 PM Gianmarco Ghiandoni 
> wrote:
>
>> Hi all,
>>
>> I have been working on a library that computes properties for molecules
>> where hydrogens are explicit. These properties are then saved into a
>> dictionary where keys are indices, then I am using the dictionary to depict
>> these properties on their corresponding hydrogens - also according to some
>> thresholds, which means that not all hydrogens are labelled. The result is
>> something like this:
>>
>> [image: image.png]
>>
>> Now my question is: How do I hide/remove the hydrogens that are not
>> labelled? I am storing their indices while iterating to append their labels
>> but there seem not to be a straightforward way to get rid of them.
>>
>> Thanks,
>>
>> Giammy
>>
>> --
>> *Gianmarco*
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Hiding/removing specific atoms in a RDKit molecule

2021-12-02 Thread Gianmarco Ghiandoni
Hi all,

I have been working on a library that computes properties for molecules
where hydrogens are explicit. These properties are then saved into a
dictionary where keys are indices, then I am using the dictionary to depict
these properties on their corresponding hydrogens - also according to some
thresholds, which means that not all hydrogens are labelled. The result is
something like this:

[image: image.png]

Now my question is: How do I hide/remove the hydrogens that are not
labelled? I am storing their indices while iterating to append their labels
but there seem not to be a straightforward way to get rid of them.

Thanks,

Giammy

-- 
*Gianmarco*
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Count-based fingerprints of molecules with and without starred atoms

2021-09-19 Thread Gianmarco Ghiandoni
I have a question for the RDKit community that regards count-based RDKit
fingerprints. During a test, I have noticed that the compound *C1C1 has
a similarity (Euclidean) different than 1 against either [H]C1C1
or C1C1. So, I am wondering, how does RDKit encode starred atoms? I
thought that the star would either be substituted with a hydrogen atom or
ignored but looks like I was wrong.

Best,

Giammy
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss