Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-22 Thread Francois Berenger

Dear JP,

To confuse you even more, you can also have a look at the ChEMBL 
open-source molecular standardizer:


https://github.com/chembl/ChEMBL_Structure_Pipeline/blob/master/chembl_structure_pipeline/standardizer.py

No need to thank me. :D

On 18/06/2021 03:12, JP Ebejer wrote:

Dear all,

I am trying to standardize(/normalize?) some molecules from different
sources, to generate a set of descriptors for them.  I have done this
a number of times, and each time I find the process slightly
confusing.  I have the following questions please, if you don't mind:

1.  What is the relation between molvs and rdkit (I remember there was
an integration project between the two a while back).  When I call
rdMolStandardize does rdkit code or molvs code get called?  The github
repo for molvs hasn't been updated in a while (2 yrs), but
rdMolStandardize has.
2.  What is the difference between standardization and normalization
of a molecule?  Does one automatically imply the other or should these
two processes be both run on a molecule?
3.  Specifically, what is the difference between
rdMolStandardize.Cleanup(mol), Chem.SanitizeMol(mol),
rdMolStandardize.Normalize(mol).  Should I call any of these manually
three after I run "standardization/cleaning operations" such as
uncharging, reionizing, etc?
4.  I understand what uncharge does, but what does reionizer do?
5.  Is there a way to chain operations together
standardize+ChooseLargestFragment+uncharge+normalize (am not sure the
order makes sense here), other than creating a class instance for each
calling the method, returning a new mol and using this mol in the next
operation?

Apologies for the many questions.  Have I missed the documentation
about this?  I have found some excellent examples here:
https://github.com/susanhleung/rdkit/blob/dev/GSOC2018_MolVS_Integration/rdkit/Chem/MolStandardize/tutorial/MolStandardize.ipynb
(thanks!).  This is not exactly a cleaning pipeline, but still quite
helpful to understand these methods.

Many thanks,
JP
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Autodock Vina

2021-06-22 Thread Gustavo Seabra
Hi Valik,

I do this on a regular basis for our generators here. Basically what you
will need is to:

1. Generate 3D structures for the molecules (RDKit can do that)
2. Save to SDF files (again, RDKit)
3. Convert to PDBQT (I use OpenBabel: "$ obabel -isdf structures.sdf
-opdbqt -Oname-.pdbqt -m")

Then you'll have the files you need. Of course, you will still need to
build the pdbqt file for the target and the vina_config file, but that you
only need to do once.

All the best,
--
Gustavo Seabra.


On Tue, Jun 22, 2021 at 4:08 AM Velik Velikov  wrote:

> Dear all,
>
>
>
> I am constructing new molecules (de novo design) that are drug-like with
> RDKit. I have my molecules in SMILES now and I need to check them with
> AutoDock Vina. I have never used it and I have been trying since last week
> but I kind of don’t know where to go from here.
>
> What is my config file, ligand or receptor? Do I need MGL Tools, PyMOL or
> something else?
>
> Also, I couldn’t run it on my mac - Big Sur, I tried with a VirtualBox but
> it didn’t work out either. I am thinking about installing Autodock Vina on
> my old windows laptop now. Appreciate any help with this tool. Thanks in
> advance.
>
>
> Best,
>
> Velik Velikov
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] [ext] Re: Autodock Vina

2021-06-22 Thread Volkamer, Andrea
Hi Velik,


though unrelated to RDKit ...

in our TeachOpenCADD platform - 
though still living in an PR to be released soon (PR 
#74) - we talk about 
protein-ligand docking using smina (starting from SMILES input), it might help 
getting started ...

https://github.com/volkamerlab/teachopencadd/blob/t011-base/teachopencadd/talktorials/T015_protein_ligand_docking/talktorial.ipynb


Best, Andrea


Von: Greg Landrum 
Gesendet: Dienstag, 22. Juni 2021 10:28:36
An: Velik Velikov
Cc: RDKit Discuss
Betreff: [ext] Re: [Rdkit-discuss] Autodock Vina

Hi Velik,

This is a discussion list for the RDKit, not for Autodock Vina.

Here's the link for getting help about Autodock Vina:
http://vina.scripps.edu/questions.html

Best,
-greg

On Tue, Jun 22, 2021 at 10:08 AM Velik Velikov 
mailto:welik0...@gmail.com>> wrote:
Dear all,

I am constructing new molecules (de novo design) that are drug-like with RDKit. 
I have my molecules in SMILES now and I need to check them with AutoDock Vina. 
I have never used it and I have been trying since last week but I kind of don’t 
know where to go from here.
What is my config file, ligand or receptor? Do I need MGL Tools, PyMOL or 
something else?
Also, I couldn’t run it on my mac - Big Sur, I tried with a VirtualBox but it 
didn’t work out either. I am thinking about installing Autodock Vina on my old 
windows laptop now. Appreciate any help with this tool. Thanks in advance.

Best,
Velik Velikov
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Autodock Vina

2021-06-22 Thread Greg Landrum
Hi Velik,

This is a discussion list for the RDKit, not for Autodock Vina.

Here's the link for getting help about Autodock Vina:
http://vina.scripps.edu/questions.html

Best,
-greg

On Tue, Jun 22, 2021 at 10:08 AM Velik Velikov  wrote:

> Dear all,
>
>
>
> I am constructing new molecules (de novo design) that are drug-like with
> RDKit. I have my molecules in SMILES now and I need to check them with
> AutoDock Vina. I have never used it and I have been trying since last week
> but I kind of don’t know where to go from here.
>
> What is my config file, ligand or receptor? Do I need MGL Tools, PyMOL or
> something else?
>
> Also, I couldn’t run it on my mac - Big Sur, I tried with a VirtualBox but
> it didn’t work out either. I am thinking about installing Autodock Vina on
> my old windows laptop now. Appreciate any help with this tool. Thanks in
> advance.
>
>
> Best,
>
> Velik Velikov
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Autodock Vina

2021-06-22 Thread Velik Velikov
Dear all,



I am constructing new molecules (de novo design) that are drug-like with
RDKit. I have my molecules in SMILES now and I need to check them with
AutoDock Vina. I have never used it and I have been trying since last week
but I kind of don’t know where to go from here.

What is my config file, ligand or receptor? Do I need MGL Tools, PyMOL or
something else?

Also, I couldn’t run it on my mac - Big Sur, I tried with a VirtualBox but
it didn’t work out either. I am thinking about installing Autodock Vina on
my old windows laptop now. Appreciate any help with this tool. Thanks in
advance.


Best,

Velik Velikov
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Searching in (Downloaded) Databases

2021-06-22 Thread Paolo Tosco
Hi Philipp,

It looks like the supplier thinks the line index has gone past the end of
file.
1) How large is the SMILES file which leads to this error (ls -l)?
2) Does it consistently happen at the same line number?
You can check this with something like:

suppl = Chem.SmilesMolSupplier(infile, sanitize=False, nameColumn=-1)
i = 0
while 1:
try:
mol = next(suppl)
except StopIteration:
break
except Exception:
print(f"Exception raised after {i} mols")
raise
i += 1

To check if the problem is actually due to file size, you may split
linewise your input file with the coreutils split command :

split -l  large_file.smi large_file_ --additional-suffix=.smi

Replace  with a number < than the one that causes the exception
and check if operating on smaller chunks removes the problem.

HTH, cheers
p.


On Tue, Jun 22, 2021 at 8:19 AM Philipp Otten 
wrote:

> Hey you lovely people,
> as I am creating a set of building blocks for my in-silico reaction, I
> downloaded various accessible databases (ChemBL28, GDB13, GDB17, Pubchem,
> emolecules and mcule) and want to just work through them with
> "HasSubstructMatch". Unfortunately I run into a "File parsing error: ran
> out of lines"
> I open the .smi files as SmilesMolSupplier and then just for loop through
> them:
>
>  with open(target_file, "w") as outfile:
> suppl = Chem.SmilesMolSupplier(infile, sanitize=False,
> nameColumn=-1)
> for mol in suppl:
> if Descriptors.MolWt(mol) <= mwt:
> if mol.HasSubstructMatch(pattern1) == True:
> mol = Chem.MolToSmiles(mol)
> outfile.write(mol + "\n")
> else:
> continue
> else:
> continue
>
> I can imagine that it possibly has something to do with the length of the
> files, but I don't know how to actually fix that.
> Thanks for all your help!
> Kind regards
> Philipp
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Searching in (Downloaded) Databases

2021-06-22 Thread Philipp Otten
Hey you lovely people,
as I am creating a set of building blocks for my in-silico reaction, I
downloaded various accessible databases (ChemBL28, GDB13, GDB17, Pubchem,
emolecules and mcule) and want to just work through them with
"HasSubstructMatch". Unfortunately I run into a "File parsing error: ran
out of lines"
I open the .smi files as SmilesMolSupplier and then just for loop through
them:

 with open(target_file, "w") as outfile:
suppl = Chem.SmilesMolSupplier(infile, sanitize=False,
nameColumn=-1)
for mol in suppl:
if Descriptors.MolWt(mol) <= mwt:
if mol.HasSubstructMatch(pattern1) == True:
mol = Chem.MolToSmiles(mol)
outfile.write(mol + "\n")
else:
continue
else:
continue

I can imagine that it possibly has something to do with the length of the
files, but I don't know how to actually fix that.
Thanks for all your help!
Kind regards
Philipp
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss