Even better ☺
From: Greg Landrum
Date: Friday 29 June 2018 at 18:04
To: GMCProfile
Cc: "rdkit-discuss@lists.sourceforge.net"
Subject: Re: [Rdkit-discuss] elimination of small fragments
How about just GetLargestFragment()?
On Fri, 29 Jun 2018 at 16:45, Stiefl, Nikolaus
mailto:ni
ik
>
>
>
>
>
> *From: *Alfredo Quevedo
> *Date: *Friday 29 June 2018 at 12:06
> *To: *Andrew Dalke
> *Cc: *Stephen Roughley via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net>
> *Subject: *Re: [Rdkit-discuss] elimination of small fragments
>
&g
rdmolops.GetMolFrags(mol, asMols = True, largestFragmentOnly = True)
? Just a thought …
Cheers
Nik
From: Alfredo Quevedo
Date: Friday 29 June 2018 at 12:06
To: Andrew Dalke
Cc: Stephen Roughley via Rdkit-discuss
Subject: Re: [Rdkit-discuss] elimination of small fragments
thank you much much
Ed, As always, there are no 'one size fits all' solutions, it all depends
on what you need to do. I was processing tens of millions of screening
compounds into a database and used a desalter/desolvator written using the
RDkit C++ API. That was quite quick enough for my needs - I never tried it
wit
Chris, Absolutely agree with your points - processing the molecules into RDkit
is much more robust, but it depends though on how many you’ve got to process.
If you’re doing millions to billions, then the overhead can become a problem
and doing it in two steps (lexical then graph) can be the pra
I'd say that using RDkit to calculate the numbers of heavy atoms is
significantly more robust than a purely lexical approach - and it's easy to
implement.
It's also dangerous to just discard the smallest fragment. Years ago I
worked on a project where the active molecule had only 11 heavy atoms an
On Jun 29, 2018, at 02:43, 藤秀義 wrote:
> Although not strictly based on the number of atoms, but on the length of
> SMILES string, the simplest way is using Python built-in functions as follows:
>
> smiles = 'CCC.CC'
> fragment = max(smiles.split('.'), key=len)
> print (fragment)
The mmpdb packa
thank you Ed for suggesting this alternative
regards
Alfredo
Enviado desde BlueMail
En 29 de junio de 2018 05:56, en 05:56, Ed Griffen
escribió:
>Using the string length to find the number of atoms in a molecule is OK
>- but you need to take account of the additional characters in SMILES
>t
thank you much much Andrew for this detailed explanation
regards
Alfredo
Enviado desde BlueMail
En 29 de junio de 2018 07:02, en 07:02, Andrew Dalke
escribió:
>On Jun 28, 2018, at 22:08, Paolo Tosco
>wrote:
>> if you wish to keep only the largest disconnected fragment you may
>try the follo
On Jun 28, 2018, at 22:08, Paolo Tosco wrote:
> if you wish to keep only the largest disconnected fragment you may try the
> following:
>
> mols = list(rdmolops.GetMolFrags(mol, asMols = True))
> if (mols):
> mols.sort(reverse = True, key = lambda m: m.GetNumAtoms())
> mol = mols[0]
A s
Using the string length to find the number of atoms in a molecule is OK - but
you need to take account of the additional characters in SMILES that are not
just atoms, for example:
two letter elements - like silicon, chlorine etc
brackets , ring closures, charges, explicit hydrogens
It’s simple
thank you Hideyoshi for your feedback.
regards
Alfredo
Enviado desde BlueMail
En 28 de junio de 2018 21:43, en 21:43, "藤秀義"
escribió:
>Dear Alfredo,
>
>Although not strictly based on the number of atoms, but on the length
>of
>SMILES string, the simplest way is using Python built-in function
Dear Alfredo,
Although not strictly based on the number of atoms, but on the length of
SMILES string, the simplest way is using Python built-in functions as
follows:
smiles = 'CCC.CC'
fragment = max(smiles.split('.'), key=len)
print (fragment)
Best regards,
Hideyoshi
thank you Paolo for this
thank you Paolo for this help, I will study the code and try it,
best regards
Alfredo
Enviado desde BlueMail
En 28 de junio de 2018 17:08, en 17:08, Paolo Tosco
escribió:
>Dear Alfredo,
>
>if you wish to keep only the largest disconnected fragment you may try
>the following:
>
>mols = list(
Dear Alfredo,
if you wish to keep only the largest disconnected fragment you may try
the following:
mols = list(rdmolops.GetMolFrags(mol, asMols = True))
if (mols):
mols.sort(reverse = True, key = lambda m: m.GetNumAtoms())
mol = mols[0]
Hope that helps, cheers
p.
On 06/28/18 19:38,
Good afternoon,
I would like to filter out small fragments from a list of molecules
using the below strategy:
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import SaltRemover
remover=SaltRemover.SaltRemover()
mol=Chem.MolFromSmiles('CCC.CC')
res=remover.StripMol(mol)
p
16 matches
Mail list logo