Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit

2017-01-18 Thread Milinda Samaraweera
Yes, most common should be the correct term.

Thanks,
Milinda

On Wed, Jan 18, 2017 at 5:49 PM, Peter S. Shenkin <shen...@gmail.com> wrote:

> You say "most stable", but I think you mean "most common." 2H is as stable
> as 1H, but less common.
>
> -P.
>
> On Wed, Jan 18, 2017 at 5:01 PM, Milinda Samaraweera <
> milindaatw...@gmail.com> wrote:
>
>> Hi Bob,
>>
>> I am trying to filter out any compound that does not have the most stable
>> isotopic form;  (anything other than: 12C,1H,14N,16O, 31P, 32S) or to
>> contain only MonoIsotopic compounds.
>>
>> Thanks,
>> Milinda
>> ​
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>


-- 
Milinda Samaraweera, Ph.D.
Postdoctoral Fellow, Department of Pharmacy
University of Connecticut
69 North Eagleville road
Storrs, CT, 06269
milindaatw...@gmail.com
860-617-8046
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit

2017-01-18 Thread Milinda Samaraweera
Hi Bob,

I am trying to filter out any compound that does not have the most stable
isotopic form;  (anything other than: 12C,1H,14N,16O, 31P, 32S) or to
contain only MonoIsotopic compounds.

Thanks,
Milinda
​
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit

2017-01-18 Thread Milinda Samaraweera
Nik,

That works too...

Thanks
Milinda

On Wed, Jan 18, 2017 at 3:08 PM, Stiefl, Nikolaus <
nikolaus.sti...@novartis.com> wrote:

> Hi
>
> Maybe this is much less efficient but I guess if you need it for specific
> isotopes then you could try using a smarts pattern and check for that?
>
>
>
> In [*20*]: q = Chem.MolFromSmarts("[13C,14C,2H,3H,15N,24P,46P,33S,34S,36S]
> ")
>
>
>
> In [*21*]: m = Chem.MolFromSmiles('CC[15NH2]')
>
>
>
> In [*22*]: m.HasSubstructMatch(q)
>
> Out[*22*]: True
>
>
>
>
>
> So you could loop over your molecules and then remove the ones that match
> the smarts.
>
> Ciao
>
> Nik
>
>
>
>
>
> *From: *Milinda Samaraweera <milindaatw...@gmail.com>
> *Date: *Wednesday 18 January 2017 at 20:47
> *To: *Greg Landrum <greg.land...@gmail.com>
> *Cc: *RDKit Discuss <rdkit-discuss@lists.sourceforge.net>
> *Subject: *Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit
>
>
>
> Greg,
>
> I am looking to remove entries that contain un-stable isotopes of elements
> CHNOPS (e.g. heavy_isotopes =['13C', '14C', '2H', '3H', '15N', '24P',
> '46P', '33S', '34S', '36S'] ). Is there a way to modify the above code to
> achieve that?
>
> Thanks,
>
> Milinda
>
>
>
>
>
> On Wed, Jan 18, 2017 at 11:16 AM, Greg Landrum <greg.land...@gmail.com>
> wrote:
>
> Hi Milinda,
>
>
>
> Here's an approach that finds all the atoms that have an isotope specified:
>
>
>
> In [1]: from rdkit import Chem
>
>
>
> In [2]: from rdkit.Chem import rdqueries
>
>
>
> In [3]: q = rdqueries.IsotopeGreaterQueryAtom(1)
>
>
>
> In [7]: list(x.GetIdx() for x in Chem.MolFromSmiles('CC[13CH3]'
> ).GetAtomsMatchingQuery(q))
>
> Out[7]: [2]
>
>
>
> In [8]: list(x.GetIdx() for x in Chem.MolFromSmiles('[12CH3]CC[13CH3]').
> GetAtomsMatchingQuery(q))
>
> Out[8]: [0, 3]
>
>
>
> Does that do what you want it to do?
>
>
>
> -greg
>
>
>
>
>
>
>
> On Wed, Jan 18, 2017 at 3:56 PM, Milinda Samaraweera <
> milindaatw...@gmail.com> wrote:
>
> Dear Experts,
>
> I am trying to figure out a way to exclude entries which contain heavy
> atoms (13C, 2H, 3H, etc), from a SD file (which has close to two thousand
> entries) and write an updated file with the remaining entries.
>
> I do understand how to read/write SD files using rdkit.
>
> What I do understand is how to detect entries with heavy isotopes: Is
> there an efficient and correct way of achieving this using rdkit?
>
>
>
> thanks,
>
> --
>
> Milinda Samaraweera
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
>
>
>
> --
>
> Milinda Samaraweera, Ph.D.
>
> Postdoctoral Fellow, Department of Pharmacy
>
> University of Connecticut
>
> 69 North Eagleville road
>
> Storrs, CT, 06269
>
> milindaatw...@gmail.com
> 860-617-8046 <(860)%20617-8046>
>



-- 
Milinda Samaraweera, Ph.D.
Postdoctoral Fellow, Department of Pharmacy
University of Connecticut
69 North Eagleville road
Storrs, CT, 06269
milindaatw...@gmail.com
860-617-8046
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit

2017-01-18 Thread Milinda Samaraweera
Greg,

I am looking to remove entries that contain un-stable isotopes of elements
CHNOPS (e.g. heavy_isotopes =['13C', '14C', '2H', '3H', '15N', '24P',
'46P', '33S', '34S', '36S'] ). Is there a way to modify the above code to
achieve that?

Thanks,
Milinda



On Wed, Jan 18, 2017 at 11:16 AM, Greg Landrum <greg.land...@gmail.com>
wrote:

> Hi Milinda,
>
> Here's an approach that finds all the atoms that have an isotope specified:
>
> In [1]: from rdkit import Chem
>
> In [2]: from rdkit.Chem import rdqueries
>
> In [3]: q = rdqueries.IsotopeGreaterQueryAtom(1)
>
> In [7]: list(x.GetIdx() for x in Chem.MolFromSmiles('CC[13CH3]'
> ).GetAtomsMatchingQuery(q))
> Out[7]: [2]
>
> In [8]: list(x.GetIdx() for x in Chem.MolFromSmiles('[12CH3]CC[13CH3]').
> GetAtomsMatchingQuery(q))
> Out[8]: [0, 3]
>
> Does that do what you want it to do?
>
> -greg
>
>
>
> On Wed, Jan 18, 2017 at 3:56 PM, Milinda Samaraweera <
> milindaatw...@gmail.com> wrote:
>
>> Dear Experts,
>>
>> I am trying to figure out a way to exclude entries which contain heavy
>> atoms (13C, 2H, 3H, etc), from a SD file (which has close to two thousand
>> entries) and write an updated file with the remaining entries.
>>
>> I do understand how to read/write SD files using rdkit.
>>
>> What I do understand is how to detect entries with heavy isotopes: Is
>> there an efficient and correct way of achieving this using rdkit?
>>
>> thanks,
>> --
>> Milinda Samaraweera
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>


-- 
Milinda Samaraweera, Ph.D.
Postdoctoral Fellow, Department of Pharmacy
University of Connecticut
69 North Eagleville road
Storrs, CT, 06269
milindaatw...@gmail.com
860-617-8046
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Check for Heavy Isotopes using RdKit

2017-01-18 Thread Milinda Samaraweera
Dear Experts,

I am trying to figure out a way to exclude entries which contain heavy
atoms (13C, 2H, 3H, etc), from a SD file (which has close to two thousand
entries) and write an updated file with the remaining entries.

I do understand how to read/write SD files using rdkit.

What I do understand is how to detect entries with heavy isotopes: Is there
an efficient and correct way of achieving this using rdkit?

thanks,
-- 
Milinda Samaraweera
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SD file read error

2017-01-11 Thread Milinda Samaraweera
Dear Experts,

I was trying to read in the attached SD file (downloaded from HMDB) and
trying to calculate the exact mass of each entry:
​
 structures.sdf
<https://drive.google.com/file/d/0B3AmIbK_SzZhdGY3NVgyMDJiQjA/view?usp=drive_web>
​
from rdkit import Chem
from rdkit.Chem import Descriptors

suppl  = Chem.SDMolSupplier(input_file)

low_mass=50
high_mass=1000

ms = []

for mol in suppl :

if mol is None: continue

try:
if mol and round(Descriptors.ExactMolWt(mol),4)>=low_mass
andround(Descriptors.ExactMolWt(mol),4)<=high_mass:
ms.append(mol)

except:
  pass

By running the script, I got a barrage of errors as:

[13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
permitted
[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1994014
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
permitted
[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1996036
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than
permitted
[13:15:16] Explicit valence for atom # 46 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302532
[13:15:16] ERROR: Explicit valence for atom # 46 N, 4, is greater than
permitte
[13:15:16] Explicit valence for atom # 16 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302918
[13:15:16] ERROR: Explicit valence for atom # 16 N, 4, is greater than
permitte
[13:15:17] Explicit valence for atom # 11 N, 4, is greater than permitted
[13:15:17] ERROR: Could not sanitize molecule ending on line 2556541
[13:15:17] ERROR: Explicit valence for atom # 11 N, 4, is greater than
permitte
[13:15:18]  S group SUP ignored on line 2836416
[13:15:18] Explicit valence for atom # 1 Cl, 4, is greater than permitted
[13:15:18] ERROR: Could not sanitize molecule ending on line 2841449
[13:15:18] ERROR: Explicit valence for atom # 1 Cl, 4, is greater than
permitte
[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Explicit valence for atom # 3 B, 4, is greater than permitted
[13:15:19] ERROR: Could not sanitize molecule ending on line 3107498
[13:15:19] ERROR: Explicit valence for atom # 3 B, 4, is greater than
permitted
[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:20]  Unhandled CTAB feature: S group SRU on line: 3205922. Molecule
skip
[13:15:20] Explicit valence for atom # 0 Mg, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3222378
[13:15:20] ERROR: Explicit valence for atom # 0 Mg, 4, is greater than
permitte
[13:15:20] Explicit valence for atom # 2 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3265386
[13:15:20] ERROR: Explicit valence for atom # 2 N, 4, is greater than
permitted
[13:15:20] Explicit valence for atom # 31 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3305754
[13:15:20] ERROR: Explicit valence for atom # 31 N, 4, is greater than
permitte
[13:15:21] Explicit valence for atom # 45 N, 4, is greater than permitted
[13:15:21] ERROR: Could not sanitize molecule ending on line 3437055
[13:15:21] ERROR: Explicit valence for atom # 45 N, 4, is greater than
permitte
[13:15:56] Explicit valence for atom # 3 C, 5, is greater than permitted
[13:15:56] ERROR: Could not sanitize molecule ending on line 8391489
[13:15:56] ERROR: Explicit valence for atom # 3 C, 5, is greater than
permitted

What causes these errors? there a way to suppress or solve the errors? or
way to stop priting them up in the command prompt.

-- 
Thanks,
Milinda Samaraweera,
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SDwriter

2016-12-16 Thread Milinda Samaraweera
This SD file is then used as an input for another program, that program is
having problems reading the sequence numbers.

Thanks,
MAK

On Fri, Dec 16, 2016 at 10:43 PM, Greg Landrum <greg.land...@gmail.com>
wrote:

> It's easy enough to make this an option, but given that it is part of the
> SDF spec (as Andrew has pointed out) the only reason I can think of to do
> so would be because it causes problems for some other piece of (likely
> commonly used) software.
>
> Are the sequence numbers causing a problem for you?
>
> -greg
>
>
>
>
>
>
> On Sat, Dec 17, 2016 at 1:46 AM +0100, "Milinda Samaraweera" <
> milindaatw...@gmail.com> wrote:
>
> Dear Users,
>>
>> I was using the SDWriter in the rdkit kit to generate a SD file with
>> mutiple entries generated using smiles and later assign SD tag data (e.g.
>> pubchem_ID, IUPAC_name, etc).
>>
>> However at the end of each tag header I noticed there is a number
>> (bolded):
>>
>> ...
>> >   * (1) *
>> N1-(2-ethylbutyl)hexane-1,3,6-triamine
>>
>> >*(1) *
>> 118903148
>>
>> ...
>> M  END
>> >   * (2)*
>> N1,N2-dimethyl-N2-[3-(methylamino)propyl]-N1-propylpropane-1,2-diamine
>>
>> >   * (2) *
>> 118883401
>>
>> What is this number and how you avoid printing this number when SDwriter
>> is used? As this number is not found in standard SD files.
>>
>> Thanks,
>> CodeMAK
>>
>>


-- 
Milinda Samaraweera, Ph.D.
Postdoctoral Fellow, Department of Pharmacy
University of Connecticut
69 North Eagleville road
Storrs, CT, 06269
milindaatw...@gmail.com
860-617-8594
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss