Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit
Yes, most common should be the correct term. Thanks, Milinda On Wed, Jan 18, 2017 at 5:49 PM, Peter S. Shenkin <shen...@gmail.com> wrote: > You say "most stable", but I think you mean "most common." 2H is as stable > as 1H, but less common. > > -P. > > On Wed, Jan 18, 2017 at 5:01 PM, Milinda Samaraweera < > milindaatw...@gmail.com> wrote: > >> Hi Bob, >> >> I am trying to filter out any compound that does not have the most stable >> isotopic form; (anything other than: 12C,1H,14N,16O, 31P, 32S) or to >> contain only MonoIsotopic compounds. >> >> Thanks, >> Milinda >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Milinda Samaraweera, Ph.D. Postdoctoral Fellow, Department of Pharmacy University of Connecticut 69 North Eagleville road Storrs, CT, 06269 milindaatw...@gmail.com 860-617-8046 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit
Hi Bob, I am trying to filter out any compound that does not have the most stable isotopic form; (anything other than: 12C,1H,14N,16O, 31P, 32S) or to contain only MonoIsotopic compounds. Thanks, Milinda -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit
Nik, That works too... Thanks Milinda On Wed, Jan 18, 2017 at 3:08 PM, Stiefl, Nikolaus < nikolaus.sti...@novartis.com> wrote: > Hi > > Maybe this is much less efficient but I guess if you need it for specific > isotopes then you could try using a smarts pattern and check for that? > > > > In [*20*]: q = Chem.MolFromSmarts("[13C,14C,2H,3H,15N,24P,46P,33S,34S,36S] > ") > > > > In [*21*]: m = Chem.MolFromSmiles('CC[15NH2]') > > > > In [*22*]: m.HasSubstructMatch(q) > > Out[*22*]: True > > > > > > So you could loop over your molecules and then remove the ones that match > the smarts. > > Ciao > > Nik > > > > > > *From: *Milinda Samaraweera <milindaatw...@gmail.com> > *Date: *Wednesday 18 January 2017 at 20:47 > *To: *Greg Landrum <greg.land...@gmail.com> > *Cc: *RDKit Discuss <rdkit-discuss@lists.sourceforge.net> > *Subject: *Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit > > > > Greg, > > I am looking to remove entries that contain un-stable isotopes of elements > CHNOPS (e.g. heavy_isotopes =['13C', '14C', '2H', '3H', '15N', '24P', > '46P', '33S', '34S', '36S'] ). Is there a way to modify the above code to > achieve that? > > Thanks, > > Milinda > > > > > > On Wed, Jan 18, 2017 at 11:16 AM, Greg Landrum <greg.land...@gmail.com> > wrote: > > Hi Milinda, > > > > Here's an approach that finds all the atoms that have an isotope specified: > > > > In [1]: from rdkit import Chem > > > > In [2]: from rdkit.Chem import rdqueries > > > > In [3]: q = rdqueries.IsotopeGreaterQueryAtom(1) > > > > In [7]: list(x.GetIdx() for x in Chem.MolFromSmiles('CC[13CH3]' > ).GetAtomsMatchingQuery(q)) > > Out[7]: [2] > > > > In [8]: list(x.GetIdx() for x in Chem.MolFromSmiles('[12CH3]CC[13CH3]'). > GetAtomsMatchingQuery(q)) > > Out[8]: [0, 3] > > > > Does that do what you want it to do? > > > > -greg > > > > > > > > On Wed, Jan 18, 2017 at 3:56 PM, Milinda Samaraweera < > milindaatw...@gmail.com> wrote: > > Dear Experts, > > I am trying to figure out a way to exclude entries which contain heavy > atoms (13C, 2H, 3H, etc), from a SD file (which has close to two thousand > entries) and write an updated file with the remaining entries. > > I do understand how to read/write SD files using rdkit. > > What I do understand is how to detect entries with heavy isotopes: Is > there an efficient and correct way of achieving this using rdkit? > > > > thanks, > > -- > > Milinda Samaraweera > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > > > -- > > Milinda Samaraweera, Ph.D. > > Postdoctoral Fellow, Department of Pharmacy > > University of Connecticut > > 69 North Eagleville road > > Storrs, CT, 06269 > > milindaatw...@gmail.com > 860-617-8046 <(860)%20617-8046> > -- Milinda Samaraweera, Ph.D. Postdoctoral Fellow, Department of Pharmacy University of Connecticut 69 North Eagleville road Storrs, CT, 06269 milindaatw...@gmail.com 860-617-8046 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit
Greg, I am looking to remove entries that contain un-stable isotopes of elements CHNOPS (e.g. heavy_isotopes =['13C', '14C', '2H', '3H', '15N', '24P', '46P', '33S', '34S', '36S'] ). Is there a way to modify the above code to achieve that? Thanks, Milinda On Wed, Jan 18, 2017 at 11:16 AM, Greg Landrum <greg.land...@gmail.com> wrote: > Hi Milinda, > > Here's an approach that finds all the atoms that have an isotope specified: > > In [1]: from rdkit import Chem > > In [2]: from rdkit.Chem import rdqueries > > In [3]: q = rdqueries.IsotopeGreaterQueryAtom(1) > > In [7]: list(x.GetIdx() for x in Chem.MolFromSmiles('CC[13CH3]' > ).GetAtomsMatchingQuery(q)) > Out[7]: [2] > > In [8]: list(x.GetIdx() for x in Chem.MolFromSmiles('[12CH3]CC[13CH3]'). > GetAtomsMatchingQuery(q)) > Out[8]: [0, 3] > > Does that do what you want it to do? > > -greg > > > > On Wed, Jan 18, 2017 at 3:56 PM, Milinda Samaraweera < > milindaatw...@gmail.com> wrote: > >> Dear Experts, >> >> I am trying to figure out a way to exclude entries which contain heavy >> atoms (13C, 2H, 3H, etc), from a SD file (which has close to two thousand >> entries) and write an updated file with the remaining entries. >> >> I do understand how to read/write SD files using rdkit. >> >> What I do understand is how to detect entries with heavy isotopes: Is >> there an efficient and correct way of achieving this using rdkit? >> >> thanks, >> -- >> Milinda Samaraweera >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Milinda Samaraweera, Ph.D. Postdoctoral Fellow, Department of Pharmacy University of Connecticut 69 North Eagleville road Storrs, CT, 06269 milindaatw...@gmail.com 860-617-8046 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Check for Heavy Isotopes using RdKit
Dear Experts, I am trying to figure out a way to exclude entries which contain heavy atoms (13C, 2H, 3H, etc), from a SD file (which has close to two thousand entries) and write an updated file with the remaining entries. I do understand how to read/write SD files using rdkit. What I do understand is how to detect entries with heavy isotopes: Is there an efficient and correct way of achieving this using rdkit? thanks, -- Milinda Samaraweera -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] SD file read error
Dear Experts, I was trying to read in the attached SD file (downloaded from HMDB) and trying to calculate the exact mass of each entry: structures.sdf <https://drive.google.com/file/d/0B3AmIbK_SzZhdGY3NVgyMDJiQjA/view?usp=drive_web> from rdkit import Chem from rdkit.Chem import Descriptors suppl = Chem.SDMolSupplier(input_file) low_mass=50 high_mass=1000 ms = [] for mol in suppl : if mol is None: continue try: if mol and round(Descriptors.ExactMolWt(mol),4)>=low_mass andround(Descriptors.ExactMolWt(mol),4)<=high_mass: ms.append(mol) except: pass By running the script, I got a barrage of errors as: [13:15:14] ERROR: Could not sanitize molecule ending on line 1993855 [13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than permitted [13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted [13:15:14] ERROR: Could not sanitize molecule ending on line 1994014 [13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than permitted [13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted [13:15:14] ERROR: Could not sanitize molecule ending on line 1996036 [13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than permitted [13:15:16] Explicit valence for atom # 46 N, 4, is greater than permitted [13:15:16] ERROR: Could not sanitize molecule ending on line 2302532 [13:15:16] ERROR: Explicit valence for atom # 46 N, 4, is greater than permitte [13:15:16] Explicit valence for atom # 16 N, 4, is greater than permitted [13:15:16] ERROR: Could not sanitize molecule ending on line 2302918 [13:15:16] ERROR: Explicit valence for atom # 16 N, 4, is greater than permitte [13:15:17] Explicit valence for atom # 11 N, 4, is greater than permitted [13:15:17] ERROR: Could not sanitize molecule ending on line 2556541 [13:15:17] ERROR: Explicit valence for atom # 11 N, 4, is greater than permitte [13:15:18] S group SUP ignored on line 2836416 [13:15:18] Explicit valence for atom # 1 Cl, 4, is greater than permitted [13:15:18] ERROR: Could not sanitize molecule ending on line 2841449 [13:15:18] ERROR: Explicit valence for atom # 1 Cl, 4, is greater than permitte [13:15:19] Warning: conflicting stereochemistry at atom 10 ignored. [13:15:19] Warning: conflicting stereochemistry at atom 10 ignored. [13:15:19] Warning: conflicting stereochemistry at atom 17 ignored. [13:15:19] Warning: conflicting stereochemistry at atom 17 ignored. [13:15:19] Explicit valence for atom # 3 B, 4, is greater than permitted [13:15:19] ERROR: Could not sanitize molecule ending on line 3107498 [13:15:19] ERROR: Explicit valence for atom # 3 B, 4, is greater than permitted [13:15:19] Warning: conflicting stereochemistry at atom 6 ignored. [13:15:19] Warning: conflicting stereochemistry at atom 6 ignored. [13:15:20] Unhandled CTAB feature: S group SRU on line: 3205922. Molecule skip [13:15:20] Explicit valence for atom # 0 Mg, 4, is greater than permitted [13:15:20] ERROR: Could not sanitize molecule ending on line 3222378 [13:15:20] ERROR: Explicit valence for atom # 0 Mg, 4, is greater than permitte [13:15:20] Explicit valence for atom # 2 N, 4, is greater than permitted [13:15:20] ERROR: Could not sanitize molecule ending on line 3265386 [13:15:20] ERROR: Explicit valence for atom # 2 N, 4, is greater than permitted [13:15:20] Explicit valence for atom # 31 N, 4, is greater than permitted [13:15:20] ERROR: Could not sanitize molecule ending on line 3305754 [13:15:20] ERROR: Explicit valence for atom # 31 N, 4, is greater than permitte [13:15:21] Explicit valence for atom # 45 N, 4, is greater than permitted [13:15:21] ERROR: Could not sanitize molecule ending on line 3437055 [13:15:21] ERROR: Explicit valence for atom # 45 N, 4, is greater than permitte [13:15:56] Explicit valence for atom # 3 C, 5, is greater than permitted [13:15:56] ERROR: Could not sanitize molecule ending on line 8391489 [13:15:56] ERROR: Explicit valence for atom # 3 C, 5, is greater than permitted What causes these errors? there a way to suppress or solve the errors? or way to stop priting them up in the command prompt. -- Thanks, Milinda Samaraweera, -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SDwriter
This SD file is then used as an input for another program, that program is having problems reading the sequence numbers. Thanks, MAK On Fri, Dec 16, 2016 at 10:43 PM, Greg Landrum <greg.land...@gmail.com> wrote: > It's easy enough to make this an option, but given that it is part of the > SDF spec (as Andrew has pointed out) the only reason I can think of to do > so would be because it causes problems for some other piece of (likely > commonly used) software. > > Are the sequence numbers causing a problem for you? > > -greg > > > > > > > On Sat, Dec 17, 2016 at 1:46 AM +0100, "Milinda Samaraweera" < > milindaatw...@gmail.com> wrote: > > Dear Users, >> >> I was using the SDWriter in the rdkit kit to generate a SD file with >> mutiple entries generated using smiles and later assign SD tag data (e.g. >> pubchem_ID, IUPAC_name, etc). >> >> However at the end of each tag header I noticed there is a number >> (bolded): >> >> ... >> > * (1) * >> N1-(2-ethylbutyl)hexane-1,3,6-triamine >> >> >*(1) * >> 118903148 >> >> ... >> M END >> > * (2)* >> N1,N2-dimethyl-N2-[3-(methylamino)propyl]-N1-propylpropane-1,2-diamine >> >> > * (2) * >> 118883401 >> >> What is this number and how you avoid printing this number when SDwriter >> is used? As this number is not found in standard SD files. >> >> Thanks, >> CodeMAK >> >> -- Milinda Samaraweera, Ph.D. Postdoctoral Fellow, Department of Pharmacy University of Connecticut 69 North Eagleville road Storrs, CT, 06269 milindaatw...@gmail.com 860-617-8594 -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss