Re: [Rdkit-discuss] Best practice: which (database) fingerprints to use ?

2011-03-08 Thread Stiefl, Nikolaus
Hi JeanPaul, The difference between featmorganbv and morganbv is that the first one uses pharmacophore features for atom descriptions whereas the other one atom types (it essentially corresponds to the ECFP descriptors). I would suggest to use featmorganbv_fp only if you want to do more fuzzy

Re: [Rdkit-discuss] Write SDF as a string instead of two file

2011-05-17 Thread Stiefl, Nikolaus
Hi JP, Maybe a comment since I ran into this before - you will loose the properties of a molecule when just using the MolBlock. Cheers Nik -Original Message- From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: Tuesday, May 17, 2011 6:10 AM To: JP Cc:

Re: [Rdkit-discuss] calculation of number of chiral centres in a mol

2011-05-19 Thread Stiefl, Nikolaus
If you'd have 3D molecules you could use the AllChem.AssignAtomChiralTagsFromStructure() ... but that's only if you have 3D molecules. Nik -Original Message- From: JP [mailto:jeanpaul.ebe...@inhibox.com] Sent: Thursday, May 19, 2011 1:27 PM To: Greg Landrum Cc:

Re: [Rdkit-discuss] how to come to a good model

2011-10-10 Thread Stiefl, Nikolaus
Hi Paul, I'd agree on Greg's comment - if this is for a hErg model then you will not have a lot of luck to make a reasonable model purely with physchem properties. I guess having information on ionizability could be of help. Another one to test - in case you need to make a 3 class model -

Re: [Rdkit-discuss] multiprocessing rdkit

2011-10-11 Thread Stiefl, Nikolaus
Hi Paul, When I look at your definition below and the one that worked there is a slight difference. In fps_calc you are passing a molecule and then you try to iterate over it (in fps = [GetMorganFingerprint(x,3) for x in m] ). Whereas in generateconformations(m) you also pass a single

Re: [Rdkit-discuss] Mol2 Format problem ? Can't kekulize mol -- but with a twist.

2012-01-12 Thread Stiefl, Nikolaus
fix. Unfortunately, but the mol2 file format (documentation) is such a pain in general and the multiple different implementations doesn't make it any better. Sorry Nik From: JP [mailto:jeanpaul.ebe...@inhibox.com] Sent: Thursday, January 12, 2012 3:33 PM To: Stiefl, Nikolaus Cc: rdkit-discuss

Re: [Rdkit-discuss] improving substructure search behavior with real molecules

2012-03-01 Thread Stiefl, Nikolaus
Hi Greg, Personally I like your suggestion of the behaviour similar to SMARTS. That way one can decide to whwat level of granularity one wants. Obviously it also means that we have to think a bit more about our queries and database preparations - I am sure though this will only improve our

Re: [Rdkit-discuss] Detecting rings and bond types from PDB HETATM record

2012-07-06 Thread Stiefl, Nikolaus
Hi JP, Not sure if this is of any help. If it's an pdb file from rcsb or an in-house one where you have a corresponding smiles available maybe you could use this information to properly setup the bond types using bond matches? I know the components.cif file still has quite a few errors - however,

Re: [Rdkit-discuss] matching substructures to molecules

2012-08-07 Thread Stiefl, Nikolaus
Hi Gonzalo, SmilesToMol has a sanitize flag which you can set to False. However - I am not sure how well you molecule fingerprints will work with an unsanitized molecule. I would imagine that you will run into all sorts of funny problems wrt aromaticity detection etc. Not sure if this helps

Re: [Rdkit-discuss] fdef files

2012-10-09 Thread Stiefl, Nikolaus
Hi James, Yes – I just checked the files I generated for the ph4 definitions and they definitely do not have the tautomer recognition (guess I am just not SMART enough for this ;-). I will follow up with Greg and see how quickly we can get the fdef file as a contribution so people can start to

Re: [Rdkit-discuss] non-smallest rings

2013-01-21 Thread Stiefl, Nikolaus
do you just have to check if an atom is in a 6-membered ring? If so then In [8]: m = Chem.MolFromSmiles('COc1ccc(cc1O[C@H]1C[C@@H]2CC[C@H]1C2)C1CNC(=O)NC1') In [9]: [a.IsInRingSize(6) for a in m.GetAtoms()] Out[9]: [False, False, True, True, True, True, True, True, False, False,

Re: [Rdkit-discuss] Volume Overlap using RDKit

2013-01-22 Thread Stiefl, Nikolaus
Hi JP, Do you want to do a shape align or just any sort of alignment? There is a MolAlign in All.Chem which will give you an RMSD align. This works well if you have reasonably similar molecules (do a GetSubstructMatch before to get the atom list). Don't think there is a shape alignment for

Re: [Rdkit-discuss] SmilesWriter

2013-07-03 Thread Stiefl, Nikolaus
Hmm, I don't know if there is a predefined option in RDKit but if you have a list of properties (say propertyList) you want to pick you could write directly to a text file something along those lines: w.write(%s\t%s%(Chem.MolToSmiles(m),\t.join([m.GetProp(p) for p in propertyList if

Re: [Rdkit-discuss] RDkit user beginner.

2013-07-04 Thread Stiefl, Nikolaus
Hi - welcome to the community You have a typo there GenMACCSkeys Should actually be GenMACCSKeys (ie an uppercase K in Keys) Try ipython (with the ipython notebook) then autocomplete will solve those issues for you more or less. The notebook is cool and there is some info in the mailing

Re: [Rdkit-discuss] Chemistry 101 question...

2013-10-22 Thread Stiefl, Nikolaus
Hi James, Interesting. I wonder if this is also dependent on the transport phase that was used. Do you have any info on that? Was it a typical 10% MeOH or more something with dichlormethane? Cheers Nik From: James Davidson j.david...@vernalis.commailto:j.david...@vernalis.com Date: Tuesday,

Re: [Rdkit-discuss] Load mol2 file with partial charges

2015-11-20 Thread Stiefl, Nikolaus
Hi Gaetano The properties of the mol2 file are stored as atom properties. Here is an example (sorry - the only thing I have at hand right now is a benzene mol2 file created with moe - note the mol2 file parser was tested on corina mol2 files) Here is the file

Re: [Rdkit-discuss] creating new properties on a molecule

2016-03-14 Thread Stiefl, Nikolaus
Hi Chris No - that should just work: m = Chem.MolFromSmiles("CC") list(m.GetPropNames()) [] m.SetProp("NewProp","TheNewProp") list(m.GetPropNames()) ['NewProp'] m.GetProp("NewProp") 'TheNewProp' Ciao Nik From: chris dalton > Date: Monday 14

Re: [Rdkit-discuss] Corresponding Features to Mol2 file

2016-03-29 Thread Stiefl, Nikolaus
Hi Nick What you are after is (based on your example below) feat = feats[0] feat.GetAtomIds() (here for the first feature. In addition you might also want to check out for a better understanding what is what: feat.GetFamiliy() feat.GetType() Ciao Nik From: Nicholas Michelarakis

Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit

2017-01-19 Thread Stiefl, Nikolaus
Then I guess Greg’s solution is the better suited one since you don’t have to specify a list of isotopes (assuming that your input compounds will not have things like 12C specified). Minor modification: In [23]: q = rdqueries.IsotopeGreaterQueryAtom(1) In [24]: atomNums = [1,6,7,8,15,16] In

Re: [Rdkit-discuss] Check for Heavy Isotopes using RdKit

2017-01-18 Thread Stiefl, Nikolaus
Hi Maybe this is much less efficient but I guess if you need it for specific isotopes then you could try using a smarts pattern and check for that? In [20]: q = Chem.MolFromSmarts("[13C,14C,2H,3H,15N,24P,46P,33S,34S,36S]") In [21]: m = Chem.MolFromSmiles('CC[15NH2]') In [22]:

Re: [Rdkit-discuss] elimination of small fragments

2018-06-29 Thread Stiefl, Nikolaus
Even better ☺ From: Greg Landrum Date: Friday 29 June 2018 at 18:04 To: GMCProfile Cc: "rdkit-discuss@lists.sourceforge.net" Subject: Re: [Rdkit-discuss] elimination of small fragments How about just GetLargestFragment()? On Fri, 29 Jun 2018 at 16:45, Stiefl, Nikolaus mailto:ni

Re: [Rdkit-discuss] elimination of small fragments

2018-06-29 Thread Stiefl, Nikolaus
Quick question – mostly to the core developers I guess: I just checked and have that kind of thing in my code in at least 5 different places - wouldn’t it make sense to have that kind of functionality as a convenience function as part of the GetMolFrags method? Something along the lines of

[Rdkit-discuss] RGroup matching in RGroup decomposition code

2018-12-11 Thread Stiefl, Nikolaus
Hi all, I was playing around with the RGroup decomposition code and must say that I am pretty impressed by it. The fact that one can directly work with a MDL R-group file and that the output is a pandasDataFrame makes analysis really slick - well done ! However, one thing that irritates me is

Re: [Rdkit-discuss] RGroup matching in RGroup decomposition code

2018-12-13 Thread Stiefl, Nikolaus
Kelley and he suggested to fix it in the underlying codebase so I hope this will be fixed in the next version :) Cheers Nik From: Paolo Tosco Sent: Thursday, December 13, 2018 11:09 AM To: Stiefl, Nikolaus ; RDKit Discuss Subject: Re: [Rdkit-discuss] RGroup matching in RGroup decomposition code

[Rdkit-discuss] CIx position at NIBR Cambridge, US

2020-06-08 Thread Stiefl, Nikolaus
Dear all, I wanted to bring to your attention that our position for a cheminformatics expert in the CADD group in our global chemistry community at NIBR is opened again. https://www.novartis.com/careers/career-search/job-details/288340BR If you feel like you want to apply your skill set to

Re: [Rdkit-discuss] Cheminformatics Graduate School Recommendations?

2021-07-20 Thread Stiefl, Nikolaus
Hi Patrick, Sorry yet another non US-based lab but I will still throw in Sereina Riniker’s group (Riniker, Sereina, Prof. Dr. | ETH Zurich) who recently attracted quite some talent ;-) Ciao Nik From: