Re: [Rdkit-discuss] Pandas to Excel

2024-02-22 Thread Chris Swain via Rdkit-discuss
Hi Both, Many thanks for your rapid response, much appreciated. Cheers Chris ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Pandas to Excel

2024-02-22 Thread Taka Seri
Hi Chris, I think you can do it with SaveXlsxFromFrame. http://rdkit.org/docs/source/rdkit.Chem.PandasTools.html#SaveXlsxFromFrame rdkit.Chem.PandasTools.SaveXlsxFromFrame(*frame*, *outFile*, *molCol='ROMol'*, *size=(300, 300)*, *formats=None*)¶

Re: [Rdkit-discuss] Pandas

2016-11-26 Thread Chris Swain
Search and add similarity to resulting data frame > On 27 Nov 2016, at 07:55, Greg Landrum wrote: > > > You don't know if what could be done as a single line? > > -greg --

Re: [Rdkit-discuss] Pandas

2016-11-26 Thread Greg Landrum
On Sun, Nov 27, 2016 at 8:45 AM, Chris Swain wrote: > I added the similarity scores > by adding an extra line, > > sdf['sim']=DataStructs.BulkTanimotoSimilarity(ionised_fps,sdf['mfp2’]) > Yes, that's what I would have done. > I don’t know if it could be done in a single line? >

Re: [Rdkit-discuss] Pandas

2016-11-26 Thread Chris Swain
I added the similarity scores by adding an extra line, sdf['sim']=DataStructs.BulkTanimotoSimilarity(ionised_fps,sdf['mfp2’]) I don’t know if it could be done in a single line? Chris > On 26 Nov 2016, at 04:48, Greg Landrum wrote: > > That's a good question. > >

Re: [Rdkit-discuss] Pandas

2016-11-26 Thread Chris Swain
This works very nicely, would it be nice to add the similarity scores to the resulting data frame. Cheers, Chris > On 26 Nov 2016, at 04:48, Greg Landrum wrote: > > That's a good question. > > I'm not a master of pandas indexing, but this seems to work: > In [5]:

Re: [Rdkit-discuss] Pandas

2016-11-25 Thread Greg Landrum
That's a good question. I'm not a master of pandas indexing, but this seems to work: In [5]: sdf['mfp2'] = [rdMolDescriptors.GetMorganFingerprintAsBitVect(x,2) for x in sdf['ROMol']] In [8]: sims = DataStructs.BulkTanimotoSimilarity(qry,sdf['mfp2']) In [13]: ids = [x for x,y in enumerate(sims) if

Re: [Rdkit-discuss] Pandas

2016-11-23 Thread Brian Kelley
Peter, If you have chemfp and can make a chemfp arena, RDKit now supports these structures for reading and searching. This, by far, is the fastest way I know of similarity searching. I believe that Greg's implementation is compatible with chemfp 1.0 which is available on pypi:

Re: [Rdkit-discuss] Pandas

2016-11-23 Thread Peter Gedeck
Is it possible to use the bulk similarity searching functionality for better performance instead of the list comprehension? Best, Peter On Wed, Nov 23, 2016 at 9:11 AM Greg Landrum wrote: No worries. This, and Anna's question about similarity searching and clustering

Re: [Rdkit-discuss] Pandas

2016-11-23 Thread Greg Landrum
No worries.This, and Anna's question about similarity searching and clustering illustrate a great opportunity for a tutorial on fingerprints and similarity searching.  -greg On Wed, Nov 23, 2016 at 3:00 PM +0100, "Chris Swain" wrote: Thanks for this, As a chemist

Re: [Rdkit-discuss] Pandas dataframe manipulation

2016-03-11 Thread Paul Czodrowski
axis=0) Paul Von: Maciek Wójcikowski [mailto:mac...@wojcikowski.pl] Gesendet: Freitag, 11. März 2016 12:29 An: Paul Czodrowski <paul.czodrow...@merckgroup.com> Cc: rdkit <rdkit-discuss@lists.sourceforge.net> Betreff: Re: [Rdkit-discuss] Pandas dataframe manipulation Hi Paul

Re: [Rdkit-discuss] Pandas dataframe manipulation

2016-03-11 Thread Maciek Wójcikowski
Hi Paul, I would suggest: - assigning dtype of dataframe/column to str/np.object - cleaning up the IC50s - casting to float/int as dataframe.astype() Or alternatively you could use "converters" argument: pd.read_csv('filename.csv', converters={'ic50_colname': lambda x: x.replace('>',

Re: [Rdkit-discuss] pandas / sd-tags

2013-07-02 Thread Paul . Czodrowski
Dear Niko, I was exactly looking for this functionality, great work! A few follow-up questions: * frame.set_index('_Name') did not work, but there is a name set in the SD file. * Is there a way to load in only a specified list of SD tags? (I didn't find a names parameter for LoadSDF) *

Re: [Rdkit-discuss] pandas / sd-tags

2013-07-02 Thread Nikolas Fechner
Hi Paul, I'll answer directly below. Dear Niko, I was exactly looking for this functionality, great work! A few follow-up questions: * frame.set_index('_Name') did not work, but there is a name set in the SD file. The molecule name is contained in the column specified by the optional

Re: [Rdkit-discuss] pandas / sd-tags

2013-07-01 Thread Nikolas Fechner
Hi Paul, I am not sure if it is easily doable to get the pandas read_table function to handle sd-files. However, there is some basic functionality for this already built-in in the PandasTools module. If you check the docktest header there is a small example. Basically, frame =