Hi Both,
Many thanks for your rapid response, much appreciated.
Cheers
Chris
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Hi Chris,
I think you can do it with SaveXlsxFromFrame.
http://rdkit.org/docs/source/rdkit.Chem.PandasTools.html#SaveXlsxFromFrame
rdkit.Chem.PandasTools.SaveXlsxFromFrame(*frame*, *outFile*,
*molCol='ROMol'*, *size=(300, 300)*, *formats=None*)¶
Search and add similarity to resulting data frame
> On 27 Nov 2016, at 07:55, Greg Landrum wrote:
>
>
> You don't know if what could be done as a single line?
>
> -greg
--
On Sun, Nov 27, 2016 at 8:45 AM, Chris Swain wrote:
> I added the similarity scores
> by adding an extra line,
>
> sdf['sim']=DataStructs.BulkTanimotoSimilarity(ionised_fps,sdf['mfp2’])
>
Yes, that's what I would have done.
> I don’t know if it could be done in a single line?
>
I added the similarity scores
by adding an extra line,
sdf['sim']=DataStructs.BulkTanimotoSimilarity(ionised_fps,sdf['mfp2’])
I don’t know if it could be done in a single line?
Chris
> On 26 Nov 2016, at 04:48, Greg Landrum wrote:
>
> That's a good question.
>
>
This works very nicely, would it be nice to add the similarity scores to the
resulting data frame.
Cheers,
Chris
> On 26 Nov 2016, at 04:48, Greg Landrum wrote:
>
> That's a good question.
>
> I'm not a master of pandas indexing, but this seems to work:
> In [5]:
That's a good question.
I'm not a master of pandas indexing, but this seems to work:
In [5]: sdf['mfp2'] = [rdMolDescriptors.GetMorganFingerprintAsBitVect(x,2)
for x in sdf['ROMol']]
In [8]: sims = DataStructs.BulkTanimotoSimilarity(qry,sdf['mfp2'])
In [13]: ids = [x for x,y in enumerate(sims) if
Peter,
If you have chemfp and can make a chemfp arena, RDKit now supports these
structures for reading and searching. This, by far, is the fastest way I
know of similarity searching. I believe that Greg's implementation is
compatible with chemfp 1.0 which is available on pypi:
Is it possible to use the bulk similarity searching functionality for
better performance instead of the list comprehension?
Best,
Peter
On Wed, Nov 23, 2016 at 9:11 AM Greg Landrum wrote:
No worries.
This, and Anna's question about similarity searching and clustering
No worries.This, and Anna's question about similarity searching and clustering
illustrate a great opportunity for a tutorial on fingerprints and similarity
searching.
-greg
On Wed, Nov 23, 2016 at 3:00 PM +0100, "Chris Swain" wrote:
Thanks for this,
As a chemist
axis=0)
Paul
Von: Maciek Wójcikowski [mailto:mac...@wojcikowski.pl]
Gesendet: Freitag, 11. März 2016 12:29
An: Paul Czodrowski <paul.czodrow...@merckgroup.com>
Cc: rdkit <rdkit-discuss@lists.sourceforge.net>
Betreff: Re: [Rdkit-discuss] Pandas dataframe manipulation
Hi Paul
Hi Paul,
I would suggest:
- assigning dtype of dataframe/column to str/np.object
- cleaning up the IC50s
- casting to float/int as dataframe.astype()
Or alternatively you could use "converters" argument:
pd.read_csv('filename.csv', converters={'ic50_colname': lambda x:
x.replace('>',
Dear Niko,
I was exactly looking for this functionality, great work!
A few follow-up questions:
* frame.set_index('_Name') did not work, but there is a name set in the SD
file.
* Is there a way to load in only a specified list of SD tags? (I didn't
find a names parameter for LoadSDF)
*
Hi Paul,
I'll answer directly below.
Dear Niko,
I was exactly looking for this functionality, great work!
A few follow-up questions:
* frame.set_index('_Name') did not work, but there is a name set in the SD
file.
The molecule name is contained in the column specified by the optional
Hi Paul,
I am not sure if it is easily doable to get the pandas read_table function to
handle sd-files. However, there is some basic functionality for this already
built-in in the PandasTools module. If you check the docktest header there is a
small example. Basically,
frame =
15 matches
Mail list logo