Hi Markus,
Sorry, but I am running a bit out of ideas. Could you check whether the
structures are rendered if you write the "dataframe.to_html()" to a file and
open that as a webpage. If this works than it probably has to do something with
the ipython environment (btw, which version are you using?).
Best,
Niko
On May 8, 2013 at 9:51 AM Markus Hartenfeller
<markus.hartenfel...@molecularhealth.com> wrote:
> Hi Niko,
>
> I tried this piece of code adapted from the doctest and got the same result
> (table is fine, but no rendering of molecules):
>
> from rdkit.Chem import PandasTools
> import pandas as pd
> import os
> from rdkit import RDConfig
> from rdkit.Chem.Draw import IPythonConsole
> from IPython.core.display import HTML
> antibiotics = pd.DataFrame(columns=['Name','Smiles'])
> antibiotics =
> antibiotics.append({'Smiles':'CC1(C(N2C(S1)C(C2=O)NC(=O)CC3=CC=CC=C3)C(=O)O)C','Name':'Penicilline
> G'}, ignore_index=True)#Penicilline G
> antibiotics =
> antibiotics.append({'Smiles':'CC1(C2CC3C(C(=O)C(=C(C3(C(=O)C2=C(C4=C1C=CC=C4O)O)O)O)C(=O)N)N(C)C)O','Name':'Tetracycline'},
> ignore_index=True)#Tetracycline
> antibiotics =
> antibiotics.append({'Smiles':'CC1(C(N2C(S1)C(C2=O)NC(=O)C(C3=CC=CC=C3)N)C(=O)O)C','Name':'Ampicilline'},
> ignore_index=True)#Ampicilline
>
> PandasTools.AddMoleculeColumnToFrame(antibiotics,'Smiles','Molecule',includeFingerprints=True)
> display(HTML(antibiotics.to_html()))
>
>
> The img tag and the png encoding themselves are fine. If I paste one in a
> simple html page and open it with the same browser the molecule is rendered.
>
> Best,
> Markus
>
>
>
> On 05/08/2013 09:03 AM, Fechner, Nikolas wrote:
>
> > > Hi Markus,
> > Could you try the examples that are included as doctests in the
> > PandasTools.py module? These should definitely work and show rendered
> > molecules in the tables.
> >
> > Best,
> > Niko
> >
> > From: Markus Hartenfeller < markus.hartenfel...@molecularhealth.com
> > <mailto:markus.hartenfel...@molecularhealth.com> >
> > Date: Tuesday, May 7, 2013 1:40 PM
> > To: " rdkit-discuss@lists.sourceforge.net
> > <mailto:rdkit-discuss@lists.sourceforge.net> " <
> > rdkit-discuss@lists.sourceforge.net
> > <mailto:rdkit-discuss@lists.sourceforge.net> >
> > Subject: Re: [Rdkit-discuss] New module for RDKit - PANDAS integration
> >
> > Sorry for the confusion, I truncated the string myself in the mail
> > because I did not want to paste the whole beast. The fields contain the full
> > strings and the tag is closed.
> >
> > Best,
> > Markus
> >
> > On 05/07/2013 01:25 PM, Nikolas Fechner wrote:
> >
> > > > > When developing the module I occasionally had
> > > > > problems with *very* long png strings, because the pandas
> > > > > maximal column width applies to the string, which is what is
> > > > > stored in the dataframe, before the image rendering. As an
> > > > > effect the truncated png string was shown in the table
> > > > > (exactly the "...' ending shown in your example).
> > > You could try manually setting the maximal width very high (e.g.
> > > pandas.set_option("display.max_colwidth",100000)). This should be done
> > > automatically by the PandasTools, which sets it the len(PNG)+100 for the
> > > longest string found during rendering, but because this rarely had an
> > > impact I could very well have overseen some problems with this strategy.
> > >
> > > Best,
> > > Niko
> > >
> > > On May 7, 2013 at 1:13 PM Markus Hartenfeller
> > > <markus.hartenfel...@molecularhealth.com>
> > > <mailto:markus.hartenfel...@molecularhealth.com> wrote:
> > >
> > > > > > > Thanks again for your reply. That's what I have tried:
> > > >
> > > > from rdkit import Chem
> > > > from rdkit.Chem import AllChem
> > > > import pandas as pd
> > > > from rdkit.Chem import PandasTools
> > > > from rdkit.Chem.Draw import IPythonConsole
> > > > from IPython.core.display import HTML
> > > > df = PandasTools.LoadSDF('test.sdf',
> > > > includeFingerprints=False)
> > > > display(HTML(df.to_html()))
> > > >
> > > > So it is a dataframe and .to_html() works fine in general. I
> > > > see all sdf fields. It's just that the molecule column contains string
> > > > value of this kind:
> > > >
> > > > <img
> > > > src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASwAAAEsCAYAAAB
> > > > <data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASwAAAEsCAYAAAB> ...
> > > >
> > > >
> > > > The notebook somehow does not realize that it is an html tag
> > > > with an image, but instead renders it as a normal string (just like
> > > > before with the single molecule).
> > > >
> > > > Best wishes,
> > > > Markus
> > > >
> > > >
> > > > On 05/07/2013 12:57 PM, Nikolas Fechner wrote:
> > > >
> > > > > > > > > Just for clarification, are you
> > > > > > > > > trying to render a dataframe or a series/single
> > > > > > > > > column? The pandas series object has no
> > > > > > > > > to_html() method and is therefore rendered as
> > > > > > > > > string only. Moreover, if you select a single
> > > > > > > > > column, e.g. 'ROMol' from a dataframe by
> > > > > > > > > df['ROMol'] you will get a series object that is
> > > > > > > > > rendered as string. If you select a set of
> > > > > > > > > columns you get a dataframe, for which the HTML
> > > > > > > > > rendering should work. The latter also works for
> > > > > > > > > a single column if you enclose in double
> > > > > > > > > brackets df[ ['ROMol' ]], which will give a
> > > > > > > > > single-column dataframe. This took me some time
> > > > > > > > > to figure out and the silent conversion that
> > > > > > > > > sometimes occurs can be quite confusing.
> > > > >
> > > > > Best,
> > > > > Niko
> > > > >
> > > > > On May 7, 2013 at 11:33 AM Markus Hartenfeller
> > > > > <markus.hartenfel...@molecularhealth.com>
> > > > > <mailto:markus.hartenfel...@molecularhealth.com> wrote:
> > > > >
> > > > > > > > > > > Thanks for your help, Niko. Importing the
> > > > > > > > > > > iPythonConsole from rdkit + removing the
> > > > > > > > > > > 'print' command did the trick for a single
> > > > > > > > > > > molecule :)
> > > > > >
> > > > > > Unfortunately, molecules in data frames are still
> > > > > > shown as strings, even when forcing html rendering. I will try to
> > > > > > get this working and report here if I make any progress. In case
> > > > > > somebody has already faced the same problem please let me know.
> > > > > >
> > > > > > Best,
> > > > > > Markus
> > > > > >
> > > > > >
> > > > > > On 05/07/2013 10:27 AM, Nikolas Fechner wrote:
> > > > > >
> > > > > > > > > > > > > Hi Markus,
> > > > > > > glad you think it could be useful :). Regarding
> > > > > > > the problem, there are two things: You have to import the RDKit
> > > > > > > IPythonConsole to enable the molecule rendering (from
> > > > > > > rdkit.Chem.Draw import IPythonConsole) and if you trigger the
> > > > > > > output using 'print' the notebook will always use string rendering
> > > > > > > (AFAIK). Just try 'm' alone (instead of 'print m'). Alternatively,
> > > > > > > you can always force the notebook to do a HTML rendering (useful
> > > > > > > for large dataframe):
> > > > > > >
> > > > > > > from IPython.core.display import HTML
> > > > > > > display(HTML('''any HTML string e.g.
> > > > > > > dataframe.to_html()'''))
> > > > > > >
> > > > > > > I hope that helps.
> > > > > > >
> > > > > > > Best,
> > > > > > > Niko
> > > > > > >
> > > > > > > On May 7, 2013 at 10:02 AM Markus Hartenfeller
> > > > > > > <markus.hartenfel...@molecularhealth.com>
> > > > > > > <mailto:markus.hartenfel...@molecularhealth.com> wrote:
> > > > > > >
> > > > > > > > > > > > > > > Hi Nikolas,
> > > > > > > >
> > > > > > > > I had a first look at the PandasTools
> > > > > > > > package: very cool! I think this is going to be useful for many
> > > > > > > > rdkit users. I'm looking forward to using it in the future.
> > > > > > > > Thanks for sharing this module.
> > > > > > > >
> > > > > > > > I'm having troubles to see the molecule
> > > > > > > > depictions in the ipython notebook though (both in tables and by
> > > > > > > > just printing out a single molecule).
> > > > > > > >
> > > > > > > > This code in a ipython notebook
> > > > > > > >
> > > > > > > > from rdkit import Chem
> > > > > > > > from rdkit.Chem import PandasTools
> > > > > > > > m=Chem.MolFromSmiles('N1CCNCC1')
> > > > > > > > print m
> > > > > > > >
> > > > > > > > gives me
> > > > > > > >
> > > > > > > > <img
> > > > > > > > src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASwAAAEsCAYAAAB
> > > > > > > > <data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASwAAAEsCAYAAAB>
> > > > > > > > ...
> > > > > > > >
> > > > > > > > a very long string with the base64 encoding
> > > > > > > > of the image, but not the image itself. Plotting from matplotlib
> > > > > > > > works fine. Did I forget to import something, or could it be a
> > > > > > > > browser issue? I am using centOS 6 and Firefox.
> > > > > > > >
> > > > > > > > Thanks in advance.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Markus
> > > > > > > >
> > > > > > > >
> > > > > > > > On 04/19/2013 11:56 AM, Nikolas Fechner
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > Dear
> > > > > > > > > > > > > > > > > all,
> > > > > > > > > We developed a new module (
> > > > > > > > > rdkit.Chem.PandasTools.py ) that allows for using RDKit
> > > > > > > > > molecule objects directly in pandas dataframes. Pandas (
> > > > > > > > > http://pandas.pydata.org/ <http://pandas.pydata.org/> ) is a
> > > > > > > > > python library that offers table-like datacontainers, which
> > > > > > > > > are incredibly useful for anything related to data mining.
> > > > > > > > > Moreover, it integrates nicely with the ipython notebook
> > > > > > > > > producing rendered HTML tables for the dataframes. The RDKit
> > > > > > > > > integration allows to have molecule-type columns and
> > > > > > > > > functionality to perform substructure-based row filtering
> > > > > > > > > directly on the pandas table. Additionally, if a dataframe is
> > > > > > > > > exported as HTML or shown within an ipython notebook, the
> > > > > > > > > molecules in the table are rendered as 2D structures.
> > > > > > > > >
> > > > > > > > > The new module is available in the
> > > > > > > > > current SF trunk and contains a doctest header that provides
> > > > > > > > > examples of how to use it.
> > > > > > > > >
> > > > > > > > > I hope some of you find that
> > > > > > > > > interesting. As always, bug reports, comments, ideas... are
> > > > > > > > > very much appreciated.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Nikolas
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ------------------------------------------------------------------------------
> > > > > > > > > Precog is a next-generation analytics
> > > > > > > > > platform capable of advanced
> > > > > > > > > analytics on semi-structured data. The
> > > > > > > > > platform includes APIs for building
> > > > > > > > > apps and a phenomenal toolset for data
> > > > > > > > > science. Developers can use
> > > > > > > > > our toolset for easy data analysis &
> > > > > > > > > visualization. Get a free
> > > > > > > > > account!http://www2.precog.com/precogplatform/slashdotnewsletter
> > > > > > > > > <http://www2.precog.com/precogplatform/slashdotnewsletter>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > Rdkit-discuss mailing list
> > > > > > > > > Rdkit-discuss@lists.sourceforge.net
> > > > > > > > > <mailto:Rdkit-discuss@lists.sourceforge.net>https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> > > > > > > > > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > ------------------------------------------------------------------------------
> > > > > > > Learn Graph Databases - Download FREE O'Reilly
> > > > > > > Book
> > > > > > > "Graph Databases" is the definitive new guide
> > > > > > > to graph databases and
> > > > > > > their applications. This 200-page book is
> > > > > > > written by three acclaimed
> > > > > > > leaders in the field. The early access version
> > > > > > > is available now.
> > > > > > > Download your free book today!
> > > > > > > http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
> > > > > > > <http://p.sf.net/sfu/neotech_d2d_may_______________________________________________>
> > > > > > > Rdkit-discuss mailing list
> > > > > > > Rdkit-discuss@lists.sourceforge.net
> > > > > > > <mailto:Rdkit-discuss@lists.sourceforge.net>
> > > > > > >
> > > > > > >
> > > > > > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> > > > > > > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > ------------------------------------------------------------------------------
> > > > > Learn Graph Databases - Download FREE O'Reilly Book
> > > > > "Graph Databases" is the definitive new guide to graph
> > > > > databases and
> > > > > their applications. This 200-page book is written by
> > > > > three acclaimed
> > > > > leaders in the field. The early access version is
> > > > > available now.
> > > > > Download your free book today!
> > > > > http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
> > > > > <http://p.sf.net/sfu/neotech_d2d_may_______________________________________________>
> > > > > Rdkit-discuss mailing list
> > > > > Rdkit-discuss@lists.sourceforge.net
> > > > > <mailto:Rdkit-discuss@lists.sourceforge.net>
> > > > >
> > > > >
> > > > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> > > > > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> > > > >
> > > > > > > > >
> > > >
> > > > > > >
> > > > > > >
> > > > ------------------------------------------------------------------------------
> > > Learn Graph Databases - Download FREE O'Reilly Book
> > > "Graph Databases" is the definitive new guide to graph databases
> > > and
> > > their applications. This 200-page book is written by three
> > > acclaimed
> > > leaders in the field. The early access version is available now.
> > > Download your free book today!
> > > http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
> > > <http://p.sf.net/sfu/neotech_d2d_may_______________________________________________>
> > > Rdkit-discuss mailing list
> > > Rdkit-discuss@lists.sourceforge.net
> > > <mailto:Rdkit-discuss@lists.sourceforge.net>
> > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> > > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> > >
> > > > >
> > > --
> Markus Hartenfeller
> Chemoinformatics Specialist
> Molecular Health GmbH
> Belfortstr. 2
> 69115 Heidelberg
> Germany
> Tel: +49 6221 43851 209
> Fax: +49 6221 43851 100
> Email: markus.hartenfel...@molecularhealth.com
> <mailto:markus.hartenfel...@molecularhealth.com>
> www.molecularhealth.com <http://www.molecularhealth.com>
>
> ----------------------------------------------------------
> Molecular Health GmbH
>
> Geschaeftsfuehrer: Dr. Stephan Brock/
> Dr. Friedrich von Bohlen und Halbach
>
> Sitz der Gesellschaft: Heidelberg
> Handelsregister: Amtsgericht Mannheim - HRB 338037
> ----------------------------------------------------------
>
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today!
http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss