When developing the module I occasionally had problems with *very* long png
strings, because the pandas maximal column width applies to the string, which is
what is stored in the dataframe, before the image rendering. As an effect the
truncated png string was shown in the table (exactly the "...' ending shown in
your example).
You could try manually setting the maximal width very high (e.g.
pandas.set_option("display.max_colwidth",100000)). This should be done
automatically by the PandasTools, which sets it the len(PNG)+100 for the longest
string found during rendering, but because this rarely had an impact I could
very well have overseen some problems with this strategy.
Best,
Niko
On May 7, 2013 at 1:13 PM Markus Hartenfeller
<markus.hartenfel...@molecularhealth.com> wrote:
> Thanks again for your reply. That's what I have tried:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
> import pandas as pd
> from rdkit.Chem import PandasTools
> from rdkit.Chem.Draw import IPythonConsole
> from IPython.core.display import HTML
> df = PandasTools.LoadSDF('test.sdf', includeFingerprints=False)
> display(HTML(df.to_html()))
>
> So it is a dataframe and .to_html() works fine in general. I see all sdf
> fields. It's just that the molecule column contains string value of this kind:
>
> <img src="
> <> ...
>
>
> The notebook somehow does not realize that it is an html tag with an image,
> but instead renders it as a normal string (just like before with the single
> molecule).
>
> Best wishes,
> Markus
>
>
> On 05/07/2013 12:57 PM, Nikolas Fechner wrote:
>
> > > Just for clarification, are you trying to render a dataframe or
> > > a series/single column? The pandas series object has no to_html()
> > > method and is therefore rendered as string only. Moreover, if you
> > > select a single column, e.g. 'ROMol' from a dataframe by df['ROMol']
> > > you will get a series object that is rendered as string. If you
> > > select a set of columns you get a dataframe, for which the HTML
> > > rendering should work. The latter also works for a single column if
> > > you enclose in double brackets df[ ['ROMol' ]], which will give a
> > > single-column dataframe. This took me some time to figure out and the
> > > silent conversion that sometimes occurs can be quite confusing.
> >
> > Best,
> > Niko
> >
> > On May 7, 2013 at 11:33 AM Markus Hartenfeller
> > <markus.hartenfel...@molecularhealth.com>
> > <mailto:markus.hartenfel...@molecularhealth.com> wrote:
> >
> > > > > Thanks for your help, Niko. Importing the iPythonConsole from
> > > > > rdkit + removing the 'print' command did the trick for a single
> > > > > molecule :)
> > >
> > > Unfortunately, molecules in data frames are still shown as strings,
> > > even when forcing html rendering. I will try to get this working and
> > > report here if I make any progress. In case somebody has already faced the
> > > same problem please let me know.
> > >
> > > Best,
> > > Markus
> > >
> > >
> > > On 05/07/2013 10:27 AM, Nikolas Fechner wrote:
> > >
> > > > > > > Hi Markus,
> > > > glad you think it could be useful :). Regarding the problem,
> > > > there are two things: You have to import the RDKit IPythonConsole to
> > > > enable the molecule rendering (from rdkit.Chem.Draw import
> > > > IPythonConsole) and if you trigger the output using 'print' the notebook
> > > > will always use string rendering (AFAIK). Just try 'm' alone (instead of
> > > > 'print m'). Alternatively, you can always force the notebook to do a
> > > > HTML rendering (useful for large dataframe):
> > > >
> > > > from IPython.core.display import HTML
> > > > display(HTML('''any HTML string e.g. dataframe.to_html()'''))
> > > >
> > > > I hope that helps.
> > > >
> > > > Best,
> > > > Niko
> > > >
> > > > On May 7, 2013 at 10:02 AM Markus Hartenfeller
> > > > <markus.hartenfel...@molecularhealth.com>
> > > > <mailto:markus.hartenfel...@molecularhealth.com> wrote:
> > > >
> > > > > > > > > Hi Nikolas,
> > > > >
> > > > > I had a first look at the PandasTools package: very cool! I
> > > > > think this is going to be useful for many rdkit users. I'm looking
> > > > > forward to using it in the future. Thanks for sharing this module.
> > > > >
> > > > > I'm having troubles to see the molecule depictions in the
> > > > > ipython notebook though (both in tables and by just printing out a
> > > > > single molecule).
> > > > >
> > > > > This code in a ipython notebook
> > > > >
> > > > > from rdkit import Chem
> > > > > from rdkit.Chem import PandasTools
> > > > > m=Chem.MolFromSmiles('N1CCNCC1')
> > > > > print m
> > > > >
> > > > > gives me
> > > > >
> > > > > <img
> > > > > src="
> > > > > <> ...
> > > > >
> > > > > a very long string with the base64 encoding of the image,
> > > > > but not the image itself. Plotting from matplotlib works fine. Did I
> > > > > forget to import something, or could it be a browser issue? I am using
> > > > > centOS 6 and Firefox.
> > > > >
> > > > > Thanks in advance.
> > > > >
> > > > > Best,
> > > > > Markus
> > > > >
> > > > >
> > > > > On 04/19/2013 11:56 AM, Nikolas Fechner wrote:
> > > > >
> > > > > > > > > > > Dear all,
> > > > > > We developed a new module ( rdkit.Chem.PandasTools.py
> > > > > > ) that allows for using RDKit molecule objects directly in pandas
> > > > > > dataframes. Pandas ( http://pandas.pydata.org/
> > > > > > <http://pandas.pydata.org/> ) is a python library that offers
> > > > > > table-like datacontainers, which are incredibly useful for anything
> > > > > > related to data mining. Moreover, it integrates nicely with the
> > > > > > ipython notebook producing rendered HTML tables for the dataframes.
> > > > > > The RDKit integration allows to have molecule-type columns and
> > > > > > functionality to perform substructure-based row filtering directly
> > > > > > on the pandas table. Additionally, if a dataframe is exported as
> > > > > > HTML or shown within an ipython notebook, the molecules in the table
> > > > > > are rendered as 2D structures.
> > > > > >
> > > > > > The new module is available in the current SF trunk
> > > > > > and contains a doctest header that provides examples of how to use
> > > > > > it.
> > > > > >
> > > > > > I hope some of you find that interesting. As always,
> > > > > > bug reports, comments, ideas... are very much appreciated.
> > > > > >
> > > > > > Best,
> > > > > > Nikolas
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > ------------------------------------------------------------------------------
> > > > > > Precog is a next-generation analytics platform
> > > > > > capable of advanced
> > > > > > analytics on semi-structured data. The platform
> > > > > > includes APIs for building
> > > > > > apps and a phenomenal toolset for data science.
> > > > > > Developers can use
> > > > > > our toolset for easy data analysis & visualization.
> > > > > > Get a free account!
> > > > > >
> > > > > >
> > > > > > http://www2.precog.com/precogplatform/slashdotnewsletter
> > > > > > <http://www2.precog.com/precogplatform/slashdotnewsletter>
> > > > > >
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > Rdkit-discuss mailing list
> > > > > > Rdkit-discuss@lists.sourceforge.net
> > > > > > <mailto:Rdkit-discuss@lists.sourceforge.net>
> > > > > >
> > > > > >
> > > > > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> > > > > > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> > > > > >
> > > > > > > > > > >
> > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > ------------------------------------------------------------------------------
> > > > Learn Graph Databases - Download FREE O'Reilly Book
> > > > "Graph Databases" is the definitive new guide to graph
> > > > databases and
> > > > their applications. This 200-page book is written by three
> > > > acclaimed
> > > > leaders in the field. The early access version is available
> > > > now.
> > > > Download your free book today!
> > > > http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
> > > > <http://p.sf.net/sfu/neotech_d2d_may_______________________________________________>
> > > > Rdkit-discuss mailing list
> > > > Rdkit-discuss@lists.sourceforge.net
> > > > <mailto:Rdkit-discuss@lists.sourceforge.net>
> > > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> > > > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> > > >
> > > > > > >
> > > > >
> > > > >
> > > ------------------------------------------------------------------------------
> > Learn Graph Databases - Download FREE O'Reilly Book
> > "Graph Databases" is the definitive new guide to graph databases and
> > their applications. This 200-page book is written by three acclaimed
> > leaders in the field. The early access version is available now.
> > Download your free book today!
> > http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
> > <http://p.sf.net/sfu/neotech_d2d_may_______________________________________________>
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > <mailto:Rdkit-discuss@lists.sourceforge.net>
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
> >
> > > --
> Markus Hartenfeller
> Chemoinformatics Specialist
> Molecular Health GmbH
> Belfortstr. 2
> 69115 Heidelberg
> Germany
> Tel: +49 6221 43851 209
> Fax: +49 6221 43851 100
> Email: markus.hartenfel...@molecularhealth.com
> <mailto:markus.hartenfel...@molecularhealth.com>
> www.molecularhealth.com <http://www.molecularhealth.com>
>
> ----------------------------------------------------------
> Molecular Health GmbH
>
> Geschaeftsfuehrer: Dr. Stephan Brock/
> Dr. Friedrich von Bohlen und Halbach
>
> Sitz der Gesellschaft: Heidelberg
> Handelsregister: Amtsgericht Mannheim - HRB 338037
> ----------------------------------------------------------
>
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today!
http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss