Sorry for the confusion, I truncated the string myself in the mail
because I did not want to paste the whole beast. The fields contain the
full strings and the tag is closed.
Best,
Markus
On 05/07/2013 01:25 PM, Nikolas Fechner wrote:
When developing the module I occasionally had problems with *very*
long png strings, because the pandas maximal column width applies to
the string, which is what is stored in the dataframe, before the image
rendering. As an effect the truncated png string was shown in the
table (exactly the "...' ending shown in your example).
You could try manually setting the maximal width very high (e.g.
pandas.set_option("display.max_colwidth",100000)). This should be done
automatically by the PandasTools, which sets it the len(PNG)+100 for
the longest string found during rendering, but because this rarely had
an impact I could very well have overseen some problems with this
strategy.
Best,
Niko
On May 7, 2013 at 1:13 PM Markus Hartenfeller
<[email protected]> wrote:
Thanks again for your reply. That's what I have tried:
from rdkit import Chem
from rdkit.Chem import AllChem
import pandas as pd
from rdkit.Chem import PandasTools
from rdkit.Chem.Draw import IPythonConsole
from IPython.core.display import HTML
df = PandasTools.LoadSDF('test.sdf', includeFingerprints=False)
display(HTML(df.to_html()))
So it is a dataframe and .to_html() works fine in general. I see all
sdf fields. It's just that the molecule column contains string value
of this kind:
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASwAAAEsCAYAAAB ...
The notebook somehow does not realize that it is an html tag with an
image, but instead renders it as a normal string (just like before
with the single molecule).
Best wishes,
Markus
On 05/07/2013 12:57 PM, Nikolas Fechner wrote:
Just for clarification, are you trying to render a dataframe or a
series/single column? The pandas series object has no to_html()
method and is therefore rendered as string only. Moreover, if you
select a single column, e.g. 'ROMol' from a dataframe by df['ROMol']
you will get a series object that is rendered as string. If you
select a set of columns you get a dataframe, for which the HTML
rendering should work. The latter also works for a single column if
you enclose in double brackets df[ *[*'ROMol' *]*], which will give
a single-column dataframe. This took me some time to figure out and
the silent conversion that sometimes occurs can be quite confusing.
Best,
Niko
On May 7, 2013 at 11:33 AM Markus Hartenfeller
<[email protected]>
<mailto:[email protected]> wrote:
Thanks for your help, Niko. Importing the iPythonConsole from rdkit
+ removing the 'print' command did the trick for a single molecule :)
Unfortunately, molecules in data frames are still shown as strings,
even when forcing html rendering. I will try to get this working
and report here if I make any progress. In case somebody has
already faced the same problem please let me know.
Best,
Markus
On 05/07/2013 10:27 AM, Nikolas Fechner wrote:
Hi Markus,
glad you think it could be useful :). Regarding the problem, there
are two things: You have to import the RDKit IPythonConsole to
enable the molecule rendering (from rdkit.Chem.Draw import
IPythonConsole) and if you trigger the output using 'print' the
notebook will always use string rendering (AFAIK). Just try 'm'
alone (instead of 'print m'). Alternatively, you can always force
the notebook to do a HTML rendering (useful for large dataframe):
from IPython.core.display import HTML
display(HTML('''any HTML string e.g. dataframe.to_html()'''))
I hope that helps.
Best,
Niko
On May 7, 2013 at 10:02 AM Markus Hartenfeller
<[email protected]>
<mailto:[email protected]> wrote:
Hi Nikolas,
I had a first look at the PandasTools package: very cool! I think
this is going to be useful for many rdkit users. I'm looking
forward to using it in the future. Thanks for sharing this module.
I'm having troubles to see the molecule depictions in the ipython
notebook though (both in tables and by just printing out a single
molecule).
This code in a ipython notebook
from rdkit import Chem
from rdkit.Chem import PandasTools
m=Chem.MolFromSmiles('N1CCNCC1')
print m
gives me
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASwAAAEsCAYAAAB ...
a very long string with the base64 encoding of the image, but not
the image itself. Plotting from matplotlib works fine. Did I
forget to import something, or could it be a browser issue? I am
using centOS 6 and Firefox.
Thanks in advance.
Best,
Markus
On 04/19/2013 11:56 AM, Nikolas Fechner wrote:
Dear all,
We developed a new module ( rdkit.Chem.PandasTools.py ) that
allows for using RDKit molecule objects directly in pandas
dataframes. Pandas ( http://pandas.pydata.org/) is a python
library that offers table-like datacontainers, which are
incredibly useful for anything related to data mining. Moreover,
it integrates nicely with the ipython notebook producing
rendered HTML tables for the dataframes. The RDKit integration
allows to have molecule-type columns and functionality to
perform substructure-based row filtering directly on the pandas
table. Additionally, if a dataframe is exported as HTML or shown
within an ipython notebook, the molecules in the table are
rendered as 2D structures.
The new module is available in the current SF trunk and contains
a doctest header that provides examples of how to use it.
I hope some of you find that interesting. As always, bug
reports, comments, ideas... are very much appreciated.
Best,
Nikolas
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis& visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Rdkit-discuss mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today!
http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
Rdkit-discuss mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today!
http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
Rdkit-discuss mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today!
http://p.sf.net/sfu/neotech_d2d_may_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss