Hi rdkiters,

Due to popular demand I started to work on a function to export pandas
DataFrame to xlsx with molecule images embedded.
Because of the xlsx specifics the code is not optimal. The most annoying
thing about this implementation is that it has to write all images to the
hard drive, before it packs them in xlsx (and deletes them at the end). I
checked two python xlsx libraries and both save images that way. If someone
finds better solution, please share it.

The dimensions of cells with images are not optimal because Excel is weird.
:) From xlsxwriter docs): "The width corresponds to the column width value
that is specified in Excel. It is approximately equal to the length of a
string in the default font of Calibri 11. Unfortunately, there is no way to
specify “AutoFit” for a column in the Excel file format."

It crashes if value of a cell is of wrong type so use df['value'].astype()
to fix incorrectly assigned types.

Resulting files work nicely in Office 365 (standalone and web app), but for
some reason don't work optimally with LibreOffice (after row ~125 it stacks
all images).

I made a pull request on GitHub: https://github.com/rdkit/rdkit/pull/371
Demo:
http://nbviewer.ipython.org/github/Team-SKI/snippets/blob/master/IPython/rdkit_hackaton/XLSX%20export.ipynb
Demo xlsx file:
https://github.com/Team-SKI/snippets/blob/master/IPython/rdkit_hackaton/demo.xlsx

Regards,
Samo
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to