Dear Niko,

I was exactly looking for this functionality, great work!

A few follow-up questions:
* frame.set_index('_Name') did not work, but there is a name set in the SD 
file.
* Is there a way to load in only a specified list of SD tags? (I didn't 
find a "names" parameter for LoadSDF)
* frame.head() frame.describe() give a property "ID", which is not present 
in my SD file. Where does it come from?
* frame.describe() does not show the basic statistics of the SD file.

The last three points are due to the fact that PandasTools.LoadSDF has 
fewer functionalities than PandasTools.read_table?


Cheers & big thanks again,
Paul


> 
> Hi Paul,
> I am not sure if it is easily doable to get the pandas read_table 
> function to handle sd-files. However, there is some basic 
> functionality for this already built-in in the PandasTools module. 
> If you check the docktest header there is a small example. Basically, 
> 
> frame = PandasTools.LoadSDF
> 
(sdfFile,smilesName='SMILES',molColName='Molecule',includeFingerprints=True)
> 
> loads the data from an sd-file into a dataframe, such that every 
> molecule entry corresponds to a row with the molecule in the column 
> 'Molecule'. The specified smiles column is generated automatically 
> and every sd-property ends up in a column with the respective 
> property name. Additionally, if there is a property "_Name" set for 
> the molecule that is used as a row identifier - I assume this could 
> be made customisable in the future.
> Is this something you could use? 
> 
> Kind regards,
> Niko
> 
> On Jun 30, 2013, at 5:10 PM, paul.czodrow...@merckgroup.com wrote:
> 
> Dear RDKitters,
> 
> I was wondering if anyone has looked into the Pandas data frame with 
> respect to read in a SD file similar to this syntax:
> 
> data = 
> pd.read_table(open('whatever.smi','r'),header=None,names=
> ['smiles','cas','mutagenic'])
> 
> Ideally, "names" would be automatically set according to the SD tags.
> 
> 
> Cheers & Thanks,
> Paul


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, you 
must not copy this message or attachment or disclose the contents to any other 
person. If you have received this transmission in error, please notify the 
sender immediately and delete the message and any attachment from your system. 
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept 
liability for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any unauthorized 
changes of the content of this message and any attachment thereto. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not guarantee that this 
message is free of viruses and does not accept liability for any damages caused 
by any virus transmitted therewith.

Click http://www.merckgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to