Hi all,
yesterday I tried to test out the functionality of the Pubmed modules.
First let me explain the scene:
I want to use the RDKit Pubmed modules for Pubchem
Compound,Substance,Bioassay etc-data queries
In order to get the answer to the following question:
In which assays was 'aspirin' testet and what are the names of the
protein targets in these assays?
Here's what I did:
import Dbase.Pubmed.Searches ,Dbase.Pubmed.QueryParams
query=Dbase.Pubmed.QueryParams.details()
query['db']='pccompound'
query['term']='aspirin'
print '...searching for term %s' %(query['term'])
res1=Dbase.Pubmed.Searches.GetSearchIds(query,url='http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi')
#This query results in the compound IDs of aspirin as an array --> for
the sake of brevity
#I'll just take the first ID and query elink for the assays that had
tested this compound:
query=Dbase.Pubmed.QueryParams.details()
query['db']='pcassay'
query['dbfrom' ]='pccompound'
query['linkname']='pccompound_pcassay'
query['Id']=res1[0]
res2=Dbase.Pubmed.Searches.GetSearchIds(query,url='http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi')
print res2
#Now res2 contains the assay IDs of every assay, that my beloved aspirin
was testet in (I'll use just assay ID 1490 ).
# what I now tried is to use the 'GetRecords' method for retrieving
assay-info from the pcassay database
#(by default the 'GetRecords' method looks up PubMed, but how to change
that to pcassay and what is the bioassay analogon for
#a PubMed SummaryRecord ??? .... and by the way: What is the
structure/red line in the eutils documentation??? :-) )
#Anyway, here's an unsucessful trial to get the name of the protein (It
should be 'phosphopantetheinyl transferase' or so ):
query=Dbase.Pubmed.QueryParams.details()
query['db']='pcassay'
res3=Dbase.Pubmed.Searches.GetRecords(['1490'],query,url='http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi')
print res3
#Obviously setting the query dictionary key for the database does not
suffice in order to get an assay record :-(
#So how do I make the last step, i.e getting the Name of the Protein
target of aid 1490?
#In the Python docu for the GetRecords method there's a 'conn'
parameter, that I did not understand. Do I have to change it?
#It might be rather a problem of understanding the pcassay record
formats and eutils functionality,
# but I did not find the right paragraphs in the doku.
Thanks in advance for any help,
Markus