Dear Daniel,
Regarding publications (and your off-list inquiry about bioGUID) I've
tweaked bioGUID to serve some simple RDF, given a publication
identifier such as a DOI or a PubMed id. I like simplicity, so I use
Dublin Core and PRISM as vocabularies to describe a journal article.
Example URIs are:
http://bioguid.info/pmid:9628005
http://bioguid.info/doi:10.1097/MPH.0b013e318186533f
In a web browser these display HTML, but in a tool like http://dataviewer.zitgist.com/
you get RDF.
Under the hood bioGUID is an OpenURL resolver that wraps CrossRef and
PubMed, but also has access to JSTOR, and a database of other
literature that I'm building (mainly taxonomic -- I developed this as
part of a project on biological taxonomy).
To do a search for an article you can do an OpenURL query (and you can
ask for JSON or RDF to be returned), e.g.
http://bioguid.info/openurl?genre=article&title=Molecular
Phylogenetics and Evolution&volume=42&spage=157&display=rdf
This is all a bit crude (I hacked the RDF stuff together this
afternoon), but it may be of use.
Regards
Rod
On 10 Jan 2009, at 10:14, Daniel O'Connor wrote:
Hey all,
I'm Daniel O'Connor, a software engineer from Australia.
At the moment I'm trying to get a lot of food nutrition data
together from a whole bunch of different sources and create a bit of
an ontology; publish it as RDF; and make sure its chock full of
linked data goodness; and I could use your help, advice, pointers
and encouragement.
Use cases include things like shopping, diet / fitness applications,
cooking, and much more.
what did you eat today? -> hey, that's only 75% of your recommended
daily energy intake
what is the approximate food energy in this recipe?
tell me the fattiest food I'm eating and replace it with one with
more protein (but the same energy content)
The data sources I've got on my list so far are:
USDA's SR21 food nutrients data (public domain)
Australia's NUTTAB 06 data (not so public domain)
Canadia's CNF data (haven't delved into it in depth)
The typical format provided is CSV, so I'm going through and mapping
those CSV exports back into a RDBMS (php + mysql / pgsql / etc),
then providing tools to generate RDF out, and publishing the static
results.
You can see (and get) the code from:
http://freebase-owl.googlecode.com/svn/trunk/nutrition/
and read a bit more about installing from:
http://clockwerx.blogspot.com/2009/01/generating-nutritional-data-rdf-from.html
and view samples of the output:
USDA:
http://lauken.com/doconnor/nutrition/usda/1006.rdf
NUTTAB:
http://lauken.com/doconnor/nutrition/nuttab/01A10027.rdf
Ontology (draft!):
http://www.lauken.com/doconnor/nutrition/0.1/schema.rdf
There's a lot of work for me here, and if anyone here has knowledge
or a helping hand, I'd love to hear from you, especially regarding
the ones in bold.
Resolve licensing agreements with Aust. government for rights to
reproduce data (in progress)
Model Canadian data
Find or create a suitable ontology for Nutrition data (I would have
expected some common terms from the bio-rdf community, but I don't
have the background to know what I'm looking for)
Model the USDA, NUTTAB and Canadian extensions as appropriate
Find or create (ick hope not) an ontology for measurements in
relation to typical nutrition measurements (again, there's no
semantic web concepts for milligrams, kilocalories, etc - not even
in dbpedia. timbl did some very high level concepts of what a Gram /
etc is; but its not quite the same)
Find or create a list of terms used in nutrition data (shorthand/
abbreivations) - ie CBODF = "Carbohydrate by difference", but I
can't seem to find a good list of these outside of the USDA data
itself.
Find or create a journal publications ontology (dublincore might do
it though; or some other bibliographic ontology) - suggestions?
Find or create science terms ontology (Paper, Subject, Experiment,
Samples, etc) - anyone?
Create owl:sameAs links to DBPedia topics in some automated fashion
- this is tricky, because a lot of the data is written as "Cheese,
blue" and is much more granular than wikipedia articles about Cheese.
Create owl:sameAs links to Freebase topics in some automated fashion
- ditto
Interlink Canadian, NUTTAB, USDA data in some automated fashion -
similar - different naming schemes make using dc:title as a IFP a
bit annoying.
Render full sets of RDF for each
Publish these somewhere - http://lauken.com/doconnor/ is not
suitable for anything more than a sandbox
Provide human interfaces as appropriate - if anyone wanted to create
shiny XSLT -> XHTML perhaps; or PHP glue...
Setup a SPARQL endpoint (I have a hell of a time doing this in my
development environment, so this might not happen) - HELP!
Provide unit test coverage for all generator tools
Refactor lots
---------------------------------------------------------
Roderic Page
Professor of Taxonomy
DEEB, FBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK
Email: [email protected]
Tel: +44 141 330 4778
Fax: +44 141 330 2792
AIM: [email protected]
Facebook: http://www.facebook.com/profile.php?id=1112517192
Twitter: http://twitter.com/rdmpage
Blog: http://iphylo.blogspot.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html