Version 0.5.0 of server package released

Peter Ansell Fri, 08 May 2009 01:11:50 -0700

Hi all,

The next version of the server software has been released on sourceforge. [1]


It contains a number of changes that will hopefully make it more useful for the 
tasks we want to do with linked rdf queries.

One major one is the introduction of content negotiation, which has been tested 
for N3 (using text/rdf+n3) and RDF/XML (using application/rdf+xml). It was made 
possible this quickly after the last release by the use of the content 
negotiation code from Pubby, the driver behind the DBpedia web interface and 
URI resolution mechanism. It is also possible to explicitly get to the N3 
format currently by prefixing the URL with /n3/ See [2] for an example. The 
ability to explicitly get to the RDF/XML will be added in future..

Another change that will hopefully be useful is the introduction of clear RDF 
level error messages when either the syntax of a URI is not recognised, or the 
syntax was recognised but there were no providers that were relevant to the 
URI. See [3], [4] and [5] for a demonstration of the error messages.

There is also the ability to page through the results, which is necessary when 
there are more than 2000 results to a query from a particular endpoint. To use 
the paging facility the URI needs to be prefixed by /pageoffsetNN/, where NN is 
a number indicating which page you would like to look at. The queries are not 
ordered currently, but in the short term it would be reasonable to believe that 
they should be consistent enough to get through all of the results. Ordered 
queries take a lot longer than unordered queries, so it is unlikely that the 
public mirrors will ever introduce ordered queries. An example of the paging 
URL could be [6] or [7].

There is also the ability to get an RDF document describing what actions would 
be taken for a particular query. It is interoperable with the /n3/ and 
/pageoffsetNN/ URI manipulations so URI's like [8] can be made up and resolved. 
This RDF document is setup to contain all of the necessary information for the 
client to then complete the query with their own network resources if 
necessary. In future, clients should be able to patch into this functionality 
without having to keep a local copy of the configuration on hand, although a 
distributed configuration idea is also in the works for sometime in the future. 
Currently the distribution is readonly from [9]. The [9] URL has also been made 
content negotiable for HTML/RDFXML/N3 content types, with a default to HTML if 
the content type is not recognised by the Sesame Rio library, but it can still 
be accessed in a particular format without content negotiation by appending 
/html /n3 or /rdfxml .

Since the last release the GeoSpecies dataset has also been partially 
integrated, although it doesn't seem to have a sparql endpoint so currently it 
is only available for basic construct queries. [10] Not all of the namespaces 
inside the geospecies dataset have rules for normalisation to Bio2RDF URI 
syntax, but the rest will be integrated eventually.

The order of normalisation rules is now respected when applying them, with 
lower numbers being applied before higher numbers. Numbers with the same order 
cannot be relied on to be applied in a consistent manner if they overlap 
syntactically.

The MyExperiment SPARQL endpoint [11] has also been integrated into Bio2RDF 
since the last release, so for instance, a user in the MyExperiment system can 
be resolved using [12], but there are also other things like workflows which 
could in the future provide valuable interconnections for the linked rdf web. 
Further integration with MyExperiment would be invaluable to the future of the 
Bio2RDF network I think.

Partial support for INCHI resolution has also made it into this release, 
although there are some syntax bugs with rdf.openmolecules.net that stop Sesame 
being able to parse the resulting RDF/XML so the inchi's are only being 
resolved using pubchem so far. Some INCHI's, particularly those which contain + 
signs will also be unresolvable for the current time because the Apache HTTPD 
and Apache Tomcat and URLRewrite stack we are using unurlencodes the plus signs 
to spaces somewhere along the line and it is hard to figure out what 
configuration is needed to avoid it happening. It was hard enough figuring out 
how to make encoded slashes (%2F) usable inside identifiers (they need to be 
double encoded as %252F to avoid detection by the HTTPD/Tomcat/URLRewrite 
algorithms), so I am not sure what progress will be made with the plus signs in 
the near future.

DOI resolution has also been integrated from both the Uniprot Citations 
database and the BioGuid.info, but will likely only be fully useful for science 
related DOI's I think.

There are currently 368 namespaces known by the server software for Bio2RDF, 
with 231 information provider configurations (although the real number of 
providers is less than this due to duplication on a few providers to enable 
reverseconstruct, and unpercentencoded queries where necessary)  The number of 
combinations that are currently encapsulated by the server configuration can be 
found at [13]

It is hard to believe so much could be packed into a new release two weeks 
after the last release! 

See the complete list of changes at [14].

If anyone has alternative configurations that they have made up using the 
software I am more than willing to include them in the distribution so others 
can utilise them. The configuration file syntax is still in flux, and won't 
likely become stable until the 1.0 release, but it is mostly additions to 
support new features, so configurations based on older software versions are 
still useful and able to be migrated to the new scheme.

Cheers,

Peter

[1] https://sourceforge.net/project/platformdownload.php?group_id=142631
[2] http://qut.bio2rdf.org/n3/geneid:14456
[3] http://qut.bio2rdf.org/dummyquery/go:0004535
[4] http://qut.bio2rdf.org/image/geneid:14936
[5] http://qut.bio2rdf.org/GO:0004535
[6] http://qut.bio2rdf.org/n3/pageoffset2/chr:10090-chr14
[7] http://qut.bio2rdf.org/pageoffset2/chr:10090-chr14
[8] http://qut.bio2rdf.org/n3/queryplan/pageoffset2/chr:10090-chr14
[9] http://qut.bio2rdf.org/admin/configuration
[10] http://qut.bio2rdf.org/geospecies_bioclass:13
[11] http://rdf.myexperiment.org/sparql
[12] http://qut.bio2rdf.org/myexp_user:1177
[13] http://qut.bio2rdf.org/admin/namespaceproviders
[14] http://bio2rdf.wiki.sourceforge.net/Road+map



      Enjoy a safer web experience. Upgrade to the new Internet Explorer 8 
optimised for Yahoo!7. Get it now.

Version 0.5.0 of server package released

Reply via email to