Re: Owning URIs (Was: Yet Another LOD cloud browser)

Daniel Schwabe Tue, 02 Jun 2009 08:35:00 -0700

Sherman Monroe wrote:

Daniel,
I see some interesting concepts worth exploring here, e.g. usingwindows (with paging inside the window). But as I refine my query,there isn't any apparent context that orients me in the data. E.g. howdoes one box/set relate to the others.

The dependency between the boxes is recorded, but it is not a simplematter to actually expose it in simple way to the user. Each box (set)is really dependent on a chain of previous operations, so in general itmay be a very long list of function compositions.I think the biggest contribution is not so much the interface aspectsthat you refer to, but the way you can form the various sets (boxes)through various operations - the SPO, which allows you to do arbitrarymatches for <s,p,o> triples, plus union/intersection/difference, plusde-referencing, plus faceted interface on either an arbitrary set ofchosen properties (and applied to any set) or automatically generatedfacets.

Here is a simple interesting scenario:

Find a drug for hypoglycemia that can be prescribed to a known alcoholabuser.

Click on menu->repositories, add drugbank sparql endpoint(http://www4.wiwiss.fu-berlin.de/drugbank/sparql) limit 50 (sometimeswe've been getting timeouts; just try again and eventually it works. Wehave a locally loaded version of these repositories, but we haven'tfinished building the index for the full text search yet, still figuringhow to build this index it in Virtuoso).


search for hypoglycemic (call it Set A)
search for avoid alcohol (call it Set B)

click on A, clic on the intersection symbol, click on set B, click on"=". (call it set C).

Click on A, click on S, click on "-".

You've computed the set of drugs associated with hypoglycemic,intersected with the set of drugs which should not be taken withalcohol, and computed the difference between this set and the set ofdrugs associated with hypoglycemic, resulting in such drugs that may betaken with alcohol.

If you sophisticate the scenario a bit, you can repeat the samereasoning for "antidepressant", to get the set of drugs which areantidepressants and may be taken with alcohol.Sophisticating further (but here I don't have the medical knowledge toformulate it properly), I could try to determine which diabetes andantidepressant drugs could be prescribed together (I'd need to determinedangerous interactions between candidates obtained in the previous steps).

and so on...

I notice you're using Sesame, do you think it can scale? I triedselecting several repositories at once, but the system seems to hangawhile (couple of minutes) before returning results.

We use both Sesame (through its Java interface) and Virtuoso (regularhttp SPARQL interface), depending on the size of the datased (e.g.,dbpedia is on Virtuoso). You may have also realized you can add anyarbitrary external endpoint as well.The problems you report are not really due to Explorator, but ratherfrom the engines themselves, and the particular repositories. If you tryto issue the same queries (notice there are many queries necessary topresent the information in the form it appears on the screen), you willsee they also take a while to respond. In fact, we'd be very interestedin seeing how to optimize such queries. Samur, my former student, willelaborate this in a separate message, for those interested.(we might take this offline if it becomes too specific, although I feelthe problems we face are the same anyone who wishes to build "userfriendly" interfaces to RDF data would face...)


Cheers
D

Re: Owning URIs (Was: Yet Another LOD cloud browser)

Reply via email to