Re: The king is dressed in void

Andreas Langegger Thu, 12 Jun 2008 09:51:01 -0700


Hi there,

I've done data integration based on SPARQL in a "restricted" domain,not web-scale (see SemWIQ presentation at ESWC08 [1]). But the issuesare similar. We need some descriptions about sites, owner, license,etc. In our case this is provided upon registration of data sources atthe mediator which is maintaining a site catalog.


For me there are two points that count for voiD:
1. we just need those meta-data about the maintainer of endpoints

2. we need some simple "pre-compiled" statistical information justbecause of performance (I think so)Of course, data should be self-describing and you could fetch alldata and collect your own stats using SPARQL, but this will produceunnecessary load to servers.Additionally, curently SPARQL does not support aggregate functions(at least not the spec) which allow you to retrieve already aggregatedstat data.

One possible way to achieve this is to provide voiD data as part ofthe actual graph exposed by the SPARQL endpoint. SPARQL can be used toretrieve meta data without the need for an additional meta-descriptionlayer. However, I think the real problem is, that sometimes this isnot (easily) possible. In such cases a simple file resource could addthe required information. But how should the search engine, client,etc. know where to find meta data? First try using SPARQL, then filelocation by convention? I don't know...


For voiD there are two things:
1. definition of a metaLOD vocabulary

2. specifying a convention of "where to find meta data" (like "/robots.txt" or "/sitemap.xml")


1. is easier than 2.

Regarding statistics: I'm working on a statistics monitor which can beattached to a SPARQL endpoint (at the same host or at least in thesubnet). It will periodically generate stats for the data storedbehind the SPARQL endpoint. Because it works via SPARQL, it can beused regardless of the implementation (my actually be a wrapper likeD2R-Server). I basically need this for query optimization in SemWIQ.

It would be great if I could use outcomings from the voiD approach.That's why I'd like to get involved.


Regards
AndyL

[1] http://semwiq.faw.uni-linz.ac.at


On Jun 12, 2008, at 8:49 AM, Hausenblas, Michael wrote:



Giovanni,

I think I see your argument here and I tend to agree up to a certain
point. What makes me wonder is that it is *you* stating this ;)

Seriously, I very much believe in self-descriptive documents, etc. Ido

prefer simple things that work. However, voiD is just the next logical
step after semantic sitemaps (it actually is thought to extend it in

terms of using the sc:datasetURI as the entry point, see also [1]).So,just in case you want to argument against your own proposal, pleasetell

me so ;)

I guess you're right that many things can be done already and I'm

positive that we should use the current layer, then advance to thenext.But what if, say, the current layer is missing something. To whom isitup to decide when we are done? I guess it is up to the people usingit.

So, let's not judge a book by its cover, please.

voiD intends to formalise what is already used in practice. I myself
have built some applications that exploit the LOD datasets and others

certainly have done as well. As it seems, there is a certain need todo

what we have done up to now mainly in our brains, in a more automated

way. There we are: a clear demand for something, a proposal to solveit.

It is as simple as it is. If it turns out that LOD dataset provides
don't use it - fine. They might use other methods, then, or nothing at
all.

I see two issues with what you propose, however - granularity &
scalability. Currently we have identified two use cases for voiD:

1. automatic creation of a map (such as http://sindice.com/map)
2. topic-based selection of LOD datasets

I guess you're kinda familiar with (1). Now, think about scalability.
Today we have a bunch of LOD data sets or other sources -  tomorrow we
may have 10k and next year maybe a million. Next, when looking at (2),
I'd like to have a reliable, simple method to determine a 'good' entry

point into the LOD cloud. As soon as I'm in, I can follow my noseusing

basically what you propose.

Finally, the reactions so far tell us that voiD seems to be whatpeoplewhere waiting for in terms of easy to use and powerful enough tohave an

added value.

Concluding, it is not 'Giovanni vs. voiD', it is Giovanni + voiD for a
better, finally a *real* Semantic Web.

Cheers,
        Michael


[1] http://sw.joanneum.at/voiD/img/void_discovery.png

----------------------------------------------------------
Michael Hausenblas, MSc.
Institute of Information Systems & Information Management
JOANNEUM RESEARCH Forschungsgesellschaft mbH

http://www.joanneum.at/iis/
----------------------------------------------------------

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Giovanni Tummarello
Sent: Thursday, June 12, 2008 12:08 AM
To: Hausenblas, Michael
Cc: public-lod@w3.org; Semantic Web
Subject: The king is dressed in void

Wasnt RDF all aabout being self describing?

if i say "giovanni works in research" .. do i really need a
vucabolary that says "this rdf contains informations that describe
what people claim to be working on" that's a suicide. If this is the
case (which i totally dont believe) then the king is seriously naked
and there is no hope whatsoever that RDF is going to have any
relevance (and there i say it)

to find one such file, instead of having to invent agree and markup
i'd say its much easier to do something like [1] or [2].
this is not marketing. its a plea to NOT jump on more layers of stuff
when the previous layers have really to show there value and

adoptability still. Solve some simple use cases first then jump tothe

more complex one.

Giovanni

[1]
http://demo.sindice.com/search?q=*+%3Chttp%3A%2F%2Fwww.w3.org%2

F2006%2Fvcard%2Fns%23title%3E+%27research%27&qt=advanced


or
http://sindice.com/search?q=http%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1

%2Fknows&qv=http%3A%2F%2Frichard.cyganiak.de%2Ffoaf.rdf%23cygri&qt=ifp

(documents which contain statements in which someone claims to be
knowing richard)

[2] http://forum.sindice.com/showthread.php?t=10


On Wed, Jun 11, 2008 at 8:54 AM, Hausenblas, Michael
<[EMAIL PROTECTED]> wrote:



Dear interested people in linked datasets,

As you may have gathered, we have recently initiated a

discussion on how

to discover the linked dataset cloud [1]. The result of ourimpromptu
kick-off meeting at the ESWC08 is literally voiD - the '

vocabulary of

interlinked datasets' (see notes at [2]). This is a proposal for a
vocabulary and a mechanism how it should be deployed and

used. We have

some first slides available at [3] as well.

Please consider commenting on it either by replying to this message
and/or sharing your thoughts with us at the Wiki [2].

Cheers,
     Michael

[1] http://richard.cyganiak.de/2007/10/lod/
[2]

http://community.linkeddata.org/MediaWiki/index.php?MetaLOD#Kic

k-off_mee

ting_at_ESWC08
[3]

http://www.slideshare.net/mediasemanticweb/full-eswc08-lightning-talk


----------------------------------------------------------
Michael Hausenblas, MSc.
Institute of Information Systems & Information Management
JOANNEUM RESEARCH Forschungsgesellschaft mbH
Steyrergasse 17, A-8010 Graz, AUSTRIA

<office>
phone: +43-316-876-1193 (fax:-1191)
mobile: +43-699-1876-1165
e-mail: [EMAIL PROTECTED]
skype: mhausenblas
  web: http://www.joanneum.at/iis/

<see also>
       http://sw-app.org/about.html
       http://riese.joanneum.at
----------------------------------------------------------



----------------------------------------------------------------------
Dipl.-Ing.(FH) Andreas Langegger
Institute for Applied Knowledge Processing
Johannes Kepler University Linz
A-4040 Linz, Altenberger Straße 69
http://www.langegger.at

Re: The king is dressed in void

Reply via email to