Re: Making human-friendly linked data pages more human-friendly

2009-09-17 Thread Paul A Houle
   I think there are a few scenarios here.

   In my mind,  dbpedia.org is a site for tripleheads.  I use it all the
time when I'm trying to understand how my systems interact with data from
dbpedia -- for that purpose,  it's useful to see a reasonably formatted list
of triples associated with an item.  A view that's isomorphic to the triples
is useful for me there.

   Yes, better interfaces for browsing dbpedia/wikipedia ought to be built
-- navigation along axes of type,  time,  and space would be obviously
interesting,  but making a usable interface for this involves some
challenges which are outside the scope of dbpedia.org;  The point of linked
data is anybody who wants to make a better browsing interface for dbpedia.

   Another scenario is a site that's ~primarily~ a site for humans and
secondly a site for tripleheads and machines,  for instance,

http://carpictures.cc/

   That particular site is built on an object-relational system which has
some (internal) RDF features.  The site was created by merging dbpedia,
freebase and other information sources,  so it exports linked data that
links dbpedia concepts to images with very high precision.  The primary
vocabulary is SIOC,  and the RDF content for a page is ~nearly~ isomorphic
to the content of the main part of the page (excluding the sidebar.)

   However,  there is content that's currently exclusive to the human
interface:  for instance,  the UI is highly visual:  for every automobile
make and model,  there are heuristics that try to pick a better than
average image at being both striking and representative of the brand.  This
selection is materialized in the database.  There's information designed to
give humans an information scent to help them navigate,  a concept which
isn't so well-defined for webcrawlers.  Then there's the sidebar,  which has
several purposes,  one of them being a navigational system for humans,  that
just isn't so relevant for machines.

   There really are two scenarios I see for linked data users relative to
this system at the moment:  (i) a webcrawler crawls the whole site,  or (ii)
I provide a service that,  given a linked data URL,  returns information
about what ontology2 knows about the URL.  For instance,  this could be used
by a system that's looking for multimedia connected with anything in dbpedia
or freebase.  Perhaps I should be offering an NT dump of the whole site,
but I've got no interest in offering a SPARQL endpoint.

   As for friendly interfaces,  I'd say take a look analytically at a page
like

http://carpictures.cc/cars/photo/car_make/21/Chevrolet

   What's going on here?  This is being done on a SQL-derivative system that
has a query builder,  but you could do the same thing w/ SPARQL.  We'd image
that there are some predicates like

hasCarModel
hasPhotograph
hasPreferredThumb

   starting with a URL that represents a make of car (a nameplate,  like
Chevrolet) we'd then traverse the hasCarModel relationship to enumerate the
models,  and then do a COUNT(*) of hasPhotograph relationships for the cars
to create a count of pictures for each model.  Generically,  the
construction of a page like this involves doing joins and traversing the
graph to show,  not just the triples that are linked to a named entity,  but
information that can be found by traversing a graph.
People shouldn't be shy about introducing their own predicates;  the very
nature of inference in RDF points to creating a new predicate as the basic
solution to most problems.  In this case,  hasPreferredThumb is a perfectly
good way to materialize the result of a complex heuristic.

(One reason I'm sour about public SPARQL endpoints is that I don't want to
damage my brand by encouraging amnesic mashups of my content;  a quality
site really needs a copy of it's own data so it can make additions,
corrections,  etc;  one major shortcoming of Web 2.0 has been self-serving
API TOS that forbid systems from keeping a memory -- for instance,  Ebay
doesn't let you make a price tracker or a system that keeps dossiers on
sellers.  Del.icio.us makes it easy to put data in,  but you can't get
anything interesting out.  Web 3.0 has to make a clean break from this.)

Database-backed sites traditionally do this with a mixture of declarative
SQL code and procedural code to create a view...  It would be interesting to
see RDF systems where the graph traversal is specified and transformed into
a website declaritively.


Re: Making human-friendly linked data pages more human-friendly

2009-09-17 Thread Kingsley Idehen

Paul A Houle wrote:

   I think there are a few scenarios here.

   In my mind,  dbpedia.org http://dbpedia.org is a site for 
tripleheads.  I use it all the time when I'm trying to understand how 
my systems interact with data from dbpedia -- for that purpose,  it's 
useful to see a reasonably formatted list of triples associated with 
an item.  A view that's isomorphic to the triples is useful for me there.


   Yes, better interfaces for browsing dbpedia/wikipedia ought to be 
built -- navigation along axes of type,  time,  and space would be 
obviously interesting,  but making a usable interface for this 
involves some challenges which are outside the scope of dbpedia.org 
http://dbpedia.org;  The point of linked data is anybody who wants 
to make a better browsing interface for dbpedia.


   Another scenario is a site that's ~primarily~ a site for humans and 
secondly a site for tripleheads and machines,  for instance,


http://carpictures.cc/

   That particular site is built on an object-relational system which 
has some (internal) RDF features.  The site was created by merging 
dbpedia,  freebase and other information sources,  so it exports 
linked data that links dbpedia concepts to images with very high 
precision.  The primary vocabulary is SIOC,  and the RDF content for a 
page is ~nearly~ isomorphic to the content of the main part of the 
page (excluding the sidebar.)


   However,  there is content that's currently exclusive to the human 
interface:  for instance,  the UI is highly visual:  for every 
automobile make and model,  there are heuristics that try to pick a 
better than average image at being both striking and representative 
of the brand.  This selection is materialized in the database.  
There's information designed to give humans an information scent to 
help them navigate,  a concept which isn't so well-defined for 
webcrawlers.  Then there's the sidebar,  which has several purposes,  
one of them being a navigational system for humans,  that just isn't 
so relevant for machines.


   There really are two scenarios I see for linked data users relative 
to this system at the moment:  (i) a webcrawler crawls the whole 
site,  or (ii) I provide a service that,  given a linked data URL,  
returns information about what ontology2 knows about the URL.  For 
instance,  this could be used by a system that's looking for 
multimedia connected with anything in dbpedia or freebase.  Perhaps I 
should be offering an NT dump of the whole site,  but I've got no 
interest in offering a SPARQL endpoint.


   As for friendly interfaces,  I'd say take a look analytically at a 
page like


http://carpictures.cc/cars/photo/car_make/21/Chevrolet

   What's going on here?  This is being done on a SQL-derivative 
system that has a query builder,  but you could do the same thing w/ 
SPARQL.  We'd image that there are some predicates like


hasCarModel
hasPhotograph
hasPreferredThumb

   starting with a URL that represents a make of car (a nameplate,  
like Chevrolet) we'd then traverse the hasCarModel relationship to 
enumerate the models,  and then do a COUNT(*) of hasPhotograph 
relationships for the cars to create a count of pictures for each 
model.  Generically,  the construction of a page like this involves 
doing joins and traversing the graph to show,  not just the triples 
that are linked to a named entity,  but information that can be found 
by traversing a graph.
People shouldn't be shy about introducing their own predicates;  the 
very nature of inference in RDF points to creating a new predicate 
as the basic solution to most problems.  In this case,  
hasPreferredThumb is a perfectly good way to materialize the result of 
a complex heuristic.


(One reason I'm sour about public SPARQL endpoints is that I don't 
want to damage my brand by encouraging amnesic mashups of my content;  
a quality site really needs a copy of it's own data so it can make 
additions,  corrections,  etc;  one major shortcoming of Web 2.0 has 
been self-serving API TOS that forbid systems from keeping a memory -- 
for instance,  Ebay doesn't let you make a price tracker or a system 
that keeps dossiers on sellers.  Del.icio.us http://Del.icio.us 
makes it easy to put data in,  but you can't get anything interesting 
out.  Web 3.0 has to make a clean break from this.)


Database-backed sites traditionally do this with a mixture of 
declarative SQL code and procedural code to create a view...  It would 
be interesting to see RDF systems where the graph traversal is 
specified and transformed into a website declaritively.



Paul,

A summary for the ages.

This is basically an aspect of the whole Linked Data meme that is lost 
on too many.


Thank you very much!!

--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: Making human-friendly linked data pages more human-friendly

2009-09-17 Thread Paul A Houle
On Thu, Sep 17, 2009 at 7:23 AM, Kingsley Idehen kide...@openlinksw.comwrote:



 This is basically an aspect of the whole Linked Data meme that is lost on
 too many.


I've got to thank the book by Allemang and Hendler

http://www.amazon.com/Semantic-Web-Working-Ontologist-Effective/dp/0123735564

for setting me straight about data modeling in RDF.  RDFS and OWL are based
on a system of duck typing that turns conventional object or
object-relational thinking inside out.  It's not necessarily good or bad,
but it's really different.  Even though types matter,  predicates come
before types because using predicate A can make object B become a member of
type C,  even if A is never explicitly put in class C.

Looking at the predicates in RDFS or OWL and not understanding the whole,
it's pretty easy to be like oh,  this isn't too different from a relational
database and miss the point that RDFSOWL is much more about inference
(creating new triples) than it is about constraints or the physical layout
of the data.

One consequence of this is that using an existing predicate can drag in a
lot more baggage than you might want;  it's pretty easy to get the inference
engine to infer too much,  and false inferences can snowball like a
katamari.

A lot of people are in the habit of reusing vocabularies and seem to forget
that the natural answer to most RDF modeling problems is to create a new
predicate.  OWL has a rich set of mechanisms that can tell systems that

x A y - x B y

where A is your new predicate and B is a well-known predicate.  Once you
merge two almost-but-not-the-same things by actually using the same
predicate,  it's very hard to fix the damage.  If you use inference,  it's
easy to change your mind.

--

It may be different with other data sets,  but data cleaning is absolutely
essential working with dbpedia if you want to make production-quality
systems.

For instance,  all of the time people build bizapps and they need a list of
US states...  Usually we go and cut and paste one from somewhere...  But now
I've got dbpedia and I should be able to do this systematically.  There's a
category in wikipedia for that...

http://en.wikipedia.org/wiki/Category:States_of_the_United_States

if you ignore the subcategories and just take the actual pages,  it's
(almost) what you need,  except for some weirdos like

User:Beebarose/Alabama http://en.wikipedia.org/wiki/User:Beebarose/Alabama

and one state that's got a disambiguator in the name:

Georgia (U.S. state) http://en.wikipedia.org/wiki/Georgia_%28U.S._state%29

It's not hard to clean up this list,  but it takes some effort,  and
ultimately you're probably going to materialize something new.

These sorts of issues even turn up in highly clean data sets.  Once I built
a webapp that had a list of countries in it,  this was used to draw a
dropdown list,  but the dropdown list was excessively wide,  busting the
layout of the site.  Now,  the list was really long because there were a few
authoritarian countries with long and flowery names.  The transformation
from

*Democratic People's Republic of Korea - North Korea

*improved the usability of the site while eliminating Orwellian language.
This kind of fit and finish is needed to make quality sites,  and semweb
systems are going to need automated and manual ways of fixing this so that
Web 3.0 looks like a step forward,  not a step back.


Re: Making human-friendly linked data pages more human-friendly

2009-09-17 Thread Kingsley Idehen

Paul A Houle wrote:



On Thu, Sep 17, 2009 at 7:23 AM, Kingsley Idehen 
kide...@openlinksw.com mailto:kide...@openlinksw.com wrote:




This is basically an aspect of the whole Linked Data meme that is
lost on too many.


I've got to thank the book by Allemang and Hendler

http://www.amazon.com/Semantic-Web-Working-Ontologist-Effective/dp/0123735564 



for setting me straight about data modeling in RDF.  RDFS and OWL are 
based on a system of duck typing that turns conventional object or 
object-relational thinking inside out.  It's not necessarily good or 
bad,  but it's really different.  Even though types matter,  
predicates come before types because using predicate A can make object 
B become a member of type C,  even if A is never explicitly put in 
class C.
Schema Last vs. Schema First :-) An RDF virtue that once broadly 
understood, across the more traditional DBMS realms, will work wonders 
for RDF based Linked Data appreciation.


Looking at the predicates in RDFS or OWL and not understanding the 
whole,  it's pretty easy to be like oh,  this isn't too different 
from a relational database and miss the point that RDFSOWL is much 
more about inference (creating new triples) than it is about 
constraints or the physical layout of the data.
Its about a concrete conceptual layer that isn't autistic to context. In 
some quarters this is actually called a: Context Model Database [1].


One consequence of this is that using an existing predicate can drag 
in a lot more baggage than you might want;  it's pretty easy to get 
the inference engine to infer too much,  and false inferences can 
snowball like a katamari.
Yes, but the katamari can be confined to a specific data space that is 
owned and controlled by a particular person, who has a specific world 
view. As long as the axioms are partitioned across data spaces, and the 
RDF store is capable of processing within said confines, everyone is 
happy. Trouble starts when the claims become global facts imposed on 
everyone else that has access to the data space.


A lot of people are in the habit of reusing vocabularies and seem to 
forget that the natural answer to most RDF modeling problems is to 
create a new predicate.  OWL has a rich set of mechanisms that can 
tell systems that


x A y - x B y
where A is your new predicate and B is a well-known predicate.  Once 
you merge two almost-but-not-the-same things by actually using the 
same predicate,  it's very hard to fix the damage.  If you use 
inference,  it's easy to change your mind.
Yep!  The trouble is that OWL-appreciation is low, but ultimately, this 
is where the magic really lies. This is how URIs (Data Source Names) 
will be distinguished based on the data highway smarts they expose etc.. 
Basically, I am traveling from Boston to Detroit, which route (amongst 
many) gets me there quickest, based on my specific preferences etc..


--

It may be different with other data sets,  but data cleaning is 
absolutely essential working with dbpedia if you want to make 
production-quality systems.
Data cleansing is required because there are no abosolute truths and we 
all see the same thing differently. What RDF facilitates, above all 
else, is its ability to protect our natural tendencies (seeing same 
things differently) by inverting the tradition model where inertia is 
introduced as a result of different views or perspectives.


Heterogeneity is the spice of life for a reason. Even our DNA rewards us 
when we fuse afar (rather than inbreed) etc. :-)


For instance,  all of the time people build bizapps and they need a 
list of US states...  Usually we go and cut and paste one from 
somewhere...  But now I've got dbpedia and I should be able to do this 
systematically.  There's a category in wikipedia for that...


http://en.wikipedia.org/wiki/Category:States_of_the_United_States

if you ignore the subcategories and just take the actual pages,  it's 
(almost) what you need,  except for some weirdos like


User:Beebarose/Alabama 
http://en.wikipedia.org/wiki/User:Beebarose/Alabama


and one state that's got a disambiguator in the name:

Georgia (U.S. state) 
http://en.wikipedia.org/wiki/Georgia_%28U.S._state%29


It's not hard to clean up this list,  but it takes some effort,  and 
ultimately you're probably going to materialize something new.

Yes, something new, in a new data space that is still plugged into the Web.


These sorts of issues even turn up in highly clean data sets.  Once I 
built a webapp that had a list of countries in it,  this was used to 
draw a dropdown list,  but the dropdown list was excessively wide,  
busting the layout of the site.  Now,  the list was really long 
because there were a few authoritarian countries with long and flowery 
names.  The transformation from


*Democratic People's Republic of Korea - North Korea

*improved the usability of the site while eliminating Orwellian 
language.  This kind of fit and finish is needed to make quality 

Re: Making human-friendly linked data pages more human-friendly

2009-09-17 Thread Paul A Houle
On Thu, Sep 17, 2009 at 12:19 PM, Kingsley Idehen kide...@openlinksw.comwrote:

Schema Last vs. Schema First :-) An RDF virtue that once broadly understood,
 across the more traditional DBMS realms, will work wonders for RDF based
 Linked Data appreciation.


That's the conclusion that I'm coming to.

I've been think of the question of,  what would Cyc look like if it were
started today?

Cyc took the Schema First approach to the human memome project:  as a
result it put a lot of work into upper and middle ontologies which don't
seem all that useful to many observers.  Despite a great deal of effort put
into avoiding 'representational thorns',  it got caught up.

A modern approach would be to start with a huge amount of data over various
domains and to construct schemas using a mix of statistical inference and
human input.  The role of the upper ontology would be reduced here,
because,  in general,  it isn't always necessary to mesh up two randomly
chosen domains,  say:  bus schedules,  anime,  psychoanalysis,
particle physics

Now,  somebody might want to apply the system to study the relationship of
anime with psychoanalysis;  that could be approached by constructing a
metatheory (i) based on those particular domains,  and (ii) conditioned by
the application that the system is being put to,  that is,  on the bit,
connected via a feedback loop to some means of evaluating the system's
motion towards a goal.

Representational Thorns get bypassed here because the system is free to
develop a new representation if an old one fails for a particular task.


Re: Making human-friendly linked data pages more human-friendly

2009-09-17 Thread Kingsley Idehen

Paul A Houle wrote:



On Thu, Sep 17, 2009 at 12:19 PM, Kingsley Idehen 
kide...@openlinksw.com mailto:kide...@openlinksw.com wrote:


Schema Last vs. Schema First :-) An RDF virtue that once broadly
understood, across the more traditional DBMS realms, will work
wonders for RDF based Linked Data appreciation.


That's the conclusion that I'm coming to.

I've been think of the question of,  what would Cyc look like if it 
were started today?


Cyc took the Schema First approach to the human memome project:  as 
a result it put a lot of work into upper and middle ontologies which 
don't seem all that useful to many observers.  Despite a great deal of 
effort put into avoiding 'representational thorns',  it got caught up.


A modern approach would be to start with a huge amount of data over 
various domains and to construct schemas using a mix of statistical 
inference and human input.  The role of the upper ontology would be 
reduced here,  because,  in general,  it isn't always necessary to 
mesh up two randomly chosen domains,  say:  bus schedules,  
anime,  psychoanalysis,  particle physics


Now,  somebody might want to apply the system to study the 
relationship of anime with psychoanalysis;  that could be 
approached by constructing a metatheory (i) based on those particular 
domains,  and (ii) conditioned by the application that the system is 
being put to,  that is,  on the bit,  connected via a feedback loop 
to some means of evaluating the system's motion towards a goal.


Representational Thorns get bypassed here because the system is free 
to develop a new representation if an old one fails for a particular task.

Yes!
Recent SUMO, Wordnet, Yago, and DBpedia mapping also provides another 
mechanism for demonstrating all of this.


Basically, SUMO, OpenCyc, UMBEL, Yago etc.. remain strange second-class 
citizens of interest re. LOD, but in due course this will change, since 
without these compliments, the real power of Linked Data will remain 
mercurial across all practical dimensions.


Linked Data does provide both substrate and exploitation mechanism for 
ubiquitous smart data.


Note to all: you cannot have ubiquitous smart data without an ability to 
deal with context fluidity; someone will always have a different point 
of view, and that should be an acceptable part of the deal re. Linked 
Data driven metadata :-)


--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: Making human-friendly linked data pages more human-friendly (was: dbpedia not very visible, nor fun)

2009-09-15 Thread Peter Ansell
It is nice to have pages that appeal to people, but projects that
start from assorted RDF, and do not have a typical single resource
layout might find it hard to get a meaningful sort arrangement in
general. I will be thinking about this in the future though as the
current layout of Bio2RDF HTML pages is based on being easy to debug
at an RDF level as opposed to easy to follow for a non-involved user.
One of the main troubles with Bio2RDF is that not every page is
designed to contain only one resource, as it is also an interface for
RESTful query results, and hence it at least requires three sorts of
information to match the RDF triples that make up the information
content. There is the idea of namespaces though for pages that are
only made up to represent single resources, so if there are namespace
specific rendering options they could be used in those cases without
much trouble at all given that it borrowed the interface style from
Pubby and it was run quite heavily using Velocity templates that are
easy to mix and match.

I definitely don't want Bio2RDF pages that can't be displayed without
Javascript turned on like OpenCalais though.

Cheers,

Peter

2009/9/15 Matthias Samwald samw...@gmx.at:
 A central idea of linked data is, in my understanding, that every resource
 has not only a HTTP - resolvable RDF description of itself, but also a
 human-friendly rendering that can be viewed in a web browser. With the
 increasing popularity of RDFa, the URIs of these resources are not only
 hidden away in triplestores, but become increasingly exposed on web pages.
 People want to click on them, and, hopefully, not all of these people come
 from the core community of RDF enthusiasts.

 This means that the HTML rendering of linked data resources might need to
 look a bit sexier than it does today. I dare to say that the Pubby-esque
 rendering of DBpedia pages such as
 http://dbpedia.org/page/Primary_motor_cortex
 is helpful to get a quick overview of the RDF triples about this resource,
 but non-RDF-enthusiasts would not find it very inviting.

 This could be improved by changes in the layout, and possibly a manually
 curated ordering of properties. For example,
 http://d.opencalais.com/er/company/ralg-tr1r/f8a13a13-8dbc-3d7e-82b6-1d7968476cae.html
 definitely looks more inviting than the typical DBpedia page (albeit still a
 bit sterile).

 In the case of DBpedia, it might be better to expose the excellent
 human-readable Wikipedia page for each resource, plus a prominently
 positioned 'show raw data' tab at the top. For other linked data resources
 that are not derived from existing human-friendly web pages, a few stylistic
 changes (ala OpenCalais) already might improve the situation a lot.

 Note that this comment is not intended to be a criticism of DBpedia, but of
 all Linked Data resources that expose HTML descriptions of resources.
 DBpedia is just the most popular example.

 Cheers,
 Matthias Samwald

 DERI Galway, Ireland
 http://deri.ie/

 Konrad Lorenz Institute for Evolution  Cognition Research, Austria
 http://kli.ac.at/



 --
 From: Danny Ayers danny.ay...@gmail.com
 Sent: Tuesday, September 15, 2009 4:03 AM
 To: public-lod@w3.org
 Subject: dbpedia not very visible, nor fun

 It seems I have a Wikipedia page in my name (ok, I only did fact-check
 edits, ok!?). So tonight I went looking for the corresponding triples,
 looking for my ultimate URI...

 Google dbpedia = front page, with news

 on the list on the left is Online Access

 what do you get?

 [[
 The DBpedia data set can be accessed online via a SPARQL query
 endpoint and as Linked Data.

 Contents
 1. Querying DBpedia
 1.1. Public SPARQL Endpoint
 1.2. Public Faceted Web Service Interface
 1.3. Example queries displayed with the Berlin SNORQL query explorer
 1.4. Examples rendering DBpedia Data with Google Map
 1.5. Example displaying DBpedia Data with Exhibit
 1.6. Example displaying DBpedia Data with gFacet
 2. Linked Data
 2.1. Background
 2.2. The DBpedia Linked Data Interface
 2.3. Sample Resources
 2.4. Sample Views of 2 Sample DBpedia Resources
 3. Semantic Web Crawling Sitemap
 ]]

 Yeah. Unless you're a triplehead none of these will mean a thing. Even
 then it's not obvious.

 Could someone please stick something more rewarding near the top! I
 don't know, maybe a Google-esque text entry form field for a regex on
 the SPARQL. Anything but blurb.

 Even being relatively familiar with the tech, I still haven't a clue
 how to take my little query (do I have a URI here?) forward.

 Presentation please.

 Cheers,
 Danny.

 --
 http://danny.ayers.name






Re: Making human-friendly linked data pages more human-friendly (was: dbpedia not very visible, nor fun)

2009-09-15 Thread Richard Cyganiak

Hi Matthias,

Please allow me to present a contrarian argument.

First, there are some datasets that combine linked data output with a  
traditional website, e.g., by embedding some RDFa markup. Of course,  
in that case, all the rules of good web design and information  
presentation still apply, and the site has to first and foremost  
fulfill the visitor's information needs in order to be successful.  
That's self-evident and not what we are talking about here.


Most linked data is different. The main purpose is not to create a web  
site where visitors go to look up stuff. The main purpose is to  
publish data in a re-usable way, in order to allow repurposing of the  
data in new applications.


In that case, the audience for the human-readable versions of the  
RDF data is *not* a visitor that came to the site while googling for  
some bit of information. It's more likely to be a data analyst, mashup  
developer, or integration engineer. So what I suggest is to think of  
these pages not as something that end users see, but rather as  
something akin to Javadoc. Javadoc pages are auto-generated pages that  
describe a public interface of your system. Linked data pages are the  
same, but rather than a Java API, they describe your URI space. And  
unlike Javadoc, they are directly connected to the documented  
artifacts (URIs).


I think that the pages should mostly answer the following questions:  
What concept is identified? What *exactly* is the URI of this concept  
(careful with /html or #this at the end)? Who curates this identifier?  
Can I trust it to be stable? Most linked data pages actually do a  
fairly decent job at answering these.


Every data publisher has limited resources, and spending them on  
prettifying the HTML views is very low-impact. It's much more  
important to increase data quality, publish more data,  improve other  
documentation, and create compelling demos/apps on top of the data.  
The namespace documentation is usually good enough, and the  
geekiness of the pages actually helps to drive home the point that  
it's about *re-using this data elsewhere*, rather than looking at the  
data in the boring old web browser.


That being said, of course nicer-looking pages that present  
information in a more useful way are of course always better, but  
that's a somewhat secondary problem in the linked data context.


Best,
Richard


On 15 Sep 2009, at 10:08, Matthias Samwald wrote:

A central idea of linked data is, in my understanding, that every  
resource has not only a HTTP - resolvable RDF description of itself,  
but also a human-friendly rendering that can be viewed in a web  
browser. With the increasing popularity of RDFa, the URIs of these  
resources are not only hidden away in triplestores, but become  
increasingly exposed on web pages. People want to click on them,  
and, hopefully, not all of these people come from the core community  
of RDF enthusiasts.


This means that the HTML rendering of linked data resources might  
need to look a bit sexier than it does today. I dare to say that the  
Pubby-esque rendering of DBpedia pages such as

http://dbpedia.org/page/Primary_motor_cortex
is helpful to get a quick overview of the RDF triples about this  
resource, but non-RDF-enthusiasts would not find it very inviting.


This could be improved by changes in the layout, and possibly a  
manually curated ordering of properties. For example,

http://d.opencalais.com/er/company/ralg-tr1r/f8a13a13-8dbc-3d7e-82b6-1d7968476cae.html
definitely looks more inviting than the typical DBpedia page (albeit  
still a bit sterile).


In the case of DBpedia, it might be better to expose the excellent  
human-readable Wikipedia page for each resource, plus a prominently  
positioned 'show raw data' tab at the top. For other linked data  
resources that are not derived from existing human-friendly web  
pages, a few stylistic changes (ala OpenCalais) already might  
improve the situation a lot.


Note that this comment is not intended to be a criticism of DBpedia,  
but of all Linked Data resources that expose HTML descriptions of  
resources. DBpedia is just the most popular example.


Cheers,
Matthias Samwald

DERI Galway, Ireland
http://deri.ie/

Konrad Lorenz Institute for Evolution  Cognition Research, Austria
http://kli.ac.at/



--
From: Danny Ayers danny.ay...@gmail.com
Sent: Tuesday, September 15, 2009 4:03 AM
To: public-lod@w3.org
Subject: dbpedia not very visible, nor fun

It seems I have a Wikipedia page in my name (ok, I only did fact- 
check
edits, ok!?). So tonight I went looking for the corresponding  
triples,

looking for my ultimate URI...

Google dbpedia = front page, with news

on the list on the left is Online Access

what do you get?

[[
The DBpedia data set can be accessed online via a SPARQL query
endpoint and as Linked Data.

Contents
1. Querying DBpedia
1.1. Public SPARQL Endpoint
1.2. Public Faceted Web 

Re: Making human-friendly linked data pages more human-friendly (was: dbpedia not very visible, nor fun)

2009-09-15 Thread Wolfgang Orthuber

Linking of data can be very successful, if it is not restricted to RDF 
enthusiasts. In this case the
vocabulary can grow extremely. Consider e.g. integration of healthcare data. 
Existing vocabularies like SNOMED
CT
http://www.ihtsdo.org/news/article/view/snomed-ct-and-interoperable-healthcare-conference-tutorials-tuesday-1st-july-2008/
contain about 40 concepts with increasing tendency.

So if the vocabulary is huge, it is not adequate, that the browser software 
knows about the information for
human readable representation, but it could know how to download this 
information from the web using the
linked data concept.

If http URIs are used as identifier, it is possible to store the information 
for human readable representation
at the location where the http URI points to.

Are there up to now rules for this?

Best

Wolfgang

- Original Message - 
From: Richard Cyganiak rich...@cyganiak.de

To: Matthias Samwald samw...@gmx.at
Cc: public-lod@w3.org
Sent: Tuesday, September 15, 2009 12:46 PM
Subject: Re: Making human-friendly linked data pages more human-friendly (was: 
dbpedia not very visible, nor
fun)



Hi Matthias,

Please allow me to present a contrarian argument.

First, there are some datasets that combine linked data output with a  
traditional website, e.g., by
embedding some RDFa markup. Of course,  in that case, all the rules of good web 
design and information
presentation still apply, and the site has to first and foremost  fulfill the 
visitor's information needs in
order to be successful.  That's self-evident and not what we are talking about 
here.

Most linked data is different. The main purpose is not to create a web  site 
where visitors go to look up
stuff. The main purpose is to  publish data in a re-usable way, in order to 
allow repurposing of the  data
in new applications.

In that case, the audience for the human-readable versions of the  RDF data 
is *not* a visitor that came
to the site while googling for  some bit of information. It's more likely to be 
a data analyst, mashup
developer, or integration engineer. So what I suggest is to think of  these 
pages not as something that end
users see, but rather as  something akin to Javadoc. Javadoc pages are 
auto-generated pages that  describe a
public interface of your system. Linked data pages are the  same, but rather 
than a Java API, they describe
your URI space. And  unlike Javadoc, they are directly connected to the 
documented  artifacts (URIs).

I think that the pages should mostly answer the following questions:  What 
concept is identified? What
*exactly* is the URI of this concept  (careful with /html or #this at the end)? 
Who curates this identifier?
Can I trust it to be stable? Most linked data pages actually do a  fairly 
decent job at answering these.

Every data publisher has limited resources, and spending them on  prettifying 
the HTML views is very
low-impact. It's much more  important to increase data quality, publish more 
data,  improve other
documentation, and create compelling demos/apps on top of the data.  The namespace 
documentation is
usually good enough, and the  geekiness of the pages actually helps to drive 
home the point that  it's about
*re-using this data elsewhere*, rather than looking at the  data in the boring 
old web browser.

That being said, of course nicer-looking pages that present  information in a 
more useful way are of course
always better, but  that's a somewhat secondary problem in the linked data 
context.

Best,
Richard


On 15 Sep 2009, at 10:08, Matthias Samwald wrote:


A central idea of linked data is, in my understanding, that every  resource has 
not only a HTTP -
resolvable RDF description of itself,  but also a human-friendly rendering that 
can be viewed in a web
browser. With the increasing popularity of RDFa, the URIs of these  resources 
are not only hidden away in
triplestores, but become  increasingly exposed on web pages. People want to 
click on them,  and, hopefully,
not all of these people come from the core community  of RDF enthusiasts.

This means that the HTML rendering of linked data resources might  need to look 
a bit sexier than it does
today. I dare to say that the  Pubby-esque rendering of DBpedia pages such as
http://dbpedia.org/page/Primary_motor_cortex
is helpful to get a quick overview of the RDF triples about this  resource, but 
non-RDF-enthusiasts would
not find it very inviting.

This could be improved by changes in the layout, and possibly a  manually 
curated ordering of properties.
For example,
http://d.opencalais.com/er/company/ralg-tr1r/f8a13a13-8dbc-3d7e-82b6-1d7968476cae.html
definitely looks more inviting than the typical DBpedia page (albeit  still a 
bit sterile).

In the case of DBpedia, it might be better to expose the excellent  
human-readable Wikipedia page for each
resource, plus a prominently  positioned 'show raw data' tab at the top. For 
other linked data  resources
that are not derived from existing human

Re: Making human-friendly linked data pages more human-friendly

2009-09-15 Thread Kingsley Idehen

Wolfgang Orthuber wrote:
Linking of data can be very successful, if it is not restricted to RDF 
enthusiasts. In this case the
vocabulary can grow extremely. Consider e.g. integration of healthcare 
data. Existing vocabularies like SNOMED

CT
http://www.ihtsdo.org/news/article/view/snomed-ct-and-interoperable-healthcare-conference-tutorials-tuesday-1st-july-2008/ 


contain about 40 concepts with increasing tendency.

So if the vocabulary is huge, it is not adequate, that the browser 
software knows about the information for
human readable representation, but it could know how to download this 
information from the web using the

linked data concept.

If http URIs are used as identifier, it is possible to store the 
information for human readable representation

at the location where the http URI points to.

Are there up to now rules for this?

HTML+RDFa in the current pages is coming. Ditto some additional links 
via link/.


This satisfies the Linked Data component, the rest is down to the 
principle of: leveraging the ability to explore data about the same 
thing in different ways, via your own context lenses, in line with your 
own world view and data analysis objectives etc..


Linked Data is really about a multi-dimensional data interaction 
experience that is cognizant of the beholder's contextual fluidity, 
above everything else.


Re. the Document Web: Content is King.  Whereas, in Web of Linked Data 
case: Context is King :-)



Kingsley

Best

Wolfgang

- Original Message - From: Richard Cyganiak 
rich...@cyganiak.de

To: Matthias Samwald samw...@gmx.at
Cc: public-lod@w3.org
Sent: Tuesday, September 15, 2009 12:46 PM
Subject: Re: Making human-friendly linked data pages more 
human-friendly (was: dbpedia not very visible, nor

fun)



Hi Matthias,

Please allow me to present a contrarian argument.

First, there are some datasets that combine linked data output with 
a  traditional website, e.g., by
embedding some RDFa markup. Of course,  in that case, all the rules 
of good web design and information
presentation still apply, and the site has to first and foremost  
fulfill the visitor's information needs in
order to be successful.  That's self-evident and not what we are 
talking about here.


Most linked data is different. The main purpose is not to create a 
web  site where visitors go to look up
stuff. The main purpose is to  publish data in a re-usable way, in 
order to allow repurposing of the  data

in new applications.

In that case, the audience for the human-readable versions of the  
RDF data is *not* a visitor that came
to the site while googling for  some bit of information. It's more 
likely to be a data analyst, mashup
developer, or integration engineer. So what I suggest is to think of  
these pages not as something that end
users see, but rather as  something akin to Javadoc. Javadoc pages 
are auto-generated pages that  describe a
public interface of your system. Linked data pages are the  same, but 
rather than a Java API, they describe
your URI space. And  unlike Javadoc, they are directly connected to 
the documented  artifacts (URIs).


I think that the pages should mostly answer the following questions:  
What concept is identified? What
*exactly* is the URI of this concept  (careful with /html or #this at 
the end)? Who curates this identifier?
Can I trust it to be stable? Most linked data pages actually do a  
fairly decent job at answering these.


Every data publisher has limited resources, and spending them on  
prettifying the HTML views is very
low-impact. It's much more  important to increase data quality, 
publish more data,  improve other
documentation, and create compelling demos/apps on top of the data.  
The namespace documentation is
usually good enough, and the  geekiness of the pages actually helps 
to drive home the point that  it's about
*re-using this data elsewhere*, rather than looking at the  data in 
the boring old web browser.


That being said, of course nicer-looking pages that present  
information in a more useful way are of course
always better, but  that's a somewhat secondary problem in the linked 
data context.


Best,
Richard


On 15 Sep 2009, at 10:08, Matthias Samwald wrote:

A central idea of linked data is, in my understanding, that every  
resource has not only a HTTP -
resolvable RDF description of itself,  but also a human-friendly 
rendering that can be viewed in a web
browser. With the increasing popularity of RDFa, the URIs of these  
resources are not only hidden away in
triplestores, but become  increasingly exposed on web pages. People 
want to click on them,  and, hopefully,
not all of these people come from the core community  of RDF 
enthusiasts.


This means that the HTML rendering of linked data resources might  
need to look a bit sexier than it does
today. I dare to say that the  Pubby-esque rendering of DBpedia 
pages such as

http://dbpedia.org/page/Primary_motor_cortex
is helpful to get a quick overview

Re: Making human-friendly linked data pages more human-friendly (was: dbpedia not very visible, nor fun)

2009-09-15 Thread Matthias Samwald

Richard wrote:
First, there are some datasets that combine linked data output with a 
traditional website, e.g., by embedding some RDFa markup. Of course,  in 
that case, all the rules of good web design and information  presentation 
still apply, and the site has to first and foremost  fulfill the visitor's 
information needs in order to be successful.  That's self-evident and not 
what we are talking about here.


Indeed. But a typical web page is full of hyperlinks to external sites. In 
the case of pages based on RDFa, this will often mean linking to external 
linked data resources. Many normal web pages, such as blogs, contain several 
links to wikipedia.org. Many linked data resources contain links to 
dbpedia.org. RDFa uses the a HTML element to represent object properties, 
and I think there will be an increasing desire to create links to, say, 
DBpedia that are both useful for the human reader of the RDFa page, as well 
as for RDF-aware software. It would be very nice to just say


... blahblah a href=http://dbpedia.org/page/Primary_motor_cortex; 
rel=sioc:topicPrimary motor cortex/a blahblah ...


instead of both linking to Wikipedia for the human plus an added hidden link 
to DBpedia for the machine, like this:


 ... blahblah a 
href=http://en.wikipedia.org/wiki/Primary_motor_cortex;Primary motor 
cortex/aa href=http://dbpedia.org/page/Primary_motor_cortex; 
rel=sioc:topic/a blahblah ...


I think that with the increasing popularity of RDFa, linked data resources 
that have a good looking HTML representation will become more popular for 
re-use in RDF statements than those that do not have good looking HTML 
representations. In other words, the RDFa pages will gradually change the 
ecosystem for linked data resources, also impacting the resource providers 
that do not primarily intend to have their pages found via Google and Yahoo.


Hence, the fact that
the audience for the human-readable versions of the  RDF data is *not* a 
visitor that came to the site while googling for  some bit of information. 
It's more likely to be a data analyst, mashup  developer, or integration 
engineer.


might become less and less true.



Cheers,
Matthias Samwald

DERI Galway, Ireland
http://deri.ie/

Konrad Lorenz Institute for Evolution  Cognition Research, Austria
http://kli.ac.at/





Re: Making human-friendly linked data pages more human-friendly

2009-09-15 Thread Adrian Walker
Hi Kingsley  All --

Good to see that the top layers of the cake are getting some attention.
After all that's where the icing is (:-)

We have an approach to making the results from RDF and other queries more
friendly.  It's online at the site below [1,2].

However, the more you think about this, the more you realize that user
friendly answer displays are  necessary, but not sufficient for the general
population of users.

 A big advantage of RDF is that it should enable ordinary users to ask
things no-one has thought of asking before.  Using their own words and
phrases.  Handing them a SPARQL manual definitely falls short.

We approach this by supporting the writing of rules in English into a
browser.  Then users can run the rules, again in the browser.  When
necessary, SQL is generated automatically from the rules.

That's still not the whole story though.  Reasoning over RDF gets
complicated, arguably much more so than over SQL databases.  This raises a
question of trust.  How do I know what the system did when it suggested that
I invest everything in Lehman Brothers?

The system [1] produces English explanations, based on underlying proof
trees, showing the what inferences and data were used in answering a
question.  You can see a simple of example of this by running [2] in a
browser, and asking for explanations.

Apologies to folks who have seen this before, and thanks for comments.

 -- Adrian

[1]  Internet Business Logic
A Wiki and SOA Endpoint for Executable *Open *Vocabulary English over SQL
and RDF
Online at www.reengineeringllc.comShared use is *free*

[2]  www.reengineeringllc.com/demo_agents/RDFQueryLangComparison1.agent



On Tue, Sep 15, 2009 at 7:34 AM, Kingsley Idehen kide...@openlinksw.comwrote:

 Matthias Samwald wrote:

 A central idea of linked data is, in my understanding, that every resource
 has not only a HTTP - resolvable RDF description of itself, but also a
 human-friendly rendering that can be viewed in a web browser. With the
 increasing popularity of RDFa, the URIs of these resources are not only
 hidden away in triplestores, but become increasingly exposed on web pages.
 People want to click on them, and, hopefully, not all of these people come
 from the core community of RDF enthusiasts.

 This means that the HTML rendering of linked data resources might need to
 look a bit sexier than it does today. I dare to say that the Pubby-esque
 rendering of DBpedia pages such as
 http://dbpedia.org/page/Primary_motor_cortex
 is helpful to get a quick overview of the RDF triples about this resource,
 but non-RDF-enthusiasts would not find it very inviting.

 Pubby isn't how DBpedia is published today. It is done via Virtuoso (been
 so for quite a long time now), which has in-built Linked Data
 Publishing/Deployment functionality [1].


 This could be improved by changes in the layout, and possibly a manually
 curated ordering of properties. For example,

 http://d.opencalais.com/er/company/ralg-tr1r/f8a13a13-8dbc-3d7e-82b6-1d7968476cae.html
 definitely looks more inviting than the typical DBpedia page (albeit still
 a bit sterile).

 You can tweak the HTML template and just send it to us. BTW, the URIBurner
 [2] pages which also use exactly the same Linked Data Deployment
 functionality behind DBpedia also have a slightly different look and feel.
 That can be applied to DBpedia in nano seconds.


 In the case of DBpedia, it might be better to expose the excellent
 human-readable Wikipedia page for each resource, plus a prominently
 positioned 'show raw data' tab at the top. For other linked data resources
 that are not derived from existing human-friendly web pages, a few stylistic
 changes (ala OpenCalais) already might improve the situation a lot.

 Note that this comment is not intended to be a criticism of DBpedia, but
 of all Linked Data resources that expose HTML descriptions of resources.
 DBpedia is just the most popular example.

 Not seen as criticism, just a wake up call. On our part (OpenLink) we've
 always sought to draw a small line between OpenLink branding and the more
 community oriented DBpedia project. Thus, our preference has been to wait
 for community preferences, and then within that context apply updates to the
 project, especially re. aesthetics.

 Links:

 1.
 http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeployingLinkedDataGuide.html--
  Virtuoso Linked Data Deployment Guide
 2. http://www.uriburner.com/wiki/URIBurner/

 Kingsley


 Cheers,
 Matthias Samwald

 DERI Galway, Ireland
 http://deri.ie/

 Konrad Lorenz Institute for Evolution  Cognition Research, Austria
 http://kli.ac.at/



 --
 From: Danny Ayers danny.ay...@gmail.com
 Sent: Tuesday, September 15, 2009 4:03 AM
 To: public-lod@w3.org
 Subject: dbpedia not very visible, nor fun

  It seems I have a Wikipedia page in my name (ok, I only did fact-check
 edits, ok!?). So tonight I went 

Re: Making human-friendly linked data pages more human-friendly (was: dbpedia not very visible, nor fun)

2009-09-15 Thread Richard Cyganiak

Matthias,

On 15 Sep 2009, at 14:30, Matthias Samwald wrote:

Richard wrote:
First, there are some datasets that combine linked data output with  
a traditional website, e.g., by embedding some RDFa markup. Of  
course,  in that case, all the rules of good web design and  
information  presentation still apply, and the site has to first  
and foremost  fulfill the visitor's information needs in order to  
be successful.  That's self-evident and not what we are talking  
about here.


Indeed. But a typical web page is full of hyperlinks to external  
sites. In the case of pages based on RDFa, this will often mean  
linking to external linked data resources. Many normal web pages,  
such as blogs, contain several links to wikipedia.org. Many linked  
data resources contain links to dbpedia.org. RDFa uses the a HTML  
element to represent object properties, and I think there will be an  
increasing desire to create links to, say, DBpedia that are both  
useful for the human reader of the RDFa page, as well as for RDF- 
aware software. It would be very nice to just say


... blahblah a href=http://dbpedia.org/page/Primary_motor_cortex;  
rel=sioc:topicPrimary motor cortex/a blahblah ...


instead of both linking to Wikipedia for the human plus an added  
hidden link to DBpedia for the machine, like this:


 ... blahblah a href=http://en.wikipedia.org/wiki/Primary_motor_cortex 
Primary motor cortex/aa href=http://dbpedia.org/page/Primary_motor_cortex 
 rel=sioc:topic/a blahblah ...


Aside: Here's a less ugly version, @resource overrides @href if both  
are present:


a href=http://en.wikipedia.org/wiki/Primary_motor_cortex; resource=http://dbpedia.org/page/Primary_motor_cortex 
 rel=sioc:topicPrimary motor cortex/a


I think that with the increasing popularity of RDFa, linked data  
resources that have a good looking HTML representation will become  
more popular for re-use in RDF statements than those that do not  
have good looking HTML representations. In other words, the RDFa  
pages will gradually change the ecosystem for linked data resources,  
also impacting the resource providers that do not primarily intend  
to have their pages found via Google and Yahoo.


Hence, the fact that
the audience for the human-readable versions of the  RDF data is  
*not* a visitor that came to the site while googling for  some bit  
of information. It's more likely to be a data analyst, mashup   
developer, or integration engineer.


might become less and less true.


These are good points, and I don't really disagree with any of them,  
except perhaps in that I think that at the moment, data quality, a  
sensible URI scheme, wide coverage, perception of stability, and  
proper marketing are more likely to determine the success of a dataset  
in attracting links. It is true that at some point in the future, we  
will take it for granted that any serious dataset has all those  
attributes, and then the quality of the visual representation would  
perhaps close or break the deal.


Best,
Richard






Cheers,
Matthias Samwald

DERI Galway, Ireland
http://deri.ie/

Konrad Lorenz Institute for Evolution  Cognition Research, Austria
http://kli.ac.at/






Re: Making human-friendly linked data pages more human-friendly

2009-09-15 Thread Kingsley Idehen

Adrian Walker wrote:

Hi Kingsley  All --

Good to see that the top layers of the cake are getting some 
attention.  After all that's where the icing is (:-)


We have an approach to making the results from RDF and other queries 
more friendly.  It's online at the site below [1,2].


However, the more you think about this, the more you realize that user 
friendly answer displays are  necessary, but not sufficient for the 
general population of users.


 A big advantage of RDF is that it should enable ordinary users to ask 
things no-one has thought of asking before.  Using their own words and 
phrases.  Handing them a SPARQL manual definitely falls short.


We approach this by supporting the writing of rules in English into a 
browser.  Then users can run the rules, again in the browser.  When 
necessary, SQL is generated automatically from the rules.


That's still not the whole story though.  Reasoning over RDF gets 
complicated, arguably much more so than over SQL databases.  This 
raises a question of trust.  How do I know what the system did when it 
suggested that I invest everything in Lehman Brothers?


The system [1] produces English explanations, based on underlying 
proof trees, showing the what inferences and data were used in 
answering a question.  You can see a simple of example of this by 
running [2] in a browser, and asking for explanations.


Apologies to folks who have seen this before, and thanks for comments.

 -- Adrian

[1]  Internet Business Logic
A Wiki and SOA Endpoint for Executable /Open /Vocabulary English over 
SQL and RDF
Online at www.reengineeringllc.com 
http://www.reengineeringllc.comShared use is /free/


[2]  
www.reengineeringllc.com/demo_agents/RDFQueryLangComparison1.agent 
http://www.reengineeringllc.com/demo_agents/RDFQueryLangComparison1.agent


Adrian,

Cool!

Also note the LOD Cloud Cache instance at: http://lod.openlinksw.com, 
also note it does also Web Services exposed at the endpoint: 
http://lod.openlinksw.com/fct. Naturally, the same applies to DBpedia 
via: http://dbpedia.org/fct.


The more services, views etc.. around DBpedia and the rest of the Linked 
Data Cloud,  the better  :-)


Context Fluidity, Data Heterogeneity, and Powerful Lookup capabilities 
are age-old problem areas that the Linked Data meme addresses very well.


Kingsley





On Tue, Sep 15, 2009 at 7:34 AM, Kingsley Idehen 
kide...@openlinksw.com mailto:kide...@openlinksw.com wrote:


Matthias Samwald wrote:

A central idea of linked data is, in my understanding, that
every resource has not only a HTTP - resolvable RDF
description of itself, but also a human-friendly rendering
that can be viewed in a web browser. With the increasing
popularity of RDFa, the URIs of these resources are not only
hidden away in triplestores, but become increasingly exposed
on web pages. People want to click on them, and, hopefully,
not all of these people come from the core community of RDF
enthusiasts.

This means that the HTML rendering of linked data resources
might need to look a bit sexier than it does today. I dare to
say that the Pubby-esque rendering of DBpedia pages such as
http://dbpedia.org/page/Primary_motor_cortex
is helpful to get a quick overview of the RDF triples about
this resource, but non-RDF-enthusiasts would not find it very
inviting.

Pubby isn't how DBpedia is published today. It is done via
Virtuoso (been so for quite a long time now), which has in-built
Linked Data Publishing/Deployment functionality [1].


This could be improved by changes in the layout, and possibly
a manually curated ordering of properties. For example,

http://d.opencalais.com/er/company/ralg-tr1r/f8a13a13-8dbc-3d7e-82b6-1d7968476cae.html

definitely looks more inviting than the typical DBpedia page
(albeit still a bit sterile).

You can tweak the HTML template and just send it to us. BTW, the
URIBurner [2] pages which also use exactly the same Linked Data
Deployment functionality behind DBpedia also have a slightly
different look and feel. That can be applied to DBpedia in nano
seconds.


In the case of DBpedia, it might be better to expose the
excellent human-readable Wikipedia page for each resource,
plus a prominently positioned 'show raw data' tab at the top.
For other linked data resources that are not derived from
existing human-friendly web pages, a few stylistic changes
(ala OpenCalais) already might improve the situation a lot.

Note that this comment is not intended to be a criticism of
DBpedia, but of all Linked Data resources that expose HTML
descriptions of resources. DBpedia is just the most popular
example.

Not seen as criticism, just a wake up call. On our part 

Re: Making human-friendly linked data pages more human-friendly (was: dbpedia not very visible, nor fun)

2009-09-15 Thread Matthias Samwald

Richard wrote:
These are good points, and I don't really disagree with any of them, 
except perhaps in that I think that at the moment, data quality, a 
sensible URI scheme, wide coverage, perception of stability, and


Those are, in part, very difficult problems that the community has been 
working on for the last decade, and will be working on in the next few 
years. Of course, they are of primary importance. However, this does not 
mean that working on improving the human-readable interfaces for linked data 
resources will slow down any of these efforts. Indeed, I think improvements 
could be made with very little effort (if we compare it to the hundreds of 
man-years that are devoted into working on these other problems). DBpedia 
could be quickly improved by making use of the Wikipedia source pages. 
Pubby-derivate sites could be quickly improved by modifying the style sheet 
and perhaps adding some additional functionality. The investment of work is 
miniscule compared to the work that is devoted to some of the bigger 
problems.


proper marketing are more likely to determine the success of a dataset  in 
attracting links


I think that having good looking web pages to show to end-users and 
investors are a very important parts of proper marketing.


Cheers,
Matthias 





Re: Making human-friendly linked data pages more human-friendly (was: dbpedia not very visible, nor fun)

2009-09-15 Thread Hugh Glaser
I agree with Richard and think I also agree with Matthias.
We do need to have nice publishing of our Linked Data.
But I really don't see why it should be the publisher of the Linked Data
that (yet again) bears the brunt of the work.
And of course they are likely to do a less than great job, not being UI
specialists.
They are better spending their time doing what they are good at - publishing
LD.

I almost use zitgist as the standard html publisher of my LD (and certainly
link to it), and I am sure that a greater effort to provide top quality
generic browsers that do the simple things well would obviate the need for
me to provide anything of my own (other than the metametadata to inform the
rendering).
There are related technologies such as fresnel that could inform the process
(we publish fresnel descriptions for many our Things).

It's time we had more specialists in LD, eg people who specialise in:
publishing LD;
defining ontologies;
identifying linkage;
consuming LD.

Best
Hugh

On 15/09/2009 12:46, Richard Cyganiak rich...@cyganiak.de wrote:

 Hi Matthias,
 
 Please allow me to present a contrarian argument.
 
 First, there are some datasets that combine linked data output with a
 traditional website, e.g., by embedding some RDFa markup. Of course,
 in that case, all the rules of good web design and information
 presentation still apply, and the site has to first and foremost
 fulfill the visitor's information needs in order to be successful.
 That's self-evident and not what we are talking about here.
 
 Most linked data is different. The main purpose is not to create a web
 site where visitors go to look up stuff. The main purpose is to
 publish data in a re-usable way, in order to allow repurposing of the
 data in new applications.
 
 In that case, the audience for the human-readable versions of the
 RDF data is *not* a visitor that came to the site while googling for
 some bit of information. It's more likely to be a data analyst, mashup
 developer, or integration engineer. So what I suggest is to think of
 these pages not as something that end users see, but rather as
 something akin to Javadoc. Javadoc pages are auto-generated pages that
 describe a public interface of your system. Linked data pages are the
 same, but rather than a Java API, they describe your URI space. And
 unlike Javadoc, they are directly connected to the documented
 artifacts (URIs).
 
 I think that the pages should mostly answer the following questions:
 What concept is identified? What *exactly* is the URI of this concept
 (careful with /html or #this at the end)? Who curates this identifier?
 Can I trust it to be stable? Most linked data pages actually do a
 fairly decent job at answering these.
 
 Every data publisher has limited resources, and spending them on
 prettifying the HTML views is very low-impact. It's much more
 important to increase data quality, publish more data,  improve other
 documentation, and create compelling demos/apps on top of the data.
 The namespace documentation is usually good enough, and the
 geekiness of the pages actually helps to drive home the point that
 it's about *re-using this data elsewhere*, rather than looking at the
 data in the boring old web browser.
 
 That being said, of course nicer-looking pages that present
 information in a more useful way are of course always better, but
 that's a somewhat secondary problem in the linked data context.
 
 Best,
 Richard
 
 
 On 15 Sep 2009, at 10:08, Matthias Samwald wrote:
 
 A central idea of linked data is, in my understanding, that every
 resource has not only a HTTP - resolvable RDF description of itself,
 but also a human-friendly rendering that can be viewed in a web
 browser. With the increasing popularity of RDFa, the URIs of these
 resources are not only hidden away in triplestores, but become
 increasingly exposed on web pages. People want to click on them,
 and, hopefully, not all of these people come from the core community
 of RDF enthusiasts.
 
 This means that the HTML rendering of linked data resources might
 need to look a bit sexier than it does today. I dare to say that the
 Pubby-esque rendering of DBpedia pages such as
 http://dbpedia.org/page/Primary_motor_cortex
 is helpful to get a quick overview of the RDF triples about this
 resource, but non-RDF-enthusiasts would not find it very inviting.
 
 This could be improved by changes in the layout, and possibly a
 manually curated ordering of properties. For example,
 http://d.opencalais.com/er/company/ralg-tr1r/f8a13a13-8dbc-3d7e-82b6-1d796847
 6cae.html
 definitely looks more inviting than the typical DBpedia page (albeit
 still a bit sterile).
 
 In the case of DBpedia, it might be better to expose the excellent
 human-readable Wikipedia page for each resource, plus a prominently
 positioned 'show raw data' tab at the top. For other linked data
 resources that are not derived from existing human-friendly web
 pages, a few stylistic changes (ala OpenCalais) already 

Re: Making human-friendly linked data pages more human-friendly

2009-09-15 Thread Kingsley Idehen

Hugh Glaser wrote:

I agree with Richard and think I also agree with Matthias.
We do need to have nice publishing of our Linked Data.
But I really don't see why it should be the publisher of the Linked Data
that (yet again) bears the brunt of the work.
  

Exactly!

Linked Data is about separation of:
- identity
- storage
- representation
- access
- presentation

This is the heart of the matter, and the sooner we internalize it the 
sooner we get more done, across the community at large.



And of course they are likely to do a less than great job, not being UI
specialists.
  

Not being specialist, or even when capable, simply not top priority.


They are better spending their time doing what they are good at - publishing
LD.
  


What they are good at, or what works best for their strategic priorities 
etc.. Again we have heterogeneity here also :-)

I almost use zitgist as the standard html publisher of my LD (and certainly
link to it), and I am sure that a greater effort to provide top quality
generic browsers that do the simple things well would obviate the need for
me to provide anything of my own (other than the metametadata to inform the
rendering).
There are related technologies such as fresnel that could inform the process
(we publish fresnel descriptions for many our Things).

It's time we had more specialists in LD, eg people who specialise in:
  
publishing LD;

defining ontologies;
identifying linkage;
consuming LD.
  


Yes.

Heterogeneity of interests, world views, and skill sets,  are data 
access realities that Linked Data brings to the fore :-)



Kingsley

Best
Hugh

On 15/09/2009 12:46, Richard Cyganiak rich...@cyganiak.de wrote:

  

Hi Matthias,

Please allow me to present a contrarian argument.

First, there are some datasets that combine linked data output with a
traditional website, e.g., by embedding some RDFa markup. Of course,
in that case, all the rules of good web design and information
presentation still apply, and the site has to first and foremost
fulfill the visitor's information needs in order to be successful.
That's self-evident and not what we are talking about here.

Most linked data is different. The main purpose is not to create a web
site where visitors go to look up stuff. The main purpose is to
publish data in a re-usable way, in order to allow repurposing of the
data in new applications.

In that case, the audience for the human-readable versions of the
RDF data is *not* a visitor that came to the site while googling for
some bit of information. It's more likely to be a data analyst, mashup
developer, or integration engineer. So what I suggest is to think of
these pages not as something that end users see, but rather as
something akin to Javadoc. Javadoc pages are auto-generated pages that
describe a public interface of your system. Linked data pages are the
same, but rather than a Java API, they describe your URI space. And
unlike Javadoc, they are directly connected to the documented
artifacts (URIs).

I think that the pages should mostly answer the following questions:
What concept is identified? What *exactly* is the URI of this concept
(careful with /html or #this at the end)? Who curates this identifier?
Can I trust it to be stable? Most linked data pages actually do a
fairly decent job at answering these.

Every data publisher has limited resources, and spending them on
prettifying the HTML views is very low-impact. It's much more
important to increase data quality, publish more data,  improve other
documentation, and create compelling demos/apps on top of the data.
The namespace documentation is usually good enough, and the
geekiness of the pages actually helps to drive home the point that
it's about *re-using this data elsewhere*, rather than looking at the
data in the boring old web browser.

That being said, of course nicer-looking pages that present
information in a more useful way are of course always better, but
that's a somewhat secondary problem in the linked data context.

Best,
Richard


On 15 Sep 2009, at 10:08, Matthias Samwald wrote:



A central idea of linked data is, in my understanding, that every
resource has not only a HTTP - resolvable RDF description of itself,
but also a human-friendly rendering that can be viewed in a web
browser. With the increasing popularity of RDFa, the URIs of these
resources are not only hidden away in triplestores, but become
increasingly exposed on web pages. People want to click on them,
and, hopefully, not all of these people come from the core community
of RDF enthusiasts.

This means that the HTML rendering of linked data resources might
need to look a bit sexier than it does today. I dare to say that the
Pubby-esque rendering of DBpedia pages such as
http://dbpedia.org/page/Primary_motor_cortex
is helpful to get a quick overview of the RDF triples about this
resource, but non-RDF-enthusiasts would not find it very inviting.

This could be improved by changes in the layout, and possibly a