It's like we have an International Relations list, filled with people who
seem like they are involved or interested in International Relations, and
yet all conversations turn into debates about what fonts to use for
Esperanto, or meta-debates about whether something counts as International
I agree with this entirely, and it's why I keep insisting that for most
purposes datasets should be expressed using local identifiers, with all
external linkages called out explicitly and/or externally. owl:sameAs and
the use of other people's identifiers for your own nodes are equally
dangerous.
Who else would be able to make assertions about your notion of Brussels
vis-a-vis some other notion of Brussels with any more authority than your
own?
The person doing the integration of your dataset with some other dataset for
some purpose of *theirs*. The kind of correspondence needed is a
Glenn:
Now, if you go through the archives of this mailing list, you'll see
earlier posts where I pointed out this pattern to Hugh (maybe are year to
two ago). As is really the case most of the time, your concerns are factored
into what we do, I just need to find the right language for
From my perspective as the designer of a system that both consumes and
publishes data, the load/burden issue here is not at all particular to the
semantic web. Needle obeys robots.txt rules, but that's a small deal
compared to the difficulty of extracting whole data from sites set up to
deliver it
{
id: 605980750,
name: Kingsley Uyi Idehen,
first_name: Kingsley,
middle_name: Uyi,
last_name: Idehen,
link: https://www.facebook.com/kidehen;,
username: kidehen,
gender: male,
locale: en_US
}
Some observations:
id attribute has value 605980750, this value means
It seems pretty clear to me that schema.org is a good step for data and
humans. It's unlikely to be the end of anything, and I have my own set of
particular issues and regrets** about it, but it's a potentially huge
visibility/credibility boost for the ideas of structured data and common
Here's why. In library world, perhaps more than elsewhere, it is
common to do things like this,
http://example.org/issn/1234-5678 a bibo:Jornal;
blah blah blah some descriptions;
owl:sameAs urn:issn:1234-5678.
This is because there are standard identifiers for lots of things that
That may be so but it misses the point. The point is there is a field,
be it a URI or a literal however modelled, that can be used to join
between two datasets. This join field is hidden in that there exists
no (known) dataset that contains all possible values it can take on.
Hmm. I'm still
If one has one dataset (say) and wants to find other datasets that
might be usefully combined with it to do some analysis, it would (I
think) be useful to have something like this to help with the discovery.
OK, but I'm not seeing is how this extra imaginary dataset helps with
discovery,
it's not feasible, nor enforceable, nor desirable to develop ontologies
entirely with random URIs as identifiers.
Just to be clear, I said pure identifiers, not random URIs. I like
integers as local IDs. Add a base URI and you've got perfectly good URIs for
everything. Or a default prefix, if
This reminds me to come back to the point about what I initially
called Directionality, and Dave improved to Modeling Consistency.
Dave is right, I think, that in terms of data quality, it is
consistency that matters, not directionality. That is, as long as we
know that a president was involved
Using Safari, when I click on the link above I hit an authentication
challenge [1].
Oh, I see. Clicking on the export links in the browser works without
authentication, but issuing the queries directly doesn't yet. No good reason
for that, so I'll get it changed.
1. Do you have a Data Object
I don't think your detailed questions about Needle have any relevance to
this conversation about data quality, and I'm not sure they're of much
relevance to this mailing list at all, so I'm not going to follow up on them
any further in this context unless somebody else expresses interest. As Hugh
I've updated this data to model the individual chapter-joining events.
On Wed, Apr 13, 2011 at 12:16 PM, Kingsley Idehen kide...@openlinksw.comwrote:
On 4/13/11 11:42 AM, Marco Neumann wrote:
I am currently looking Chapter vs Years and I think you do some very
nice data presentation here.
As part of conversations about data, you do need to able to see the
subjectively bad to make it subjectively good. What you can't do (which
is what Glenn does repeatedly) is conflate the tools that actually enable
you see the subjectively good, bad, or ugly with said data.
I'm a tool
A minor quibble, not sure about Directionality. You can follow an RDF
link in both directions (at least in SPARQL and any RDF API I've worked
with). I would be inclined to generalize and rephrase this as ...
Consistency of modelling: whichever way you make modelling decisions
such as
But who ever told you, or inferred to you, that any LOD demo is about the
Complete Linked Data Experience let alone the Complete Data Experience.
I didn't capitalize those. A human's experience of data is the product of
the underlying data and the tool/experience/interface through which they
BIND(URI(CONCAT(http://dbpedia.org/resource/;, ?label)) AS ?dbpResource)
The 1.0/1.1 clunkiness is just temporary, but I feel obliged to point out
the hand-waving in this join-via-URI-concatenation...
BIND(URI(CONCAT(http://dbpedia.org/resource/;, ?label)) AS ?dbpResource)
The 1.0/1.1 clunkiness is just temporary, but I feel obliged to point out
the hand-waving in this join-via-URI-concatenation...
What now? You don't like the manner in which a solution has been
constructed? What are
Stop quibbling, contribute a solution.
As you know, but others might not, I work on www.needlebase.com, a
graph-database project incubated at ITA and due to become part of Google any
hour now. It takes a somewhat different approach to data representation and
data curation than the
If you have a dataset fix for Danny's problems (or any others you've
stumbled across along the way) do share via a URL.
Well, the problems in Danny's case were these:
- the required query path to connect gods to planets was non-obvious and not
trivial to figure out by exploring
- doing
You continue to imply that seeing subjectively imperfect data projected via
a data oriented tool is problematic re., your total data experience world
view.
I continue to think it's hilarious that you consider it subjectively
imperfect that your dataset says Michael Jackson and Michael
So yes, I think you should feel a little embarrassed about broadcasting
links to a demo in which the very first piece of data one sees is obviously
wrong.
To you the first piece of that is an owl:sameAs assertion. That's 100% fine
for you, but that isn't true for everyone else. It just
Nothing about the DBMS hosting the datasets (where each has a Named Graph
IRI) prevents the beholder or consumer from achieving the following via the
available data access endpoints:
1. Accessing and altering the source query or SPARQL protocol URL
I tried clicking your OpenLink Data
Please post the URL in question so I can double check what's happening.
Remember, I am sharing URLs across the Web, there are many factor in play
re. time variant nature of resources. etc..
Anyway, give me a URL and I can look into what might be happening.
http://linkeddata.uriburner.com/ode/?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FMichael_Jackson
The link above doesn't correspond to any link I've sent to you owl:sameAs
inference context. Basically, that's ODE one of many browsers we offer. Its
forte isn't showcasing owl:sameAs expansion.
Are 'Michael Jackson' Object ID and Object Representation Access Address
distinct?
Yes.
BTW - can I assume this is the actual URL that you intended above re. access
to JSON based graph representation:
I didn't start this thread with polishing datasets in mind.
Exactly. You started the thread to brag about quantity. I suggest that if
you're claiming to talk about real data, quantity and quality are
inextricable.
I am demonstrating and talking about what Virtuoso infrastructure enables..
You're talking about it, and you're *trying* to demonstrate it. But your
demonstrations are consistently undermined by other factors you consider
irrelevant.
Nonsense. See
Are you not able use the public instance for intelligent faceting across
the massive datasets that it hosts?
I think it's fair to say Yes, I am not able to use the public instance for
anything I would consider 'intelligent faceting' of the dbpedia dataset. As
I said before, I don't mean this
On Wed, Apr 6, 2011 at 3:59 PM, Kingsley Idehen kide...@openlinksw.comwrote:
On 4/6/11 2:16 PM, glenn mcdonald wrote:
I didn't start this thread with polishing datasets in mind.
Exactly. You started the thread to brag about quantity. I suggest that if
you're claiming to talk about real
My target audience is interested in DBMS scalability with regards to RDF
data ingestion, indexing, and publication. You, as far as I can gather are
more interested in idealism
I'm interested in making computers do a better job of helping humans make
sense of data. So yes, I care about data
33 matches
Mail list logo