Re: RDF, Linked Data etc : please ping me when it's over ...

2013-06-19 Thread glenn mcdonald
It's like we have an International Relations list, filled with people who seem like they are involved or interested in International Relations, and yet all conversations turn into debates about what fonts to use for Esperanto, or meta-debates about whether something counts as International

Re: Where to put the knowledge you add

2011-10-12 Thread glenn mcdonald
I agree with this entirely, and it's why I keep insisting that for most purposes datasets should be expressed using local identifiers, with all external linkages called out explicitly and/or externally. owl:sameAs and the use of other people's identifiers for your own nodes are equally dangerous.

Re: Where to put the knowledge you add

2011-10-12 Thread glenn mcdonald
Who else would be able to make assertions about your notion of Brussels vis-a-vis some other notion of Brussels with any more authority than your own? The person doing the integration of your dataset with some other dataset for some purpose of *theirs*. The kind of correspondence needed is a

Re: Where to put the knowledge you add

2011-10-12 Thread glenn mcdonald
Glenn: Now, if you go through the archives of this mailing list, you'll see earlier posts where I pointed out this pattern to Hugh (maybe are year to two ago). As is really the case most of the time, your concerns are factored into what we do, I just need to find the right language for

Re: Think before you write Semantic Web crawlers

2011-06-22 Thread glenn mcdonald
From my perspective as the designer of a system that both consumes and publishes data, the load/burden issue here is not at all particular to the semantic web. Needle obeys robots.txt rules, but that's a small deal compared to the difficulty of extracting whole data from sites set up to deliver it

Re: Using Facebook Data Objects to illuminate Linked Data add-on re. structured data

2011-06-12 Thread glenn mcdonald
{ id: 605980750, name: Kingsley Uyi Idehen, first_name: Kingsley, middle_name: Uyi, last_name: Idehen, link:;, username: kidehen, gender: male, locale: en_US } Some observations: id attribute has value 605980750, this value means

Re: in RDF ... expected Types in RDFS

2011-06-06 Thread glenn mcdonald
It seems pretty clear to me that is a good step for data and humans. It's unlikely to be the end of anything, and I have my own set of particular issues and regrets** about it, but it's a potentially huge visibility/credibility boost for the ideas of structured data and common

Re: implied datasets

2011-05-23 Thread glenn mcdonald
Here's why. In library world, perhaps more than elsewhere, it is common to do things like this, a bibo:Jornal; blah blah blah some descriptions; owl:sameAs urn:issn:1234-5678. This is because there are standard identifiers for lots of things that

Re: implied datasets

2011-05-23 Thread glenn mcdonald
That may be so but it misses the point. The point is there is a field, be it a URI or a literal however modelled, that can be used to join between two datasets. This join field is hidden in that there exists no (known) dataset that contains all possible values it can take on. Hmm. I'm still

Re: implied datasets

2011-05-23 Thread glenn mcdonald
If one has one dataset (say) and wants to find other datasets that might be usefully combined with it to do some analysis, it would (I think) be useful to have something like this to help with the discovery. OK, but I'm not seeing is how this extra imaginary dataset helps with discovery,

Re: Best Practice for Renaming OWL Vocabulary Elements

2011-05-18 Thread glenn mcdonald
it's not feasible, nor enforceable, nor desirable to develop ontologies entirely with random URIs as identifiers. Just to be clear, I said pure identifiers, not random URIs. I like integers as local IDs. Add a base URI and you've got perfectly good URIs for everything. Or a default prefix, if

Re: Take2: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-15 Thread glenn mcdonald
This reminds me to come back to the point about what I initially called Directionality, and Dave improved to Modeling Consistency. Dave is right, I think, that in terms of data quality, it is consistency that matters, not directionality. That is, as long as we know that a president was involved

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-13 Thread glenn mcdonald
Using Safari, when I click on the link above I hit an authentication challenge [1]. Oh, I see. Clicking on the export links in the browser works without authentication, but issuing the queries directly doesn't yet. No good reason for that, so I'll get it changed. 1. Do you have a Data Object

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-13 Thread glenn mcdonald
I don't think your detailed questions about Needle have any relevance to this conversation about data quality, and I'm not sure they're of much relevance to this mailing list at all, so I'm not going to follow up on them any further in this context unless somebody else expresses interest. As Hugh

Re: Subjectivity and Fluidity of Data Quality

2011-04-13 Thread glenn mcdonald
I've updated this data to model the individual chapter-joining events. On Wed, Apr 13, 2011 at 12:16 PM, Kingsley Idehen kide...@openlinksw.comwrote: On 4/13/11 11:42 AM, Marco Neumann wrote: I am currently looking Chapter vs Years and I think you do some very nice data presentation here.

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald
As part of conversations about data, you do need to able to see the subjectively bad to make it subjectively good. What you can't do (which is what Glenn does repeatedly) is conflate the tools that actually enable you see the subjectively good, bad, or ugly with said data. I'm a tool

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald
A minor quibble, not sure about Directionality. You can follow an RDF link in both directions (at least in SPARQL and any RDF API I've worked with). I would be inclined to generalize and rephrase this as ... Consistency of modelling: whichever way you make modelling decisions such as

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald
But who ever told you, or inferred to you, that any LOD demo is about the Complete Linked Data Experience let alone the Complete Data Experience. I didn't capitalize those. A human's experience of data is the product of the underlying data and the tool/experience/interface through which they

Re: Wordnet Planets SPARQL Puzzle

2011-04-12 Thread glenn mcdonald
BIND(URI(CONCAT(;, ?label)) AS ?dbpResource) The 1.0/1.1 clunkiness is just temporary, but I feel obliged to point out the hand-waving in this join-via-URI-concatenation...

Re: Wordnet Planets SPARQL Puzzle

2011-04-12 Thread glenn mcdonald
BIND(URI(CONCAT(;, ?label)) AS ?dbpResource) The 1.0/1.1 clunkiness is just temporary, but I feel obliged to point out the hand-waving in this join-via-URI-concatenation... What now? You don't like the manner in which a solution has been constructed? What are

Re: Wordnet Planets SPARQL Puzzle

2011-04-12 Thread glenn mcdonald
Stop quibbling, contribute a solution. As you know, but others might not, I work on, a graph-database project incubated at ITA and due to become part of Google any hour now. It takes a somewhat different approach to data representation and data curation than the

Re: Wordnet Planets SPARQL Puzzle

2011-04-12 Thread glenn mcdonald
If you have a dataset fix for Danny's problems (or any others you've stumbled across along the way) do share via a URL. Well, the problems in Danny's case were these: - the required query path to connect gods to planets was non-obvious and not trivial to figure out by exploring - doing

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald
You continue to imply that seeing subjectively imperfect data projected via a data oriented tool is problematic re., your total data experience world view. I continue to think it's hilarious that you consider it subjectively imperfect that your dataset says Michael Jackson and Michael

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald
So yes, I think you should feel a little embarrassed about broadcasting links to a demo in which the very first piece of data one sees is obviously wrong. To you the first piece of that is an owl:sameAs assertion. That's 100% fine for you, but that isn't true for everyone else. It just

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald
Nothing about the DBMS hosting the datasets (where each has a Named Graph IRI) prevents the beholder or consumer from achieving the following via the available data access endpoints: 1. Accessing and altering the source query or SPARQL protocol URL I tried clicking your OpenLink Data

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald
Please post the URL in question so I can double check what's happening. Remember, I am sharing URLs across the Web, there are many factor in play re. time variant nature of resources. etc.. Anyway, give me a URL and I can look into what might be happening.

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald The link above doesn't correspond to any link I've sent to you owl:sameAs inference context. Basically, that's ODE one of many browsers we offer. Its forte isn't showcasing owl:sameAs expansion.

Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread glenn mcdonald
Are 'Michael Jackson' Object ID and Object Representation Access Address distinct? Yes. BTW - can I assume this is the actual URL that you intended above re. access to JSON based graph representation:

Re: LOD Cloud Cache Stats

2011-04-06 Thread glenn mcdonald
I didn't start this thread with polishing datasets in mind. Exactly. You started the thread to brag about quantity. I suggest that if you're claiming to talk about real data, quantity and quality are inextricable.

Re: LOD Cloud Cache Stats

2011-04-06 Thread glenn mcdonald
I am demonstrating and talking about what Virtuoso infrastructure enables.. You're talking about it, and you're *trying* to demonstrate it. But your demonstrations are consistently undermined by other factors you consider irrelevant. Nonsense. See

Re: LOD Cloud Cache Stats

2011-04-06 Thread glenn mcdonald
Are you not able use the public instance for intelligent faceting across the massive datasets that it hosts? I think it's fair to say Yes, I am not able to use the public instance for anything I would consider 'intelligent faceting' of the dbpedia dataset. As I said before, I don't mean this

Re: LOD Cloud Cache Stats

2011-04-06 Thread glenn mcdonald
On Wed, Apr 6, 2011 at 3:59 PM, Kingsley Idehen kide...@openlinksw.comwrote: On 4/6/11 2:16 PM, glenn mcdonald wrote: I didn't start this thread with polishing datasets in mind. Exactly. You started the thread to brag about quantity. I suggest that if you're claiming to talk about real

Re: LOD Cloud Cache Stats

2011-04-06 Thread glenn mcdonald
My target audience is interested in DBMS scalability with regards to RDF data ingestion, indexing, and publication. You, as far as I can gather are more interested in idealism I'm interested in making computers do a better job of helping humans make sense of data. So yes, I care about data