Re: voiD 1.0 guide comments

Richard Cyganiak Sun, 01 Feb 2009 16:45:26 -0800


Jiri,


Thanks for the feedback.

On 1 Feb 2009, at 21:03, Jiri Prochazka wrote:

In the article I haven't found a solid definition of what is a dataset
and when to use another dataset/subset. I think this has to be clearly
defined.


“A dataset in voiD (void:Dataset) is a collection of data, which is:
- published and maintained by a single provider, and
- available as RDF, and

- accessible, for example, through dereferenceable HTTP URIs or aSPARQL endpoint.”

I think this is as clear as it's possible without becoming overlyconstraining.

From what I understood, the publisher which is the "primary key" of
datasets.


It's three points, see above.

I think that it should be emphasized that categorizing datasets should
only be used, if the data in it are somewhat homogeneous - the
categorization applies to all of it.

Categorization is an art that is way older than voiD, and we don'twant tell people how to do it properly! And I definitely don't agreewith you when you say that “a categorization must apply to all of thedataset”. For example, I think it would be absolutely adequate to saythat DBpedia is about people and geography, because it is a sizableand valuable resource for both those areas, even though it alsocontains data about lots of other things.

I guess the categorization it is fairly unusable in use cases like
personal website, because the information are various...

Well, http://dbpedia.org/resource/Personal_web_page might be a nicesubject here. (Assuming that you do have some interesting RDF on yoursite!)

(I note with regret that the Wikipedia article on “Random stuff” hasbeen deleted, it would make for another nice DBpedia resource...)

Another thing - dataset partitioning. Combination of dataset
categorization and partitioning led me to great confusion - I have
thought voiD also wanted to categorize the data in the dataset.
Better to put a notice that partitioning should be used carefully and
that it was designed for mirroring of datasets.

I don't understand. “I have thought voiD also wanted to categorizingthe data in the dataset” -- yes, that IS what we want. “partitioningwas designed for mirroring of datasets” -- no, it was designed forcases where voiD authors want to say something about just a part ofthe dataset, and not about the entire dataset, for whatever reason.


Best,
Richard

Best regards,
Jiri Prochazka
PS: Please send the replies also directly to me, as I am notsubscribed
to this list.

Re: voiD 1.0 guide comments

Reply via email to