Re: The Public Domain (was Re: LOD Data Sets, Licensing, and AWS)

Kingsley Idehen Wed, 24 Jun 2009 19:01:47 -0700

Ian Davis wrote:

On Wed, Jun 24, 2009 at 9:56 PM, Kingsley Idehen<[email protected] <mailto:[email protected]>> wrote:
    The NYT, London Times, and others of this ilk, are more likely to
    contribute their quality data to the LOD cloud if they know there
    is a vehicle (e.g., a license scheme) that ensures their HTTP URIs
    are protected i.e., always accessible to user agents at the data
    representation (HTML, XML, N3, RDF/XML, Turtle etc..) level;
    thereby ensuring citation and attribution requirements are honored.
I agree with that, but it only covers a small portion of what isneeded. You fail to consider the situations where people publish dataabout other people's URIs, as reviews or annotation.

I am not, far from it.

The foaf:primaryTopic mechanism isn't strong enough if the publisherrequires full attribution for use of their data. If I use SPARQL toextract a subset of reviews to display on my site then in alllikelihood I have lost that linkage with the publishing document.

Only if you choose to construct your result document using literalvalues i.e., a SPARQL solution that has URIs filtered out; anyway, ifthats what you end up doing, then you do have <link/> and @rel at yourdisposal for identifying your data sources, worst case.

    Attribution is the kind of thing one gives as the result of a
    license requirement in exchange for permission to copy. In the
    academic world for journal articles this doesn't come into play at
    all, since there is no copying (in the usual case). Instead people
    cite articles because the norms of their community demand it.

    Yes, and the HTTP URI ultimately delivers the kind mechanism I
    believe most traditional media companies seek (as stated above).
    They ultimately want people to use their data with low cost
    citation and attribution intrinsic to the medium of value exchange.

The BBC is a traditional media company. Its data is licensed only forpersonal, non-commercial use: http://www.bbc.co.uk/terms/#3

I used New York Times and London Times for specific reasons, theirbusiness models are different from that of the BBC; they are traditional*commercial* media companies.

    btw - how are you dealing with this matter re. the
    nuerocommons.org <http://nuerocommons.org> linked data space? How
    do you ensure your valuable work is fully credited as it bubbles
    up the value chain?
I found this linked from the RDF Distribution page on neurocommons.org<http://neurocommons.org> :http://svn.neurocommons.org/svn/trunk/product/bundles/frontend/nsparql/NOTICES.txt
Everyone should read it right now to appreciate the complexity ofaggregating data from many sources when they all have idiosyncraticrequirements of attribution.
Then readhttp://sciencecommons.org/projects/publishing/open-access-data-protocol/to see how we should be approaching the licensing of data. It explainsin detail the motivations for things like CC-0 and PDDL which seek topromote open access for all by removing restrictions:
"Thus, to facilitate data integration and open access data sharing,any implementation of this protocol MUST waive all rights necessaryfor data extraction and re-use (including copyright, sui generisdatabase rights, claims of unfair competition, implied contracts, andother legal rights), and MUST NOT apply any obligations on the user ofthe data or database such as “copyleft” or “share alike”, or even thelegal requirement to provide attribution. Any implementation SHOULDdefine a non-legally binding set of citation norms in clear,lay-readable language."
Science Commons have spent a lot of time and resources to come to thisconclusion, and they tried all kinds of alternatives such asattribution and share alike licences (as did Talis). The finalconsensus was that the public domain was the only mechanism that couldscale for the future. Without this kind of approach, aggregating,querying and reusing the web of data will become impossibly complex.This is a key motivation for Talis starting the Connected Commonsprogramme ( http://www.talis.com/platform/cc/ ). We want to see moredata that is unambiguously reusable because it has been placed in thepublic domain using CC-0 or the Open Data Commons PDDL.
So, I urge everyone publishing data onto the linked data web toconsider waiving all rights over it using one of the licenses above.

I don't think "waiving all rights" is a practical option for the likesof New York Times or Times of London, ditto traditional commercial mediacompanies.

As Kingsley points out, you will always be attributed via the URIs youmint.

This part I totally agree with :-)

Ian
PS. This was the subject of my keynote at code4lib 2009 "If you lovesomething, set it free", which you can view herehttp://www.slideshare.net/iandavis/code4lib2009-keynote-1073812

The thing about "Free" is that we'll always end up having todisambiguate: "Free Speech" and "Free Beer". That's the sad nature ofthe overloaded "Free" moniker that belies the Open Source moniker.


--


Regards,

Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen

President & CEOOpenLink Software Web: http://www.openlinksw.com

Re: The Public Domain (was Re: LOD Data Sets, Licensing, and AWS)

Reply via email to