Ian Davis wrote:
On Wed, Jun 24, 2009 at 9:56 PM, Kingsley Idehen <[email protected] <mailto:[email protected]>> wrote:

    The NYT, London Times, and others of this ilk, are more likely to
    contribute their quality data to the LOD cloud if they know there
    is a vehicle (e.g., a license scheme) that ensures their HTTP URIs
    are protected i.e., always accessible to user agents at the data
    representation (HTML, XML, N3, RDF/XML, Turtle etc..) level;
    thereby ensuring citation and attribution requirements are honored.


I agree with that, but it only covers a small portion of what is needed. You fail to consider the situations where people publish data about other people's URIs, as reviews or annotation.
I am not, far from it.
The foaf:primaryTopic mechanism isn't strong enough if the publisher requires full attribution for use of their data. If I use SPARQL to extract a subset of reviews to display on my site then in all likelihood I have lost that linkage with the publishing document.
Only if you choose to construct your result document using literal values i.e., a SPARQL solution that has URIs filtered out; anyway, if thats what you end up doing, then you do have <link/> and @rel at your disposal for identifying your data sources, worst case.


    Attribution is the kind of thing one gives as the result of a
    license requirement in exchange for permission to copy. In the
    academic world for journal articles this doesn't come into play at
    all, since there is no copying (in the usual case). Instead people
    cite articles because the norms of their community demand it.

    Yes, and the HTTP URI ultimately delivers the kind mechanism I
    believe most traditional media companies seek (as stated above).
    They ultimately want people to use their data with low cost
    citation and attribution intrinsic to the medium of value exchange.


The BBC is a traditional media company. Its data is licensed only for personal, non-commercial use: http://www.bbc.co.uk/terms/#3
I used New York Times and London Times for specific reasons, their business models are different from that of the BBC; they are traditional *commercial* media companies.
    btw - how are you dealing with this matter re. the
    nuerocommons.org <http://nuerocommons.org> linked data space? How
    do you ensure your valuable work is fully credited as it bubbles
    up the value chain?


I found this linked from the RDF Distribution page on neurocommons.org <http://neurocommons.org> : http://svn.neurocommons.org/svn/trunk/product/bundles/frontend/nsparql/NOTICES.txt

Everyone should read it right now to appreciate the complexity of aggregating data from many sources when they all have idiosyncratic requirements of attribution.

Then read http://sciencecommons.org/projects/publishing/open-access-data-protocol/ to see how we should be approaching the licensing of data. It explains in detail the motivations for things like CC-0 and PDDL which seek to promote open access for all by removing restrictions:

"Thus, to facilitate data integration and open access data sharing, any implementation of this protocol MUST waive all rights necessary for data extraction and re-use (including copyright, sui generis database rights, claims of unfair competition, implied contracts, and other legal rights), and MUST NOT apply any obligations on the user of the data or database such as “copyleft” or “share alike”, or even the legal requirement to provide attribution. Any implementation SHOULD define a non-legally binding set of citation norms in clear, lay-readable language."

Science Commons have spent a lot of time and resources to come to this conclusion, and they tried all kinds of alternatives such as attribution and share alike licences (as did Talis). The final consensus was that the public domain was the only mechanism that could scale for the future. Without this kind of approach, aggregating, querying and reusing the web of data will become impossibly complex. This is a key motivation for Talis starting the Connected Commons programme ( http://www.talis.com/platform/cc/ ). We want to see more data that is unambiguously reusable because it has been placed in the public domain using CC-0 or the Open Data Commons PDDL.

So, I urge everyone publishing data onto the linked data web to consider waiving all rights over it using one of the licenses above.
I don't think "waiving all rights" is a practical option for the likes of New York Times or Times of London, ditto traditional commercial media companies.
As Kingsley points out, you will always be attributed via the URIs you mint.
This part I totally agree with :-)


Ian

PS. This was the subject of my keynote at code4lib 2009 "If you love something, set it free", which you can view here http://www.slideshare.net/iandavis/code4lib2009-keynote-1073812


The thing about "Free" is that we'll always end up having to disambiguate: "Free Speech" and "Free Beer". That's the sad nature of the overloaded "Free" moniker that belies the Open Source moniker.

--


Regards,

Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO OpenLink Software Web: http://www.openlinksw.com





Reply via email to