Re: [ol-discuss] The (necessary) License/Licence Question

2013-03-04 Thread Ben Companjen
Alexis, thanks for your IA answer.
I do agree with Tom and Lee though, that the current statement of
usage responsibility is not really, err, usable.

I wasn't there from the beginning and haven't read all logs and emails
in the archive, so I don't really know what happened then. It is clear
that there were several imports of MARC record sets, some of better
quality than others, and that a bot 'took' information from the
Library of Congress at regular intervals or semi-continuously. For
each of these sources, there must have been at least a decision to
'take'/start taking the data and I assume something gave the
impression (or better: explicit confirmation) that that 'taking' was
allowed.
Can it be determined for these sources? Talis, sets from several
state(?) libraries, the Library of Congress, Amazon?

Indeed, in the worst case data may need to be removed to make the
whole OL dumps/live content shareable under an Open Data licence. The
OpenStreetMap project changed their licence from CC-BY-SA to ODBl
because the latter was apparently better for sharing data(bases). The
change involved lots of discussion and agreements from each and every
user to agree to the new licence or have his/her contributions removed
on a certain date. Perhaps objects were reverted to the last edit n in
a chain of edits (1, 2, ..., n, n+1, n+2, ...) in which all
contributors agreed to the new licence.
Contributions from companies (like a map producer who provided
complete data for some countries) also had to be relicenced which
required a bit of lobbying, but didn't pose a lot of trouble.
It is important to note that the OSM community has been pretty strict
on only allowing original data or donations of data by the
rightsholder under the right licence terms.

About the cover images: isn't there a fair use in allowing low
resolution images for book identification purposes? I believe
Wikipedia or Wikimedia Commons has some rules or guidelines for using
copyrighted images. I believe that although this image is
copyrighted, its display here is fair use, because .. no open
alternative, small enough to discourage reuse... etc.

Ben
___
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org


Re: [ol-discuss] The (necessary) License/Licence Question

2013-03-02 Thread Tom Morris
On Fri, Mar 1, 2013 at 9:04 PM, Lee Passey l...@passkeysoft.com wrote:

 On Fri, March 1, 2013 11:07 am, Tom Morris wrote:
  On Wed, Feb 27, 2013 at 1:07 AM, Alexis Rossi ale...@archive.org
 wrote:
 
  The data in Open Library comes from various libraries and other sources.
  We don't know whether those other parties have asserted any rights over
  that data, or whether they are legally able to do so. As our Terms of
  Service states, we ourselves do not assert any rights over the data in
 OL,
  but that doesn't mean that someone else won't.  I understand that you'd
  like us to put some clarifying license on the data, but we don't have
 the
  information that would allow us to do that.
 
 
  That, of course, makes it completely impossible to reuse the data in any
  legal manner.

 Yes. What's your point?


Sorry I wasn't clearer, Lee.  I take it as implicit given that reuse of the
data is an important goal for Open Library and the current state of affairs
doesn't make this possible.


  Depending on how many sources of data were used, it may be a lot of work,
  but the only sane way forward that allows people to actually use this
 data
  is to start the process of vetting and scrubbing what people contributed.
  Otherwise, all this work is going into something that no one will ever be
  able to reuse.

 At this point I would say that it is be virtually impossible to scrub
 the data that is in OL's data store; there's just too much of it. Had OL
 been aware of the issue at the outset, and retained both provenance
 information and license claims maybe it would have been possible, but
 they did not and now it is not. The best approach is to probably treat
 the current OL as a beta test, discard what's there, and start over. You
 probably wouldn't even need to discard the entire data store, only that
 part that is not simple bibliographic data.


I guess our analyses differ, although I admit I've only given it a cursory
look.  There *is* a lot of data, but much it was imported in big chunks and
the analysis only needs to be done once per source/import.  There's also a
lot of provenance information -- you've probably seen the page edit
histories and the links to the original MARC records at the bottom of pages.

 In the worst case, starting from scratch is an option.  In either case
though, the longer we delay, the worse the problem becomes.

My advice for those who want to ...


I wouldn't consider giving legal advice to people who aren't clients, but I
do invite anyone who's interested in helping to solve the problem to work
with me.

Tom
___
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org


Re: [ol-discuss] The (necessary) License/Licence Question

2013-03-01 Thread Tom Morris
Alexis - I appreciate Internet Archive providing an official response,
since it's an important question.

On Wed, Feb 27, 2013 at 1:07 AM, Alexis Rossi ale...@archive.org wrote:


 The data in Open Library comes from various libraries and other sources.
  We don't know whether those other parties have asserted any rights over
 that data, or whether they are legally able to do so.  As our Terms of
 Service states, we ourselves do not assert any rights over the data in OL,
 but that doesn't mean that someone else won't.  I understand that you'd
 like us to put some clarifying license on the data, but we don't have the
 information that would allow us to do that.


That, of course, makes it completely impossible to reuse the data in any
legal manner.  They way that one controls the provenance of the data is by
requiring contributors to license their contributions in a manner which is
compatible with your intended license for downstream consumers.  For
example, the DPLA says that they don't think most bibliographic data is
copyrightable, but to the extent that it is, all their contributors will be
required to license contributions under CC0.  Of course, part of that is
requiring that contributors have the right to grant such a license.

Allowing anyone to contribute anything, not matter how shady the
provenance, and telling the consumer that they bear all risk of using the
data is the modus operandi of the Pirate Bay and its ilk.  It shouldn't be
the way Open Library operates.

A little banner popped up the other day when I used Open Library to tell me
that the Boston Public Library uses Open Library data.  I wonder if
venerable institutions like that are aware of what a shady operation
they're dealing with.

Depending on how many sources of data were used, it may be a lot of work,
but the only sane way forward that allows people to actually use this data
is to start the process of vetting and scrubbing what people contributed.
 Otherwise, all this work is going into something that no one will ever be
able to reuse.

Tom



 Thanks,
 Alexis Rossi
 Internet Archive

 On 26-Feb-2013, at 10:16 AM, Tom Morris wrote:

 Happy Open Data Day!  Thanks for bringing this up.  I think one of the
 best things that people can do to be open is to be explicit and
 transparent about the terms that they license their information under and,
 if they accept  remix content, the terms under which they accept data.

 One of the key things that Creative Commons licenses were designed to
 address is the friction caused by everyone having to read, understand, and
 approve of lots of different unique licenses.  Refusing to declare a
 license and making cloudy statements about the provenance of the data is
 the ultimate in anti-openness.

 I eagerly await clarification from the Open Library and/or Internet
 Archive staff.

 Tom

 On Sat, Feb 23, 2013 at 5:10 PM, Ben Companjen bencompan...@gmail.com
  wrote:

 I should have added the most clearly confusing license statement,
 buried at http://openlibrary.org/developers/licensing

 The Internet Archive does not assert any new copyright or other
 proprietary rights over any of the material in the Open Library
 database. There may be existing rights issues on some contributions
 and in some jurisdictions. When it comes to community projects, the
 legal issues are, frankly, very confusing, but we are attempting to
 make a database that can be openly used for a wide variety of
 purposes. We appreciate all that have contributed.

 On 23 February 2013 20:40, Ben Companjen bencompan...@gmail.com wrote:
  On this Open Data Day [0], for some already over, for others just
  starting, to celebrate, promote and use Open Data (with open as in
  the Open Definition [1]), I would really like to know: under what
  terms can data from Open Library be used?
 
  This question recently surfaced (on the ol-tech list) when John Shutt
  proposed to add first paragraphs from Wikipedia to work descriptions.
  Wikipedia's licence requires attribution (which is easily added), but
  also that the derived work is shared under the same conditions. When
  someone edits a work or book, she agrees to waive all rights by
  sharing the content of the edit under CC0 [2]. CC0 is incompatible
  with CC-BY and CC-BY-SA, because it requires no attribution and
  certainly no share alike.
 
  In the discussion, Karen pointed at the Terms and conditions of the
  Internet Archive [3] (that as you know hosts/pays for OL) and that
  they apply to OL content. They state that IA respects others'
  copyright (and not much more is said). Tom Morris replied that makes
  it really difficult to know what usage rights are granted for OL
  dumps, web data, API data etc.
 
  I believe it is important to show (limitations to) what can be done
  with the OL data and if necessary, clearly state that text with
  some/all rights restricted (e.g. from Wikipedia) should not be
  included in OL.
 
  Related question: as lots of information was 

Re: [ol-discuss] The (necessary) License/Licence Question

2013-03-01 Thread Lee Passey
On Fri, March 1, 2013 11:07 am, Tom Morris wrote:

 Alexis - I appreciate Internet Archive providing an official response,
 since it's an important question.

Based on recent postings I'm not certain anymore that there /is/ anyone 
at Internet Archive you can provide an /official/ response -- other than 
Brester Kahle, of course. Expecting anything more than what is already 
in the TOS is, IMO, a vain hope.

 On Wed, Feb 27, 2013 at 1:07 AM, Alexis Rossi ale...@archive.org wrote:

 The data in Open Library comes from various libraries and other sources.
 We don't know whether those other parties have asserted any rights over
 that data, or whether they are legally able to do so. As our Terms of
 Service states, we ourselves do not assert any rights over the data in OL,
 but that doesn't mean that someone else won't.  I understand that you'd
 like us to put some clarifying license on the data, but we don't have the
 information that would allow us to do that.


 That, of course, makes it completely impossible to reuse the data in any
 legal manner.

Yes. What's your point?

Reusing data from OL is a lot like dumpster divers scavenging food from
restaurant dumpsters: it's probably nutritious, but you really never 
know where it's been.

!-- snip --

 Allowing anyone to contribute anything, not matter how shady the
 provenance, and telling the consumer that they bear all risk of using the
 data is the modus operandi of the Pirate Bay and its ilk. It shouldn't be
 the way Open Library operates.

Perhaps not, but that's the way OL /does/ operate, and you simply have 
to decide how that impacts your little corner of the world, and given 
that, how much time you want to spend contributing.

!-- snip --

 Depending on how many sources of data were used, it may be a lot of work,
 but the only sane way forward that allows people to actually use this data
 is to start the process of vetting and scrubbing what people contributed.
 Otherwise, all this work is going into something that no one will ever be
 able to reuse.

At this point I would say that it is be virtually impossible to scrub 
the data that is in OL's data store; there's just too much of it. Had OL 
been aware of the issue at the outset, and retained both provenance 
information and license claims maybe it would have been possible, but 
they did not and now it is not. The best approach is to probably treat 
the current OL as a beta test, discard what's there, and start over. You 
probably wouldn't even need to discard the entire data store, only that 
part that is not simple bibliographic data.

My advice for those who want to consume OL data is as follows:

1. If it's unadorned data, e.g. Book titles, author names, etc. use it 
without reservation (if you're in the U.S.--I understand that some 
European countries allow for copyrights on collections of data, but the 
U.S. does not). It may not be accurate, but at least it's not copyrighted.

2. Never consume cover images, as these are almost certainly under 
copyright and uploaded without permission. I frequently check e-books 
out from my public library and about half the time the cover image in 
the e-book is just a generic image with the book title printed on it. 
That's because the publisher who has rights to republish the text may 
not have rights to publish the cover image on anything but a paper book; 
two creators, two copyrights.

3. The origin of any text that displays even a modicum of creativity 
cannot be trusted. Feel free to read it in the OL user interface, but 
whether you republish it depends on your own risk tolerance. Would you 
eat food from a dumpster? If so, republishing OL data may be acceptable 
to you.

For those who want to contribute to OL, I would advise:

1. Add all the unadorned data you want. A note telling people where you 
obtained the data from would be appreciated, but is not necessary.

2. Don't upload cover images; you are almost certainly violating someone 
else's copyright.

3. For everything else, do whatever you want. If OL doesn't care about 
rights why should you? It's the one doing the publishing, so it's the 
one that will get sued. If you're nervous, use a phony e-mail address 
and post from behind an IP anonymiser.

___
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org


Re: [ol-discuss] The (necessary) License/Licence Question

2013-03-01 Thread peter francis
unsubscribe.

 
Peter Francis



 From: Lee Passey l...@passkeysoft.com
To: Open Library -- general discussion ol-discuss@archive.org 
Sent: Saturday, 2 March 2013, 4:04
Subject: Re: [ol-discuss] The (necessary) License/Licence Question
 
On Fri, March 1, 2013 11:07 am, Tom Morris wrote:

 Alexis - I appreciate Internet Archive providing an official response,
 since it's an important question.

Based on recent postings I'm not certain anymore that there /is/ anyone 
at Internet Archive you can provide an /official/ response -- other than 
Brester Kahle, of course. Expecting anything more than what is already 
in the TOS is, IMO, a vain hope.

 On Wed, Feb 27, 2013 at 1:07 AM, Alexis Rossi ale...@archive.org wrote:

 The data in Open Library comes from various libraries and other sources.
 We don't know whether those other parties have asserted any rights over
 that data, or whether they are legally able to do so. As our Terms of
 Service states, we ourselves do not assert any rights over the data in OL,
 but that doesn't mean that someone else won't.  I understand that you'd
 like us to put some clarifying license on the data, but we don't have the
 information that would allow us to do that.


 That, of course, makes it completely impossible to reuse the data in any
 legal manner.

Yes. What's your point?

Reusing data from OL is a lot like dumpster divers scavenging food from
restaurant dumpsters: it's probably nutritious, but you really never 
know where it's been.

!-- snip --

 Allowing anyone to contribute anything, not matter how shady the
 provenance, and telling the consumer that they bear all risk of using the
 data is the modus operandi of the Pirate Bay and its ilk. It shouldn't be
 the way Open Library operates.

Perhaps not, but that's the way OL /does/ operate, and you simply have 
to decide how that impacts your little corner of the world, and given 
that, how much time you want to spend contributing.

!-- snip --

 Depending on how many sources of data were used, it may be a lot of work,
 but the only sane way forward that allows people to actually use this data
 is to start the process of vetting and scrubbing what people contributed.
 Otherwise, all this work is going into something that no one will ever be
 able to reuse.

At this point I would say that it is be virtually impossible to scrub 
the data that is in OL's data store; there's just too much of it. Had OL 
been aware of the issue at the outset, and retained both provenance 
information and license claims maybe it would have been possible, but 
they did not and now it is not. The best approach is to probably treat 
the current OL as a beta test, discard what's there, and start over. You 
probably wouldn't even need to discard the entire data store, only that 
part that is not simple bibliographic data.

My advice for those who want to consume OL data is as follows:

1. If it's unadorned data, e.g. Book titles, author names, etc. use it 
without reservation (if you're in the U.S.--I understand that some 
European countries allow for copyrights on collections of data, but the 
U.S. does not). It may not be accurate, but at least it's not copyrighted.

2. Never consume cover images, as these are almost certainly under 
copyright and uploaded without permission. I frequently check e-books 
out from my public library and about half the time the cover image in 
the e-book is just a generic image with the book title printed on it. 
That's because the publisher who has rights to republish the text may 
not have rights to publish the cover image on anything but a paper book; 
two creators, two copyrights.

3. The origin of any text that displays even a modicum of creativity 
cannot be trusted. Feel free to read it in the OL user interface, but 
whether you republish it depends on your own risk tolerance. Would you 
eat food from a dumpster? If so, republishing OL data may be acceptable 
to you.

For those who want to contribute to OL, I would advise:

1. Add all the unadorned data you want. A note telling people where you 
obtained the data from would be appreciated, but is not necessary.

2. Don't upload cover images; you are almost certainly violating someone 
else's copyright.

3. For everything else, do whatever you want. If OL doesn't care about 
rights why should you? It's the one doing the publishing, so it's the 
one that will get sued. If you're nervous, use a phony e-mail address 
and post from behind an IP anonymiser.

___
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org___
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr

Re: [ol-discuss] The (necessary) License/Licence Question

2013-02-26 Thread Alex Stinson
+1 This is the kinds of roundabout answers that you get from many
institutions, such as the Smithsonian, but basically puts a stop to all
forms of reuse, defeating the purpose of an open environment.

Alex Stinson

On Mon, Feb 25, 2013 at 10:46 PM, Tom Morris tfmor...@gmail.com wrote:

 Happy Open Data Day!  Thanks for bringing this up.  I think one of the
 best things that people can do to be open is to be explicit and
 transparent about the terms that they license their information under and,
 if they accept  remix content, the terms under which they accept data.

 One of the key things that Creative Commons licenses were designed to
 address is the friction caused by everyone having to read, understand, and
 approve of lots of different unique licenses.  Refusing to declare a
 license and making cloudy statements about the provenance of the data is
 the ultimate in anti-openness.

 I eagerly await clarification from the Open Library and/or Internet
 Archive staff.

 Tom


 On Sat, Feb 23, 2013 at 5:10 PM, Ben Companjen bencompan...@gmail.comwrote:

 I should have added the most clearly confusing license statement,
 buried at http://openlibrary.org/developers/licensing

 The Internet Archive does not assert any new copyright or other
 proprietary rights over any of the material in the Open Library
 database. There may be existing rights issues on some contributions
 and in some jurisdictions. When it comes to community projects, the
 legal issues are, frankly, very confusing, but we are attempting to
 make a database that can be openly used for a wide variety of
 purposes. We appreciate all that have contributed.

 On 23 February 2013 20:40, Ben Companjen bencompan...@gmail.com wrote:
  On this Open Data Day [0], for some already over, for others just
  starting, to celebrate, promote and use Open Data (with open as in
  the Open Definition [1]), I would really like to know: under what
  terms can data from Open Library be used?
 
  This question recently surfaced (on the ol-tech list) when John Shutt
  proposed to add first paragraphs from Wikipedia to work descriptions.
  Wikipedia's licence requires attribution (which is easily added), but
  also that the derived work is shared under the same conditions. When
  someone edits a work or book, she agrees to waive all rights by
  sharing the content of the edit under CC0 [2]. CC0 is incompatible
  with CC-BY and CC-BY-SA, because it requires no attribution and
  certainly no share alike.
 
  In the discussion, Karen pointed at the Terms and conditions of the
  Internet Archive [3] (that as you know hosts/pays for OL) and that
  they apply to OL content. They state that IA respects others'
  copyright (and not much more is said). Tom Morris replied that makes
  it really difficult to know what usage rights are granted for OL
  dumps, web data, API data etc.
 
  I believe it is important to show (limitations to) what can be done
  with the OL data and if necessary, clearly state that text with
  some/all rights restricted (e.g. from Wikipedia) should not be
  included in OL.
 
  Related question: as lots of information was ingested from the Library
  of Congress and other libraries and Amazon, were there special or
  general (non-exclusive) agreements that allowed OL to take this data?
 
  There may be arguments for fair use, facts can't be copyrighted,
  LC data is in the public domain, but these are partial answers at
  best.
 
  Enjoy Open Data Day :)
 
  Regards,
 
  Ben
 
  [0] http://opendataday.org
  [1] http://opendefinition.org
  [2] http://creativecommons.org/about/cc0
  [3] http://archive.org/about/terms.php
 ___
 Ol-discuss mailing list
 Ol-discuss@archive.org
 http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
 To unsubscribe from this mailing list, send email to
 ol-discuss-unsubscr...@archive.org



 ___
 Ol-discuss mailing list
 Ol-discuss@archive.org
 http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
 To unsubscribe from this mailing list, send email to
 ol-discuss-unsubscr...@archive.org


___
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org


Re: [ol-discuss] The (necessary) License/Licence Question

2013-02-25 Thread Tom Morris
Happy Open Data Day!  Thanks for bringing this up.  I think one of the best
things that people can do to be open is to be explicit and transparent
about the terms that they license their information under and, if they
accept  remix content, the terms under which they accept data.

One of the key things that Creative Commons licenses were designed to
address is the friction caused by everyone having to read, understand, and
approve of lots of different unique licenses.  Refusing to declare a
license and making cloudy statements about the provenance of the data is
the ultimate in anti-openness.

I eagerly await clarification from the Open Library and/or Internet Archive
staff.

Tom

On Sat, Feb 23, 2013 at 5:10 PM, Ben Companjen bencompan...@gmail.comwrote:

 I should have added the most clearly confusing license statement,
 buried at http://openlibrary.org/developers/licensing

 The Internet Archive does not assert any new copyright or other
 proprietary rights over any of the material in the Open Library
 database. There may be existing rights issues on some contributions
 and in some jurisdictions. When it comes to community projects, the
 legal issues are, frankly, very confusing, but we are attempting to
 make a database that can be openly used for a wide variety of
 purposes. We appreciate all that have contributed.

 On 23 February 2013 20:40, Ben Companjen bencompan...@gmail.com wrote:
  On this Open Data Day [0], for some already over, for others just
  starting, to celebrate, promote and use Open Data (with open as in
  the Open Definition [1]), I would really like to know: under what
  terms can data from Open Library be used?
 
  This question recently surfaced (on the ol-tech list) when John Shutt
  proposed to add first paragraphs from Wikipedia to work descriptions.
  Wikipedia's licence requires attribution (which is easily added), but
  also that the derived work is shared under the same conditions. When
  someone edits a work or book, she agrees to waive all rights by
  sharing the content of the edit under CC0 [2]. CC0 is incompatible
  with CC-BY and CC-BY-SA, because it requires no attribution and
  certainly no share alike.
 
  In the discussion, Karen pointed at the Terms and conditions of the
  Internet Archive [3] (that as you know hosts/pays for OL) and that
  they apply to OL content. They state that IA respects others'
  copyright (and not much more is said). Tom Morris replied that makes
  it really difficult to know what usage rights are granted for OL
  dumps, web data, API data etc.
 
  I believe it is important to show (limitations to) what can be done
  with the OL data and if necessary, clearly state that text with
  some/all rights restricted (e.g. from Wikipedia) should not be
  included in OL.
 
  Related question: as lots of information was ingested from the Library
  of Congress and other libraries and Amazon, were there special or
  general (non-exclusive) agreements that allowed OL to take this data?
 
  There may be arguments for fair use, facts can't be copyrighted,
  LC data is in the public domain, but these are partial answers at
  best.
 
  Enjoy Open Data Day :)
 
  Regards,
 
  Ben
 
  [0] http://opendataday.org
  [1] http://opendefinition.org
  [2] http://creativecommons.org/about/cc0
  [3] http://archive.org/about/terms.php
 ___
 Ol-discuss mailing list
 Ol-discuss@archive.org
 http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
 To unsubscribe from this mailing list, send email to
 ol-discuss-unsubscr...@archive.org

___
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org