Re: [ol-discuss] The (necessary) License/Licence Question
Alexis, thanks for your IA answer. I do agree with Tom and Lee though, that the current statement of usage responsibility is not really, err, usable. I wasn't there from the beginning and haven't read all logs and emails in the archive, so I don't really know what happened then. It is clear that there were several imports of MARC record sets, some of better quality than others, and that a bot 'took' information from the Library of Congress at regular intervals or semi-continuously. For each of these sources, there must have been at least a decision to 'take'/start taking the data and I assume something gave the impression (or better: explicit confirmation) that that 'taking' was allowed. Can it be determined for these sources? Talis, sets from several state(?) libraries, the Library of Congress, Amazon? Indeed, in the worst case data may need to be removed to make the whole OL dumps/live content shareable under an Open Data licence. The OpenStreetMap project changed their licence from CC-BY-SA to ODBl because the latter was apparently better for sharing data(bases). The change involved lots of discussion and agreements from each and every user to agree to the new licence or have his/her contributions removed on a certain date. Perhaps objects were reverted to the last edit n in a chain of edits (1, 2, ..., n, n+1, n+2, ...) in which all contributors agreed to the new licence. Contributions from companies (like a map producer who provided complete data for some countries) also had to be relicenced which required a bit of lobbying, but didn't pose a lot of trouble. It is important to note that the OSM community has been pretty strict on only allowing original data or donations of data by the rightsholder under the right licence terms. About the cover images: isn't there a fair use in allowing low resolution images for book identification purposes? I believe Wikipedia or Wikimedia Commons has some rules or guidelines for using copyrighted images. I believe that although this image is copyrighted, its display here is fair use, because .. no open alternative, small enough to discourage reuse... etc. Ben ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] The (necessary) License/Licence Question
On Fri, Mar 1, 2013 at 9:04 PM, Lee Passey l...@passkeysoft.com wrote: On Fri, March 1, 2013 11:07 am, Tom Morris wrote: On Wed, Feb 27, 2013 at 1:07 AM, Alexis Rossi ale...@archive.org wrote: The data in Open Library comes from various libraries and other sources. We don't know whether those other parties have asserted any rights over that data, or whether they are legally able to do so. As our Terms of Service states, we ourselves do not assert any rights over the data in OL, but that doesn't mean that someone else won't. I understand that you'd like us to put some clarifying license on the data, but we don't have the information that would allow us to do that. That, of course, makes it completely impossible to reuse the data in any legal manner. Yes. What's your point? Sorry I wasn't clearer, Lee. I take it as implicit given that reuse of the data is an important goal for Open Library and the current state of affairs doesn't make this possible. Depending on how many sources of data were used, it may be a lot of work, but the only sane way forward that allows people to actually use this data is to start the process of vetting and scrubbing what people contributed. Otherwise, all this work is going into something that no one will ever be able to reuse. At this point I would say that it is be virtually impossible to scrub the data that is in OL's data store; there's just too much of it. Had OL been aware of the issue at the outset, and retained both provenance information and license claims maybe it would have been possible, but they did not and now it is not. The best approach is to probably treat the current OL as a beta test, discard what's there, and start over. You probably wouldn't even need to discard the entire data store, only that part that is not simple bibliographic data. I guess our analyses differ, although I admit I've only given it a cursory look. There *is* a lot of data, but much it was imported in big chunks and the analysis only needs to be done once per source/import. There's also a lot of provenance information -- you've probably seen the page edit histories and the links to the original MARC records at the bottom of pages. In the worst case, starting from scratch is an option. In either case though, the longer we delay, the worse the problem becomes. My advice for those who want to ... I wouldn't consider giving legal advice to people who aren't clients, but I do invite anyone who's interested in helping to solve the problem to work with me. Tom ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] The (necessary) License/Licence Question
Alexis - I appreciate Internet Archive providing an official response, since it's an important question. On Wed, Feb 27, 2013 at 1:07 AM, Alexis Rossi ale...@archive.org wrote: The data in Open Library comes from various libraries and other sources. We don't know whether those other parties have asserted any rights over that data, or whether they are legally able to do so. As our Terms of Service states, we ourselves do not assert any rights over the data in OL, but that doesn't mean that someone else won't. I understand that you'd like us to put some clarifying license on the data, but we don't have the information that would allow us to do that. That, of course, makes it completely impossible to reuse the data in any legal manner. They way that one controls the provenance of the data is by requiring contributors to license their contributions in a manner which is compatible with your intended license for downstream consumers. For example, the DPLA says that they don't think most bibliographic data is copyrightable, but to the extent that it is, all their contributors will be required to license contributions under CC0. Of course, part of that is requiring that contributors have the right to grant such a license. Allowing anyone to contribute anything, not matter how shady the provenance, and telling the consumer that they bear all risk of using the data is the modus operandi of the Pirate Bay and its ilk. It shouldn't be the way Open Library operates. A little banner popped up the other day when I used Open Library to tell me that the Boston Public Library uses Open Library data. I wonder if venerable institutions like that are aware of what a shady operation they're dealing with. Depending on how many sources of data were used, it may be a lot of work, but the only sane way forward that allows people to actually use this data is to start the process of vetting and scrubbing what people contributed. Otherwise, all this work is going into something that no one will ever be able to reuse. Tom Thanks, Alexis Rossi Internet Archive On 26-Feb-2013, at 10:16 AM, Tom Morris wrote: Happy Open Data Day! Thanks for bringing this up. I think one of the best things that people can do to be open is to be explicit and transparent about the terms that they license their information under and, if they accept remix content, the terms under which they accept data. One of the key things that Creative Commons licenses were designed to address is the friction caused by everyone having to read, understand, and approve of lots of different unique licenses. Refusing to declare a license and making cloudy statements about the provenance of the data is the ultimate in anti-openness. I eagerly await clarification from the Open Library and/or Internet Archive staff. Tom On Sat, Feb 23, 2013 at 5:10 PM, Ben Companjen bencompan...@gmail.com wrote: I should have added the most clearly confusing license statement, buried at http://openlibrary.org/developers/licensing The Internet Archive does not assert any new copyright or other proprietary rights over any of the material in the Open Library database. There may be existing rights issues on some contributions and in some jurisdictions. When it comes to community projects, the legal issues are, frankly, very confusing, but we are attempting to make a database that can be openly used for a wide variety of purposes. We appreciate all that have contributed. On 23 February 2013 20:40, Ben Companjen bencompan...@gmail.com wrote: On this Open Data Day [0], for some already over, for others just starting, to celebrate, promote and use Open Data (with open as in the Open Definition [1]), I would really like to know: under what terms can data from Open Library be used? This question recently surfaced (on the ol-tech list) when John Shutt proposed to add first paragraphs from Wikipedia to work descriptions. Wikipedia's licence requires attribution (which is easily added), but also that the derived work is shared under the same conditions. When someone edits a work or book, she agrees to waive all rights by sharing the content of the edit under CC0 [2]. CC0 is incompatible with CC-BY and CC-BY-SA, because it requires no attribution and certainly no share alike. In the discussion, Karen pointed at the Terms and conditions of the Internet Archive [3] (that as you know hosts/pays for OL) and that they apply to OL content. They state that IA respects others' copyright (and not much more is said). Tom Morris replied that makes it really difficult to know what usage rights are granted for OL dumps, web data, API data etc. I believe it is important to show (limitations to) what can be done with the OL data and if necessary, clearly state that text with some/all rights restricted (e.g. from Wikipedia) should not be included in OL. Related question: as lots of information was
Re: [ol-discuss] The (necessary) License/Licence Question
On Fri, March 1, 2013 11:07 am, Tom Morris wrote: Alexis - I appreciate Internet Archive providing an official response, since it's an important question. Based on recent postings I'm not certain anymore that there /is/ anyone at Internet Archive you can provide an /official/ response -- other than Brester Kahle, of course. Expecting anything more than what is already in the TOS is, IMO, a vain hope. On Wed, Feb 27, 2013 at 1:07 AM, Alexis Rossi ale...@archive.org wrote: The data in Open Library comes from various libraries and other sources. We don't know whether those other parties have asserted any rights over that data, or whether they are legally able to do so. As our Terms of Service states, we ourselves do not assert any rights over the data in OL, but that doesn't mean that someone else won't. I understand that you'd like us to put some clarifying license on the data, but we don't have the information that would allow us to do that. That, of course, makes it completely impossible to reuse the data in any legal manner. Yes. What's your point? Reusing data from OL is a lot like dumpster divers scavenging food from restaurant dumpsters: it's probably nutritious, but you really never know where it's been. !-- snip -- Allowing anyone to contribute anything, not matter how shady the provenance, and telling the consumer that they bear all risk of using the data is the modus operandi of the Pirate Bay and its ilk. It shouldn't be the way Open Library operates. Perhaps not, but that's the way OL /does/ operate, and you simply have to decide how that impacts your little corner of the world, and given that, how much time you want to spend contributing. !-- snip -- Depending on how many sources of data were used, it may be a lot of work, but the only sane way forward that allows people to actually use this data is to start the process of vetting and scrubbing what people contributed. Otherwise, all this work is going into something that no one will ever be able to reuse. At this point I would say that it is be virtually impossible to scrub the data that is in OL's data store; there's just too much of it. Had OL been aware of the issue at the outset, and retained both provenance information and license claims maybe it would have been possible, but they did not and now it is not. The best approach is to probably treat the current OL as a beta test, discard what's there, and start over. You probably wouldn't even need to discard the entire data store, only that part that is not simple bibliographic data. My advice for those who want to consume OL data is as follows: 1. If it's unadorned data, e.g. Book titles, author names, etc. use it without reservation (if you're in the U.S.--I understand that some European countries allow for copyrights on collections of data, but the U.S. does not). It may not be accurate, but at least it's not copyrighted. 2. Never consume cover images, as these are almost certainly under copyright and uploaded without permission. I frequently check e-books out from my public library and about half the time the cover image in the e-book is just a generic image with the book title printed on it. That's because the publisher who has rights to republish the text may not have rights to publish the cover image on anything but a paper book; two creators, two copyrights. 3. The origin of any text that displays even a modicum of creativity cannot be trusted. Feel free to read it in the OL user interface, but whether you republish it depends on your own risk tolerance. Would you eat food from a dumpster? If so, republishing OL data may be acceptable to you. For those who want to contribute to OL, I would advise: 1. Add all the unadorned data you want. A note telling people where you obtained the data from would be appreciated, but is not necessary. 2. Don't upload cover images; you are almost certainly violating someone else's copyright. 3. For everything else, do whatever you want. If OL doesn't care about rights why should you? It's the one doing the publishing, so it's the one that will get sued. If you're nervous, use a phony e-mail address and post from behind an IP anonymiser. ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] The (necessary) License/Licence Question
unsubscribe. Peter Francis From: Lee Passey l...@passkeysoft.com To: Open Library -- general discussion ol-discuss@archive.org Sent: Saturday, 2 March 2013, 4:04 Subject: Re: [ol-discuss] The (necessary) License/Licence Question On Fri, March 1, 2013 11:07 am, Tom Morris wrote: Alexis - I appreciate Internet Archive providing an official response, since it's an important question. Based on recent postings I'm not certain anymore that there /is/ anyone at Internet Archive you can provide an /official/ response -- other than Brester Kahle, of course. Expecting anything more than what is already in the TOS is, IMO, a vain hope. On Wed, Feb 27, 2013 at 1:07 AM, Alexis Rossi ale...@archive.org wrote: The data in Open Library comes from various libraries and other sources. We don't know whether those other parties have asserted any rights over that data, or whether they are legally able to do so. As our Terms of Service states, we ourselves do not assert any rights over the data in OL, but that doesn't mean that someone else won't. I understand that you'd like us to put some clarifying license on the data, but we don't have the information that would allow us to do that. That, of course, makes it completely impossible to reuse the data in any legal manner. Yes. What's your point? Reusing data from OL is a lot like dumpster divers scavenging food from restaurant dumpsters: it's probably nutritious, but you really never know where it's been. !-- snip -- Allowing anyone to contribute anything, not matter how shady the provenance, and telling the consumer that they bear all risk of using the data is the modus operandi of the Pirate Bay and its ilk. It shouldn't be the way Open Library operates. Perhaps not, but that's the way OL /does/ operate, and you simply have to decide how that impacts your little corner of the world, and given that, how much time you want to spend contributing. !-- snip -- Depending on how many sources of data were used, it may be a lot of work, but the only sane way forward that allows people to actually use this data is to start the process of vetting and scrubbing what people contributed. Otherwise, all this work is going into something that no one will ever be able to reuse. At this point I would say that it is be virtually impossible to scrub the data that is in OL's data store; there's just too much of it. Had OL been aware of the issue at the outset, and retained both provenance information and license claims maybe it would have been possible, but they did not and now it is not. The best approach is to probably treat the current OL as a beta test, discard what's there, and start over. You probably wouldn't even need to discard the entire data store, only that part that is not simple bibliographic data. My advice for those who want to consume OL data is as follows: 1. If it's unadorned data, e.g. Book titles, author names, etc. use it without reservation (if you're in the U.S.--I understand that some European countries allow for copyrights on collections of data, but the U.S. does not). It may not be accurate, but at least it's not copyrighted. 2. Never consume cover images, as these are almost certainly under copyright and uploaded without permission. I frequently check e-books out from my public library and about half the time the cover image in the e-book is just a generic image with the book title printed on it. That's because the publisher who has rights to republish the text may not have rights to publish the cover image on anything but a paper book; two creators, two copyrights. 3. The origin of any text that displays even a modicum of creativity cannot be trusted. Feel free to read it in the OL user interface, but whether you republish it depends on your own risk tolerance. Would you eat food from a dumpster? If so, republishing OL data may be acceptable to you. For those who want to contribute to OL, I would advise: 1. Add all the unadorned data you want. A note telling people where you obtained the data from would be appreciated, but is not necessary. 2. Don't upload cover images; you are almost certainly violating someone else's copyright. 3. For everything else, do whatever you want. If OL doesn't care about rights why should you? It's the one doing the publishing, so it's the one that will get sued. If you're nervous, use a phony e-mail address and post from behind an IP anonymiser. ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr
Re: [ol-discuss] The (necessary) License/Licence Question
+1 This is the kinds of roundabout answers that you get from many institutions, such as the Smithsonian, but basically puts a stop to all forms of reuse, defeating the purpose of an open environment. Alex Stinson On Mon, Feb 25, 2013 at 10:46 PM, Tom Morris tfmor...@gmail.com wrote: Happy Open Data Day! Thanks for bringing this up. I think one of the best things that people can do to be open is to be explicit and transparent about the terms that they license their information under and, if they accept remix content, the terms under which they accept data. One of the key things that Creative Commons licenses were designed to address is the friction caused by everyone having to read, understand, and approve of lots of different unique licenses. Refusing to declare a license and making cloudy statements about the provenance of the data is the ultimate in anti-openness. I eagerly await clarification from the Open Library and/or Internet Archive staff. Tom On Sat, Feb 23, 2013 at 5:10 PM, Ben Companjen bencompan...@gmail.comwrote: I should have added the most clearly confusing license statement, buried at http://openlibrary.org/developers/licensing The Internet Archive does not assert any new copyright or other proprietary rights over any of the material in the Open Library database. There may be existing rights issues on some contributions and in some jurisdictions. When it comes to community projects, the legal issues are, frankly, very confusing, but we are attempting to make a database that can be openly used for a wide variety of purposes. We appreciate all that have contributed. On 23 February 2013 20:40, Ben Companjen bencompan...@gmail.com wrote: On this Open Data Day [0], for some already over, for others just starting, to celebrate, promote and use Open Data (with open as in the Open Definition [1]), I would really like to know: under what terms can data from Open Library be used? This question recently surfaced (on the ol-tech list) when John Shutt proposed to add first paragraphs from Wikipedia to work descriptions. Wikipedia's licence requires attribution (which is easily added), but also that the derived work is shared under the same conditions. When someone edits a work or book, she agrees to waive all rights by sharing the content of the edit under CC0 [2]. CC0 is incompatible with CC-BY and CC-BY-SA, because it requires no attribution and certainly no share alike. In the discussion, Karen pointed at the Terms and conditions of the Internet Archive [3] (that as you know hosts/pays for OL) and that they apply to OL content. They state that IA respects others' copyright (and not much more is said). Tom Morris replied that makes it really difficult to know what usage rights are granted for OL dumps, web data, API data etc. I believe it is important to show (limitations to) what can be done with the OL data and if necessary, clearly state that text with some/all rights restricted (e.g. from Wikipedia) should not be included in OL. Related question: as lots of information was ingested from the Library of Congress and other libraries and Amazon, were there special or general (non-exclusive) agreements that allowed OL to take this data? There may be arguments for fair use, facts can't be copyrighted, LC data is in the public domain, but these are partial answers at best. Enjoy Open Data Day :) Regards, Ben [0] http://opendataday.org [1] http://opendefinition.org [2] http://creativecommons.org/about/cc0 [3] http://archive.org/about/terms.php ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] The (necessary) License/Licence Question
Happy Open Data Day! Thanks for bringing this up. I think one of the best things that people can do to be open is to be explicit and transparent about the terms that they license their information under and, if they accept remix content, the terms under which they accept data. One of the key things that Creative Commons licenses were designed to address is the friction caused by everyone having to read, understand, and approve of lots of different unique licenses. Refusing to declare a license and making cloudy statements about the provenance of the data is the ultimate in anti-openness. I eagerly await clarification from the Open Library and/or Internet Archive staff. Tom On Sat, Feb 23, 2013 at 5:10 PM, Ben Companjen bencompan...@gmail.comwrote: I should have added the most clearly confusing license statement, buried at http://openlibrary.org/developers/licensing The Internet Archive does not assert any new copyright or other proprietary rights over any of the material in the Open Library database. There may be existing rights issues on some contributions and in some jurisdictions. When it comes to community projects, the legal issues are, frankly, very confusing, but we are attempting to make a database that can be openly used for a wide variety of purposes. We appreciate all that have contributed. On 23 February 2013 20:40, Ben Companjen bencompan...@gmail.com wrote: On this Open Data Day [0], for some already over, for others just starting, to celebrate, promote and use Open Data (with open as in the Open Definition [1]), I would really like to know: under what terms can data from Open Library be used? This question recently surfaced (on the ol-tech list) when John Shutt proposed to add first paragraphs from Wikipedia to work descriptions. Wikipedia's licence requires attribution (which is easily added), but also that the derived work is shared under the same conditions. When someone edits a work or book, she agrees to waive all rights by sharing the content of the edit under CC0 [2]. CC0 is incompatible with CC-BY and CC-BY-SA, because it requires no attribution and certainly no share alike. In the discussion, Karen pointed at the Terms and conditions of the Internet Archive [3] (that as you know hosts/pays for OL) and that they apply to OL content. They state that IA respects others' copyright (and not much more is said). Tom Morris replied that makes it really difficult to know what usage rights are granted for OL dumps, web data, API data etc. I believe it is important to show (limitations to) what can be done with the OL data and if necessary, clearly state that text with some/all rights restricted (e.g. from Wikipedia) should not be included in OL. Related question: as lots of information was ingested from the Library of Congress and other libraries and Amazon, were there special or general (non-exclusive) agreements that allowed OL to take this data? There may be arguments for fair use, facts can't be copyrighted, LC data is in the public domain, but these are partial answers at best. Enjoy Open Data Day :) Regards, Ben [0] http://opendataday.org [1] http://opendefinition.org [2] http://creativecommons.org/about/cc0 [3] http://archive.org/about/terms.php ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org