Re: [OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?
Hi, ... > > I guess numbers 3) and 4) are problematic. I know they do much more > than the current feature set of OSMdoc but I implemented them because > those were two fairly often requested features. I guess their priority > just dropped a lot :) > > So I guess my current solution would be to offer access to the > aggregated statistical data and then offer two views into the other > data. One for ODbL data and one for CC by-SA data. How would those two > datasets have to be separated? > Or to phrase it in another way: When can I combine ODbL data with CC > by-SA data, how do I do that and what do I have to do to comply to > (both) licenses? Independent of the fact that this remains an interesting discussion about the effects of both the licenses, I am wondering if some of the practical issues here for OSMdoc come from a different understanding of what is going to happen with the changeover than what I thought was planned. My understanding is the following (assuming everything goes ahead): At some point there will be a cut off when the actual switch happens. At that point there will be a planet dump with full history of everything in OSM licensed under CC-BY-SA. Afterwards, all the data that can not be relicensed will be taken out of the main OSM database after which, again a dump of the planet including full history of everything that remains will be produced and now licensed under the ODbL. It is not that the history data will only be available under CC-BY-SA and that ODbL only covers current and future data. So the entire history of all then available OSM data will be licensed under the ODbL and can be used by OSMdoc. The only history that is missing from the ODbL dump is of that that has been removed from the ODbL version of OSM. So if that amount of data was deemed sufficiently little as that it won't harm OSM as a project and thus the relicensing can go ahead, I would have thought that that is also sufficiently little to not effect OSMdoc particularly either and can be dropped from OSMdoc statistics. At that point you have no worries about mixing the two licenses anymore. Not sure if that was in anyway unclear to start with, but I thought it was worth pointing out, just in case. Kai > > Cheers, > Lars > > [1] http://www.openstreetmap.org/browse/way/45724946 > > ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?
On Wed, Dec 9, 2009 at 15:43, Richard Weait wrote: > On Wed, Dec 9, 2009 at 6:47 AM, Richard Fairhurst > wrote: >> >> Lars Francke wrote: >>> At the moment I'm displaying statistical data about a snapshot of >>> the OSM data. >> >> Hooee, this is probably the single trickiest question I've seen. >> >> I can't find any precedents for whether aggregate statistics such as >> OSMdoc's are considered derivative works. My gut feeling, and no more than >> that, is that they probably aren't. You are not really deriving any >> information that's in the OSM database and offering it up for reuse. > > Lars' use case is a produced work[*]. How do we make this clear to > the community and clearly permitted in the license? Thanks for the answer! I think it'd be great if something like this could be added to the Use cases page (or other documents). If you need any more details on what I'm doing I'll gladly provide the information. I could make the source code available but that'll probably do more harm than good at this time. > Lars' Produced Work is a database, but his database is a list of keys > and values with use totals. The Produced Work drops the geo location > data and the connectivity data from OSM. This irreversibly prevents > recreating or extracting the original data. So by the community > guideline above OSMDoc is a Produced Work. I don't drop all the geo data and I don't drop the connectivity data. In fact I'm producing more of this data than there was in the original database. Four examples: 1) For every key and value I record which elements this key and value are _currently_ used on. 2) Every key and value has a bounding box so one can see where the tag has been used. 3) Way 1 in version 1 has four nodes. Two of those nodes are moved. The original OSM data doesn't reflect this move in a new version of the way. I record those "minor" version changes in nodes, ways and relations and make them available on a similar page to OpenStreetMaps current interface [1]. 4) I have an experimental historical API that answers timestamp and timerange queries: How did London look on 23.12.2006 at 12:33? Show me all changes for Hamburg in the year 2007. 1) and 2) don't pose a problem, I hope. For 1) I can just drop all information about pre ODbL information as this information is useless anyway (those referenced elements can't be retrieved via the main API) but 2) is geo data aggregated from both data sets. I guess numbers 3) and 4) are problematic. I know they do much more than the current feature set of OSMdoc but I implemented them because those were two fairly often requested features. I guess their priority just dropped a lot :) So I guess my current solution would be to offer access to the aggregated statistical data and then offer two views into the other data. One for ODbL data and one for CC by-SA data. How would those two datasets have to be separated? Or to phrase it in another way: When can I combine ODbL data with CC by-SA data, how do I do that and what do I have to do to comply to (both) licenses? Cheers, Lars [1] http://www.openstreetmap.org/browse/way/45724946 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?
On Wed, Dec 9, 2009 at 6:47 AM, Richard Fairhurst wrote: > > Lars Francke wrote: >> At the moment I'm displaying statistical data about a snapshot of >> the OSM data. > > Hooee, this is probably the single trickiest question I've seen. > > I can't find any precedents for whether aggregate statistics such as > OSMdoc's are considered derivative works. My gut feeling, and no more than > that, is that they probably aren't. You are not really deriving any > information that's in the OSM database and offering it up for reuse. Lars' use case is a produced work[*]. How do we make this clear to the community and clearly permitted in the license? Our draft community guidelines on Produced Work[1] say: "If it was intended for the extraction of the original data, then it is a database and not a Produced Work. Otherwise it is a Produced Work. " Lars' Produced Work is a database, but his database is a list of keys and values with use totals. The Produced Work drops the geo location data and the connectivity data from OSM. This irreversibly prevents recreating or extracting the original data. So by the community guideline above OSMDoc is a Produced Work. Best regards, Richard [*] I've made some assumptions about how Lars makes OSMDoc. Corrections welcome. [1] http://wiki.openstreetmap.org/wiki/Open_Data_License/Produced_Work_-_Guideline ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?
Lars Francke wrote: > At the moment I'm displaying statistical data about a snapshot of > the OSM data. If it'd stay that way it would be very easy for me to > switch from one license to the other as the data wouldn't depend on > data from the CC by-SA set. But I'm currently rewriting the tool to > account for historical statistics. One example would be a feature > that has been requested quite often: How many users have used a tag. > This means I have to incorporate the history of all elements into my > numbers. I wouldn't want to lose the data if we switch but the number > is clearly derived from both databases (the ODbL database and the CC > by-SA dump). This is only one example. The new version uses historic > data all over the place and I've been working very hard these last > few weeks/months to get this far and to get the data so I wouldn't > want to throw everything "pre ODbL" away as it would alter the > meaning of the statistics. Hooee, this is probably the single trickiest question I've seen. I can't find any precedents for whether aggregate statistics such as OSMdoc's are considered derivative works. My gut feeling, and no more than that, is that they probably aren't. You are not really deriving any information that's in the OSM database and offering it up for reuse. The only meaningful content that you're reproducing is the actual text of the tags and values - yet even then, these are divorced from the objects to which they apply. If your statistics aren't a derivative work, they don't inherit the share-alike provisions of either licence, so no conflict arises. It isn't black and white, of course. If you extract a list of "Most popular tags in the UK, ordered by popularity", I doubt it's derivative. If you extract a list of "Most popular values for the name= tag applied to streets in Charlbury, ordered by popularity", it would be. There's clearly a spectrum. I believe OSMF has received legal advice that "community guidelines" (informal Terms of Use) can help to influence edge cases such as this. I would therefore suggest that, as a community supported by OSMF, we add an explicit clarification to the relevant pages on the wiki that we do not consider aggregate statistics of this sort to form a derivative work. cheers Richard -- View this message in context: http://old.nabble.com/Implications-of-using-aggregated-statistical-data-from-both-licenses-%28ODbL-and-CC-by-SA%29-for-OSMdoc--tp26703374p26708988.html Sent from the OpenStreetMap - General mailing list archive at Nabble.com. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
[OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?
I've just listened to the podcast, I've read a lot of the mails on the mailing lists in the last few days, I've read quite a few discussions about it on IRC, the proposal document, parts of the license, the human readable form of the license and a lot of the Wiki pages. The more I read the less I know. So I'd like to take the LWG up on the offer (the one I just heard in the Podcast) and ask what the implications would be for me as the developer of OSMdoc if OSM were to switch to ODbL (I'm assuming that at least parts of the OSM data would have to be made unavailable from the ODbL dataset after the switch). At the moment I'm displaying statistical data about a snapshot of the OSM data. If it'd stay that way it would be very easy for me to switch from one license to the other as the data wouldn't depend on data from the CC by-SA set. But I'm currently rewriting the tool to account for historical statistics. One example would be a feature that has been requested quite often: How many users have used a tag. This means I have to incorporate the history of all elements into my numbers. I wouldn't want to lose the data if we switch but the number is clearly derived from both databases (the ODbL database and the CC by-SA dump). This is only one example. The new version uses historic data all over the place and I've been working very hard these last few weeks/months to get this far and to get the data so I wouldn't want to throw everything "pre ODbL" away as it would alter the meaning of the statistics. What would the license change mean for me? What do I have to do to comply? I don't even know which of these categories I belong to (taken from the ODbL text): “Collective Database” – Means this Database in unmodified form as part of a collection of independent databases in themselves that together are assembled into a collective whole. A work that constitutes a Collective Database will not be considered a Derivative Database. “Produced Work” – a work (such as an image, audiovisual material, text, or sounds) resulting from using the whole or a Substantial part of the Contents (via a search or other query) from this Database, a Derivative Database, or this Database as part of a Collective Database. “Derivative Database” – Means a database based upon the Database, and includes any translation, adaptation, arrangement, modification, or any other alteration of the Database or of a Substantial part of the Contents. This includes, but is not limited to, Extracting or Re-utilising the whole or a Substantial part of the Contents in a new Database. 1) Collective Database? What does "modify" mean? Again from the ODbL: “Database” – A collection of material (the Contents) arranged in a systematic or methodical way and individually accessible by electronic or other means offered under the terms of this License. I don't change any of the content of the database. I just parse the provided XML and write the content into my own database (but I parse the timestamp strings to longs, leave out usernames, etc. - modification?). 2) Produced Work? Certainly. At least I think sobut...I don't know. I provide a viewable version of the derivative database I produced and in all probabilty there will be charts/graphs/etc. based on this database. 3) Derivative Database? I think so. As a short personal opinion about the license debate I'd like to add that I've pretty much given up on understanding the license (and its implications) despite the continued efforts by all those involved. Please understand that this is not criticism about ODbL, CC by-SA, the LWG or anyone else involved in this license change. I know that a lot of people are working hard on this (on the "Yes" and on the "No" side). But I have the feeling that the "normal" user can't really understand or follow the details of the discussion anymore. This is even more true for those of us that don't speak english as a native language. I know that mine is probably an uncommon case but I couldn't find anything on the Use Cases site that deals with the combination of CC by-SA and ODbL data that'd be applicable to my use case. So any help or insights would be greatly appreciated. Cheers, Lars ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk