I will add my 2 cents in the same pot as Kathleen. A typical "learned" model, based on a ML algorithm and a substantial extract of OSM data: That seems like a Produced Work to me.
Hence... - licence for the training inputs (underlying database, data structures built before learning): release under ODbL (Derivative Database; publish the entire database; or alterations; or algorithm) - licence for the model (weights, internal data structures built during learning): Produced Work, release under any license that you like (Share Alike: no), required to credit OpenStreetMap (Attribution: yes) - licence for the results (outputs): provided there are an insubstantial extract or contain no OSM data, release under any license that you like (Share Alike: no), not required to credit OpenStreetMap (Attribution: no) If the results (outputs) are used to create a new database that contains the whole or a substantial part of the contents of the OSM database, this new database would be considered a Derivative Database and would trigger share-alike obligations under section 4.4.b of the ODbL. [shameless plug of Geocoding guideline] In fact, I think the Geocoding guideline is a very good starting point and could be extended to cover other applications (ML-based or not). Geocoder underlying database ~equivalent~ training inputs Geocoder application ~equivalent~ ML-based model Geocoding results ~equivalent~ model outputs This is my understanding or interpretation of the current materials: https://opendatacommons.org/licenses/odbl/1.0/ https://wiki.osmfoundation.org/wiki/Licence/Community_Guidelines/Produced_Work_-_Guideline https://wiki.openstreetmap.org/wiki/Open_Data_License/Produced_Work_-_Guideline https://wiki.osmfoundation.org/wiki/Licence/Community_Guidelines/Geocoding_-_Guideline -- althio On Tue, 9 Apr 2019 at 15:35, Kathleen Lu via legal-talk <legal-talk@openstreetmap.org> wrote: > > My two cents: > I'm not sure what you mean by internal data structures. If OSM data is used > to train a ML algorithm, then I would think that the training inputs could be > a substantial extract (possibly a trivial transformation of an extract). But > what is trained would be an algorithm/weights, which I generally do not think > of as a database at all? But since it uses an OSM database, a Produced Work > seems the right concept: > "a work (such as an image, audiovisual material, text, > or sounds) resulting from using the whole or a Substantial part of the > Contents (via a search or other query) from this Database, a Derivative > Database, or this Database as part of a Collective Database." > -Kathleen > > > > On Tue, Apr 9, 2019 at 5:06 AM Frederik Ramm <frede...@remote.org> wrote: >> >> Hi, >> >> is it a community consensus that, when someone uses OSM to train their >> machine learning "black box", the internal data structures built during >> learning constitute a derivative database? Or are there people who argue >> that somehow the "black box" can ingest OSM data at will and still >> remain 100% intellectual property of its operator? >> >> Further, assuming that we have a system that has ingested OSM by deep >> learning and we say that this means its internal database is ODbL, what >> would this mean for the output later produced by the same machine? >> >> Bye >> Frederik >> >> -- >> Frederik Ramm ## eMail frede...@remote.org ## N49°00'09" E008°23'33" >> >> _______________________________________________ >> legal-talk mailing list >> legal-talk@openstreetmap.org >> https://lists.openstreetmap.org/listinfo/legal-talk > > _______________________________________________ > legal-talk mailing list > legal-talk@openstreetmap.org > https://lists.openstreetmap.org/listinfo/legal-talk _______________________________________________ legal-talk mailing list legal-talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/legal-talk