Re: [OSM-dev] 0.6 bulk uploader
2009/1/22 Frederik Ramm frede...@remote.org: Hi, Shaun McDonald wrote: It would be best if the bulk_import.py script was updated for 0.6. As everything needs to be wrapped into a changeset, it makes the bulk upload more complex than before. Yes and no... if you're talking uploads that are small enough to fit into one diff upload (i.e. not something like a TIGER county ;-) then bulk uploading should become trivial because you don't even have to keep track of the object IDs, you just throw your diff at the server and that's it. Such a bulk upload could basically be handled by a shell script that has three lwp-request calls. Hm, I see that each object in the diff must explicitly reference the changeset ID... so that would probably add one sed call to the shell script ;-) BTW: It seems that we're not currently imposing an upper limit for the number of changes in a diff upload, is that true? If so, we should perhaps add such a limit because the transacionality of diff uploads would otherwise make it too easy for the thoughtless script writer to mess up or data base... only thing I'm unsure about is whether we should simply abort after n cycles in the DiffReader.commit method (easy to implement, but by the time we abort the database has already been unnecessarily loaded), or whether there is perhaps a way to make this depend on the size (in bytes) of the upload and it could easily be checked before even starting to process it? Don't forget changeset size limitations. As I remember it we decided on something like a 50,000 edit limit to keep changesets from becoming land mines taking out poor innocent passers by as they suddenly find themselves trying to view a 1GB city upload. Last I saw that limit was being enforced by the API, so any diff upload that's bigger than 50,000 changes will fail automatically -- just not till rails runs the validation, probably after the whole diff has been processed. So the bulk uploader needs to split the data into useful changesets, not just multiple uploads. Dave ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
[OSM-dev] 0.6 bulk uploader
Hi all, I have a homebrew OSM 0.6 test server, and about 4000 JOSM-compatible .osm files. JOSM is able to upload those files nicely with no hassles. I would like to automatically upload all those files to my server, but for the information I've read at the wiki, bulk_import.py is not ready for 0.6. Is there any similar bulk upload script compatible with 0.6? Should I use osmosis instead? Any other ideas? Cheers, -- -- Iván Sánchez Ortega i...@sanchezortega.es Proudly running Debian Linux with 2.6.26-1-amd64 kernel, KDE 3.5.10, and PHP 5.2.6-0.1+b1 generating this signature. Uptime: 00:33:56 up 20 days, 7:15, 2 users, load average: 0.17, 0.38, 0.41 signature.asc Description: This is a digitally signed message part. ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] 0.6 bulk uploader
Iván Sánchez Ortega wrote: Any other ideas? It seems 0.6 supports uploading diffs: http://wiki.openstreetmap.org/wiki/OSM_Protocol_Version_0.6#Diff_uploads Stefan ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] 0.6 bulk uploader
Iván Sánchez Ortega wrote: Hi all, I have a homebrew OSM 0.6 test server, and about 4000 JOSM-compatible .osm files. JOSM is able to upload those files nicely with no hassles. I would like to automatically upload all those files to my server, but for the information I've read at the wiki, bulk_import.py is not ready for 0.6. Is there any similar bulk upload script compatible with 0.6? Should I use osmosis instead? Any other ideas? Currently osmosis doesn't support uploading either. ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] 0.6 bulk uploader
It seems 0.6 supports uploading diffs: http://wiki.openstreetmap.org/wiki/OSM_Protocol_Version_0.6#Diff_uploads Yummy... Transactions! ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] 0.6 bulk uploader
On 21 Jan 2009, at 23:40, Iván Sánchez Ortega wrote: Hi all, I have a homebrew OSM 0.6 test server, and about 4000 JOSM- compatible .osm files. JOSM is able to upload those files nicely with no hassles. I would like to automatically upload all those files to my server, but for the information I've read at the wiki, bulk_import.py is not ready for 0.6. Is there any similar bulk upload script compatible with 0.6? Should I use osmosis instead? Any other ideas? It would be best if the bulk_import.py script was updated for 0.6. As everything needs to be wrapped into a changeset, it makes the bulk upload more complex than before. Shaun ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] 0.6 bulk uploader
Shaun McDonald wrote: It would be best if the bulk_import.py script was updated for 0.6. As everything needs to be wrapped into a changeset, it makes the bulk upload more complex than before. More? How is this possible? It would be one changeset, put entire file. Done. Stefan ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] 0.6 bulk uploader
Hi, Shaun McDonald wrote: It would be best if the bulk_import.py script was updated for 0.6. As everything needs to be wrapped into a changeset, it makes the bulk upload more complex than before. Yes and no... if you're talking uploads that are small enough to fit into one diff upload (i.e. not something like a TIGER county ;-) then bulk uploading should become trivial because you don't even have to keep track of the object IDs, you just throw your diff at the server and that's it. Such a bulk upload could basically be handled by a shell script that has three lwp-request calls. Hm, I see that each object in the diff must explicitly reference the changeset ID... so that would probably add one sed call to the shell script ;-) BTW: It seems that we're not currently imposing an upper limit for the number of changes in a diff upload, is that true? If so, we should perhaps add such a limit because the transacionality of diff uploads would otherwise make it too easy for the thoughtless script writer to mess up or data base... only thing I'm unsure about is whether we should simply abort after n cycles in the DiffReader.commit method (easy to implement, but by the time we abort the database has already been unnecessarily loaded), or whether there is perhaps a way to make this depend on the size (in bytes) of the upload and it could easily be checked before even starting to process it? Bye Frederik -- Frederik Ramm ## eMail frede...@remote.org ## N49°00'09 E008°23'33 ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] 0.6 bulk uploader
Frederik Ramm wrote: BTW: It seems that we're not currently imposing an upper limit for the number of changes in a diff upload, is that true? If so, we should perhaps add such a limit because the transacionality of diff uploads would otherwise make it too easy for the thoughtless script writer to mess up or data base... only thing I'm unsure about is whether we should simply abort after n cycles in the DiffReader.commit method (easy to implement, but by the time we abort the database has already been unnecessarily loaded), or whether there is perhaps a way to make this depend on the size (in bytes) of the upload and it could easily be checked before even starting to process it? So 0.6 brought us atomic transactions and you are proposing to break too big transactions into pieces? What is the use case of breakage? Someone loads 'a city' in South-Africa, this is an upload of 1GB, a transaction is started, everything is inserted, a transaction is committed, profit? Where do we need a limit? We will create a transaction log in the *SQL server, until the request is actually commited we do not return that data anyway in any query. The only significant problem that we get is if we want to return after a commit and we have a significant processing time that is longer than our http-timeout. Then again, if an user is able to query the changeset, he must also be able to query the actual processing of it hence the need of an actual return value after the request could be just 'queued'. In a more easy, less code approach: - Upload a file to OSM (By API/FTP/DAV) to changeset.osm - This returns an in queue respons on successful upload - The files are processed in the order of upload - Start Transaction - Create new - Delete from diff - Update existing - Commit/Rollback - Update status of the changeset An editor that uploads in the diff way would have to poll the OSM server for the status of the changeset. Stefan ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] 0.6 bulk uploader
Hi, Stefan de Konink wrote: Where do we need a limit? I assume that while doing all the inserts, the Ruby code has to keep track of all the Ids involved in order to be able to adjust the references in other objects. This will consume memory which is a limited resource. Also, it is my assumption that any transaction that is open over a prolonged period of time will create extra complexities on the database side (locks etc.), making very long-lasting transactions unwanted/difficult. I'd rather impose an arbitrary limit which is much smaller than what the servers can theoretically handle and publish this so that whoever uploads something has a chance to be relatively sure that his query will work - instead of letting anyone upload an arbitrarily big diff that will give a Rails out of memory exception somewhere down the line. Bye Frederik -- Frederik Ramm ## eMail frede...@remote.org ## N49°00'09 E008°23'33 ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev