Re: [OSM-dev] 0.6 bulk uploader

2009-01-22 Thread Dave Stubbs
2009/1/22 Frederik Ramm frede...@remote.org:
 Hi,

 Shaun McDonald wrote:
 It would be best if the bulk_import.py script was updated for 0.6. As
 everything needs to be wrapped into a changeset, it makes the bulk
 upload more complex than before.

 Yes and no... if you're talking uploads that are small enough to fit
 into one diff upload (i.e. not something like a TIGER county ;-) then
 bulk uploading should become trivial because you don't even have to keep
 track of the object IDs, you just throw your diff at the server and
 that's it. Such a bulk upload could basically be handled by a shell
 script that has three lwp-request calls.

 Hm, I see that each object in the diff must explicitly reference the
 changeset ID... so that would probably add one sed call to the shell
 script ;-)

 BTW: It seems that we're not currently imposing an upper limit for the
 number of changes in a diff upload, is that true? If so, we should
 perhaps add such a limit because the transacionality of diff uploads
 would otherwise make it too easy for the thoughtless script writer to
 mess up or data base... only thing I'm unsure about is whether we should
 simply abort after n cycles in the DiffReader.commit method (easy to
 implement, but by the time we abort the database has already been
 unnecessarily loaded), or whether there is perhaps a way to make this
 depend on the size (in bytes) of the upload and it could easily be
 checked before even starting to process it?


Don't forget changeset size limitations.
As I remember it we decided on something like a 50,000 edit limit to
keep changesets from becoming land mines taking out poor innocent
passers by as they suddenly find themselves trying to view a 1GB city
upload.

Last I saw that limit was being enforced by the API, so any diff
upload that's bigger than 50,000 changes will fail automatically --
just not till rails runs the validation, probably after the whole diff
has been processed. So the bulk uploader needs to split the data into
useful changesets, not just multiple uploads.

Dave

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


[OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Iván Sánchez Ortega
Hi all,

I have a homebrew OSM 0.6 test server, and about 4000 JOSM-compatible .osm 
files. JOSM is able to upload those files nicely with no hassles.

I would like to automatically upload all those files to my server, but for the 
information I've read at the wiki, bulk_import.py is not ready for 0.6. 

Is there any similar bulk upload script compatible with 0.6? Should I use 
osmosis instead? Any other ideas?

Cheers,
-- 
--
Iván Sánchez Ortega i...@sanchezortega.es

Proudly running Debian Linux with 2.6.26-1-amd64 kernel, KDE 3.5.10, and PHP 
5.2.6-0.1+b1 generating this signature.
Uptime: 00:33:56 up 20 days,  7:15,  2 users,  load average: 0.17, 0.38, 0.41


signature.asc
Description: This is a digitally signed message part.
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Stefan de Konink
Iván Sánchez Ortega wrote:
 Any other ideas?

It seems 0.6 supports uploading diffs:

http://wiki.openstreetmap.org/wiki/OSM_Protocol_Version_0.6#Diff_uploads


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Brett Henderson
Iván Sánchez Ortega wrote:
 Hi all,

 I have a homebrew OSM 0.6 test server, and about 4000 JOSM-compatible .osm 
 files. JOSM is able to upload those files nicely with no hassles.

 I would like to automatically upload all those files to my server, but for 
 the 
 information I've read at the wiki, bulk_import.py is not ready for 0.6. 

 Is there any similar bulk upload script compatible with 0.6? Should I use 
 osmosis instead? Any other ideas?
   
Currently osmosis doesn't support uploading either.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Chris Browet


 It seems 0.6 supports uploading diffs:

 http://wiki.openstreetmap.org/wiki/OSM_Protocol_Version_0.6#Diff_uploads


Yummy... Transactions!
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Shaun McDonald

On 21 Jan 2009, at 23:40, Iván Sánchez Ortega wrote:

 Hi all,

 I have a homebrew OSM 0.6 test server, and about 4000 JOSM- 
 compatible .osm
 files. JOSM is able to upload those files nicely with no hassles.

 I would like to automatically upload all those files to my server,  
 but for the
 information I've read at the wiki, bulk_import.py is not ready for  
 0.6.

 Is there any similar bulk upload script compatible with 0.6? Should  
 I use
 osmosis instead? Any other ideas?


It would be best if the bulk_import.py script was updated for 0.6. As  
everything needs to be wrapped into a changeset, it makes the bulk  
upload more complex than before.

Shaun
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Stefan de Konink
Shaun McDonald wrote:
 It would be best if the bulk_import.py script was updated for 0.6. As  
 everything needs to be wrapped into a changeset, it makes the bulk  
 upload more complex than before.

More? How is this possible? It would be one changeset, put entire file. 
Done.


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Frederik Ramm
Hi,

Shaun McDonald wrote:
 It would be best if the bulk_import.py script was updated for 0.6. As  
 everything needs to be wrapped into a changeset, it makes the bulk  
 upload more complex than before.

Yes and no... if you're talking uploads that are small enough to fit 
into one diff upload (i.e. not something like a TIGER county ;-) then 
bulk uploading should become trivial because you don't even have to keep 
track of the object IDs, you just throw your diff at the server and 
that's it. Such a bulk upload could basically be handled by a shell 
script that has three lwp-request calls.

Hm, I see that each object in the diff must explicitly reference the 
changeset ID... so that would probably add one sed call to the shell 
script ;-)

BTW: It seems that we're not currently imposing an upper limit for the 
number of changes in a diff upload, is that true? If so, we should 
perhaps add such a limit because the transacionality of diff uploads 
would otherwise make it too easy for the thoughtless script writer to 
mess up or data base... only thing I'm unsure about is whether we should 
simply abort after n cycles in the DiffReader.commit method (easy to 
implement, but by the time we abort the database has already been 
unnecessarily loaded), or whether there is perhaps a way to make this 
depend on the size (in bytes) of the upload and it could easily be 
checked before even starting to process it?

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Stefan de Konink
Frederik Ramm wrote:
 BTW: It seems that we're not currently imposing an upper limit for the 
 number of changes in a diff upload, is that true? If so, we should 
 perhaps add such a limit because the transacionality of diff uploads 
 would otherwise make it too easy for the thoughtless script writer to 
 mess up or data base... only thing I'm unsure about is whether we should 
 simply abort after n cycles in the DiffReader.commit method (easy to 
 implement, but by the time we abort the database has already been 
 unnecessarily loaded), or whether there is perhaps a way to make this 
 depend on the size (in bytes) of the upload and it could easily be 
 checked before even starting to process it?

So 0.6 brought us atomic transactions and you are proposing to break too 
big transactions into pieces? What is the use case of breakage?

Someone loads 'a city' in South-Africa, this is an upload of 1GB, a 
transaction is started, everything is inserted, a transaction is 
committed, profit?

Where do we need a limit? We will create a transaction log in the *SQL 
server, until the request is actually commited we do not return that 
data anyway in any query. The only significant problem that we get is if 
we want to return after a commit and we have a significant processing 
time that is longer than our http-timeout. Then again, if an user is 
able to query the changeset, he must also be able to query the actual 
processing of it hence the need of an actual return value after the 
request could be just 'queued'.


In a more easy, less code approach:

- Upload a file to OSM (By API/FTP/DAV) to changeset.osm
- This returns an in queue respons on successful upload
- The files are processed in the order of upload
- Start Transaction
  - Create new
  - Delete from diff
  - Update existing
- Commit/Rollback
- Update status of the changeset


An editor that uploads in the diff way would have to poll the OSM server 
for the status of the changeset.


Stefan

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] 0.6 bulk uploader

2009-01-21 Thread Frederik Ramm
Hi,

Stefan de Konink wrote:
 Where do we need a limit? 

I assume that while doing all the inserts, the Ruby code has to keep 
track of all the Ids involved in order to be able to adjust the 
references in other objects. This will consume memory which is a limited 
resource. Also, it is my assumption that any transaction that is open 
over a prolonged period of time will create extra complexities on the 
database side (locks etc.), making very long-lasting transactions 
unwanted/difficult.

I'd rather impose an arbitrary limit which is much smaller than what the 
servers can theoretically handle and publish this so that whoever 
uploads something has a chance to be relatively sure that his query will 
work - instead of letting anyone upload an arbitrarily big diff that 
will give a Rails out of memory exception somewhere down the line.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev