Re: [OSM-dev] minute diff - max delay

2010-09-12 Thread Brett Henderson
On Sat, Aug 14, 2010 at 11:38 AM, Brett Henderson br...@bretth.com wrote:

 On Sat, Aug 14, 2010 at 9:30 AM, Tom Hughes t...@compton.nu wrote:

 On 14/08/10 00:19, Grant Slater wrote:

 On 14 August 2010 00:10, Brett Hendersonbr...@bretth.com  wrote:


 Is anybody aware of anything that happened on that day *other* than the
 database upgrade?  Any new imports, etc.


 The database was fully re-imported (planned and triple backed up) and
 the transaction IDs were reset due to this.
 zere was able to set the transaction id used by osmosis diff export
 because I believe you were not around or weren't available at the
 time.

 Also: Postgresql 8.3 -  8.4. RAID10 on 10 disk to RAID 10 on 16 disks.
 RAID stripe size changed from 256KB to 64KB.


 There's not really any great mystery here, we know it was the upgrade to
 postgres 8.4 (or just as likely the reimport of the db) that triggered it.


 Okay.  I didn't realise that a database upgrade had occurred, I thought it
 was only disk/RAID changes.



 We just need to get to the bottom of what is making some of the queries
 run slowly, but it's not a very easy thing to do.


 Is it only Osmosis queries that are running slowly?


 My assumption was that it was choosing a bad execution plan as the way our
 schema works tends to confuse Postgres's statistics, but the plan I looked
 at didn't show any sign of that.

 Equally it doesn't seem to be a lock contention issue.


 Is there anything I can add that might make it easier to investigate such
 as additional query options, log query timings, etc?  I'm not sure what to
 try at this point.  About the only thing I can think to do is to set up a
 local database and try to replicate the problem.  I've been meaning to do
 that but it's not a quick task and I haven't had much time to spend on it.


I've just upgraded Osmosis from the 0.35 release to the current 0.37
snapshot.  I've introduced a relatively minor change that on initial testing
appears to have fixed the problem.  I create a number of temp tables during
replication processing to hold identifiers (actually id and version) of each
of nodes, ways and relations.  I am now adding a primary key to those tables
which should assist the query planner come up with a more effective query
plan.  I'm not sure why I didn't do that originally ... perhaps I just
missed it.

I'm a bit surprised that it has fixed it given that the amount of data in
the temp tables is relatively small and query analysis wasn't pointing at
poor query plans, but it seems to be running *much* faster now.

The new version took effect from replication number 906 onwards, so if
anybody sees any issues please let me know.

Brett
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] minute diff - max delay

2010-09-12 Thread Brett Henderson
On Sun, Sep 12, 2010 at 9:28 PM, Brett Henderson br...@bretth.com wrote:

 I've just upgraded Osmosis from the 0.35 release to the current 0.37
 snapshot.  I've introduced a relatively minor change that on initial testing
 appears to have fixed the problem.  I create a number of temp tables during
 replication processing to hold identifiers (actually id and version) of each
 of nodes, ways and relations.  I am now adding a primary key to those tables
 which should assist the query planner come up with a more effective query
 plan.  I'm not sure why I didn't do that originally ... perhaps I just
 missed it.

 I'm a bit surprised that it has fixed it given that the amount of data in
 the temp tables is relatively small and query analysis wasn't pointing at
 poor query plans, but it seems to be running *much* faster now.

 The new version took effect from replication number 906 onwards, so if
 anybody sees any issues please let me know.


I haven't seen a single cron failure since the new version was deployed so
it's looking good so far.  Previously almost 2 out of 3 minute jobs were
failing due to the previous job not completing in time.  Even before the db
upgrade I used to get occasional failures.  Such a simple change ... if only
I'd discovered it sooner.
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] minute diff - max delay

2010-08-14 Thread Brett Henderson
They're okay.  There were two produced at the same time, but for different
hourly periods.  You need to open the corresponding state file to know what
time each one is for.  324.state.txt contains

timestamp=2010-06-29T18\:00\:00Z

but 325.state.txt contains

timestamp=2010-06-29T19\:00\:00Z



On Sat, Aug 14, 2010 at 7:17 PM, bernhard zwischenbrugger 
b...@datenkueche.com wrote:

  Hi

 It looks like the hour diffs also have a problem.
 They are on time but sometimes there are zwo files.

 example:
 324.osc.gz 29-Jun-2010 20:02 6.0M
 325.osc.gz 29-Jun-2010 20:02 46K

 Bernhard


 Am 14.08.10 03:38, schrieb Brett Henderson:

 On Sat, Aug 14, 2010 at 9:30 AM, Tom Hughes t...@compton.nu wrote:

  On 14/08/10 00:19, Grant Slater wrote:

 On 14 August 2010 00:10, Brett Hendersonbr...@bretth.com  wrote:


 Is anybody aware of anything that happened on that day *other* than the
 database upgrade?  Any new imports, etc.


 The database was fully re-imported (planned and triple backed up) and
 the transaction IDs were reset due to this.
 zere was able to set the transaction id used by osmosis diff export
 because I believe you were not around or weren't available at the
 time.

 Also: Postgresql 8.3 -  8.4. RAID10 on 10 disk to RAID 10 on 16 disks.
 RAID stripe size changed from 256KB to 64KB.


  There's not really any great mystery here, we know it was the upgrade to
 postgres 8.4 (or just as likely the reimport of the db) that triggered it.


 Okay.  I didn't realise that a database upgrade had occurred, I thought it
 was only disk/RAID changes.



 We just need to get to the bottom of what is making some of the queries
 run slowly, but it's not a very easy thing to do.


 Is it only Osmosis queries that are running slowly?


 My assumption was that it was choosing a bad execution plan as the way our
 schema works tends to confuse Postgres's statistics, but the plan I looked
 at didn't show any sign of that.

 Equally it doesn't seem to be a lock contention issue.


 Is there anything I can add that might make it easier to investigate such
 as additional query options, log query timings, etc?  I'm not sure what to
 try at this point.  About the only thing I can think to do is to set up a
 local database and try to replicate the problem.  I've been meaning to do
 that but it's not a quick task and I haven't had much time to spend on it.

 Brett


 ___
 dev mailing list
 d...@openstreetmap.orghttp://lists.openstreetmap.org/listinfo/dev



___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] minute diff - max delay

2010-08-13 Thread Patrick Petschge
Hi,

 If anybody has time it would be interesting to look at the minute diffs
 over the last few months and summarise how many diffs are produced per
 hour or day and plot that on a graph.  It would show whether or not this
 problem has gradually gotten worse over a period of time, or if it occurred
 suddenly.

Looking at the attached graphs for the number of minutly replication-diffs
per day it looks like it appeared very suddenly on 2010-07-03 but
contuinues to get worse.


HTH,
Patrick Petschge Kilianattachment: reps_per_day.png18-Sep-2009 462
19-Sep-2009 1397
20-Sep-2009 1438
21-Sep-2009 1401
22-Sep-2009 1354
23-Sep-2009 1399
24-Sep-2009 1438
25-Sep-2009 1277
26-Sep-2009 724
27-Sep-2009 1324
28-Sep-2009 1362
29-Sep-2009 1440
30-Sep-2009 1439
01-Oct-2009 1349
02-Oct-2009 1340
03-Oct-2009 1437
04-Oct-2009 1431
05-Oct-2009 1440
06-Oct-2009 1424
07-Oct-2009 1435
08-Oct-2009 1434
09-Oct-2009 1434
10-Oct-2009 1433
11-Oct-2009 1438
12-Oct-2009 1433
13-Oct-2009 1437
14-Oct-2009 1438
15-Oct-2009 1435
16-Oct-2009 1437
17-Oct-2009 1439
18-Oct-2009 1440
19-Oct-2009 1423
20-Oct-2009 1402
21-Oct-2009 1304
22-Oct-2009 1440
23-Oct-2009 1431
24-Oct-2009 1430
25-Oct-2009 1495
26-Oct-2009 1437
27-Oct-2009 1303
28-Oct-2009 1336
29-Oct-2009 1436
30-Oct-2009 1327
31-Oct-2009 537
01-Nov-2009 833
02-Nov-2009 913
03-Nov-2009 1367
04-Nov-2009 1343
05-Nov-2009 1370
06-Nov-2009 1440
07-Nov-2009 1398
08-Nov-2009 1440
09-Nov-2009 1440
10-Nov-2009 1431
11-Nov-2009 1439
12-Nov-2009 1439
13-Nov-2009 814
14-Nov-2009 1429
15-Nov-2009 1422
16-Nov-2009 1427
17-Nov-2009 1415
18-Nov-2009 1412
19-Nov-2009 1434
20-Nov-2009 1433
21-Nov-2009 1440
22-Nov-2009 1426
23-Nov-2009 1402
24-Nov-2009 1427
25-Nov-2009 1397
26-Nov-2009 1440
27-Nov-2009 1403
28-Nov-2009 1412
29-Nov-2009 1433
30-Nov-2009 1427
01-Dec-2009 1430
02-Dec-2009 1387
03-Dec-2009 1426
04-Dec-2009 1431
05-Dec-2009 1313
06-Dec-2009 1432
07-Dec-2009 1419
08-Dec-2009 1426
09-Dec-2009 1435
10-Dec-2009 1438
11-Dec-2009 1440
12-Dec-2009 1413
13-Dec-2009 1432
14-Dec-2009 1411
15-Dec-2009 1431
16-Dec-2009 1440
17-Dec-2009 1428
18-Dec-2009 1439
19-Dec-2009 1438
20-Dec-2009 1439
21-Dec-2009 1432
22-Dec-2009 1434
23-Dec-2009 1438
24-Dec-2009 1439
25-Dec-2009 1439
26-Dec-2009 1439
27-Dec-2009 1440
28-Dec-2009 1440
29-Dec-2009 1440
30-Dec-2009 1439
31-Dec-2009 1440
01-Jan-2010 1439
02-Jan-2010 1440
03-Jan-2010 1438
04-Jan-2010 1419
05-Jan-2010 1413
06-Jan-2010 1397
07-Jan-2010 1415
08-Jan-2010 1393
09-Jan-2010 1432
10-Jan-2010 1440
11-Jan-2010 1440
12-Jan-2010 1436
13-Jan-2010 1439
14-Jan-2010 1437
15-Jan-2010 1440
16-Jan-2010 1439
17-Jan-2010 1414
18-Jan-2010 1440
19-Jan-2010 1440
20-Jan-2010 1440
21-Jan-2010 1440
22-Jan-2010 1440
23-Jan-2010 1440
24-Jan-2010 1440
25-Jan-2010 1440
26-Jan-2010 1438
27-Jan-2010 1439
28-Jan-2010 1437
29-Jan-2010 1440
30-Jan-2010 1439
31-Jan-2010 1434
01-Feb-2010 1434
02-Feb-2010 1435
03-Feb-2010 1422
04-Feb-2010 1427
05-Feb-2010 1437
06-Feb-2010 1430
07-Feb-2010 1405
08-Feb-2010 1437
09-Feb-2010 1427
10-Feb-2010 1436
11-Feb-2010 1434
12-Feb-2010 1440
13-Feb-2010 1439
14-Feb-2010 1439
15-Feb-2010 1439
16-Feb-2010 1437
17-Feb-2010 1433
18-Feb-2010 1438
19-Feb-2010 1439
20-Feb-2010 1435
21-Feb-2010 1440
22-Feb-2010 1432
23-Feb-2010 1436
24-Feb-2010 1432
25-Feb-2010 1436
26-Feb-2010 1440
27-Feb-2010 1438
28-Feb-2010 1432
01-Mar-2010 1438
02-Mar-2010 1437
03-Mar-2010 1440
04-Mar-2010 1439
05-Mar-2010 1440
06-Mar-2010 1440
07-Mar-2010 1439
08-Mar-2010 1431
09-Mar-2010 1440
10-Mar-2010 1439
11-Mar-2010 1440
12-Mar-2010 1092
13-Mar-2010 1436
14-Mar-2010 1421
15-Mar-2010 1433
16-Mar-2010 1438
17-Mar-2010 1440
18-Mar-2010 1440
19-Mar-2010 1439
20-Mar-2010 1440
21-Mar-2010 1437
22-Mar-2010 1433
23-Mar-2010 1440
24-Mar-2010 1438
25-Mar-2010 1439
26-Mar-2010 1439
27-Mar-2010 1430
28-Mar-2010 1373
29-Mar-2010 1440
30-Mar-2010 1440
31-Mar-2010 1437
01-Apr-2010 1426
02-Apr-2010 1385
03-Apr-2010 1436
04-Apr-2010 1438
05-Apr-2010 1381
06-Apr-2010 1439
07-Apr-2010 1438
08-Apr-2010 1431
09-Apr-2010 1440
10-Apr-2010 1424
11-Apr-2010 1417
12-Apr-2010 1431
13-Apr-2010 1424
14-Apr-2010 1409
15-Apr-2010 1439
16-Apr-2010 1423
17-Apr-2010 1438
18-Apr-2010 1438
19-Apr-2010 1403
20-Apr-2010 1428
21-Apr-2010 1439
22-Apr-2010 1431
23-Apr-2010 1437
24-Apr-2010 1433
25-Apr-2010 1424
26-Apr-2010 1430
27-Apr-2010 1437
28-Apr-2010 1438
29-Apr-2010 1429
30-Apr-2010 1438
01-May-2010 1434
02-May-2010 1439
03-May-2010 1419
04-May-2010 1437
05-May-2010 1407
06-May-2010 1421
07-May-2010 1424
08-May-2010 1418
09-May-2010 1429
10-May-2010 1437
11-May-2010 1420
12-May-2010 1407
13-May-2010 1406
14-May-2010 1434
15-May-2010 1423
16-May-2010 1434
17-May-2010 1431
18-May-2010 1427
19-May-2010 1291
20-May-2010 1358
21-May-2010 1368
22-May-2010 1428
23-May-2010 1427
24-May-2010 1418
25-May-2010 1436
26-May-2010 1410
27-May-2010 1436
28-May-2010 1432
29-May-2010 1415
30-May-2010 1424
31-May-2010 1432
01-Jun-2010 1426
02-Jun-2010 1400
03-Jun-2010 1408
04-Jun-2010 1433
05-Jun-2010 1431
06-Jun-2010 1426
07-Jun-2010 

Re: [OSM-dev] minute diff - max delay

2010-08-13 Thread Brett Henderson
Thanks Patrick, that's much appreciated.

Is anybody aware of anything that happened on that day *other* than the
database upgrade?  Any new imports, etc.

On Fri, Aug 13, 2010 at 4:33 PM, Patrick Petschge o...@petschge.de wrote:

 Hi,

  If anybody has time it would be interesting to look at the minute diffs
  over the last few months and summarise how many diffs are produced per
  hour or day and plot that on a graph.  It would show whether or not this
  problem has gradually gotten worse over a period of time, or if it
 occurred
  suddenly.

 Looking at the attached graphs for the number of minutly replication-diffs
 per day it looks like it appeared very suddenly on 2010-07-03 but
 contuinues to get worse.


 HTH,
 Patrick Petschge Kilian
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] minute diff - max delay

2010-08-13 Thread Grant Slater
On 14 August 2010 00:10, Brett Henderson br...@bretth.com wrote:

 Is anybody aware of anything that happened on that day *other* than the
 database upgrade?  Any new imports, etc.


The database was fully re-imported (planned and triple backed up) and
the transaction IDs were reset due to this.
zere was able to set the transaction id used by osmosis diff export
because I believe you were not around or weren't available at the
time.

Also: Postgresql 8.3 - 8.4. RAID10 on 10 disk to RAID 10 on 16 disks.
RAID stripe size changed from 256KB to 64KB.

Regards
 Grant

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] minute diff - max delay

2010-08-13 Thread Tom Hughes

On 14/08/10 00:19, Grant Slater wrote:

On 14 August 2010 00:10, Brett Hendersonbr...@bretth.com  wrote:


Is anybody aware of anything that happened on that day *other* than the
database upgrade?  Any new imports, etc.



The database was fully re-imported (planned and triple backed up) and
the transaction IDs were reset due to this.
zere was able to set the transaction id used by osmosis diff export
because I believe you were not around or weren't available at the
time.

Also: Postgresql 8.3 -  8.4. RAID10 on 10 disk to RAID 10 on 16 disks.
RAID stripe size changed from 256KB to 64KB.


There's not really any great mystery here, we know it was the upgrade to 
postgres 8.4 (or just as likely the reimport of the db) that triggered it.


We just need to get to the bottom of what is making some of the queries 
run slowly, but it's not a very easy thing to do.


My assumption was that it was choosing a bad execution plan as the way 
our schema works tends to confuse Postgres's statistics, but the plan I 
looked at didn't show any sign of that.


Equally it doesn't seem to be a lock contention issue.

Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] minute diff - max delay

2010-08-13 Thread Brett Henderson
On Sat, Aug 14, 2010 at 9:30 AM, Tom Hughes t...@compton.nu wrote:

 On 14/08/10 00:19, Grant Slater wrote:

 On 14 August 2010 00:10, Brett Hendersonbr...@bretth.com  wrote:


 Is anybody aware of anything that happened on that day *other* than the
 database upgrade?  Any new imports, etc.


 The database was fully re-imported (planned and triple backed up) and
 the transaction IDs were reset due to this.
 zere was able to set the transaction id used by osmosis diff export
 because I believe you were not around or weren't available at the
 time.

 Also: Postgresql 8.3 -  8.4. RAID10 on 10 disk to RAID 10 on 16 disks.
 RAID stripe size changed from 256KB to 64KB.


 There's not really any great mystery here, we know it was the upgrade to
 postgres 8.4 (or just as likely the reimport of the db) that triggered it.


Okay.  I didn't realise that a database upgrade had occurred, I thought it
was only disk/RAID changes.



 We just need to get to the bottom of what is making some of the queries run
 slowly, but it's not a very easy thing to do.


Is it only Osmosis queries that are running slowly?


 My assumption was that it was choosing a bad execution plan as the way our
 schema works tends to confuse Postgres's statistics, but the plan I looked
 at didn't show any sign of that.

 Equally it doesn't seem to be a lock contention issue.


Is there anything I can add that might make it easier to investigate such as
additional query options, log query timings, etc?  I'm not sure what to try
at this point.  About the only thing I can think to do is to set up a local
database and try to replicate the problem.  I've been meaning to do that but
it's not a quick task and I haven't had much time to spend on it.

Brett
___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


[OSM-dev] minute diff - max delay

2010-08-12 Thread bernhard zwischenbrugger

Hi all

Is there a maximum delay time the minute diffs can have?

Here for example:
http://planet.openstreetmap.org/minute-replicate/000/439/
At 11:54 there is a 15 minute delay.

lg, Bernhard

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] minute diff - max delay

2010-08-12 Thread Norbert Hoffmann
bernhard zwischenbrugger wrote:

Is there a maximum delay time the minute diffs can have?

Here for example:
http://planet.openstreetmap.org/minute-replicate/000/439/
At 11:54 there is a 15 minute delay.

Files with data for 10 minutes or more are quite frequent since some time
now. This causes e.g. the TRAPI servers to stop to serve data because they
have no actual data. http://datenkueche.com/osmlive/ is quite useless
because it replays the same data again and again. 

Norbert


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev