Re: [OSM-dev] minute diff - max delay
On Sat, Aug 14, 2010 at 11:38 AM, Brett Henderson br...@bretth.com wrote: On Sat, Aug 14, 2010 at 9:30 AM, Tom Hughes t...@compton.nu wrote: On 14/08/10 00:19, Grant Slater wrote: On 14 August 2010 00:10, Brett Hendersonbr...@bretth.com wrote: Is anybody aware of anything that happened on that day *other* than the database upgrade? Any new imports, etc. The database was fully re-imported (planned and triple backed up) and the transaction IDs were reset due to this. zere was able to set the transaction id used by osmosis diff export because I believe you were not around or weren't available at the time. Also: Postgresql 8.3 - 8.4. RAID10 on 10 disk to RAID 10 on 16 disks. RAID stripe size changed from 256KB to 64KB. There's not really any great mystery here, we know it was the upgrade to postgres 8.4 (or just as likely the reimport of the db) that triggered it. Okay. I didn't realise that a database upgrade had occurred, I thought it was only disk/RAID changes. We just need to get to the bottom of what is making some of the queries run slowly, but it's not a very easy thing to do. Is it only Osmosis queries that are running slowly? My assumption was that it was choosing a bad execution plan as the way our schema works tends to confuse Postgres's statistics, but the plan I looked at didn't show any sign of that. Equally it doesn't seem to be a lock contention issue. Is there anything I can add that might make it easier to investigate such as additional query options, log query timings, etc? I'm not sure what to try at this point. About the only thing I can think to do is to set up a local database and try to replicate the problem. I've been meaning to do that but it's not a quick task and I haven't had much time to spend on it. I've just upgraded Osmosis from the 0.35 release to the current 0.37 snapshot. I've introduced a relatively minor change that on initial testing appears to have fixed the problem. I create a number of temp tables during replication processing to hold identifiers (actually id and version) of each of nodes, ways and relations. I am now adding a primary key to those tables which should assist the query planner come up with a more effective query plan. I'm not sure why I didn't do that originally ... perhaps I just missed it. I'm a bit surprised that it has fixed it given that the amount of data in the temp tables is relatively small and query analysis wasn't pointing at poor query plans, but it seems to be running *much* faster now. The new version took effect from replication number 906 onwards, so if anybody sees any issues please let me know. Brett ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] minute diff - max delay
On Sun, Sep 12, 2010 at 9:28 PM, Brett Henderson br...@bretth.com wrote: I've just upgraded Osmosis from the 0.35 release to the current 0.37 snapshot. I've introduced a relatively minor change that on initial testing appears to have fixed the problem. I create a number of temp tables during replication processing to hold identifiers (actually id and version) of each of nodes, ways and relations. I am now adding a primary key to those tables which should assist the query planner come up with a more effective query plan. I'm not sure why I didn't do that originally ... perhaps I just missed it. I'm a bit surprised that it has fixed it given that the amount of data in the temp tables is relatively small and query analysis wasn't pointing at poor query plans, but it seems to be running *much* faster now. The new version took effect from replication number 906 onwards, so if anybody sees any issues please let me know. I haven't seen a single cron failure since the new version was deployed so it's looking good so far. Previously almost 2 out of 3 minute jobs were failing due to the previous job not completing in time. Even before the db upgrade I used to get occasional failures. Such a simple change ... if only I'd discovered it sooner. ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] minute diff - max delay
They're okay. There were two produced at the same time, but for different hourly periods. You need to open the corresponding state file to know what time each one is for. 324.state.txt contains timestamp=2010-06-29T18\:00\:00Z but 325.state.txt contains timestamp=2010-06-29T19\:00\:00Z On Sat, Aug 14, 2010 at 7:17 PM, bernhard zwischenbrugger b...@datenkueche.com wrote: Hi It looks like the hour diffs also have a problem. They are on time but sometimes there are zwo files. example: 324.osc.gz 29-Jun-2010 20:02 6.0M 325.osc.gz 29-Jun-2010 20:02 46K Bernhard Am 14.08.10 03:38, schrieb Brett Henderson: On Sat, Aug 14, 2010 at 9:30 AM, Tom Hughes t...@compton.nu wrote: On 14/08/10 00:19, Grant Slater wrote: On 14 August 2010 00:10, Brett Hendersonbr...@bretth.com wrote: Is anybody aware of anything that happened on that day *other* than the database upgrade? Any new imports, etc. The database was fully re-imported (planned and triple backed up) and the transaction IDs were reset due to this. zere was able to set the transaction id used by osmosis diff export because I believe you were not around or weren't available at the time. Also: Postgresql 8.3 - 8.4. RAID10 on 10 disk to RAID 10 on 16 disks. RAID stripe size changed from 256KB to 64KB. There's not really any great mystery here, we know it was the upgrade to postgres 8.4 (or just as likely the reimport of the db) that triggered it. Okay. I didn't realise that a database upgrade had occurred, I thought it was only disk/RAID changes. We just need to get to the bottom of what is making some of the queries run slowly, but it's not a very easy thing to do. Is it only Osmosis queries that are running slowly? My assumption was that it was choosing a bad execution plan as the way our schema works tends to confuse Postgres's statistics, but the plan I looked at didn't show any sign of that. Equally it doesn't seem to be a lock contention issue. Is there anything I can add that might make it easier to investigate such as additional query options, log query timings, etc? I'm not sure what to try at this point. About the only thing I can think to do is to set up a local database and try to replicate the problem. I've been meaning to do that but it's not a quick task and I haven't had much time to spend on it. Brett ___ dev mailing list d...@openstreetmap.orghttp://lists.openstreetmap.org/listinfo/dev ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] minute diff - max delay
Hi, If anybody has time it would be interesting to look at the minute diffs over the last few months and summarise how many diffs are produced per hour or day and plot that on a graph. It would show whether or not this problem has gradually gotten worse over a period of time, or if it occurred suddenly. Looking at the attached graphs for the number of minutly replication-diffs per day it looks like it appeared very suddenly on 2010-07-03 but contuinues to get worse. HTH, Patrick Petschge Kilianattachment: reps_per_day.png18-Sep-2009 462 19-Sep-2009 1397 20-Sep-2009 1438 21-Sep-2009 1401 22-Sep-2009 1354 23-Sep-2009 1399 24-Sep-2009 1438 25-Sep-2009 1277 26-Sep-2009 724 27-Sep-2009 1324 28-Sep-2009 1362 29-Sep-2009 1440 30-Sep-2009 1439 01-Oct-2009 1349 02-Oct-2009 1340 03-Oct-2009 1437 04-Oct-2009 1431 05-Oct-2009 1440 06-Oct-2009 1424 07-Oct-2009 1435 08-Oct-2009 1434 09-Oct-2009 1434 10-Oct-2009 1433 11-Oct-2009 1438 12-Oct-2009 1433 13-Oct-2009 1437 14-Oct-2009 1438 15-Oct-2009 1435 16-Oct-2009 1437 17-Oct-2009 1439 18-Oct-2009 1440 19-Oct-2009 1423 20-Oct-2009 1402 21-Oct-2009 1304 22-Oct-2009 1440 23-Oct-2009 1431 24-Oct-2009 1430 25-Oct-2009 1495 26-Oct-2009 1437 27-Oct-2009 1303 28-Oct-2009 1336 29-Oct-2009 1436 30-Oct-2009 1327 31-Oct-2009 537 01-Nov-2009 833 02-Nov-2009 913 03-Nov-2009 1367 04-Nov-2009 1343 05-Nov-2009 1370 06-Nov-2009 1440 07-Nov-2009 1398 08-Nov-2009 1440 09-Nov-2009 1440 10-Nov-2009 1431 11-Nov-2009 1439 12-Nov-2009 1439 13-Nov-2009 814 14-Nov-2009 1429 15-Nov-2009 1422 16-Nov-2009 1427 17-Nov-2009 1415 18-Nov-2009 1412 19-Nov-2009 1434 20-Nov-2009 1433 21-Nov-2009 1440 22-Nov-2009 1426 23-Nov-2009 1402 24-Nov-2009 1427 25-Nov-2009 1397 26-Nov-2009 1440 27-Nov-2009 1403 28-Nov-2009 1412 29-Nov-2009 1433 30-Nov-2009 1427 01-Dec-2009 1430 02-Dec-2009 1387 03-Dec-2009 1426 04-Dec-2009 1431 05-Dec-2009 1313 06-Dec-2009 1432 07-Dec-2009 1419 08-Dec-2009 1426 09-Dec-2009 1435 10-Dec-2009 1438 11-Dec-2009 1440 12-Dec-2009 1413 13-Dec-2009 1432 14-Dec-2009 1411 15-Dec-2009 1431 16-Dec-2009 1440 17-Dec-2009 1428 18-Dec-2009 1439 19-Dec-2009 1438 20-Dec-2009 1439 21-Dec-2009 1432 22-Dec-2009 1434 23-Dec-2009 1438 24-Dec-2009 1439 25-Dec-2009 1439 26-Dec-2009 1439 27-Dec-2009 1440 28-Dec-2009 1440 29-Dec-2009 1440 30-Dec-2009 1439 31-Dec-2009 1440 01-Jan-2010 1439 02-Jan-2010 1440 03-Jan-2010 1438 04-Jan-2010 1419 05-Jan-2010 1413 06-Jan-2010 1397 07-Jan-2010 1415 08-Jan-2010 1393 09-Jan-2010 1432 10-Jan-2010 1440 11-Jan-2010 1440 12-Jan-2010 1436 13-Jan-2010 1439 14-Jan-2010 1437 15-Jan-2010 1440 16-Jan-2010 1439 17-Jan-2010 1414 18-Jan-2010 1440 19-Jan-2010 1440 20-Jan-2010 1440 21-Jan-2010 1440 22-Jan-2010 1440 23-Jan-2010 1440 24-Jan-2010 1440 25-Jan-2010 1440 26-Jan-2010 1438 27-Jan-2010 1439 28-Jan-2010 1437 29-Jan-2010 1440 30-Jan-2010 1439 31-Jan-2010 1434 01-Feb-2010 1434 02-Feb-2010 1435 03-Feb-2010 1422 04-Feb-2010 1427 05-Feb-2010 1437 06-Feb-2010 1430 07-Feb-2010 1405 08-Feb-2010 1437 09-Feb-2010 1427 10-Feb-2010 1436 11-Feb-2010 1434 12-Feb-2010 1440 13-Feb-2010 1439 14-Feb-2010 1439 15-Feb-2010 1439 16-Feb-2010 1437 17-Feb-2010 1433 18-Feb-2010 1438 19-Feb-2010 1439 20-Feb-2010 1435 21-Feb-2010 1440 22-Feb-2010 1432 23-Feb-2010 1436 24-Feb-2010 1432 25-Feb-2010 1436 26-Feb-2010 1440 27-Feb-2010 1438 28-Feb-2010 1432 01-Mar-2010 1438 02-Mar-2010 1437 03-Mar-2010 1440 04-Mar-2010 1439 05-Mar-2010 1440 06-Mar-2010 1440 07-Mar-2010 1439 08-Mar-2010 1431 09-Mar-2010 1440 10-Mar-2010 1439 11-Mar-2010 1440 12-Mar-2010 1092 13-Mar-2010 1436 14-Mar-2010 1421 15-Mar-2010 1433 16-Mar-2010 1438 17-Mar-2010 1440 18-Mar-2010 1440 19-Mar-2010 1439 20-Mar-2010 1440 21-Mar-2010 1437 22-Mar-2010 1433 23-Mar-2010 1440 24-Mar-2010 1438 25-Mar-2010 1439 26-Mar-2010 1439 27-Mar-2010 1430 28-Mar-2010 1373 29-Mar-2010 1440 30-Mar-2010 1440 31-Mar-2010 1437 01-Apr-2010 1426 02-Apr-2010 1385 03-Apr-2010 1436 04-Apr-2010 1438 05-Apr-2010 1381 06-Apr-2010 1439 07-Apr-2010 1438 08-Apr-2010 1431 09-Apr-2010 1440 10-Apr-2010 1424 11-Apr-2010 1417 12-Apr-2010 1431 13-Apr-2010 1424 14-Apr-2010 1409 15-Apr-2010 1439 16-Apr-2010 1423 17-Apr-2010 1438 18-Apr-2010 1438 19-Apr-2010 1403 20-Apr-2010 1428 21-Apr-2010 1439 22-Apr-2010 1431 23-Apr-2010 1437 24-Apr-2010 1433 25-Apr-2010 1424 26-Apr-2010 1430 27-Apr-2010 1437 28-Apr-2010 1438 29-Apr-2010 1429 30-Apr-2010 1438 01-May-2010 1434 02-May-2010 1439 03-May-2010 1419 04-May-2010 1437 05-May-2010 1407 06-May-2010 1421 07-May-2010 1424 08-May-2010 1418 09-May-2010 1429 10-May-2010 1437 11-May-2010 1420 12-May-2010 1407 13-May-2010 1406 14-May-2010 1434 15-May-2010 1423 16-May-2010 1434 17-May-2010 1431 18-May-2010 1427 19-May-2010 1291 20-May-2010 1358 21-May-2010 1368 22-May-2010 1428 23-May-2010 1427 24-May-2010 1418 25-May-2010 1436 26-May-2010 1410 27-May-2010 1436 28-May-2010 1432 29-May-2010 1415 30-May-2010 1424 31-May-2010 1432 01-Jun-2010 1426 02-Jun-2010 1400 03-Jun-2010 1408 04-Jun-2010 1433 05-Jun-2010 1431 06-Jun-2010 1426 07-Jun-2010
Re: [OSM-dev] minute diff - max delay
Thanks Patrick, that's much appreciated. Is anybody aware of anything that happened on that day *other* than the database upgrade? Any new imports, etc. On Fri, Aug 13, 2010 at 4:33 PM, Patrick Petschge o...@petschge.de wrote: Hi, If anybody has time it would be interesting to look at the minute diffs over the last few months and summarise how many diffs are produced per hour or day and plot that on a graph. It would show whether or not this problem has gradually gotten worse over a period of time, or if it occurred suddenly. Looking at the attached graphs for the number of minutly replication-diffs per day it looks like it appeared very suddenly on 2010-07-03 but contuinues to get worse. HTH, Patrick Petschge Kilian ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] minute diff - max delay
On 14 August 2010 00:10, Brett Henderson br...@bretth.com wrote: Is anybody aware of anything that happened on that day *other* than the database upgrade? Any new imports, etc. The database was fully re-imported (planned and triple backed up) and the transaction IDs were reset due to this. zere was able to set the transaction id used by osmosis diff export because I believe you were not around or weren't available at the time. Also: Postgresql 8.3 - 8.4. RAID10 on 10 disk to RAID 10 on 16 disks. RAID stripe size changed from 256KB to 64KB. Regards Grant ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] minute diff - max delay
On 14/08/10 00:19, Grant Slater wrote: On 14 August 2010 00:10, Brett Hendersonbr...@bretth.com wrote: Is anybody aware of anything that happened on that day *other* than the database upgrade? Any new imports, etc. The database was fully re-imported (planned and triple backed up) and the transaction IDs were reset due to this. zere was able to set the transaction id used by osmosis diff export because I believe you were not around or weren't available at the time. Also: Postgresql 8.3 - 8.4. RAID10 on 10 disk to RAID 10 on 16 disks. RAID stripe size changed from 256KB to 64KB. There's not really any great mystery here, we know it was the upgrade to postgres 8.4 (or just as likely the reimport of the db) that triggered it. We just need to get to the bottom of what is making some of the queries run slowly, but it's not a very easy thing to do. My assumption was that it was choosing a bad execution plan as the way our schema works tends to confuse Postgres's statistics, but the plan I looked at didn't show any sign of that. Equally it doesn't seem to be a lock contention issue. Tom -- Tom Hughes (t...@compton.nu) http://compton.nu/ ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] minute diff - max delay
On Sat, Aug 14, 2010 at 9:30 AM, Tom Hughes t...@compton.nu wrote: On 14/08/10 00:19, Grant Slater wrote: On 14 August 2010 00:10, Brett Hendersonbr...@bretth.com wrote: Is anybody aware of anything that happened on that day *other* than the database upgrade? Any new imports, etc. The database was fully re-imported (planned and triple backed up) and the transaction IDs were reset due to this. zere was able to set the transaction id used by osmosis diff export because I believe you were not around or weren't available at the time. Also: Postgresql 8.3 - 8.4. RAID10 on 10 disk to RAID 10 on 16 disks. RAID stripe size changed from 256KB to 64KB. There's not really any great mystery here, we know it was the upgrade to postgres 8.4 (or just as likely the reimport of the db) that triggered it. Okay. I didn't realise that a database upgrade had occurred, I thought it was only disk/RAID changes. We just need to get to the bottom of what is making some of the queries run slowly, but it's not a very easy thing to do. Is it only Osmosis queries that are running slowly? My assumption was that it was choosing a bad execution plan as the way our schema works tends to confuse Postgres's statistics, but the plan I looked at didn't show any sign of that. Equally it doesn't seem to be a lock contention issue. Is there anything I can add that might make it easier to investigate such as additional query options, log query timings, etc? I'm not sure what to try at this point. About the only thing I can think to do is to set up a local database and try to replicate the problem. I've been meaning to do that but it's not a quick task and I haven't had much time to spend on it. Brett ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
[OSM-dev] minute diff - max delay
Hi all Is there a maximum delay time the minute diffs can have? Here for example: http://planet.openstreetmap.org/minute-replicate/000/439/ At 11:54 there is a 15 minute delay. lg, Bernhard ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] minute diff - max delay
bernhard zwischenbrugger wrote: Is there a maximum delay time the minute diffs can have? Here for example: http://planet.openstreetmap.org/minute-replicate/000/439/ At 11:54 there is a 15 minute delay. Files with data for 10 minutes or more are quite frequent since some time now. This causes e.g. the TRAPI servers to stop to serve data because they have no actual data. http://datenkueche.com/osmlive/ is quite useless because it replays the same data again and again. Norbert ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev