Re: [Toolserver-l] performance issues
One more DB performance report/question. I am seeing some UPDATE queries that only change one row but take much, much longer than they ought to. Is anyone else seeing this? For example, the following type of query is getting killed from time to time in the p_enwp10 database on sql-s1. The query killers says it ran for over 400 seconds before being killed. The update is on a primary key, and I don't see any way to optimize it. At the time this is running the database connection is inside a transaction (AutoCommit = 0) if that matters. UPDATE tmpcategories SET c_category = 'A-Class_Water_supply_and_sanitation_articles', c_ranking = '425', c_replacement = 'A-Class' WHERE c_project = 'Water_supply_and_sanitation' and c_rating= 'A-Class' and c_type = 'quality' - Carl mysql show indexes from tmpcategories; +---++--+--+-+---+-+--++--++-+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---++--+--+-+---+-+--++--++-+ | tmpcategories | 0 | PRIMARY |1 | c_project | A |4055 | NULL | NULL | | BTREE | | | tmpcategories | 0 | PRIMARY |2 | c_type | A |8110 | NULL | NULL | | BTREE | | | tmpcategories | 0 | PRIMARY |3 | c_rating| A | 40553 | NULL | NULL | | BTREE | | +---++--+--+-+---+-+--++--++-+ ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
On Wed, Oct 26, 2011 at 9:50 AM, Ja Ga jaga_...@yahoo.com wrote: So, is anyone else seeing really horrible mysql performance on s1? I heard a mysql upgrade introduced regressions, but I'm seeing queries take 2x - 3x as long, sometimes worse. https://wiki.toolserver.org/view/Mailing_list_etiquette Another problem that has started to appear very recently is that sometimes s1 (or sql-s1-user at least) seems to disappear. I got the following error from a perl script on willow in the past 8 hours: DBI connect('database=p_enwp10:host=sql-s1-user:mysql_read_default_file=/home/project/e/n/w/enwp10/.my.cnf','',...) failed: Unknown MySQL server host 'sql-s1-user' (0) at database_routines.plline 627 Couldn't connect to database: Unknown MySQL server host 'sql-s1-user' (0) at database_routines.pl line 627. - Carl ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
On Wed, Oct 26, 2011 at 10:18 PM, Platonides platoni...@gmail.com wrote: Marlen Caemmerer wrote: Rosemary (one of the enwiki-DB-Hosts) seems to bring the maximum of the I/O that is possible, disk graphs are clipping there. In the MySQL traffic graph you can see there is clipping too. Strange thing about this is that this phenomenon started in the middle of september. Can anyone remember any important change in this time? That is not the case with thyme (other enwiki DB Host) so it'd be a question wether we could share the load better. Also I enabled query caching and will see if this is useful in any way. Maybe there's some job/tool which started at that time which loaded rosemary so much? In the last three hours, I got no less than 21 immensely useless mails from the query killer. The following query, apparently, was killed after running for a whooping 71 minutes: SELECT /* SLOW_OK */ /* GLAMOROUS */ gil_wiki,gil_page_title,gil_page_namespace,gil_to from globalimagelinks,page,categorylinks where gil_to=page_title and cl_to =Images_from_the_National_Archives_and_Records_Administration AND page_id=cl_from AND page_namespace=6 AND gil_page_namespace= This queries, and others like it, have not caused problems for the last one (two? I can't remember) years they've been in place. Something has changed for the far, far worse. Magnus ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
Hello, At Saturday 29 October 2011 15:04:30 DaB. wrote: This queries, and others like it, have not caused problems for the last one (two? I can't remember) years they've been in place. Something has changed for the far, far worse. have you ever thought about the possiblity that maybe there are more users on the TS and more people who use the toolserver now than 2 years ago? Or maybe it is just people like you, who let a query for a WEBTOOL run for 71 minutes! We are just short on hardware at the moment, and the situation will not change for some time, so we all have to live with the things we have and if a tool breaks because it consumes too much resources, then the dev. has to look if he/she can optimize it or it breaks! Sincerly, DaB. -- Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885 signature.asc Description: This is a digitally signed message part. ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
DaB. schrieb: have you ever thought about the possiblity that maybe there are more users on the TS and more people who use the toolserver now than 2 years ago? Or maybe it is just people like you, who let a query for a WEBTOOL run for 71 minutes! We are just short on hardware at the moment, and the situation will not change for some time, so we all have to live with the things we have and if a tool breaks because it consumes too much resources, then the dev. has to look if he/she can optimize it or it breaks! Sincerly, DaB. Something has changed very recently. I had occasionally received a cron error of Unable to run job: got no response from JSV script /sge62/default/common/jsv.sh. But since ~24 hours ago I got many of those. I don't know what may be wrong (the target server too slow to answer the RPC? a flood of jobs migrated to SGE?) but I'm a bit concerned that we have redundants servers to ensure the crons registering to SGE are run, yet the registration may fail with pretty much no reason. Maybe we should start launching locally if cronsub fails. ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
Hello, At Saturday 29 October 2011 16:24:30 DaB. wrote: Something has changed very recently. I had occasionally received a cron error of Unable to run job: got no response from JSV script /sge62/default/common/jsv.sh. we try to find the reason for this at the moment. Sincerly, DaB. -- Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885 signature.asc Description: This is a digitally signed message part. ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
Hi, I noticed unusually high replag on s1 in the past week, that's probably related. But I didn't hear about any any recent upgrades. They are usually announced, because they cause some downtime. Petr Onderka [[en:User:Svick]] ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
On Wed, Oct 26, 2011 at 6:29 PM, Carl (CBM) cbm.wikipe...@gmail.com wrote: the following query of mine was killed after 1700 seconds: SELECT ns_id, ns_name FROM toolserver.namespacename where (ns_type = 'canonical' or ns_type = 'primary') and dbname = 'enwiki_p' That should be an extremely fast query; there is an index on dbname. Just a little sidenote, you may be interested in ns_is_favorite. There is one entry per namespace per dbname where `ns_is_favorite = 1` which is also the one used by the wiki when creating/redirecting native links. - Krinkle ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
Hello, I am not sure if I can already be helpful since I am new to the environment but I have some strange findings in the monitoring graphs. First I'd be interested in the MySQL host you ask. Then: this is the graph. http://munin.toolserver.org/Database/rosemary/mysql_bytes.html Rosemary (one of the enwiki-DB-Hosts) seems to bring the maximum of the I/O that is possible, disk graphs are clipping there. In the MySQL traffic graph you can see there is clipping too. Strange thing about this is that this phenomenon started in the middle of september. Can anyone remember any important change in this time? That is not the case with thyme (other enwiki DB Host) so it'd be a question wether we could share the load better. Also I enabled query caching and will see if this is useful in any way. Another question would be: which queries are especially slow? Do you have special trouble with joins? Cheers nosy -- * Marlen Caemmerer *Richard-Sorge-Str. 82 monoro * 10249 Berlin * * Tel: 0179/733 90 72 USt-ID: DE 252684276 ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Re: [Toolserver-l] performance issues
Marlen Caemmerer wrote: Rosemary (one of the enwiki-DB-Hosts) seems to bring the maximum of the I/O that is possible, disk graphs are clipping there. In the MySQL traffic graph you can see there is clipping too. Strange thing about this is that this phenomenon started in the middle of september. Can anyone remember any important change in this time? That is not the case with thyme (other enwiki DB Host) so it'd be a question wether we could share the load better. Also I enabled query caching and will see if this is useful in any way. Maybe there's some job/tool which started at that time which loaded rosemary so much? ___ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette