[MTT users] mtt reports arrive without subject.
Hi, I'm running MTT trunk (r1126) and I'm getting mail reports without any subject. In ini file I have: [Reporter: send email] module = Email email_to = pa...@mellanox.co.il email_subject = MPI test results: $phase / $sectiona In mail body I got: Subject: start_timestamp: Thu Jan 10 12:25:56 2008 Thanks. -- Pavel Shamis (Pasha) Mellanox Technologies
Re: [MTT users] hostlist enhancement
Mellanox will try. 10x! Jeff Squyres wrote: Mellanox told me that the MTT &hostlist() funclet is returning a comma-delimited list of hosts (and &hostlist_hosts()). That is fine for Open MPI, but it is not for MVAPICH -- MVAPICH requires a space-delimited list of hosts for their mpirun. Here's a patch that introduces an optional parameter to &hostlist() and &hostlist_hosts(). The optional parameter is a delimiter for the hostlist. So if you call: &hostlist_hosts() you'll get the same comma-delimited list that is returned today. But if you call &hostlist_hosts(" ") you should get a space-delimited list. Can Mellanox try this patch and see if it works for them? If so, I'll commit it to the MTT trunk. ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies
Re: [MTT users] hostlist enhancement
Here is reply from Oleg : The Patch didn't work. Broken code - my $ret = join(/$delimiter/, @hosts); in lib/MTT/Values/Functions.pm > sub hostlist_hosts Must be $ret = join($delimiter, @hosts); Ethan Mallove wrote: Works for me. -Ethan On Thu, Jan/10/2008 08:48:20AM, Jeff Squyres wrote: Mellanox told me that the MTT &hostlist() funclet is returning a comma-delimited list of hosts (and &hostlist_hosts()). That is fine for Open MPI, but it is not for MVAPICH -- MVAPICH requires a space-delimited list of hosts for their mpirun. Here's a patch that introduces an optional parameter to &hostlist() and &hostlist_hosts(). The optional parameter is a delimiter for the hostlist. So if you call: &hostlist_hosts() you'll get the same comma-delimited list that is returned today. But if you call &hostlist_hosts(" ") you should get a space-delimited list. Can Mellanox try this patch and see if it works for them? If so, I'll commit it to the MTT trunk. -- Jeff Squyres Cisco Systems ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies
Re: [MTT users] mtt reports arrive without subject.
Ethan Mallove wrote: Hi Pasha, I confess that I don't use the Email reporter. I think the email_subject is out-of-date. Try this: email_subject = MPI test results: ¤t_section() I'm unsure about the empty email body, but have an idea. What does this output for you? $ which Mail mailx mail rmail mutt elm which Mail mailx mail rmail mutt elm /usr/bin/Mail /usr/bin/mailx /usr/bin/mail /usr/bin/mutt I'm guessing the problem is that your mail agent has a problem with doing this: $ cat foo.txt | -s "MTT Mail Test" pa...@mellanox.co.il $ cat foo.txt Subject: start_timestamp: blah blah the command : cat foo.txt | /usr/bin/Mail -s "MTT Mail Test" pa...@mellanox.co.il worked nice for me, I got the mail with the subj. Pasha
Re: [MTT users] mtt reports arrive without subject.
I might've misread your last email. Did the new email_subject INI parameter from above solve your issue? I'd have to see the --debug output to know why the Subject was blank. Sorry I forgot to test the "email_subject = MPI test results: ¤t_section()" I just tested the command line mailer and it works well. ... email_subject = MPI test results: $phase / $sectiona In mail body I got: Subject: start_timestamp: Thu Jan 10 12:25:56 2008 Was the body of the email okay, or was it really blank following the start_timestamp line? The email body was ok and only the subj. was empty. Thanks, Ethan Thanks. -- Pavel Shamis (Pasha) Mellanox Technologies Pasha ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies
Re: [MTT users] mtt reports arrive without subject.
I found the problem it was a typo in name of variable. I had something like : email_subject: MPI regression $broken_name After fixing the name I started to get reports ! Thanks. Pasha Pavel Shamis (Pasha) wrote: I might've misread your last email. Did the new email_subject INI parameter from above solve your issue? I'd have to see the --debug output to know why the Subject was blank. Sorry I forgot to test the "email_subject = MPI test results: ¤t_section()" I just tested the command line mailer and it works well. ... email_subject = MPI test results: $phase / $sectiona In mail body I got: Subject: start_timestamp: Thu Jan 10 12:25:56 2008 Was the body of the email okay, or was it really blank following the start_timestamp line? The email body was ok and only the subj. was empty. Thanks, Ethan Thanks. -- Pavel Shamis (Pasha) Mellanox Technologies Pasha ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies
[MTT users] MTT server side problem
About the issue: 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Any ideas ? Thanks. -- Pavel Shamis (Pasha) Mellanox Technologies
Re: [MTT users] MTT server side problem
Hi Ethan, I don't run the latest version. Tomorrow I will try to update the server side. I will keep you updated. Other question. On the server side after the server bring up we forgot to update crontab to run the mtt cron. Is it ok to start the cron jobs now ? I afraid the the late cron start up will cause problem to DB. Thanks. Ethan Mallove wrote: Hi Pasha, I thought this issue was solved in r1119 (see below). Do you have the latest mtt/server scripts? https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit -Ethan On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote: About the issue: 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Any ideas ? Thanks. -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies
Re: [MTT users] MTT server side problem
I'm not sure which cron jobs you're referring to. Do you mean these? https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron I talked about this one: https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance The only thing there are the regular mtt-resu...@open-mpi.org email alerts and some out-of-date DB monitoring junk. You can ignore that stuff. Josh, are there some nightly (DB pruning/cleaning/vacuuming?) cron jobs that Pasha should be running? -Ethan Thanks. Ethan Mallove wrote: Hi Pasha, I thought this issue was solved in r1119 (see below). Do you have the latest mtt/server scripts? https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit -Ethan On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote: About the issue: 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Any ideas ? Thanks. -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies
Re: [MTT users] MTT server side problem
Hi, I upgraded the server side (the mtt is still running , so don't know if the problem was resolved) During upgrade I had some problem with the submit/index.php script, it had some duplicated functions and some of them were broken. Please review the attached patch. Pasha Ethan Mallove wrote: On Tue, May/06/2008 06:29:33PM, Pavel Shamis (Pasha) wrote: I'm not sure which cron jobs you're referring to. Do you mean these? https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron I talked about this one: https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance I'm guessing you would only be concerned with the below periodic-maintenance.pl script, which just runs ANALYZE/VACUUM queries. I think you can start that up whenever you want (and it should optimize the Reporter). https://svn.open-mpi.org/trac/mtt/browser/trunk/server/sql/cron/periodic-maintenance.pl -Ethan The only thing there are the regular mtt-resu...@open-mpi.org email alerts and some out-of-date DB monitoring junk. You can ignore that stuff. Josh, are there some nightly (DB pruning/cleaning/vacuuming?) cron jobs that Pasha should be running? -Ethan Thanks. Ethan Mallove wrote: Hi Pasha, I thought this issue was solved in r1119 (see below). Do you have the latest mtt/server scripts? https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit -Ethan On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote: About the issue: 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Any ideas ? Thanks. -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies Index: submit/index.php === --- submit/index.php(revision 1200) +++ submit/index.php(working copy) @@ -1,6 +1,7 @@ value) function associative_select($cmd) { @@ -1722,21 +1584,6 @@ function associative_select($cmd) { return pg_fetch_array($result); } -# Fetch 2D array -function select($cmd) { -do_pg_connect(); - -debug("\nSQL: $cmd\n"); -if (! ($result = pg_query($cmd))) { -$out = "\nSQL QUERY: " . $cmd . - "\nSQL ERROR: " . pg_last_error() . - "\nSQL ERROR: " . pg_result_error(); -mtt_error($out); -mtt_send_mail($out); -} -return pg_fetch_all($result); -} - ## # Function for reporting errors back to the client
Re: [MTT users] MTT server side problem
Hi Josh. Looking at the patch I'm a little bit conserned. The "get_table_fields()" is, as you mentioned, no longer used so should be removed. However the other functions are critical to the submission script particularly 'do_pg_connect' which opens the connection to the backend database. All the functions are implemented in $topdir/database.inc file. And the "database.inc" implementation is better because it use password and username from config.ini. The original implementation from submit/index use hardcoded values defined in the file. Are you using the current development trunk (mtt/trunk) or the stable release branch (mtt/branches/ompi-core-testers)? trunk Can you send us the error messages that you were receiving? 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Regards, Pasha Cheers, Josh On May 7, 2008, at 4:49 AM, Pavel Shamis (Pasha) wrote: Hi, I upgraded the server side (the mtt is still running , so don't know if the problem was resolved) During upgrade I had some problem with the submit/index.php script, it had some duplicated functions and some of them were broken. Please review the attached patch. Pasha Ethan Mallove wrote: On Tue, May/06/2008 06:29:33PM, Pavel Shamis (Pasha) wrote: I'm not sure which cron jobs you're referring to. Do you mean these? https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron I talked about this one: https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance I'm guessing you would only be concerned with the below periodic-maintenance.pl script, which just runs ANALYZE/VACUUM queries. I think you can start that up whenever you want (and it should optimize the Reporter). https://svn.open-mpi.org/trac/mtt/browser/trunk/server/sql/cron/periodic-maintenance.pl -Ethan The only thing there are the regular mtt-resu...@open-mpi.org email alerts and some out-of-date DB monitoring junk. You can ignore that stuff. Josh, are there some nightly (DB pruning/cleaning/vacuuming?) cron jobs that Pasha should be running? -Ethan Thanks. Ethan Mallove wrote: Hi Pasha, I thought this issue was solved in r1119 (see below). Do you have the latest mtt/server scripts? https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit -Ethan On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote: About the issue: 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Any ideas ? Thanks. -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Pavel Shamis (Pasha) Mellanox Technologies Index: submit/index.php
Re: [MTT users] MTT server side problem
Hi Josh, I had the original problem with some old revision from trunk. Today I updated the server to latest revision from trunk + the patch and everything looks good. Can I commit the patch ? Pasha Ethan Mallove wrote: On Wed, May/07/2008 06:04:07PM, Pavel Shamis (Pasha) wrote: Hi Josh. Looking at the patch I'm a little bit conserned. The "get_table_fields()" is, as you mentioned, no longer used so should be removed. However the other functions are critical to the submission script particularly 'do_pg_connect' which opens the connection to the backend database. All the functions are implemented in $topdir/database.inc file. And the "database.inc" implementation is better because it use password and username from config.ini. The original implementation from submit/index use hardcoded values defined in the file. Are you using the current development trunk (mtt/trunk) or the stable release branch (mtt/branches/ompi-core-testers)? trunk Can you send us the error messages that you were receiving? 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Looks like PHP is actually using a 32MB limit ("Allowed memory size of 33554432 ..."). Does a (Apache?) daemon need to be restarted for the php.ini file to take effect? To check your settings, this little PHP script will print an HTML page of all the active system settings (search on "memory_limit"). -Ethan Regards, Pasha Cheers, Josh On May 7, 2008, at 4:49 AM, Pavel Shamis (Pasha) wrote: Hi, I upgraded the server side (the mtt is still running , so don't know if the problem was resolved) During upgrade I had some problem with the submit/index.php script, it had some duplicated functions and some of them were broken. Please review the attached patch. Pasha Ethan Mallove wrote: On Tue, May/06/2008 06:29:33PM, Pavel Shamis (Pasha) wrote: I'm not sure which cron jobs you're referring to. Do you mean these? https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron I talked about this one: https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance I'm guessing you would only be concerned with the below periodic-maintenance.pl script, which just runs ANALYZE/VACUUM queries. I think you can start that up whenever you want (and it should optimize the Reporter). https://svn.open-mpi.org/trac/mtt/browser/trunk/server/sql/cron/periodic-maintenance.pl -Ethan The only thing there are the regular mtt-resu...@open-mpi.org email alerts and some out-of-date DB monitoring junk. You can ignore that stuff. Josh, are there some nightly (DB pruning/cleaning/vacuuming?) cron jobs that Pasha should be running? -Ethan Thanks. Ethan Mallove wrote: Hi Pasha, I thought this issue was solved in r1119 (see below). Do you have the latest mtt/server scripts? https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit -Ethan On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote: About the issue: 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My me
Re: [MTT users] MTT server side problem
Hi Josh, I ported the error handling mechanism from submit/index.php to to the database.inc. Please review. Thanks, Pasha Josh Hursey wrote: Pasha, I'm looking at the patch a bit closer and even though at a high level the do_pg_connect, do_pg_query, simple_select, and select functions do the same thing the versions in submit/index.php have some additional error handling mechanisms that the ones in database.inc do not have. Specifically they send email when the functions fail with messages indicating what failed so corrections can be made. So though I agree that we should unify the functionality I cannot recommend this patch since it will result in losing useful error handling functionality. Maybe there is another way to clean this up to preserve the error reporting. -- Josh On May 7, 2008, at 11:56 AM, Pavel Shamis (Pasha) wrote: Hi Josh, I had the original problem with some old revision from trunk. Today I updated the server to latest revision from trunk + the patch and everything looks good. Can I commit the patch ? Pasha Ethan Mallove wrote: On Wed, May/07/2008 06:04:07PM, Pavel Shamis (Pasha) wrote: Hi Josh. Looking at the patch I'm a little bit conserned. The "get_table_fields()" is, as you mentioned, no longer used so should be removed. However the other functions are critical to the submission script particularly 'do_pg_connect' which opens the connection to the backend database. All the functions are implemented in $topdir/database.inc file. And the "database.inc" implementation is better because it use password and username from config.ini. The original implementation from submit/index use hardcoded values defined in the file. Are you using the current development trunk (mtt/trunk) or the stable release branch (mtt/branches/ompi-core-testers)? trunk Can you send us the error messages that you were receiving? 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Looks like PHP is actually using a 32MB limit ("Allowed memory size of 33554432 ..."). Does a (Apache?) daemon need to be restarted for the php.ini file to take effect? To check your settings, this little PHP script will print an HTML page of all the active system settings (search on "memory_limit"). -Ethan Regards, Pasha Cheers, Josh On May 7, 2008, at 4:49 AM, Pavel Shamis (Pasha) wrote: Hi, I upgraded the server side (the mtt is still running , so don't know if the problem was resolved) During upgrade I had some problem with the submit/index.php script, it had some duplicated functions and some of them were broken. Please review the attached patch. Pasha Ethan Mallove wrote: On Tue, May/06/2008 06:29:33PM, Pavel Shamis (Pasha) wrote: I'm not sure which cron jobs you're referring to. Do you mean these? https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron I talked about this one: https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance I'm guessing you would only be concerned with the below periodic-maintenance.pl script, which just runs ANALYZE/VACUUM queries. I think you can start that up whenever you want (and it should optimize the Reporter). https://svn.open-mpi.org/trac/mtt/browser/trunk/server/sql/cron/periodic-maintenance.pl -Ethan The only thing there are the regular mtt-resu...@open-mpi.org email alerts and some out-of-date DB monitoring junk. You can ignore that stuff. Josh, are there some nightly (DB pruning/cleaning/vacuuming?) cron jobs that Pasha should be running? -Ethan Thanks. Ethan Mallove wrote: Hi Pasha, I thought this issue was solved in r1119 (see below). Do you have the latest mtt/server scripts? https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit -Ethan On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote: About the issue: 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found fol
Re: [MTT users] MTT server side problem
Hello, Did you have chance to review this patch ? Regards, Pasha Josh Hursey wrote: Sorry for the delay on this. I probably will not have a chance to look at it until later this week or early next. Thank you for the work on the patch. Cheers, Josh On May 12, 2008, at 8:08 AM, Pavel Shamis (Pasha) wrote: Hi Josh, I ported the error handling mechanism from submit/index.php to to the database.inc. Please review. Thanks, Pasha Josh Hursey wrote: Pasha, I'm looking at the patch a bit closer and even though at a high level the do_pg_connect, do_pg_query, simple_select, and select functions do the same thing the versions in submit/index.php have some additional error handling mechanisms that the ones in database.inc do not have. Specifically they send email when the functions fail with messages indicating what failed so corrections can be made. So though I agree that we should unify the functionality I cannot recommend this patch since it will result in losing useful error handling functionality. Maybe there is another way to clean this up to preserve the error reporting. -- Josh On May 7, 2008, at 11:56 AM, Pavel Shamis (Pasha) wrote: Hi Josh, I had the original problem with some old revision from trunk. Today I updated the server to latest revision from trunk + the patch and everything looks good. Can I commit the patch ? Pasha Ethan Mallove wrote: On Wed, May/07/2008 06:04:07PM, Pavel Shamis (Pasha) wrote: Hi Josh. Looking at the patch I'm a little bit conserned. The "get_table_fields()" is, as you mentioned, no longer used so should be removed. However the other functions are critical to the submission script particularly 'do_pg_connect' which opens the connection to the backend database. All the functions are implemented in $topdir/database.inc file. And the "database.inc" implementation is better because it use password and username from config.ini. The original implementation from submit/index use hardcoded values defined in the file. Are you using the current development trunk (mtt/trunk) or the stable release branch (mtt/branches/ompi-core-testers)? trunk Can you send us the error messages that you were receiving? 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Looks like PHP is actually using a 32MB limit ("Allowed memory size of 33554432 ..."). Does a (Apache?) daemon need to be restarted for the php.ini file to take effect? To check your settings, this little PHP script will print an HTML page of all the active system settings (search on "memory_limit"). -Ethan Regards, Pasha Cheers, Josh On May 7, 2008, at 4:49 AM, Pavel Shamis (Pasha) wrote: Hi, I upgraded the server side (the mtt is still running , so don't know if the problem was resolved) During upgrade I had some problem with the submit/index.php script, it had some duplicated functions and some of them were broken. Please review the attached patch. Pasha Ethan Mallove wrote: On Tue, May/06/2008 06:29:33PM, Pavel Shamis (Pasha) wrote: I'm not sure which cron jobs you're referring to. Do you mean these? https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron I talked about this one: https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance I'm guessing you would only be concerned with the below periodic-maintenance.pl script, which just runs ANALYZE/VACUUM queries. I think you can start that up whenever you want (and it should optimize the Reporter). https://svn.open-mpi.org/trac/mtt/browser/trunk/server/sql/cron/periodic-maintenance.pl -Ethan The only thing there are the regular mtt-resu...@open-mpi.org email alerts and some out-of-date DB monitoring junk. You can ignore that stuff. Josh, are there some nightly (DB pruning/cleaning/vacuuming?) cron jobs that Pasha should be running? -Ethan Thanks. Ethan Mallove wrote: Hi Pasha, I thought this issue was solved in r1119 (see below). Do you have the latest mtt/server scripts? https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/se
Re: [MTT users] MTT server side problem
Josh Hursey wrote: I just got back from travel this morning. I reviewed the patch and it looks good to me. I updated the ticket: https://svn.open-mpi.org/trac/mtt/ticket/357 Do you need me to commit it? Thank you. I will commit it. Regards, Pasha Cheers, Josh On May 19, 2008, at 7:32 AM, Pavel Shamis (Pasha) wrote: Hello, Did you have chance to review this patch ? Regards, Pasha Josh Hursey wrote: Sorry for the delay on this. I probably will not have a chance to look at it until later this week or early next. Thank you for the work on the patch. Cheers, Josh On May 12, 2008, at 8:08 AM, Pavel Shamis (Pasha) wrote: Hi Josh, I ported the error handling mechanism from submit/index.php to to the database.inc. Please review. Thanks, Pasha Josh Hursey wrote: Pasha, I'm looking at the patch a bit closer and even though at a high level the do_pg_connect, do_pg_query, simple_select, and select functions do the same thing the versions in submit/index.php have some additional error handling mechanisms that the ones in database.inc do not have. Specifically they send email when the functions fail with messages indicating what failed so corrections can be made. So though I agree that we should unify the functionality I cannot recommend this patch since it will result in losing useful error handling functionality. Maybe there is another way to clean this up to preserve the error reporting. -- Josh On May 7, 2008, at 11:56 AM, Pavel Shamis (Pasha) wrote: Hi Josh, I had the original problem with some old revision from trunk. Today I updated the server to latest revision from trunk + the patch and everything looks good. Can I commit the patch ? Pasha Ethan Mallove wrote: On Wed, May/07/2008 06:04:07PM, Pavel Shamis (Pasha) wrote: Hi Josh. Looking at the patch I'm a little bit conserned. The "get_table_fields()" is, as you mentioned, no longer used so should be removed. However the other functions are critical to the submission script particularly 'do_pg_connect' which opens the connection to the backend database. All the functions are implemented in $topdir/database.inc file. And the "database.inc" implementation is better because it use password and username from config.ini. The original implementation from submit/index use hardcoded values defined in the file. Are you using the current development trunk (mtt/trunk) or the stable release branch (mtt/branches/ompi-core-testers)? trunk Can you send us the error messages that you were receiving? 1. On client side I see ""*** WARNING: MTTDatabase client did not get a serial" As result of the error some of MTT results is not visible via the web reporter 2. On server side I found follow error message: [client 10.4.3.214] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 23592960 bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : eval()'d code on line 77515 [Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down [Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec) [Mon May 05 19:30:54 2008] [notice] Digest: generating secret for digest authentication ... [Mon May 05 19:30:54 2008] [notice] Digest: done [Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP LDAP SDK [Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable My memory limit in php.ini file was set on 256MB ! Looks like PHP is actually using a 32MB limit ("Allowed memory size of 33554432 ..."). Does a (Apache?) daemon need to be restarted for the php.ini file to take effect? To check your settings, this little PHP script will print an HTML page of all the active system settings (search on "memory_limit"). -Ethan Regards, Pasha Cheers, Josh On May 7, 2008, at 4:49 AM, Pavel Shamis (Pasha) wrote: Hi, I upgraded the server side (the mtt is still running , so don't know if the problem was resolved) During upgrade I had some problem with the submit/index.php script, it had some duplicated functions and some of them were broken. Please review the attached patch. Pasha Ethan Mallove wrote: On Tue, May/06/2008 06:29:33PM, Pavel Shamis (Pasha) wrote: I'm not sure which cron jobs you're referring to. Do you mean these? https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron I talked about this one: https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance I'm guessing you would only be concerned with the below periodic-maintenance.pl script, which just runs ANALYZE/VACUUM queries. I think you can start that up whenever you want (and it should optimize the Reporter). https://svn.open-mpi.org/trac/mtt/browser/trunk/server/sql/cron/periodic-maintenance.pl -Ethan The only thing there are the regular mtt-resu...@open-mpi.org email alerts and some out-of-date DB monitoring junk. You can ig
[MTT users] Can not find my testing results in OMPI MTT DB
Hi, Here is test result from my last mtt run: +-++--+--+--+--+ | Phase | Section| Pass | Fail | Time out | Skip | +-++--+--+--+--+ | MPI install | ompi/gcc | 1| 0| 0| 0| | MPI install | ompi/intel-9.0 | 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 3179 | 165 | 400 | 0| | Test Run| intel-suite| 492 | 0| 0| 0| +-++--+--+--+--+ In the OMPI MTT DB (http://www.open-mpi.org/mtt) I found the follow "test run" results: | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 492 | 0| 0| 0| And I can not find this one: | Test Run| intel-suite| 3179 | 165 | 400 | 0| From the log I see that all tests results were submitted successfully. Can you please check ? Thanks, Pasha
Re: [MTT users] Can not find my testing results in OMPI MTT DB
Jeff Squyres wrote: Are we running into http max memory problems or http max upload size problems again? I guess it is some server side issue, you need to check the /var/log/httpd/* log on the server. On May 21, 2008, at 5:28 AM, Pavel Shamis (Pasha) wrote: Hi, Here is test result from my last mtt run: +-++--+--+--+--+ | Phase | Section| Pass | Fail | Time out | Skip | +-++--+--+--+--+ | MPI install | ompi/gcc | 1| 0| 0| 0| | MPI install | ompi/intel-9.0 | 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 3179 | 165 | 400 | 0| | Test Run| intel-suite| 492 | 0| 0| 0| +-++--+--+--+--+ In the OMPI MTT DB (http://www.open-mpi.org/mtt) I found the follow "test run" results: | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 492 | 0| 0| 0| And I can not find this one: | Test Run| intel-suite| 3179 | 165 | 400 | 0| From the log I see that all tests results were submitted successfully. Can you please check ? Thanks, Pasha ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
Re: [MTT users] Can not find my testing results in OMPI MTT DB
I sent it directly to your email. Please check. Thanks, Pasha Ethan Mallove wrote: On Wed, May/21/2008 05:19:44PM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: Are we running into http max memory problems or http max upload size problems again? I guess it is some server side issue, you need to check the /var/log/httpd/* log on the server. The only thing I found in the httpd logs (/var/log/httpd/www.open-mpi.org/error_log*) was this PHP warning, which I don't think would result in lost results: PHP Warning: array_shift(): The argument should be an array in .../submit/index.php on line 1683 I haven't received any emailed Postgres errors either. When were these results submitted? I searched for "mellanox" over the past four days. It seem the results aren't buried in here, because I see no test run failures ... http://www.open-mpi.org/mtt/index.php?do_redir=659 I'm assuming you're running with two Reporter INI sections: Textfile and MTTDatabase? Can you send some MTT client --verbose/--debug output from the below runs? Thanks, Ethan On May 21, 2008, at 5:28 AM, Pavel Shamis (Pasha) wrote: Hi, Here is test result from my last mtt run: +-++--+--+--+--+ | Phase | Section| Pass | Fail | Time out | Skip | +-++--+--+--+--+ | MPI install | ompi/gcc | 1| 0| 0| 0| | MPI install | ompi/intel-9.0 | 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 3179 | 165 | 400 | 0| | Test Run| intel-suite| 492 | 0| 0| 0| +-++--+--+--+--+ In the OMPI MTT DB (http://www.open-mpi.org/mtt) I found the follow "test run" results: | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 492 | 0| 0| 0| And I can not find this one: | Test Run| intel-suite| 3179 | 165 | 400 | 0| From the log I see that all tests results were submitted successfully. Can you please check ? Thanks, Pasha ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
Re: [MTT users] Can not find my testing results in OMPI MTT DB
I had similar problem on my server. I upgraded the server to latest trunk and the problem disappear. (see "MTT server side problem" thread). Pasha Jeff Squyres wrote: FWIW: I think we have at least one open ticket on this issue (break up submits so that we don't overflow PHP and/or apache). On May 21, 2008, at 2:36 PM, Ethan Mallove wrote: On Wed, May/21/2008 06:46:06PM, Pavel Shamis (Pasha) wrote: I sent it directly to your email. Please check. Thanks, Pasha Got it. Thanks. It's a PHP memory overload issue. (Apparently I didn't look far back enough in the httpd error_logs.) See below. Ethan Mallove wrote: On Wed, May/21/2008 05:19:44PM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: Are we running into http max memory problems or http max upload size problems again? I guess it is some server side issue, you need to check the /var/log/httpd/* log on the server. The only thing I found in the httpd logs (/var/log/httpd/www.open-mpi.org/error_log*) was this PHP warning, which I don't think would result in lost results: PHP Warning: array_shift(): The argument should be an array in .../submit/index.php on line 1683 I haven't received any emailed Postgres errors either. When were these results submitted? I searched for "mellanox" over the past four days. It seem the results aren't buried in here, because I see no test run failures ... http://www.open-mpi.org/mtt/index.php?do_redir=659 I'm assuming you're running with two Reporter INI sections: Textfile and MTTDatabase? Can you send some MTT client --verbose/--debug output from the below runs? Thanks, Ethan On May 21, 2008, at 5:28 AM, Pavel Shamis (Pasha) wrote: Hi, Here is test result from my last mtt run: +-++--+--+--+--+ | Phase | Section| Pass | Fail | Time out | Skip | +-++--+--+--+--+ | MPI install | ompi/gcc | 1| 0| 0| 0| | MPI install | ompi/intel-9.0 | 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 3179 | 165 | 400 | 0| | Test Run| intel-suite| 492 | 0| 0| 0| +-++--+--+--+--+ In the OMPI MTT DB (http://www.open-mpi.org/mtt) I found the follow "test run" results: | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 492 | 0| 0| 0| And I can not find this one: | Test Run| intel-suite| 3179 | 165 | 400 | 0| Some missing results are in mttdb_debug_file.16.txt (and 17.txt), which are the largest .txt files of the bunch. 8 variants isn't that much, but maybe it causes a problem when there's lots of stderr/stdout? I'm surprised submit/index.php barfs on files this size: $ ls -l ... -rw-r--r--1 em162155 staff 956567 May 21 14:21 mttdb_debug_file.16.inc.gz -rw-r--r--1 em162155 staff 9603132 May 21 14:09 mttdb_debug_file.16.txt ... $ client/mtt-submit -h www.open-mpi.org -f mttdb_debug_file.16.txt -z -u sun -p sun4sun -d LWP::UserAgent::new: () LWP::UserAgent::proxy: http Filelist: $VAR1 = 'mttdb_debug_file.16.txt'; LWP::MediaTypes::read_media_types: Reading media types from /ws/ompi-tools/lib/perl5/5.8.8/LWP/media.types LWP::MediaTypes::read_media_types: Reading media types from /usr/perl5/site_perl/5.8.4/LWP/media.types LWP::MediaTypes::read_media_types: Reading media types from /home/em162155/.mime.types LWP::UserAgent::request: () LWP::UserAgent::send_request: POST http://www.open-mpi.org/mtt/submit/index.php LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::UserAgent::request: Simple response: OK $ tail -f /var/log/httpd/www.open-mpi.org/error_log | grep -w submit ... [client 192.18.128.5] PHP Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to alloc
Re: [MTT users] Can not find my testing results in OMPI MTT DB
Oops, in the "MTT server side problem" we discussed other issue. But anyway I did not see the problem on my server after the upgrade :) Pasha Pavel Shamis (Pasha) wrote: I had similar problem on my server. I upgraded the server to latest trunk and the problem disappear. (see "MTT server side problem" thread). Pasha Jeff Squyres wrote: FWIW: I think we have at least one open ticket on this issue (break up submits so that we don't overflow PHP and/or apache). On May 21, 2008, at 2:36 PM, Ethan Mallove wrote: On Wed, May/21/2008 06:46:06PM, Pavel Shamis (Pasha) wrote: I sent it directly to your email. Please check. Thanks, Pasha Got it. Thanks. It's a PHP memory overload issue. (Apparently I didn't look far back enough in the httpd error_logs.) See below. Ethan Mallove wrote: On Wed, May/21/2008 05:19:44PM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: Are we running into http max memory problems or http max upload size problems again? I guess it is some server side issue, you need to check the /var/log/httpd/* log on the server. The only thing I found in the httpd logs (/var/log/httpd/www.open-mpi.org/error_log*) was this PHP warning, which I don't think would result in lost results: PHP Warning: array_shift(): The argument should be an array in .../submit/index.php on line 1683 I haven't received any emailed Postgres errors either. When were these results submitted? I searched for "mellanox" over the past four days. It seem the results aren't buried in here, because I see no test run failures ... http://www.open-mpi.org/mtt/index.php?do_redir=659 I'm assuming you're running with two Reporter INI sections: Textfile and MTTDatabase? Can you send some MTT client --verbose/--debug output from the below runs? Thanks, Ethan On May 21, 2008, at 5:28 AM, Pavel Shamis (Pasha) wrote: Hi, Here is test result from my last mtt run: +-++--+--+--+--+ | Phase | Section| Pass | Fail | Time out | Skip | +-++--+--+--+--+ | MPI install | ompi/gcc | 1| 0| 0| 0| | MPI install | ompi/intel-9.0 | 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 3179 | 165 | 400 | 0| | Test Run| intel-suite| 492 | 0| 0| 0| +-++--+--+--+--+ In the OMPI MTT DB (http://www.open-mpi.org/mtt) I found the follow "test run" results: | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 492 | 0| 0| 0| And I can not find this one: | Test Run| intel-suite| 3179 | 165 | 400 | 0| Some missing results are in mttdb_debug_file.16.txt (and 17.txt), which are the largest .txt files of the bunch. 8 variants isn't that much, but maybe it causes a problem when there's lots of stderr/stdout? I'm surprised submit/index.php barfs on files this size: $ ls -l ... -rw-r--r--1 em162155 staff 956567 May 21 14:21 mttdb_debug_file.16.inc.gz -rw-r--r--1 em162155 staff 9603132 May 21 14:09 mttdb_debug_file.16.txt ... $ client/mtt-submit -h www.open-mpi.org -f mttdb_debug_file.16.txt -z -u sun -p sun4sun -d LWP::UserAgent::new: () LWP::UserAgent::proxy: http Filelist: $VAR1 = 'mttdb_debug_file.16.txt'; LWP::MediaTypes::read_media_types: Reading media types from /ws/ompi-tools/lib/perl5/5.8.8/LWP/media.types LWP::MediaTypes::read_media_types: Reading media types from /usr/perl5/site_perl/5.8.4/LWP/media.types LWP::MediaTypes::read_media_types: Reading media types from /home/em162155/.mime.types LWP::UserAgent::request: () LWP::UserAgent::send_request: POST http://www.open-mpi.org/mtt/subm
Re: [MTT users] Can not find my testing results in OMPI MTT DB
Hi All, I still see that some time MTT (http://www.open-mpi.org/mtt) "lose" my testing results, in the local log I see: ### Test progress: 474 of 474 section tests complete (100%) Submitting to MTTDatabase... MTTDatabase client trying proxy: / Default (none) MTTDatabase proxy successful / not 500 MTTDatabase response is a success MTTDatabase client got response: *** WARNING: MTTDatabase client did not get a serial; phases will be isolated from each other in the reports MTTDatabase client submit complete And I can not find these results in DB. Is it any progress with this issue ? Regards. Pasha Ethan Mallove wrote: On Wed, May/21/2008 09:53:11PM, Pavel Shamis (Pasha) wrote: Oops, in the "MTT server side problem" we discussed other issue. But anyway I did not see the problem on my server after the upgrade :) We took *some* steps to alleviate the PHP memory overload problem (e.g., r668, and then r1119), but evidently there's more work to do :-) Pasha Pavel Shamis (Pasha) wrote: I had similar problem on my server. I upgraded the server to latest trunk and the problem disappear. (see "MTT server side problem" thread). Pasha Jeff Squyres wrote: FWIW: I think we have at least one open ticket on this issue (break up submits so that we don't overflow PHP and/or apache). https://svn.open-mpi.org/trac/mtt/ticket/221 -Ethan On May 21, 2008, at 2:36 PM, Ethan Mallove wrote: On Wed, May/21/2008 06:46:06PM, Pavel Shamis (Pasha) wrote: I sent it directly to your email. Please check. Thanks, Pasha Got it. Thanks. It's a PHP memory overload issue. (Apparently I didn't look far back enough in the httpd error_logs.) See below. Ethan Mallove wrote: On Wed, May/21/2008 05:19:44PM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: Are we running into http max memory problems or http max upload size problems again? I guess it is some server side issue, you need to check the /var/log/httpd/* log on the server. The only thing I found in the httpd logs (/var/log/httpd/www.open-mpi.org/error_log*) was this PHP warning, which I don't think would result in lost results: PHP Warning: array_shift(): The argument should be an array in .../submit/index.php on line 1683 I haven't received any emailed Postgres errors either. When were these results submitted? I searched for "mellanox" over the past four days. It seem the results aren't buried in here, because I see no test run failures ... http://www.open-mpi.org/mtt/index.php?do_redir=659 I'm assuming you're running with two Reporter INI sections: Textfile and MTTDatabase? Can you send some MTT client --verbose/--debug output from the below runs? Thanks, Ethan On May 21, 2008, at 5:28 AM, Pavel Shamis (Pasha) wrote: Hi, Here is test result from my last mtt run: +-++--+--+--+--+ | Phase | Section| Pass | Fail | Time out | Skip | +-++--+--+--+--+ | MPI install | ompi/gcc | 1| 0| 0| 0| | MPI install | ompi/intel-9.0 | 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | trivial| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | intel-suite| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | imb| 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | presta | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | osu_benchmarks | 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Build | netpipe| 1| 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 3179 | 165 | 400 | 0| | Test Run| intel-suite| 492 | 0| 0| 0| +-++--+--+--+--+ In the OMPI MTT DB (http://www.open-mpi.org/mtt) I found the follow "test run" results: | Test Run| trivial| 64 | 0| 0| 0| | Test Run| trivial| 64 | 0| 0| 0| | Test Run| intel-suite| 492 | 0| 0| 0| And I can not find this one: | Test Run| intel-suite| 3179 | 165 | 400
Re: [MTT users] Can not find my testing results in OMPI MTT DB
Ethan Mallove wrote: If you are running on the trunk, there's this Test run INI parameter: submit_results_after_each = 1 That tells MTT to submit results after each new test executable. This might add significant overhead, but at least you'd have some results in the DB to look at. I'll go ahead and add "submit_results_after_n_tests". (Is that an okay name?) E.g., this would submit results after every 100 new test executables. submit_results_after_n_tests = 100 The "submit_results_after_n_tests" solution sounds good for me. In which section I should define it ? Pasha On Tue, Jul/08/2008 04:11:46PM, Jeff Squyres wrote: ...and/or we could finally make the client capable of breaking up large sets of submit data into multiple submits to the server. :-) I unfortunately have no cycles to work on this, but it shouldn't be *too* hard to do... E.g., the client can loop over submitting N results at a time until all results have been submitted. On Jul 8, 2008, at 4:07 PM, Ethan Mallove wrote: I see a bunch of these errors in the httpd logs today: PHP Fatal error: Allowed memory size of 33554432 Could you send your INI file? One way around this issue is to submit fewer test results at one time by breaking up your large MTT run into multiple runs. If you're running on the trunk, this parameter will submit results after each run. -Ethan On Tue, Jul/08/2008 10:22:47AM, Pavel Shamis (Pasha) wrote: Hi All, I still see that some time MTT (http://www.open-mpi.org/mtt) "lose" my testing results, in the local log I see: ### Test progress: 474 of 474 section tests complete (100%) Submitting to MTTDatabase... MTTDatabase client trying proxy: / Default (none) MTTDatabase proxy successful / not 500 MTTDatabase response is a success MTTDatabase client got response: *** WARNING: MTTDatabase client did not get a serial; phases will be isolated from each other in the reports MTTDatabase client submit complete And I can not find these results in DB. Is it any progress with this issue ? Regards. Pasha Ethan Mallove wrote: On Wed, May/21/2008 09:53:11PM, Pavel Shamis (Pasha) wrote: Oops, in the "MTT server side problem" we discussed other issue. But anyway I did not see the problem on my server after the upgrade :) We took *some* steps to alleviate the PHP memory overload problem (e.g., r668, and then r1119), but evidently there's more work to do :-) Pasha Pavel Shamis (Pasha) wrote: I had similar problem on my server. I upgraded the server to latest trunk and the problem disappear. (see "MTT server side problem" thread). Pasha Jeff Squyres wrote: FWIW: I think we have at least one open ticket on this issue (break up submits so that we don't overflow PHP and/or apache). https://svn.open-mpi.org/trac/mtt/ticket/221 -Ethan On May 21, 2008, at 2:36 PM, Ethan Mallove wrote: On Wed, May/21/2008 06:46:06PM, Pavel Shamis (Pasha) wrote: I sent it directly to your email. Please check. Thanks, Pasha Got it. Thanks. It's a PHP memory overload issue. (Apparently I didn't look far back enough in the httpd error_logs.) See below. Ethan Mallove wrote: On Wed, May/21/2008 05:19:44PM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: Are we running into http max memory problems or http max upload size problems again? I guess it is some server side issue, you need to check the /var/log/httpd/* log on the server. The only thing I found in the httpd logs (/var/log/httpd/www.open-mpi.org/error_log*) was this PHP warning, which I don't think would result in lost results: PHP Warning: array_shift(): The argument should be an array in .../submit/index.php on line 1683 I haven't received any emailed Postgres errors either. When were these results submitted? I searched for "mellanox" over the past four days. It seem the results aren't buried in here, because I see no test run failures ... http://www.open-mpi.org/mtt/index.php?do_redir=659 I'm assuming you're running with two Reporter INI sections: Textfile and MTTDatabase? Can you send some MTT client --verbose/--debug output from the below runs? Thanks, Ethan On May 21, 2008, at 5:28 AM, Pavel Shamis (Pasha) wrote: Hi, Here is test result from my last mtt run: +-++--+--+--+--+ | Phase | Section| Pass | Fail | Time out | Skip | +-++--+--+--+--+ | MPI install | ompi/gcc | 1| 0| 0| 0
Re: [MTT users] RETRY EXCEEDED ERROR
The "RETRY EXCEEDED ERROR" error is related to IB and not MTT. The error says that IB failed to send IB packet from machine 10.2.1.90 to 10.2.1.50 You need to run your IB network monitoring tool and found the issue. Usually it is some bad cable in IB fabric that causes such errors. Regards, Pasha Rafael Folco wrote: Hi, I need some help, please. I'm running a set of MTT tests on my cluster and I have issues in a particular node. [0,1,7][btl_openib_component.c:1332:btl_openib_component_progress] from 10.2.1.90 to: 10.2.1.50 error polling HP CQ with status RETRY EXCEEDED ERROR status number 12 for wr_id 268870712 opcode 0 I am able to ping from 10.2.1.90 to 10.2.1.50, and they are visible to each other in the network, just like the other nodes. I've already checked the drivers, reinstalled openmpi, but nothing changes... On 10.2.1.90: # ping 10.2.1.50 PING 10.2.1.50 (10.2.1.50) 56(84) bytes of data. 64 bytes from 10.2.1.50: icmp_seq=1 ttl=64 time=9.95 ms 64 bytes from 10.2.1.50: icmp_seq=2 ttl=64 time=0.076 ms 64 bytes from 10.2.1.50: icmp_seq=3 ttl=64 time=0.114 ms The cable connections are the same to every node and all tests run fine without 10.2.1.90. In the other hand, when I add 10.2.1.90 to the hostlist, I get many failures. Please, could someone tell me why 10.2.1.90 doesn't like 10.2.1.50 ? Any clue? I don't see any problems with other combination of nodes. This is very very weird. MTT Results Summary hostname: p6ihopenhpc1-ib0 uname: Linux p6ihopenhpc1-ib0 2.6.16.60-0.21-ppc64 #1 SMP Tue May 6 12:41:02 UTC 2008 ppc64 ppc64 ppc64 GNU/Linux who am i: root pts/3Jul 31 13:31 (elm3b150:S.0) +-+-+--+--+--+--+ | Phase | Section | Pass | Fail | Time out | Skip | +-+-+--+--+--+--+ | MPI install | openmpi-1.2.5 | 1| 0| 0| 0| | Test Build | trivial | 1| 0| 0| 0| | Test Build | ibm | 1| 0| 0| 0| | Test Build | onesided| 1| 0| 0| 0| | Test Build | mpicxx | 1| 0| 0| 0| | Test Build | imb | 1| 0| 0| 0| | Test Build | netpipe | 1| 0| 0| 0| | Test Run| trivial | 4| 4| 0| 0| | Test Run| ibm | 59 | 120 | 0| 3| | Test Run| onesided| 95 | 37 | 0| 0| | Test Run| mpicxx | 0| 1| 0| 0| | Test Run| imb correctness | 0| 1| 0| 0| | Test Run| imb performance | 0| 12 | 0| 0| | Test Run| netpipe | 1| 0| 0| 0| +-+-+--+--+--+--+ I also attached one of the errors here. Thanks in advance, Rafael ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users