Re: [MTT users] Fwd: [Alert] Found server-side submit error messages

2008-10-28 Thread Tim Mattox
I ran into this a week ago on sif, so I added report_after_n_results = 100
for our regular nightly tests on sif, odin and bigred.  Josh, could this be a
problem with any of the tests you run?

On Tue, Oct 28, 2008 at 6:15 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> That host is in one of IU's clusters (odin).
>
> Tim/Josh -- this is you guys...
>
>
> On Oct 28, 2008, at 3:45 PM, Ethan Mallove wrote:
>
>> Folks,
>>
>> I got an alert from the http-log-checker.pl script. Somebody appears to
>> have
>> lost some MTT results. (Possibly due to an oversized database submission
>> to
>> submit/index.php?) There's an open ticket for this (see
>> https://svn.open-mpi.org/trac/mtt/ticket/375).  Currently there exists a
>> simple
>> workaround for this problem. Put the below line in the problematic "Test
>> run"
>> section(s). This will prevent oversided submissions by directing MTT to
>> submit
>> the results in batches of 50 results instead of an entire section at a
>> time,
>> which can reach 400+ for an Intel test run section.
>>
>>   report_after_n_results = 50
>>
>> It's hard to know whose errors are in the HTTP error log with only the IP
>> address. If you want to verify they are or are not yours, visit a bogus
>> URL off
>> open-mpi.org, e.g., www.open-mpi.org/what-is-foobars-ip-address, and ping
>> me
>> about it. This will write your IP address to the log file, and then this
>> can be
>> matched with the IP addr against the submit.php errors.
>>
>> -Ethan
>>
>>
>> - Forwarded message from Ethan Mallove <emall...@osl.iu.edu> -
>>
>> From: Ethan Mallove <emall...@osl.iu.edu>
>> Date: Tue, 28 Oct 2008 08:00:41 -0400
>> To: ethan.mall...@sun.com, http-log-checker.pl-no-re...@open-mpi.org
>> Subject: [Alert] Found server-side submit error messages
>> Original-recipient: rfc822;ethan.mall...@sun.com
>>
>> This email was automatically sent by http-log-checker.pl. You have
>> received
>> it because some error messages were found in the HTTP(S) logs that
>> might indicate some MTT results were not successfully submitted by the
>> server-side PHP submit script (even if the MTT client has not
>> indicated a submission error).
>>
>> ###
>> #
>> # The below log messages matched "gz.*submit/index.php" in
>> # /var/log/httpd/www.open-mpi.org/ssl_error_log
>> #
>> ###
>>
>> [client 129.79.240.114] PHP Warning:  gzeof(): supplied argument is not a
>> valid stream resource in
>> /nfs/rontok/xraid/data/osl/www/www.open-mpi.org/mtt/submit/index.php on line
>> 1923
>> [client 129.79.240.114] PHP Warning:  gzgets(): supplied argument is not a
>> valid stream resource in
>> /nfs/rontok/xraid/data/osl/www/www.open-mpi.org/mtt/submit/index.php on line
>> 1924
>> ...
>> 
>>
>>
>>
>>
>> - End forwarded message -
>> ___
>> mtt-users mailing list
>> mtt-us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> mtt-users mailing list
> mtt-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>



-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
I'm a bright... http://www.the-brights.net/


Re: [MTT users] Patch to add --local-scratch option

2008-09-19 Thread Tim Mattox
I've also been thinking about this a bit more, and although
having the name match the INI section name has some appeal,
I ultimately think the best name is: --mpi-build-scratch, since
that is what it does.  As Ethan mentioned, the actual MPI
install goes into --scratch.  And on the other side of it,
the MPI Get also goes into --scratch.  The --mpi-build scratch
is only used for untaring/copying the MPI source tree, running
config, make, and make check.  The actual "make install"
simply copies the binaries from --mpi-build-scratch into --scratch.

As for names like local-scratch or fast-scratch, they don't convey
what it's used for, so should it be fast-for-big-files, of fast-for-small-files?
Or similarly, "local" to my cluster, my node, or what?
I think mpi-build-scratch conveys the most useful meaning, since you
should pick a filesystem that is tuned (or at least not horrible) for
doing configure/make.

Unfortunately, I won't have time today to get the patch adjusted
and into svn.  Maybe on Monday.

On Fri, Sep 19, 2008 at 11:23 AM, Ethan Mallove <ethan.mall...@sun.com> wrote:
> On Thu, Sep/18/2008 05:35:13PM, Jeff Squyres wrote:
>> On Sep 18, 2008, at 10:45 AM, Ethan Mallove wrote:
>>
>>>> Ah, yeah, ok, now I see why you wouldl call it --mpi-install-scratch, so
>>>> that it matches the MTT ini section name.  Sure, that works for me.
>>>
>>> Since this does seem like a feature that should eventually
>>> propogate to all the other phases (except for Test run),
>>> what will we call the option to group all the fast phase
>>> scratches?
>>
>> --scratch
>>
>> :-)
>>
>> Seriously, *if* we ever implement the other per-phase scratches, I think
>> having one overall --scratch and fine-grained per-phase specifications
>> fine.  I don't think we need to go overboard to have a way to say I want
>> phases X, Y, and Z to use scratch A.  Meaning that you could just use
>> --X-scratch=A --Y-scratch=A and --Z-scratch=A.
>
> --mpi-install-scratch actually has MTT install (using
> DESTDIR) into --scratch. Is that confusing? Though
> --fast-scratch could also be misleading, as I could see a
> user thinking that --fast-scratch will do some magical
> optimization to make their NFS directory go faster. I guess
> I'm done splitting hairs over --mpi-install-scratch :-)
>
> -Ethan
>
>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> ___
>> mtt-users mailing list
>> mtt-us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
> ___
> mtt-users mailing list
> mtt-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>



-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
I'm a bright... http://www.the-brights.net/


Re: [MTT users] Patch to add --local-scratch option

2008-09-18 Thread Tim Mattox
I Guess I should comment on Jeff's comments too.

On Thu, Sep 18, 2008 at 9:00 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
> On Sep 16, 2008, at 12:07 PM, Ethan Mallove wrote:
>
>> What happens if one uses --local-scratch, but leaves out the
>> --scratch option? In this case, I think MTT should assume
>> --scratch is the same as --local-scratch.
>
> In this case, my $0.02 is that it should be an error.  --scratch implies a
> --local-scatch, but I don't think the implication should go the other way.

Yeah, I agree, especially if we call it --mpi-install-scratch.

>> Could the "local" in --local-scratch ever be misleading?
>> Couldn't a user ever use a mounted filesystem that's faster
>> than all their local filesystem? Should it be
>> --fast-scratch?
>
> Mmm... good point.  What if we name it what it really is:
> --mpi-install-scratch?  This also opens the door for other phase scratches
> if we ever want them.  And it keeps everything consistent and simple (from
> the user's point of view).

Ah, yeah, ok, now I see why you wouldl call it --mpi-install-scratch, so
that it matches the MTT ini section name.  Sure, that works for me.

>> For future work, how about --scratch taking a (CSV?) list of
>> scratch directories. MTT then does a quick check for which
>> is the fastest filesystem (if such a check is
>> possible/feasible), and proceeds accordingly. That is, doing
>> everything it possible can in a fast scratch (builds,
>> writing out metadata, etc.), and installing the MPI(s) into
>> the slow mounted scratch. Would this be possible?
>
> Hmm.  I'm not quite sure how that would work -- "fastest" is a hard thing to
> determine.  What is "fastest" at this moment may be "slowest" 2 minutes from
> now, for example.

Yeah, I claim that auto-detecting file system speed is outside the
scope of MTT. :-)

> I'm looking at the patch in detail now... sorry for the delay...
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> mtt-users mailing list
> mtt-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>



-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
 I'm a bright... http://www.the-brights.net/


Re: [MTT users] Patch to add --local-scratch option

2008-09-18 Thread Tim Mattox
OK, so how about calling it --mpi-build-scratch?
Once we get a consensus on what to call it, I can commit the patch to svn.

Can anyone check it quick for vpath builds?

Just a FYI, I've already run into the "downside" I mentioned once this week.
I had to rerun my MTT to get access to the build directory, since it
was on /tmp on some random BigRed compute node.  Hmm... maybe a
feature enhancement would be to copy it to your regular --scratch if
a build failure was detected?  Maybe later I'll do that as yet another option,
say, --copy-mpi-build-on-failure.  No time this week, but hey, its an idea.

On Thu, Sep 18, 2008 at 9:10 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
> Patch looks good.  Please also update the CHANGES file (this file reflects
> bullets for things that have happened since the core testers branch).
>
>
> On Sep 15, 2008, at 6:15 PM, Tim Mattox wrote:
>
>> Hello,
>> Attached is a patchfile for the mtt trunk that adds a
>> --local-scratch 
>> option to client/mtt.  You can also specify something like
>> this in your [MTT] ini section:
>> local_scratch = ("echo /tmp/`whoami`_mtt")
>>
>> This local-scratch directory is then used for part of the --mpi-install
>> phase to speed up your run.  Specifically, the source-code of the
>> MPI is untarred there, configure is run, make all, and make check.
>> Then, when make install is invoked the MPI is installed into the
>> usual place as if you hadn't used --local-scratch.  If you don't
>> use --local-scratch, then the builds occur in the usual place that
>> they have before.
>>
>> For the clusters at IU that seem to have slow NSF home directories,
>> this cuts the --mpi-install phase time in half.
>>
>> The downside is that if the MPI build fails, your build directory is out
>> on some compile-node's /tmp and is harder to go debug.  But, since
>> mpi build failures are now rare, this should make for quicker turnaround
>> for the general case.
>>
>> I think I adjusted the code properly for the vpath build case, but I've
>> never used that so haven't tested it.  Also, I adjusted the free disk
>> space
>> check code.  Right now it only checks the free space on --scratch,
>> and won't detect if --local-scratch is full.  If people really care, I
>> could make it check both later.  But for now, if your /tmp is full
>> you probably have other problems to worry about.
>>
>> Comments?  Can you try it out, and if I get no objections, I'd like
>> to put this into the MTT trunk this week.
>> --
>> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>> tmat...@gmail.com || timat...@open-mpi.org
>> I'm a bright... http://www.the-brights.net/
>> ___
>> mtt-users mailing list
>> mtt-us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> mtt-users mailing list
> mtt-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>



-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
 I'm a bright... http://www.the-brights.net/


[MTT users] Patch to add --local-scratch option

2008-09-15 Thread Tim Mattox
Hello,
Attached is a patchfile for the mtt trunk that adds a
--local-scratch 
option to client/mtt.  You can also specify something like
this in your [MTT] ini section:
local_scratch = ("echo /tmp/`whoami`_mtt")

This local-scratch directory is then used for part of the --mpi-install
phase to speed up your run.  Specifically, the source-code of the
MPI is untarred there, configure is run, make all, and make check.
Then, when make install is invoked the MPI is installed into the
usual place as if you hadn't used --local-scratch.  If you don't
use --local-scratch, then the builds occur in the usual place that
they have before.

For the clusters at IU that seem to have slow NSF home directories,
this cuts the --mpi-install phase time in half.

The downside is that if the MPI build fails, your build directory is out
on some compile-node's /tmp and is harder to go debug.  But, since
mpi build failures are now rare, this should make for quicker turnaround
for the general case.

I think I adjusted the code properly for the vpath build case, but I've
never used that so haven't tested it.  Also, I adjusted the free disk space
check code.  Right now it only checks the free space on --scratch,
and won't detect if --local-scratch is full.  If people really care, I
could make it check both later.  But for now, if your /tmp is full
you probably have other problems to worry about.

Comments?  Can you try it out, and if I get no objections, I'd like
to put this into the MTT trunk this week.
-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
 I'm a bright... http://www.the-brights.net/


mtt-local-scratch.patch
Description: Binary data


Re: [MTT users] Problems running MTT with already installed MPICH-MX

2007-09-28 Thread Tim Mattox
home/pjesa/mtt/scratch2/installs
> > Unique directory: CoY6
> > Making dir: CoY6 (cwd: /home/pjesa/mtt/scratch2/installs)
> > CoY6 does not exist -- creating
> > chdir CoY6/
> > chdir /home/pjesa/mtt/scratch2/installs
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6
> > Value: module
> > Evaluating: MPICH2
> > Replacing vars from section mpi install: mpich-mx: MPICH2
> > Got final version before escapes: MPICH2
> > Returning: MPICH2
> > Value: description
> > Value: description
> > Evaluating: [testbake]
> > Replacing vars from section MTT: [testbake]
> > Got final version before escapes: [testbake]
> > Returning: [testbake]
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6
> > chdir ..
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6
> > Sym linked: CoY6 to mpich-mx#mpich-mx#1.2.7
> > Value: env_module
> > Value: setenv
> > Value: unsetenv
> > Value: prepend_path
> > Value: append_path
> > Value: configure_arguments
> > Value: vpath_mode
> > Value: make_all_arguments
> > Value: make_check
> > Value: compiler_name
> > Value: compiler_version
> > Value: save_stdout_on_success
> > Evaluating: 1
> > Replacing vars from section mpi install: mpich-mx: 1
> > Got final version before escapes: 1
> > Returning: 1
> > Value: merge_stdout_stderr
> > Evaluating: 0
> > Replacing vars from section mpi install: mpich-mx: 0
> > Got final version before escapes: 0
> > Returning: 0
> > Value: stderr_save_lines
> > Value: stdout_save_lines
> > Running command: rm -rf src
> > Command complete, exit status: 0
> > Making dir: src (cwd: /home/pjesa/mtt/scratch2/installs/CoY6)
> > src does not exist -- creating
> > chdir src/
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6/src
> > Evaluating: require MTT::MPI::Get::AlreadyInstalled
> > Evaluating: $ret =
> >::MPI::Get::AlreadyInstalled::PrepareForInstall(@args)
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6/src
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6/src
> > Making dir: /home/pjesa/mtt/scratch2/installs/CoY6/install (cwd:
> >/home/pjesa/mtt/scratch2/installs/CoY6/src)
> > /home/pjesa/mtt/scratch2/installs/CoY6/install does not exist -- creating
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6/install/
> > chdir /home/pjesa/mtt/scratch2/installs/CoY6/src
> > Evaluating: require MTT::MPI::Install::MPICH2
> > Evaluating: $ret = ::MPI::Install::MPICH2::Install(@args)
> > Value: mpich2_make_all_arguments
> > Value: mpich2_compiler_name
> > Value: bitness
> > Evaluating: _mpi_install_bitness("")
> > --> Prefix now:
> > --> Remaining (after &): get_mpi_install_bitness("")
> > --> Found func name: get_mpi_install_bitness
> > --> Found beginning of arguments: "")
> > --> Initial param search: "")
> > --> Loop: trimmed search: "")
> > --> Examining char: " (pos 0)
> > --> Found beginning quote
> > --> Found last quote
> > --> Examining char: ) (pos 2)
> > --> Found end of arg (pos 2)
> > Found argument: ""
> > --> Remainder:
> > --> Calling: $ret = MTT::Values::Functions::get_mpi_install_bitness("");
> > _mpi_intall_bitness
> > &_find_libmpi returning:
> > Couldn't find libmpi!
> > --> After eval(string), remaining: 0
> > Got final version before escapes: 0
> > Returning: 0
> > Value: endian
> > Evaluating: _mpi_install_endian("")
> > --> Prefix now:
> > --> Remaining (after &): get_mpi_install_endian("")
> > --> Found func name: get_mpi_install_endian
> > --> Found beginning of arguments: "")
> > --> Initial param search: "")
> > --> Loop: trimmed search: "")
> > --> Examining char: " (pos 0)
> > --> Found beginning quote
> > --> Found last quote
> > --> Examining char: ) (pos 2)
> > --> Found end of arg (pos 2)
> > Found argument: ""
> > --> Remainder:
> > --> Calling: $ret = MTT::Values::Functions::get_mpi_install_endian("");
> > _mpi_intall_endian
> > &_find_libmpi returning:
> > *** Could not find libmpi to calculate endian-ness
> > --> After eval(string), remaining: 0
> > Got final version before escapes: 0
> > Returning: 0
> > Found whatami: /home/pjesa/mtt/collective-bakeoff/client/whatami/whatami
> > Value: platform_type
> > Value: platform_type
> > Value: platform_hardware
> > Value: platform_hardware
> > Value: os_name
> > Value: os_name
> > Value: os_version
> > Value: os_version
> >Skipped MPI install
> > *** MPI install phase complete
> > >> Phase: MPI Install
> >Started:   Thu Sep 27 22:39:37 2007
> >Stopped:   Thu Sep 27 22:39:38 2007
> >Elapsed:   00:00:01
> >Total elapsed: 00:00:01
> > *** Test get phase starting
> > chdir /home/pjesa/mtt/scratch2/sources
> > >> Test get: [test get: netpipe]
> >Checking for new test sources...
> > Value: module
> > Evaluating: Download
> > Replacing vars from section test get: netpipe: Download
> > Got final version before escapes: Download
> > Returning: Download
> > chdir /home/pjesa/mtt/scratch2/sources
> > Making dir: test_get__netpipe (cwd: /home/pjesa/mtt/scratch2/sources)
> > test_get__netpipe does not exist -- creating
> > chdir test_get__netpipe/
> > chdir /home/pjesa/mtt/scratch2/sources
> > chdir /home/pjesa/mtt/scratch2/sources/test_get__netpipe
> > Evaluating: require MTT::Test::Get::Download
> > Evaluating: $ret = ::Test::Get::Download::Get(@args)
> > Value: download_url
> > Evaluating: http://www.scl.ameslab.gov/netpipe/code/NetPIPE_3.6.2.tar.gz
> > Replacing vars from section test get: netpipe:
> >http://www.scl.ameslab.gov/netpipe/code/NetPIPE_3.6.2.tar.gz
> > Got final version before escapes:
> >http://www.scl.ameslab.gov/netpipe/code/NetPIPE_3.6.2.tar.gz
> > Returning: http://www.scl.ameslab.gov/netpipe/code/NetPIPE_3.6.2.tar.gz
> > >> Download got url:
> >http://www.scl.ameslab.gov/netpipe/code/NetPIPE_3.6.2.tar.gz
> > Value: download_username
> > Value: download_password
> > >> MTT::FindProgram::FindProgram returning /usr/bin/wget
> > Running command: wget -nv
> >http://www.scl.ameslab.gov/netpipe/code/NetPIPE_3.6.2.tar.gz
> > OUT:22:39:55
> >URL:http://www.scl.ameslab.gov/netpipe/code/NetPIPE_3.6.2.tar.gz
> >[369585/369585] -> "NetPIPE_3.6.2.tar.gz" [1]
> > Command complete, exit status: 0
> > Value: download_version
> > >> Download complete
> >Got new test sources
> > *** Test get phase complete
> > >> Phase: Test Get
> >Started:   Thu Sep 27 22:39:38 2007
> >Stopped:   Thu Sep 27 22:39:55 2007
> >Elapsed:   00:00:17
> >Total elapsed: 00:00:18
> > *** Test build phase starting
> > chdir /home/pjesa/mtt/scratch2/installs
> > >> Test build [test build: netpipe]
> > Value: test_get
> > Evaluating: netpipe
> > Replacing vars from section test build: netpipe: netpipe
> > Got final version before escapes: netpipe
> > Returning: netpipe
> > *** Test build phase complete
> > >> Phase: Test Build
> >Started:   Thu Sep 27 22:39:55 2007
> >Stopped:   Thu Sep 27 22:39:55 2007
> >Elapsed:   00:00:00
> >Total elapsed: 00:00:18
> > *** Run test phase starting
> > >> Test run [netpipe]
> > Value: test_build
> > Evaluating: netpipe
> > Replacing vars from section test run: netpipe: netpipe
> > Got final version before escapes: netpipe
> > Returning: netpipe
> > *** Run test phase complete
> > >> Phase: Test Run
> >Started:   Thu Sep 27 22:39:55 2007
> >Stopped:   Thu Sep 27 22:39:55 2007
> >Elapsed:   00:00:00
> >Total elapsed: 00:00:18
> > >> Phase: Trim
> >Started:   Thu Sep 27 22:39:55 2007
> >Stopped:   Thu Sep 27 22:39:55 2007
> >Elapsed:   00:00:00
> >Total elapsed: 00:00:18
> > *** Reporter finalizing
> > Evaluating: require MTT::Reporter::MTTDatabase
> > Evaluating: $ret = ::Reporter::MTTDatabase::Finalize(@args)
> > Evaluating: require MTT::Reporter::TextFile
> > Evaluating: $ret = ::Reporter::TextFile::Finalize(@args)
> > *** Reporter finalized
>
> > ___
> > mtt-users mailing list
> > mtt-us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>
> ___
> mtt-users mailing list
> mtt-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>


-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
I'm a bright... http://www.the-brights.net/


[MTT users] Recent OMPI Trunk fails MPI_Allgatherv_* MTT tests

2007-04-01 Thread Tim Mattox

Hi All,
I just checked the recent nightly MTT results and found two things of note,
one for the MTT community, the other for the OMPI developers.

For both, see http://www.open-mpi.org/mtt/reporter.php?do_redir=143
for details of the failed MTT tests with the OMPI trunk at r14180.

1) For MTT developers:
The MTT intel test suite is incorrectly seeing a failed MPI_Allgatherv_f
test as passed, yet is correctly detecting that the MPI_Allgatherv_c
test is failing.
The STDOUT from "passed" MPI_Allgatherv_f seems to indicate that the tests
actually failed in a similar way to the _c version, but MTT thinks it passed.
I've not had time to diagnose why MTT is missing this...  anyone else have
some spare cycles to look at this?

2) For OMPI developers:
The MPI_Allgatherv_* tests are failing as of r14180 in all test conditions
on the IU machines, and others, yet this passed the night before on r14172.

Looking at the svn log for r#'s r14173 thru r14180, I can narrow it down to
one of these changes as the culprit:
https://svn.open-mpi.org/trac/ompi/changeset/14180
https://svn.open-mpi.org/trac/ompi/changeset/14179
https://svn.open-mpi.org/trac/ompi/changeset/14174 (Not likely)

My money is on the much larger r14180 changeset.
The other r#'s aren't culprits for obvious reasons.
--
Tim Mattox - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
   I'm a bright... http://www.the-brights.net/


Re: [MTT users] Minor bug found in MTT 2 client side.

2007-01-19 Thread Tim Mattox

Hi All,

On 1/18/07, Jeff Squyres <jsquy...@cisco.com> wrote:

On Jan 18, 2007, at 10:37 PM, Tim Mattox wrote:

[snip description of a newline bug]


Fixed -- none have newlines now, so they'll all be in the one-line
output format.


Thanks.


> I don't know if it is related or not, but for tests that fail without
> timing out,
> the debug output from MTT for that test does NOT have a line like
> these:
> test_result: 1  (passed)
> test_result: 2  (skipped)
> test_result: 3  (timed out)

Are you sure?  Lines 80-84 of Correctness.pm are:

 if ($results->{timed_out}) {
 Warning("$str TIMED OUT (failed)\n");
 } else {
 Warning("$str FAILED\n");
 }

Are you talking about some other output?  Or are you asking for
something in (parentheses)?


Sorry, I wasn't clear.  The current output for each test in the debug file
usually includes a line "test_result: X" with X replaced by a number.
However, for tests that fail outright, this line is missing.  This missing
line happened to correspond to the tests that had a newline in the result
message that I discussed (snipped) above.

Please don't put in the parentheses things.  That was just me commenting
on which number meant what.



If you're in the middle of revamping your parser to match the MTT 2.0
output, I might suggest that it might be far easier to just
incorporate your desired output into MTT itself, for two reasons:

1. the debug output can change at any time; it was meant to be for
debugging, not necessarily for screen scraping.


Point taken.


2. there would be no need for screen scraping/parsing; you would have
the data immediately available and all you have to do is output it
into the format that you want.  We should be able to accommodate
whatever you need via an MTT Reporter plugin.  I'm guessing this
should be pretty straightforward...?


Where can I find some documentation for or examples of a plugin?


--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems



--
Tim Mattox - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
   I'm a bright... http://www.the-brights.net/


[MTT users] Minor bug found in MTT 2 client side.

2007-01-18 Thread Tim Mattox

Hi MTT developers,
(Sorry to those who are just MTT users, you can skip this message).

I found some minor bugs/inconveniences in lib/MTT/Test/Analyze/Correctness.pm.

It is not consistent about making "$report->{result_message}" get assigned
a value without an embedded newline.  For example, at lines 93 & 96
a newline is embedded, yet at lines 72, 76, 87-88 the string is ended
without a newline.  For our purposes at IU, with a local e-mail
generation script,
it would be great if those \n newlines could be removed from lines 93 & 96,
so we could parse the MTT debug output more easily.

As it is right now, the result message for a test is reported in two very
distinct ways, depending on how a test passes or fails:
1) ugly format from failed tests:
RESULT_MESSAGE_BEGIN
Failed; exit status: 139
RESULT_MESSAGE_END

2) preferred format:
result_message: Failed; timeout expired (120 seconds) )
or
result_message: Passed
or
result_message: Skipped

I don't know if it is related or not, but for tests that fail without
timing out,
the debug output from MTT for that test does NOT have a line like
these:
test_result: 1  (passed)
test_result: 2  (skipped)
test_result: 3  (timed out)

Again, for our e-mail generation script, it would be much easier if there
was a corresponding "test_result: X" for each test regardless of if it failed,
timed-out, was skipped, or passed.

Thanks!
--
Tim Mattox - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
   I'm a bright... http://www.the-brights.net/


Re: [MTT users] [devel-core] MTT 2.0 tutorial teleconference

2007-01-04 Thread Tim Mattox

I'll be there for the call on Tuesday.
We are looking forward to switching IU to MTT 2.0
The new report/results pages are great!
--
Tim Mattox - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
   I'm a bright... http://www.the-brights.net/


[MTT users] Ignore trunk failures on thor from last night

2006-12-11 Thread Tim Mattox

Hello Ethan & others,
I was expanding out testing with the thor cluster this weekend,
and discovered this morning that one of it's nodes has a faulty
Myrinet card or configuration.

So, please remove or ignore the trunk failures from last night &
early today on IU's thor cluster.  I've excluded the faulty node,
and am rerunning the tests.
--
Tim Mattox - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
   I'm a bright... http://www.the-brights.net/


Re: [MTT users] New MTT home page

2006-11-10 Thread Tim Mattox

Hi Ethan,
These look great!  Can you add one more column of choices with the
heading "Failed Test Runs" which would be the same as "Test Runs",
but without the entries that had zero failures.

If it wouldn't be too much trouble, could you also add a "Past 48
Hours" section,
but this is lower priority than adding a "Failed Test Runs" column.

Thanks!

On 11/10/06, Ethan Mallove <ethan.mall...@sun.com> wrote:

Folks,

The MTT home page has been updated to display a number of
quick links divided up by date range, organization, and
testing phase (MPI installation, test build, and test run).
The custom and summary reports are linked from there.

http://www.open-mpi.org/mtt/

--
-Ethan
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




--
Tim Mattox - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
   I'm a bright... http://www.the-brights.net/


Re: [MTT users] nightly OMPI tarballs

2006-11-08 Thread Tim Mattox

It would help us here at IU for MTT as well if the tarball generation was a
little earlier each day.  9pm Indiana/Eastern time would be good I think.
That would make it 6pm West coast time...  Does that work for
the West coasters?  or should we do 10pm Eastern/7pm West?
Gleb, George, hpcstork, as three that I know do svn commits outside the typical
US workday, how would this affect you?

Maybe we could make the non-trunk tarballs even earlier, since
the gatekeepers would know when we were "done for the day".
What time would Sun need to have the 1.2 tarballs ready
for them to do their MTT runs?  7pm Eastern?

I can work on making the tarball generation go more quickly,
but I suspect I can't get it reliably faster than 1 hour, especially if
we have changes on all three branches (trunk, v1.1, v1.2).
I have some ideas on how to speed it up though from it's
current 2 hour span.  One of the ideas, is to have the v1.2 (and maybe v1.1)
tarballs be built earlier, so that we only have one tarball to build
at the designated time.

As for doing multiple builds per day, I am a bit apposed to doing that
on a regular basis, for two reasons:
1) It takes time & resources (both human and computer) per tarball
for testing, and to look at the results from the testing.  One set per
day seems at the moment to be what we as a group can currently handle.
2) If we have different groups testing from different tarball sets,
then it would become harder to aggregate the testing results,
since we would not necessarily be testing the same tarball.

On 11/8/06, Jeff Squyres <jsquy...@cisco.com> wrote:

I'm wondering if it's worthwhile to either a) move back the nightly
tarball generation to, say, 9pm US Indiana time or b) perhaps make
the tarballs at multiple times during the day.

Since we're doing more and more testing, it seems like we need more
time to do it before the 9am reports.  Right now, we're pretty
limited to starting at about 2am (to guarantee that the tarballs have
finished building).  If you start before then, you could be testing a
tarball that's about a day old.

This was happening to sun, for example, who (I just found out) starts
their testing at 7pm because they have limited time and access to
resources (starting at 7pm lets them finish all their testing by 9am).

So what do people think about my proposals from above?  Either 9pm,
or perhaps make them every 6 hours throughout the day.

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




--
Tim Mattox - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
   I'm a bright... http://www.the-brights.net/