Re: [MTT users] MTT username/password and report upload

2018-03-14 Thread Josh Hursey
I am also interested in some help moving to the Python client. Since a few
of us that are interested are going to be at the face-to-face next week,
could Howard (or someone familiar with setting it up for OMPI testing) give
us a presentation on it? I'll add it to the agenda. The initial hurdle of
getting started has prevented me from taking the leap, but maybe next week
is a good opportunity to do so.

On Tue, Mar 13, 2018 at 9:14 PM, Kawashima, Takahiro <
t-kawash...@jp.fujitsu.com> wrote:

> Jeff,
>
> Thank you. I received the password. I cannot remember I had received it
> before...
>
> My colleague was working using the Perl client before but the work was
> suspended because his job was changed. It is the reason we use the Perl
> client currently. We want to change it to the Python client if the change
> does not require much work.
>
> Takahiro Kawashima,
> MPI development team,
> Fujitsu
>
> > Yes, it's trivial to reset the Fujitsu MTT password -- I'll send you a
> mail off-list with the new password.
> >
> > If you're just starting up with MTT, you might want to use the Python
> client, instead. That's where 95% of ongoing development is occurring.
> >
> > If all goes well, I plan to sit down with Howard + Ralph next week and
> try converting Cisco's Perl config to use the Python client.
> >
> >
> > > On Mar 12, 2018, at 10:23 PM, Kawashima, Takahiro <
> t-kawash...@jp.fujitsu.com> wrote:
> > >
> > > Hi,
> > >
> > > Fujitsu has resumed the work of running MTT on our machines.
> > >
> > > Could you give me username/password to upload the report to the server?
> > > I suppose the username is "fujitsu".
> > >
> > > Our machines are not connected to the Internet directly.
> > > Is there a document for uploading the report outside the MTT run?
> > > We are using Perl client currently.
> > >
> > > I'll attend the OMPI developer's meeting next week.
> > > I hope we can talk about it.
> > >
> > > Takahiro Kawashima,
> > > MPI development team,
> > > Fujitsu
>
> ___
> mtt-users mailing list
> mtt-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/mtt-users
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/mtt-users

Re: [MTT users] MTT Server Downtime - Fri., Oct. 21, 2016 (Updated)

2016-10-22 Thread Josh Hursey
MTT Service has been successfully moved to the AWS servers.

Everything looks good so far, so folks can start using it. Note that the
DNS record had to be updated. This -should- have propagated everywhere by
now. I had to clear my browser cache to get it to resolve correctly the
first time.

This move requires -no- changes to any of your MTT client setups.

Let me know if you have any issues.

-- Josh


On Fri, Oct 21, 2016 at 9:53 PM, Josh Hursey <jjhur...@open-mpi.org> wrote:

> I have taken down the MTT Reporter at mtt.open-mpi.org while we finish up
> the migration. I'll send out another email when everything is up and
> running again.
>
> On Fri, Oct 21, 2016 at 10:17 AM, Josh Hursey <jjhur...@open-mpi.org>
> wrote:
>
>> Reminder that the MTT will go offline starting at *Noon US Eastern (11
>> am US Central) today*.
>>
>> Any MTT client submissions to the MTT database will return in error
>> during this window of downtime. I will try to keep the MTT Reporter
>> interface as available as possible (although permalinks will not be
>> available) at the normal URL.
>>   https://mtt.open-mpi.org
>> However, there will be a time when that will go down as well. I'll send a
>> note when that occurs.
>>
>> I will send another email once MTT is back online.
>>
>> Thank you for your patience. Let me know if you have any questions.
>>
>> -- Josh
>>
>>
>> On Wed, Oct 19, 2016 at 10:14 AM, Josh Hursey <jjhur...@open-mpi.org>
>> wrote:
>>
>>> Based on current estimates we need to extend the window of downtime for
>>> MTT to 24 hours.
>>>
>>> *Start time*: *Fri., Oct. 21, 2016 at Noon US Eastern* (11 am US
>>> Central)
>>> *End time*: *Sat., Oct. 22, 2016 at Noon US Eastern* (estimated)
>>>
>>> I will send an email just before taking down the MTT site on Friday, and
>>> another once it is back up on Sat.
>>>
>>> During this time all of the MTT services will be down - MTT Reporter and
>>> MTT submission interface. If you have an MTT client running during this
>>> time you will receive an error message if you try to submit results to the
>>> MTT server.
>>>
>>> Let me know if you have any questions or concerns.
>>>
>>>
>>> On Tue, Oct 18, 2016 at 10:59 AM, Josh Hursey <jjhur...@open-mpi.org>
>>> wrote:
>>>
>>>> We are moving this downtime to *Friday, Oct. 21 from 2-5 pm US Eastern*
>>>> .
>>>>
>>>> We hit a snag with the AWS configuration that we are working through.
>>>>
>>>> On Sun, Oct 16, 2016 at 9:53 AM, Josh Hursey <jjhur...@open-mpi.org>
>>>> wrote:
>>>>
>>>>> I will announce this on the Open MPI developer's teleconf on Tuesday,
>>>>> before the move.
>>>>>
>>>>> Geoff - Please add this item to the agenda.
>>>>>
>>>>>
>>>>> Short version:
>>>>> ---
>>>>> MTT server (mtt.open-mpi.org) will be going down for maintenance on
>>>>> Tuesday, Oct. 18, 2016 from 2-5 pm US Eastern. During this time the MTT
>>>>> Reporter and the MTT client submission interface will not be accessible. I
>>>>> will send an email out when the service is back online.
>>>>>
>>>>>
>>>>> Longer version:
>>>>> ---
>>>>> We need to move the MTT Server/Database from the IU server to the AWS
>>>>> server. This move will be completely transparent to users submitting to 
>>>>> the
>>>>> database, except for a window of downtime to move the database.
>>>>>
>>>>> I estimate that moving the database will take about two hours. So I
>>>>> have blocked off three hours to give us time to test, and redirect the DNS
>>>>> record.
>>>>>
>>>>> Once the service comes back online, you should be able to access MTT
>>>>> using themtt.open-mpi.org URL. No changes are needed in your MTT
>>>>> client setup, and all permalinks are expected to still work after the 
>>>>> move.
>>>>>
>>>>>
>>>>> Let me know if you have any questions or concerns about the move.
>>>>>
>>>>>
>>>>> --
>>>>> Josh Hursey
>>>>> IBM Spectrum MPI Developer
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Josh Hursey
>>>> IBM Spectrum MPI Developer
>>>>
>>>
>>>
>>>
>>> --
>>> Josh Hursey
>>> IBM Spectrum MPI Developer
>>>
>>
>>
>>
>> --
>> Josh Hursey
>> IBM Spectrum MPI Developer
>>
>
>
>
> --
> Josh Hursey
> IBM Spectrum MPI Developer
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users

Re: [MTT users] MTT Server Downtime - Fri., Oct. 21, 2016 (Updated)

2016-10-21 Thread Josh Hursey
I have taken down the MTT Reporter at mtt.open-mpi.org while we finish up
the migration. I'll send out another email when everything is up and
running again.

On Fri, Oct 21, 2016 at 10:17 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:

> Reminder that the MTT will go offline starting at *Noon US Eastern (11 am
> US Central) today*.
>
> Any MTT client submissions to the MTT database will return in error during
> this window of downtime. I will try to keep the MTT Reporter interface as
> available as possible (although permalinks will not be available) at the
> normal URL.
>   https://mtt.open-mpi.org
> However, there will be a time when that will go down as well. I'll send a
> note when that occurs.
>
> I will send another email once MTT is back online.
>
> Thank you for your patience. Let me know if you have any questions.
>
> -- Josh
>
>
> On Wed, Oct 19, 2016 at 10:14 AM, Josh Hursey <jjhur...@open-mpi.org>
> wrote:
>
>> Based on current estimates we need to extend the window of downtime for
>> MTT to 24 hours.
>>
>> *Start time*: *Fri., Oct. 21, 2016 at Noon US Eastern* (11 am US Central)
>> *End time*: *Sat., Oct. 22, 2016 at Noon US Eastern* (estimated)
>>
>> I will send an email just before taking down the MTT site on Friday, and
>> another once it is back up on Sat.
>>
>> During this time all of the MTT services will be down - MTT Reporter and
>> MTT submission interface. If you have an MTT client running during this
>> time you will receive an error message if you try to submit results to the
>> MTT server.
>>
>> Let me know if you have any questions or concerns.
>>
>>
>> On Tue, Oct 18, 2016 at 10:59 AM, Josh Hursey <jjhur...@open-mpi.org>
>> wrote:
>>
>>> We are moving this downtime to *Friday, Oct. 21 from 2-5 pm US Eastern*.
>>>
>>> We hit a snag with the AWS configuration that we are working through.
>>>
>>> On Sun, Oct 16, 2016 at 9:53 AM, Josh Hursey <jjhur...@open-mpi.org>
>>> wrote:
>>>
>>>> I will announce this on the Open MPI developer's teleconf on Tuesday,
>>>> before the move.
>>>>
>>>> Geoff - Please add this item to the agenda.
>>>>
>>>>
>>>> Short version:
>>>> ---
>>>> MTT server (mtt.open-mpi.org) will be going down for maintenance on
>>>> Tuesday, Oct. 18, 2016 from 2-5 pm US Eastern. During this time the MTT
>>>> Reporter and the MTT client submission interface will not be accessible. I
>>>> will send an email out when the service is back online.
>>>>
>>>>
>>>> Longer version:
>>>> ---
>>>> We need to move the MTT Server/Database from the IU server to the AWS
>>>> server. This move will be completely transparent to users submitting to the
>>>> database, except for a window of downtime to move the database.
>>>>
>>>> I estimate that moving the database will take about two hours. So I
>>>> have blocked off three hours to give us time to test, and redirect the DNS
>>>> record.
>>>>
>>>> Once the service comes back online, you should be able to access MTT
>>>> using themtt.open-mpi.org URL. No changes are needed in your MTT
>>>> client setup, and all permalinks are expected to still work after the move.
>>>>
>>>>
>>>> Let me know if you have any questions or concerns about the move.
>>>>
>>>>
>>>> --
>>>> Josh Hursey
>>>> IBM Spectrum MPI Developer
>>>>
>>>
>>>
>>>
>>> --
>>> Josh Hursey
>>> IBM Spectrum MPI Developer
>>>
>>
>>
>>
>> --
>> Josh Hursey
>> IBM Spectrum MPI Developer
>>
>
>
>
> --
> Josh Hursey
> IBM Spectrum MPI Developer
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users

Re: [MTT users] MTT Server Downtime - Fri., Oct. 21, 2016 (Updated)

2016-10-21 Thread Josh Hursey
Reminder that the MTT will go offline starting at *Noon US Eastern (11 am
US Central) today*.

Any MTT client submissions to the MTT database will return in error during
this window of downtime. I will try to keep the MTT Reporter interface as
available as possible (although permalinks will not be available) at the
normal URL.
  https://mtt.open-mpi.org
However, there will be a time when that will go down as well. I'll send a
note when that occurs.

I will send another email once MTT is back online.

Thank you for your patience. Let me know if you have any questions.

-- Josh


On Wed, Oct 19, 2016 at 10:14 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:

> Based on current estimates we need to extend the window of downtime for
> MTT to 24 hours.
>
> *Start time*: *Fri., Oct. 21, 2016 at Noon US Eastern* (11 am US Central)
> *End time*: *Sat., Oct. 22, 2016 at Noon US Eastern* (estimated)
>
> I will send an email just before taking down the MTT site on Friday, and
> another once it is back up on Sat.
>
> During this time all of the MTT services will be down - MTT Reporter and
> MTT submission interface. If you have an MTT client running during this
> time you will receive an error message if you try to submit results to the
> MTT server.
>
> Let me know if you have any questions or concerns.
>
>
> On Tue, Oct 18, 2016 at 10:59 AM, Josh Hursey <jjhur...@open-mpi.org>
> wrote:
>
>> We are moving this downtime to *Friday, Oct. 21 from 2-5 pm US Eastern*.
>>
>> We hit a snag with the AWS configuration that we are working through.
>>
>> On Sun, Oct 16, 2016 at 9:53 AM, Josh Hursey <jjhur...@open-mpi.org>
>> wrote:
>>
>>> I will announce this on the Open MPI developer's teleconf on Tuesday,
>>> before the move.
>>>
>>> Geoff - Please add this item to the agenda.
>>>
>>>
>>> Short version:
>>> ---
>>> MTT server (mtt.open-mpi.org) will be going down for maintenance on
>>> Tuesday, Oct. 18, 2016 from 2-5 pm US Eastern. During this time the MTT
>>> Reporter and the MTT client submission interface will not be accessible. I
>>> will send an email out when the service is back online.
>>>
>>>
>>> Longer version:
>>> ---
>>> We need to move the MTT Server/Database from the IU server to the AWS
>>> server. This move will be completely transparent to users submitting to the
>>> database, except for a window of downtime to move the database.
>>>
>>> I estimate that moving the database will take about two hours. So I have
>>> blocked off three hours to give us time to test, and redirect the DNS
>>> record.
>>>
>>> Once the service comes back online, you should be able to access MTT
>>> using themtt.open-mpi.org URL. No changes are needed in your MTT client
>>> setup, and all permalinks are expected to still work after the move.
>>>
>>>
>>> Let me know if you have any questions or concerns about the move.
>>>
>>>
>>> --
>>> Josh Hursey
>>> IBM Spectrum MPI Developer
>>>
>>
>>
>>
>> --
>> Josh Hursey
>> IBM Spectrum MPI Developer
>>
>
>
>
> --
> Josh Hursey
> IBM Spectrum MPI Developer
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users

Re: [MTT users] MTT Server Downtime - Fri., Oct. 21, 2016 (Updated)

2016-10-19 Thread Josh Hursey
Based on current estimates we need to extend the window of downtime for MTT
to 24 hours.

*Start time*: *Fri., Oct. 21, 2016 at Noon US Eastern* (11 am US Central)
*End time*: *Sat., Oct. 22, 2016 at Noon US Eastern* (estimated)

I will send an email just before taking down the MTT site on Friday, and
another once it is back up on Sat.

During this time all of the MTT services will be down - MTT Reporter and
MTT submission interface. If you have an MTT client running during this
time you will receive an error message if you try to submit results to the
MTT server.

Let me know if you have any questions or concerns.


On Tue, Oct 18, 2016 at 10:59 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:

> We are moving this downtime to *Friday, Oct. 21 from 2-5 pm US Eastern*.
>
> We hit a snag with the AWS configuration that we are working through.
>
> On Sun, Oct 16, 2016 at 9:53 AM, Josh Hursey <jjhur...@open-mpi.org>
> wrote:
>
>> I will announce this on the Open MPI developer's teleconf on Tuesday,
>> before the move.
>>
>> Geoff - Please add this item to the agenda.
>>
>>
>> Short version:
>> ---
>> MTT server (mtt.open-mpi.org) will be going down for maintenance on
>> Tuesday, Oct. 18, 2016 from 2-5 pm US Eastern. During this time the MTT
>> Reporter and the MTT client submission interface will not be accessible. I
>> will send an email out when the service is back online.
>>
>>
>> Longer version:
>> ---
>> We need to move the MTT Server/Database from the IU server to the AWS
>> server. This move will be completely transparent to users submitting to the
>> database, except for a window of downtime to move the database.
>>
>> I estimate that moving the database will take about two hours. So I have
>> blocked off three hours to give us time to test, and redirect the DNS
>> record.
>>
>> Once the service comes back online, you should be able to access MTT
>> using themtt.open-mpi.org URL. No changes are needed in your MTT client
>> setup, and all permalinks are expected to still work after the move.
>>
>>
>> Let me know if you have any questions or concerns about the move.
>>
>>
>> --
>> Josh Hursey
>> IBM Spectrum MPI Developer
>>
>
>
>
> --
> Josh Hursey
> IBM Spectrum MPI Developer
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users

Re: [MTT users] MTT Server Downtime - Tues., Oct. 18, 2016

2016-10-18 Thread Josh Hursey
We are moving this downtime to *Friday, Oct. 21 from 2-5 pm US Eastern*.

We hit a snag with the AWS configuration that we are working through.

On Sun, Oct 16, 2016 at 9:53 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:

> I will announce this on the Open MPI developer's teleconf on Tuesday,
> before the move.
>
> Geoff - Please add this item to the agenda.
>
>
> Short version:
> ---
> MTT server (mtt.open-mpi.org) will be going down for maintenance on
> Tuesday, Oct. 18, 2016 from 2-5 pm US Eastern. During this time the MTT
> Reporter and the MTT client submission interface will not be accessible. I
> will send an email out when the service is back online.
>
>
> Longer version:
> ---
> We need to move the MTT Server/Database from the IU server to the AWS
> server. This move will be completely transparent to users submitting to the
> database, except for a window of downtime to move the database.
>
> I estimate that moving the database will take about two hours. So I have
> blocked off three hours to give us time to test, and redirect the DNS
> record.
>
> Once the service comes back online, you should be able to access MTT using
> themtt.open-mpi.org URL. No changes are needed in your MTT client setup,
> and all permalinks are expected to still work after the move.
>
>
> Let me know if you have any questions or concerns about the move.
>
>
> --
> Josh Hursey
> IBM Spectrum MPI Developer
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users

[MTT users] MTT Server Downtime - Tues., Oct. 18, 2016

2016-10-16 Thread Josh Hursey
I will announce this on the Open MPI developer's teleconf on Tuesday,
before the move.

Geoff - Please add this item to the agenda.


Short version:
---
MTT server (mtt.open-mpi.org) will be going down for maintenance on
Tuesday, Oct. 18, 2016 from 2-5 pm US Eastern. During this time the MTT
Reporter and the MTT client submission interface will not be accessible. I
will send an email out when the service is back online.


Longer version:
---
We need to move the MTT Server/Database from the IU server to the AWS
server. This move will be completely transparent to users submitting to the
database, except for a window of downtime to move the database.

I estimate that moving the database will take about two hours. So I have
blocked off three hours to give us time to test, and redirect the DNS
record.

Once the service comes back online, you should be able to access MTT using
themtt.open-mpi.org URL. No changes are needed in your MTT client setup,
and all permalinks are expected to still work after the move.


Let me know if you have any questions or concerns about the move.


-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users

Re: [MTT users] Open MPI - MTT perf

2016-09-01 Thread Josh Hursey
The MTT perl client (./client/mtt) has some performance gathering and
reporting built in as modules. These modules know how to breakdown the
results from, say, NetPipe and package it up to send to the MTT Server for
storage. The available modules are here:

https://github.com/open-mpi/mtt/tree/master/lib/MTT/Test/Analyze/Performance
There are a couple examples in this sample file on how to set it up in your
nightly runs (links to NetPipe below, but there are others in the file):

https://github.com/open-mpi/mtt/blob/master/samples/perl/ompi-core-template.ini#L458-L462

https://github.com/open-mpi/mtt/blob/master/samples/perl/ompi-core-template.ini#L532-L541

https://github.com/open-mpi/mtt/blob/master/samples/perl/ompi-core-template.ini#L692-L704

The MTT Reporter will display the performance in the Test Run results
column.
  https://mtt.open-mpi.org/index.php?do_redir=2349


Currently, the report generated by MTT could use some work. I actually
don't know if the Performance button works any more.

We started brainstorming ways to improve this aspect of MTT at the Open MPI
developers meeting in Aug. 2016. We are just getting started on that. If
you are interested in participating in those discussions let us know.
Otherwise we will try to keep the list updated on any new performance
features.


On Thu, Sep 1, 2016 at 2:01 AM, Christoph Niethammer <nietham...@hlrs.de>
wrote:

> Hi Jeff,
>
> I recognized, that there is a perf field in the MTT database.
> Unfortunately I could not find something in the wiki.
> Can anyone give me a link or some information about about it?
>
> How to report results there? And can one automatically detect
> degradations/compare version results?
>
> Best
> Christoph Niethammer
>
> --
>
> Christoph Niethammer
> High Performance Computing Center Stuttgart (HLRS)
> Nobelstrasse 19
> 70569 Stuttgart
>
> Tel: ++49(0)711-685-87203
> email: nietham...@hlrs.de
> http://www.hlrs.de/people/niethammer
> ___
> mtt-users mailing list
> mtt-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users

Re: [MTT users] Pyclient fails to report to IU database

2016-08-10 Thread Josh Hursey
This should be the correct (current) url:
  https://mtt.open-mpi.org/submit/cpy/api/

We might want to change it in the future, but the following is just for the
perl client:
  https://mtt.open-mpi.org/submit/

It looks like the service at IU has been down since May 13. I just
restarted it so you should be fine now. It should have auto-restarted, so
we'll have to look into why that didn't happen when we move it over to the
new server next week.


On Wed, Aug 10, 2016 at 6:14 PM, r...@open-mpi.org <r...@open-mpi.org> wrote:

> Hi folks
>
> I hit the following error when trying to upload results from Pyclient to
> IU Reporter:
>
> <<<<<<< Response -->>>>>>
> Result: 404: text/html; charset=iso-8859-1
> {'date': 'Wed, 10 Aug 2016 23:07:03 GMT', 'content-length': '296',
> 'content-type': 'text/html; charset=iso-8859-1', 'connection': 'close',
> 'server': 'Apache/2.2.15 (Red Hat)'}
> Not Found
> <<<<<<< Raw Output (Start) >>>>>>
> 
> 
> 404 Not Found
> 
> Not Found
> The requested URL /submit//submit was not found on this server.
> 
> Apache/2.2.15 (Red Hat) Server at mtt.open-mpi.org Port
> 443
> 
>
> <<<<<<< Raw Output (End  ) >>>>>>
>
> Here was my .ini stage:
>
> [Reporter:IUdatabase]
> plugin = IUDatabase
>
> realm = OMPI
> username = intel
> pwfile = /home/common/mttpwd.txt
> platform = bend-rsh
> hostname = rhc00[1-2]
> url = https://mtt.open-mpi.org/submit/
> email = r...@open-mpi.org
>
> #------
>
> Is the URL incorrect?
> Ralph
>
> ___
> mtt-users mailing list
> mtt-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
mtt-users mailing list
mtt-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/mtt-users

Re: [MTT users] Python client

2015-12-13 Thread Josh Hursey
I think this is fine. If we do start to organize ourselves for a formal
release then we might want to move to pull requests to keep the branch
stable for a bit, but for now this is ok with me.

The Python client looks like it will be a nice addition. Hopefully, I will
have the REST submission interface finished in the next couple months and
that will make it easier to submit to the DB. I probably won't get cycles
to finish that work until after Jan 12. I plan to have it done before the
end of Jan - let me know if you need it before then.


On Thu, Dec 10, 2015 at 1:10 PM, Ralph Castain  wrote:

> Hey folks
>
> I'm working on the Python client and it is coming along pretty well. The
> code is completely separate from the Perl-based client and doesn't interact
> with it, so I would like to push it into the repo on an on-going basis so
> others can look at it and comment as I go rather than hold it until it is
> "complete".
>
> Any objections? Obviously, it won't be fully functional at times - mostly
> available for architectural review and directional suggestions.
>
> Ralph
>
>
> ___
> mtt-users mailing list
> mtt-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
> Link to this post:
> http://www.open-mpi.org/community/lists/mtt-users/2015/12/0834.php
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [MTT users] Actual releases?

2015-12-10 Thread Josh Hursey
I think that would be good. I won't have any cycles to help until after the
first of the year. We started working towards a release way back when, but
I think we got stuck with the license to package up the graphing library
for the MTT Reporter. We could just remove that feature from the release
since the new reporter will do something different.

Releasing where we are now should be pretty straight forward if folks just
want to posted a versioned tarball. We would have to assess how to get MTT
into a more packaged configuration (e.g., rpm) if folks want that.


On Wed, Dec 9, 2015 at 11:36 AM, Ralph Castain  wrote:

> Hey folks
>
> There is interest in packaging MTT in the OpenHPC distribution. However,
> we don't actually have "releases" of MTT. Any objection to actually
> tagging/releasing versions?
>
> Ralph
>
>
> ___
> mtt-users mailing list
> mtt-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
> Searchable archives:
> http://www.open-mpi.org/community/lists/mtt-users/2015/12/0832.php
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [MTT users] [OMPI devel] Open MPI MTT is moving

2012-11-05 Thread Josh Hursey
The MTT server migration went well this weekend. I have updated the
Open MPI website to redirect you appropriately to the new MTT
Reporter.

You will need to update your .ini files to submit your tests to the
new server at the address below:
  https://mtt.open-mpi.org/submit/

Let me know if you experience any problems with the new server.

-- Josh

On Fri, Nov 2, 2012 at 9:26 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:
> Reminder that we will be shutting down the MTT submission and reporter
> services this weekend to migrate it to another machine. The MTT
> services will go offline at COB today, and be brought back by Monday
> morning.
>
>
> On Wed, Oct 31, 2012 at 7:54 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
>> *** IF YOU RUN MTT, YOU NEED TO READ THIS.
>>
>> Due to some server re-organization at Indiana University (read: our gracious 
>> hosting provider), we are moving the Open MPI community MTT database to a 
>> new server.  Instead of being found under www.open-mpi.org/mtt/, the OMPI 
>> MTT results will soon be located under mtt.open-mpi.org.
>>
>> Josh and I have been running tests on the new server and we think it is 
>> ready; it's now time to move the rest of the community to it.
>>
>> 1. In order to make this change, we need some "quiet time" where no one is 
>> submitting new MTT results.  As such, we will be shutting down 
>> MTT/disallowing new MTT submissions over this upcoming weekend: from COB 
>> Friday, 2 Nov 2012 through Monday morning, 5 Nov 2012 (all times US Central).
>>
>> ** Translation: don't expect submissions or queries to work after about 5pm 
>> on Friday through about 8am Monday (US Central).
>>
>>  Super obvious translation: turn off your MTT runs this weekend.
>>
>> 2. After this weekend, you will need to update your MTT submission URL from:
>>
>> https://www.open-mpi.org/mtt/submit/
>>
>> to
>>
>> https://mtt.open-mpi.org/submit/
>>
>> Thanks!
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> --
> Joshua Hursey
> Assistant Professor of Computer Science
> University of Wisconsin-La Crosse
> http://cs.uwlax.edu/~jjhursey



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey



Re: [MTT users] [OMPI devel] Open MPI MTT is moving

2012-11-02 Thread Josh Hursey
Reminder that we will be shutting down the MTT submission and reporter
services this weekend to migrate it to another machine. The MTT
services will go offline at COB today, and be brought back by Monday
morning.


On Wed, Oct 31, 2012 at 7:54 AM, Jeff Squyres  wrote:
> *** IF YOU RUN MTT, YOU NEED TO READ THIS.
>
> Due to some server re-organization at Indiana University (read: our gracious 
> hosting provider), we are moving the Open MPI community MTT database to a new 
> server.  Instead of being found under www.open-mpi.org/mtt/, the OMPI MTT 
> results will soon be located under mtt.open-mpi.org.
>
> Josh and I have been running tests on the new server and we think it is 
> ready; it's now time to move the rest of the community to it.
>
> 1. In order to make this change, we need some "quiet time" where no one is 
> submitting new MTT results.  As such, we will be shutting down 
> MTT/disallowing new MTT submissions over this upcoming weekend: from COB 
> Friday, 2 Nov 2012 through Monday morning, 5 Nov 2012 (all times US Central).
>
> ** Translation: don't expect submissions or queries to work after about 5pm 
> on Friday through about 8am Monday (US Central).
>
>  Super obvious translation: turn off your MTT runs this weekend.
>
> 2. After this weekend, you will need to update your MTT submission URL from:
>
> https://www.open-mpi.org/mtt/submit/
>
> to
>
> https://mtt.open-mpi.org/submit/
>
> Thanks!
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey



Re: [MTT users] Fwd: [Alert] Found server-side submit error messages

2008-10-30 Thread Josh Hursey
This is probably me. I haven't had a chance to do anything about it  
yet. Hopefully tomorrow.


I'm running the release branch (I believe), does this option exist for  
the release branch yet?


-- Josh

On Oct 30, 2008, at 11:36 AM, Ethan Mallove wrote:


On Wed, Oct/29/2008 09:15:37AM, Ethan Mallove wrote:

On Tue, Oct/28/2008 06:17:12PM, Jeff Squyres wrote:
Should we set a default value of report_after_n_results to, say,  
50 or 100?




We should.

-Ethan



On Oct 28, 2008, at 6:15 PM, Jeff Squyres wrote:


That host is in one of IU's clusters (odin).

Tim/Josh -- this is you guys...


Got another submit.php failure alert last night from IU. If
the IU tests are running on the MTT trunk, an "svn up" on it
should eliminate the issue. (report_after_n_results now
defaults to 100 - see r1239.)

-Ethan





On Oct 28, 2008, at 3:45 PM, Ethan Mallove wrote:


Folks,

I got an alert from the http-log-checker.pl script. Somebody  
appears to

have
lost some MTT results. (Possibly due to an oversized database  
submission

to
submit/index.php?) There's an open ticket for this (see
https://svn.open-mpi.org/trac/mtt/ticket/375).  Currently there  
exists a

simple
workaround for this problem. Put the below line in the  
problematic "Test

run"
section(s). This will prevent oversided submissions by directing  
MTT to

submit
the results in batches of 50 results instead of an entire  
section at a

time,
which can reach 400+ for an Intel test run section.

 report_after_n_results = 50

It's hard to know whose errors are in the HTTP error log with  
only the IP
address. If you want to verify they are or are not yours, visit  
a bogus

URL off
open-mpi.org, e.g., www.open-mpi.org/what-is-foobars-ip-address,  
and ping

me
about it. This will write your IP address to the log file, and  
then this

can be
matched with the IP addr against the submit.php errors.

-Ethan


- Forwarded message from Ethan Mallove   
-


From: Ethan Mallove 
Date: Tue, 28 Oct 2008 08:00:41 -0400
To: ethan.mall...@sun.com, http-log-checker.pl-no-re...@open-mpi.org
Subject: [Alert] Found server-side submit error messages
Original-recipient: rfc822;ethan.mall...@sun.com

This email was automatically sent by http-log-checker.pl. You have
received
it because some error messages were found in the HTTP(S) logs that
might indicate some MTT results were not successfully submitted  
by the

server-side PHP submit script (even if the MTT client has not
indicated a submission error).

###
#
# The below log messages matched "gz.*submit/index.php" in
# /var/log/httpd/www.open-mpi.org/ssl_error_log
#
###

[client 129.79.240.114] PHP Warning:  gzeof(): supplied argument  
is not a

valid stream resource in
/nfs/rontok/xraid/data/osl/www/www.open-mpi.org/mtt/submit/index.php 
 on

line 1923
[client 129.79.240.114] PHP Warning:  gzgets(): supplied  
argument is not

a valid stream resource in
/nfs/rontok/xraid/data/osl/www/www.open-mpi.org/mtt/submit/index.php 
 on

line 1924
...





- End forwarded message -
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



--
Jeff Squyres
Cisco Systems

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



--
Jeff Squyres
Cisco Systems

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] MTT server side problem

2008-05-07 Thread Josh Hursey

Pasha,

Looking at the patch I'm a little bit conserned. The  
"get_table_fields()" is, as you mentioned, no longer used so should be  
removed. However the other functions are critical to the submission  
script particularly 'do_pg_connect' which opens the connection to the  
backend database.


Are you using the current development trunk (mtt/trunk) or the stable  
release branch (mtt/branches/ompi-core-testers)?


Can you send us the error messages that you were receiving?

Cheers,
Josh

On May 7, 2008, at 4:49 AM, Pavel Shamis (Pasha) wrote:


Hi,
I upgraded the server side (the mtt is still running , so don't know  
if the problem was resolved)
During upgrade I had some problem with the submit/index.php script,  
it had some duplicated functions and some of them were broken.

Please review the attached patch.

Pasha

Ethan Mallove wrote:

On Tue, May/06/2008 06:29:33PM, Pavel Shamis (Pasha) wrote:


I'm not sure which cron jobs you're referring to. Do you
mean these?

 https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron


I talked about this one: 
https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance



I'm guessing you would only be concerned with the below
periodic-maintenance.pl script, which just runs
ANALYZE/VACUUM queries. I think you can start that up
whenever you want (and it should optimize the Reporter).

 
https://svn.open-mpi.org/trac/mtt/browser/trunk/server/sql/cron/periodic-maintenance.pl

-Ethan




The only thing there are the regular
mtt-resu...@open-mpi.org email alerts and some out-of-date
DB monitoring junk. You can ignore that stuff.

Josh, are there some nightly (DB
pruning/cleaning/vacuuming?) cron jobs that Pasha should be
running?

-Ethan



Thanks.

Ethan Mallove wrote:


Hi Pasha,

I thought this issue was solved in r1119 (see below). Do you
have the latest mtt/server scripts?

 https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit

-Ethan

On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote:


About the issue:
1. On client side I see ""*** WARNING: MTTDatabase client did  
not get a serial"
As result of the error some of MTT results is not visible via  
the web reporter

2. On server side I found follow error message:
[client 10.4.3.214] PHP Fatal error:  Allowed memory size of  
33554432 bytes exhausted (tried to allocate 23592960
bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) :  
eval()'d code on line 77515
[Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting  
down
[Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled  
(wrapper: /usr/sbin/suexec)
[Mon May 05 19:30:54 2008] [notice] Digest: generating secret  
for digest authentication ...

[Mon May 05 19:30:54 2008] [notice] Digest: done
[Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP  
LDAP SDK
[Mon May 05 19:30:54 2008] [notice] LDAP: SSL support  
unavailable

My memory limit in php.ini file was set on 256MB !

Any ideas ?

Thanks.


--
Pavel Shamis (Pasha)
Mellanox Technologies

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




--
Pavel Shamis (Pasha)
Mellanox Technologies

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




--
Pavel Shamis (Pasha)
Mellanox Technologies

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users







--
Pavel Shamis (Pasha)
Mellanox Technologies

Index: submit/index.php
===
--- submit/index.php(revision 1200)
+++ submit/index.php(working copy)
@@ -1,6 +1,7 @@
+# Copyright (c) 2008  Mellanox Technologies.  All rights  
reserved.


#
#
@@ -24,8 +25,7 @@ if (file_exists("$topdir/config.inc")) {
ini_set("memory_limit", "32M");

$topdir = '..';
-$ompi_home = '/l/osl/www/doc/www.open-mpi.org';
-include_once("$ompi_home/dbpassword.inc");
+include_once("$topdir/database.inc");
include_once("$topdir/reporter.inc");


@@ -1465,60 +1465,6 @@ function get_table_indexes($table_name,
return simple_select($sql_cmd);
}

-# Function used to determine which _POST fields
-# to INSERT. Prevent non-existent fields from being
-# INSERTed
-function get_table_fields($table_name) {
-
-global $dbname;
-global $id;
-
-# These indexes are special in that they link phases
-# together and hence, can and do show up in _POST
-if ($table_name == "test_build")
-$special_indexes = array("mpi_install$id");
-elseif ($table_name == "test_run")
-$special_indexes = array("test_build$id");
-
-# Crude way to tell whether a field is an index
-$is_not_index_clause =
-   "\n\t (table_name = '$table_name' AND NOT " .
-   "\n\t (data_type = 'integer' AND " .
-   "\n\t column_name ~ '_id$' AND " .
-

Re: [MTT users] MTT server side problem

2008-05-07 Thread Josh Hursey

Pasha,

All of the scripts can be run whenever. They should not be saving  
state between runs, so there should not be any bad effects on the  
database by starting them up late in the game.


The 'periodic-maintenance.pl' script is a postgresql cleaning/ 
vacuuming script that helps the database run a bit faster by doing  
some analysis on itself. Out of all the scripts this one is probably  
the most important for performance of the MTT Reporter.


Looking at the ServerMaintenance wiki page it seems to be in need of  
updating. Nothing major is missing, but there are some new cron  
scripts that should be added. I'll put that on my todo list.


Cheers,
Josh

On May 6, 2008, at 11:29 AM, Pavel Shamis (Pasha) wrote:




I'm not sure which cron jobs you're referring to. Do you
mean these?

 https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron


I talked about this one: 
https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance


The only thing there are the regular
mtt-resu...@open-mpi.org email alerts and some out-of-date
DB monitoring junk. You can ignore that stuff.

Josh, are there some nightly (DB
pruning/cleaning/vacuuming?) cron jobs that Pasha should be
running?

-Ethan



Thanks.

Ethan Mallove wrote:


Hi Pasha,

I thought this issue was solved in r1119 (see below). Do you
have the latest mtt/server scripts?

 https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit

-Ethan

On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote:


About the issue:
1. On client side I see ""*** WARNING: MTTDatabase client did  
not get a serial"
As result of the error some of MTT results is not visible via  
the web reporter

2. On server side I found follow error message:
[client 10.4.3.214] PHP Fatal error:  Allowed memory size of  
33554432 bytes exhausted (tried to allocate 23592960
bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) :  
eval()'d code on line 77515

[Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down
[Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled  
(wrapper: /usr/sbin/suexec)
[Mon May 05 19:30:54 2008] [notice] Digest: generating secret  
for digest authentication ...

[Mon May 05 19:30:54 2008] [notice] Digest: done
[Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP  
LDAP SDK

[Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable
My memory limit in php.ini file was set on 256MB !

Any ideas ?

Thanks.


--
Pavel Shamis (Pasha)
Mellanox Technologies

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




--
Pavel Shamis (Pasha)
Mellanox Technologies

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users







--
Pavel Shamis (Pasha)
Mellanox Technologies




[MTT users] BLACS Support

2008-01-31 Thread Josh Hursey

Has anyone tried to use the BLACS tests in ompi-tests with MTT?

IU is considering adding it to our testing matrix and wanted to hear  
of any experiences.


Cheers,
Josh


Re: [MTT users] Reporter problems

2008-01-30 Thread Josh Hursey
I'm seeing between 12 and 20 seconds on a fairly idle machine. We can  
likely do better. I'll dig into it this week[end] and see what I can do.


12 - 20 isn't too bad though considering the amount of data that query  
is returning. :)


On Jan 30, 2008, at 2:52 PM, Ethan Mallove wrote:


I don't remember a "past 24 hour" summary taking "24
seconds". Are we seeing a slow down due to an accumulation
of results? I thought the week-long table partitions would
prevent this type of effect?

-Ethan


On Wed, Jan/30/2008 11:00:46AM, Josh Hursey wrote:

This maintenance is complete. The reporter should be operating as
normal.

There are a few other maintenance items, but I am pushing them to the
weekend since it will result in a bit of a slowdown again.

Thanks for your patience.

Cheers,
Josh

On Jan 29, 2008, at 9:47 AM, Josh Hursey wrote:


The reporter should be responding much better now. I tweaked the
maintenance scripts so they no longer push nearly as hard on the
database. They are still running, but the query you specified  
seems to

run in approx. 15-20 sec. with the current load.

-- Josh

On Jan 29, 2008, at 8:38 AM, Josh Hursey wrote:


For the next 24 - 48 hours this is to be expected. Sorry :(

I started some maintenance work last night, and it is taking a bit
longer than I expected (due to integrity constraint checking most
likely). The maintenance scripts are pushing fairly hard on the
database, so I would expect some slowdown with the reporter (and
maybe
client submits).

If this becomes a substantial problem for anyone please let me  
know,
and I may be able to shift this work to the weekend. In the mean  
time

I'll see if I can reduce the load a bit.

-- Josh


On Jan 29, 2008, at 7:44 AM, Tim Prins wrote:


Hi,

Using the reporter this morning it is being awfully slow, as in it
is
taking about 3 minutes to do a top level summary search for:
Date: past 24 hours
Org: IU
Platform name: IU_Odin

I don't know whether this is a known problem or not. I seem to
recall
that after the last database upgrade such a search was taking  
just a

few
seconds.

Thanks,

Tim
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users


___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users


___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users


___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] [MTT devel] Test runs not getting into database

2007-09-06 Thread Josh Hursey

As a quick followup here.

The problem seems to be with how the mtt-relay is reporting the  
server name to the submission site. I implemented a quick work around  
(allowing this particular host to connect), but we are efforting a  
real solution at the moment. I have filed a bug about this:

http://svn.open-mpi.org/trac/mtt/ticket/305

-- Josh

On Sep 6, 2007, at 10:04 AM, Josh Hursey wrote:


Weird this looks like a mirror issue again. Below is some more debug
output from MTT on BigRed:
<>
*** Reporter initializing
Evaluating: MTTDatabase

Initializing reporter module: MTTDatabase

Evaluating: require MTT::Reporter::MTTDatabase
Evaluating: $ret = ::Reporter::MTTDatabase::Init(@args)
Evaluating: XXUsernameXX
Evaluating: XXPasswordXX
Evaluating: http://s10c2b3.dim:8008/
Evaluating: OMPI
Evaluating: 1
Evaluating: IU_BigRed
Set HTTP credentials for realm "OMPI"
MTTDatabase getting a client serial number...
MTTDatabase trying proxy:  / Default (none)
MTTDatabase got response: Sorry, this page is not mirrored.  Please
see the
http://www.open-mpi.org/mtt/submit/index.php;>original
version
of this page on the main Open MPI web site.
*** WARNING: MTTDatabase did not get a serial
Making dir:
/N/ptl01/mpiteam/bigred/20070906-CronTest-cron/pb_0/mttdatabase-
submit
(cwd: /N/ptl01/mpiteam/bigred/20070906-CronTest-cron/pb_0)
<>

In the INI file we have the following for the reporter so we can do
the redirect through the head node (s10c2b3.dim):
<>
[Reporter: IU database]
module = MTTDatabase

mttdatabase_realm = OMPI

mttdatabase_url = http://s10c2b3.dim:8008/

mttdatabase_username = XXUsernameXX
mttdatabase_password = XXPasswordXX

mttdatabase_platform = IU_BigRed
mttdatabase_keep_debug_files = 1

<>

It looks like IU is using the trunk version of the mtt-relay, and the
branch version of the MTT client. The mtt-relay code is the same on
both the trunk and the branch. The relay seems to be submitting to:
https://www.open-mpi.org/mtt/submit/index.php

Any thoughts on why this might be happening? It looks like the mirror
check is messed up again.

-- Josh

On Sep 5, 2007, at 11:31 PM, Josh Hursey wrote:


yeah I'll try to take a look at it tomorrow. I suspect that something
is going wrong with the relay, but I can't really think of what it
might be at the moment.

-- Josh

On Sep 5, 2007, at 9:11 PM, Jeff Squyres wrote:


Josh / Ethan --

Not getting a serial means that the client is not getting a value
back from the server that it can parse into a serial.

Can you guys dig into this and see why the mtt dbdebug file that Tim
has at the end of this message is not getting a serial?

Thanks...


On Sep 5, 2007, at 9:24 AM, Tim Prins wrote:


Here is the smallest one. Let me know if you need anything else.

Tim

Jeff Squyres wrote:

Can you send any one of those mtt database files?  We'll need to
figure out if this is a client or a server problem.  :-(
On Sep 5, 2007, at 7:40 AM, Tim Prins wrote:

Hi,

BigRed has not gotten its test results into the database for a
while.
This is running the ompi-core-testers branch. We run by passing
the
results through the mtt-relay.

The mtt-output file has lines like:
*** WARNING: MTTDatabase did not get a serial; phases will be
isolated
from each other in the reports

Reported to MTTDatabase: 1 successful submit, 0 failed submits

(total of 1 result)

I have the database submit files if they would help.

Thanks,

Tim

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users


$VAR1 = {
  'exit_signal_1' => -1,
  'duration_1' => '5 seconds',
  'mpi_version' => '1.3a1r16038',
  'trial' => 0,
  'mpi_install_section_name_1' => 'bigred 32 bit gcc',
  'client_serial' => undef,
  'hostname' => 's1c2b12',
  'result_stdout_1' => '/bin/rm -f *.o *~ PI* core IMB-IO
IMB-EXT IMB-MPI1 exe_io exe_ext exe_mpi1
touch IMB_declare.h
touch exe_mpi1 *.c; rm -rf exe_io exe_ext
make MPI1 CPP=MPI1
make[1]: Entering directory `/N/ptl01/mpiteam/bigred/20070905-
Wednesday/pb_0/installs/d7Ri/tests/imb/IMB_2.3/src\'
mpicc  -I.  -DMPI1 -O -c IMB.c
mpicc  -I.  -DMPI1 -O -c IMB_declare.c
mpicc  -I.  -DMPI1 -O -c IMB_init.c
mpicc  -I.  -DMPI1 -O -c IMB_mem_manager.c
mpicc  -I.  -DMPI1 -O -c IMB_parse_name_mpi1.c
mpicc  -I.  -DMPI1 -O -c IMB_benchlist.c
mpicc  -I.  -DMPI1 -O -c IMB_strgs.c
mpicc  -I.  -DMPI1 -O -c IMB_err_handler.c
mpicc  -I.  -DMPI1 -O -c IMB_g_info.c
mpicc  -I.  -DMPI1 -O -c IMB_warm_up.c
mpicc  -I.  -DMPI1 -O -c IMB_output.c
mpicc  -I.  -DMPI1 -O -c IMB_pingpong.c
mpicc  -I.  -DMPI1 -O -c IMB_pingping.c
mpicc  -I.  -DMPI1 -O -c IMB_allreduce.c
mpicc  -I.  -DMPI1 -O -c IMB_reduce_scatter.c
mpicc  -I.  -DMPI1 -O -c I

Re: [MTT users] Test runs not getting into database

2007-09-06 Thread Josh Hursey
yeah I'll try to take a look at it tomorrow. I suspect that something  
is going wrong with the relay, but I can't really think of what it  
might be at the moment.


-- Josh

On Sep 5, 2007, at 9:11 PM, Jeff Squyres wrote:


Josh / Ethan --

Not getting a serial means that the client is not getting a value
back from the server that it can parse into a serial.

Can you guys dig into this and see why the mtt dbdebug file that Tim
has at the end of this message is not getting a serial?

Thanks...


On Sep 5, 2007, at 9:24 AM, Tim Prins wrote:


Here is the smallest one. Let me know if you need anything else.

Tim

Jeff Squyres wrote:

Can you send any one of those mtt database files?  We'll need to
figure out if this is a client or a server problem.  :-(
On Sep 5, 2007, at 7:40 AM, Tim Prins wrote:

Hi,

BigRed has not gotten its test results into the database for a
while.
This is running the ompi-core-testers branch. We run by passing the
results through the mtt-relay.

The mtt-output file has lines like:
*** WARNING: MTTDatabase did not get a serial; phases will be
isolated
from each other in the reports

Reported to MTTDatabase: 1 successful submit, 0 failed submits

(total of 1 result)

I have the database submit files if they would help.

Thanks,

Tim

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users


$VAR1 = {
  'exit_signal_1' => -1,
  'duration_1' => '5 seconds',
  'mpi_version' => '1.3a1r16038',
  'trial' => 0,
  'mpi_install_section_name_1' => 'bigred 32 bit gcc',
  'client_serial' => undef,
  'hostname' => 's1c2b12',
  'result_stdout_1' => '/bin/rm -f *.o *~ PI* core IMB-IO
IMB-EXT IMB-MPI1 exe_io exe_ext exe_mpi1
touch IMB_declare.h
touch exe_mpi1 *.c; rm -rf exe_io exe_ext
make MPI1 CPP=MPI1
make[1]: Entering directory `/N/ptl01/mpiteam/bigred/20070905-
Wednesday/pb_0/installs/d7Ri/tests/imb/IMB_2.3/src\'
mpicc  -I.  -DMPI1 -O -c IMB.c
mpicc  -I.  -DMPI1 -O -c IMB_declare.c
mpicc  -I.  -DMPI1 -O -c IMB_init.c
mpicc  -I.  -DMPI1 -O -c IMB_mem_manager.c
mpicc  -I.  -DMPI1 -O -c IMB_parse_name_mpi1.c
mpicc  -I.  -DMPI1 -O -c IMB_benchlist.c
mpicc  -I.  -DMPI1 -O -c IMB_strgs.c
mpicc  -I.  -DMPI1 -O -c IMB_err_handler.c
mpicc  -I.  -DMPI1 -O -c IMB_g_info.c
mpicc  -I.  -DMPI1 -O -c IMB_warm_up.c
mpicc  -I.  -DMPI1 -O -c IMB_output.c
mpicc  -I.  -DMPI1 -O -c IMB_pingpong.c
mpicc  -I.  -DMPI1 -O -c IMB_pingping.c
mpicc  -I.  -DMPI1 -O -c IMB_allreduce.c
mpicc  -I.  -DMPI1 -O -c IMB_reduce_scatter.c
mpicc  -I.  -DMPI1 -O -c IMB_reduce.c
mpicc  -I.  -DMPI1 -O -c IMB_exchange.c
mpicc  -I.  -DMPI1 -O -c IMB_bcast.c
mpicc  -I.  -DMPI1 -O -c IMB_barrier.c
mpicc  -I.  -DMPI1 -O -c IMB_allgather.c
mpicc  -I.  -DMPI1 -O -c IMB_allgatherv.c
mpicc  -I.  -DMPI1 -O -c IMB_alltoall.c
mpicc  -I.  -DMPI1 -O -c IMB_sendrecv.c
mpicc  -I.  -DMPI1 -O -c IMB_init_transfer.c
mpicc  -I.  -DMPI1 -O -c IMB_chk_diff.c
mpicc  -I.  -DMPI1 -O -c IMB_cpu_exploit.c
mpicc   -o IMB-MPI1 IMB.o IMB_declare.o  IMB_init.o
IMB_mem_manager.o IMB_parse_name_mpi1.o  IMB_benchlist.o
IMB_strgs.o IMB_err_handler.o IMB_g_info.o  IMB_warm_up.o
IMB_output.o IMB_pingpong.o IMB_pingping.o IMB_allreduce.o
IMB_reduce_scatter.o IMB_reduce.o IMB_exchange.o IMB_bcast.o
IMB_barrier.o IMB_allgather.o IMB_allgatherv.o IMB_alltoall.o
IMB_sendrecv.o IMB_init_transfer.o  IMB_chk_diff.o IMB_cpu_exploit.o
make[1]: Leaving directory `/N/ptl01/mpiteam/bigred/20070905-
Wednesday/pb_0/installs/d7Ri/tests/imb/IMB_2.3/src\'
',
  'mpi_name' => 'ompi-nightly-trunk',
  'number_of_results' => '1',
  'phase' => 'Test Build',
  'compiler_version_1' => '3.3.3',
  'exit_value_1' => 0,
  'result_message_1' => 'Success',
  'start_timestamp_1' => 'Wed Sep  5 04:16:52 2007',
  'compiler_name_1' => 'gnu',
  'suite_name_1' => 'imb',
  'test_result_1' => 1,
  'mtt_client_version' => '2.1devel',
  'fields' =>
'compiler_name,compiler_version,duration,exit_signal,exit_value,mpi_g 
e
t_section_name,mpi_install_id,mpi_install_section_name,mpi_name,mpi_v 
e
rsion,phase,result_message,result_stdout,start_timestamp,suite_name,t 
e

st_result',
  'mpi_install_id' => undef,
  'platform_name' => 'IU_BigRed',
  'local_username' => 'mpiteam',
  'mpi_get_section_name_1' => 'ompi-nightly-trunk'
};
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



--
Jeff Squyres
Cisco Systems

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] Database submit error

2007-08-28 Thread Josh Hursey

Short Version:
--
I just finished the fix, and the submit script is back up and running.

This was a bug that arose in testing, but somehow did not get  
propagated to the production database.


Long Version:
-
The new databases uses partition tables to archive test results. As  
part of this there are some complex rules to mask the partition table  
complexity from the users of the db. There was a bug in the insert  
rule in which the 'id' of the submitted result (mpi_install,  
test_build, and test_run) was a different value than expected since  
the 'id' was not translated properly to the partition table setup.


The fix was to drop all rules and replace them with the correct  
versions. The submit errors you saw below were caused by integrity  
checks in the submit script that keep data from being submitted that  
do not have a proper lineage (e.g., you cannot submit a test_run  
without having submitted a test_build and an mpi_install result). The  
bug caused the client and the server to become confused on what the  
proper 'id' should be and when the submit script attempted to 'guess'  
the correct run it was unsuccessful and errored out.


So sorry this bug lived this long, but it should be fixed now.

-- Josh

On Aug 28, 2007, at 10:16 AM, Jeff Squyres wrote:


Josh found the problem and is in the process of fixing it.  DB
submits are currently disabled while Josh is working on the fix.
More specific details coming soon.

Unfortunately, it looks like all data from last night will be
junk.  :-(  You might as well kill any MTT scripts that are still
running from last night.


On Aug 28, 2007, at 9:14 AM, Jeff Squyres wrote:


Josh and I are investigating -- the total runs in the db in the
summary report from this morning is far too low.  :-(


On Aug 28, 2007, at 9:13 AM, Tim Prins wrote:


It installed and the tests built and made it into the database:
http://www.open-mpi.org/mtt/reporter.php?do_redir=293

Tim

Jeff Squyres wrote:

Did you get a correct MPI install section for mpich2?

On Aug 28, 2007, at 9:05 AM, Tim Prins wrote:


Hi all,

I am working with the jms branch, and when trying to use mpich2,
I get
the following submit error:

*** WARNING: MTTDatabase server notice:  
mpi_install_section_name is

not in
 mtt database.
 MTTDatabase server notice: number_of_results is not in mtt
database.
 MTTDatabase server notice: phase is not in mtt database.
 MTTDatabase server notice: test_type is not in mtt database.
 MTTDatabase server notice: test_build_section_name is not in
mtt
 database.
 MTTDatabase server notice: variant is not in mtt database.
 MTTDatabase server notice: command is not in mtt database.
 MTTDatabase server notice: fields is not in mtt database.
 MTTDatabase server notice: resource_manager is not in mtt
database.

 MTT submission for test run
 MTTDatabase server notice: Invalid test_build_id (47368)
given.
 Guessing that it should be -1
 MTTDatabase server error: ERROR: Unable to find a
test_build to
 associate with this test_run.

 MTTDatabase abort: (Tried to send HTTP error) 400
 MTTDatabase abort:
 No test_build associated with this test_run
*** WARNING: MTTDatabase did not get a serial; phases will be
isolated from
 each other in the reports

Reported to MTTDatabase: 1 successful submit, 0 failed submits
(total of

   12 results)

This happens for each test run section.

Thanks,

Tim
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users






___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



--
Jeff Squyres
Cisco Systems

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



--
Jeff Squyres
Cisco Systems

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] trouble with new reporter

2007-08-27 Thread Josh Hursey
I don't think so. But it is possible that they got perturbed a bit  
with the upgrade i guess. :/


On Aug 27, 2007, at 4:31 PM, Jeff Squyres wrote:


Josh -- did our cookies change?

On Aug 27, 2007, at 4:29 PM, Tim Prins wrote:


Hmm... I just tried this at home and it works.

Maybe I need to get rid of old cookies?

Tim

On Monday 27 August 2007 02:30:17 pm Jeff Squyres wrote:

Is this an effect of "preferfnces" cookies not propagating properly?

On Aug 27, 2007, at 2:26 PM, Josh Hursey wrote:

Weird. I just tried this and it worked fine for me. Showing 25
skampi
runs for IU all trials. Can you try it again?

-- Josh

On Aug 27, 2007, at 2:11 PM, Tim Prins wrote:

All,

First, I have to say the new faster reporter is very nice.

However, I am running into some difficulty with trial runs.  
Here is

what
I did:

1. went to www.open-mpi.org/mtt/reporter.php
2. Clicked preferences, toggled show trial runs
3. typed 'IU' into org
4. Press summary

So far so good, I see the performance results I expect. But then
if I
click on the performance results, I get 'no data available for the
specified query"

Thanks,

Tim
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users


___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



--
Jeff Squyres
Cisco Systems

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] trouble with new reporter

2007-08-27 Thread Josh Hursey
Weird. I just tried this and it worked fine for me. Showing 25 skampi  
runs for IU all trials. Can you try it again?


-- Josh

On Aug 27, 2007, at 2:11 PM, Tim Prins wrote:


All,

First, I have to say the new faster reporter is very nice.

However, I am running into some difficulty with trial runs. Here is  
what

I did:

1. went to www.open-mpi.org/mtt/reporter.php
2. Clicked preferences, toggled show trial runs
3. typed 'IU' into org
4. Press summary

So far so good, I see the performance results I expect. But then if I
click on the performance results, I get 'no data available for the
specified query"

Thanks,

Tim
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] MTT database performance boost

2007-03-01 Thread Josh Hursey

That's awesome. Good work :)

-- Josh

On Mar 1, 2007, at 11:59 AM, Ethan Mallove wrote:


Folks,

If some of you hadn't already noticed, reports (see
http://www.open-mpi.org/mtt/) on Test Runs have been taking
an upwards of 5-7 minutes to load as of late.  This was due
in part to some database design issues (compounded by the
fact that we now have nearly 3 million test results
archived, dating back to November).  To mitigate the
performance issues, there is now a sliding window n-day
"speedy" database that will be used automatically for recent
reports.  (Currently n=7, but there is only 2 days worth of
"speedy" data as of this email).  Reports which date back
earlier than the sliding window will take some time as they
will be coming from the slower "archive" database.

Cheers,
Ethan

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] [devel-core] MTT 2.0 tutorial teleconference

2007-01-05 Thread Josh Hursey

I'll be there as well

On Jan 4, 2007, at 3:44 PM, Tim Mattox wrote:


I'll be there for the call on Tuesday.
We are looking forward to switching IU to MTT 2.0
The new report/results pages are great!
--
Tim Mattox - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
I'm a bright... http://www.the-brights.net/
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/



Re: [MTT users] Corrupted MTT database or incorrucet query

2006-11-13 Thread Josh Hursey


On Nov 13, 2006, at 10:27 AM, Ethan Mallove wrote:


I can infer that you have an MPI Install section labeled
"odin 64 bit gcc". A few questions:

* What is the mpi_get for that section (or how does that
  parameter get filled in by your automated scripts)?


I attached the generated INI file for you to look at.


nightly-trunk-64-gcc.ini-gen
Description: Binary data


It is the same value for all parallel runs of GCC+64bit (same value  
for all branches)




* Do you start with a fresh scratch tree every run?


Yep. Every run, and all of the parallel runs.


* Could you email me your scratch/installs/mpi_installs.xml
  files?



  

  

  

The attached mpi_installs.xml is from the trunk+gcc+64bit parallel  
scratch directory.




I checked on how widespread this issue is, and found that
18,700 out of 474,000 Test Run rows in the past month have a
mpi_version/command (v1.2-trunk) mismatch. Occuring in both
directions (version=1.2, command=trunk and vice versa).
They occur on these clusters:

 Cisco MPI development cluster
 IU Odin
 IU - Thor - TESTING



Interesting...


There *is* that race condition in which one mtt submitting
could overwrite another's index. Do you have "trunk" and
"1.2" runs submitting to the database at the same time?


Yes we do. :(

The parallel blocks as we call them are separate scratch directories  
in which MTT is running concurrently. Meaning that we have N parallel  
block scratch directories each running one instance of MTT. So it is  
possible (and highly likely) that when the reporter phase fires all  
of the N parallel blocks are firing it about the same time.


Without knowing how the reporter is doing the inserts into the  
database I don't think I can help much more than that on debugging.  
When the reporter fires for the DB:
 - Does it start a transaction for the connection, do the inserts,  
then commit?
 - Does it ship the inserts to the server then allow it to run them,  
or does the client do all of the individual inserts?


-- Josh




On Sun, Nov/12/2006 06:04:17PM, Jeff Squyres (jsquyres) wrote:


   I feel somewhat better now.  Ethan - can you fix?
-Original Message-
   From:   Tim Mattox [[1]mailto:timat...@open-mpi.org]
   Sent:   Sunday, November 12, 2006 05:34 PM Eastern Standard Time
   To: General user list for the MPI Testing Tool
   Subject:[MTT users] Corrupted MTT database or  
incorrucet query

   Hello,
   I just noticed that the MTT summary page is presenting
   incorrect information for our recent runs at IU.  It is
   showing failures for the 1.2b1 that actaully came from
   the trunk!  See the first entry in this table:
   http://www.open-mpi.org/mtt/reporter.php? 
_start_test_timestamp=200
   6-11-12%2019:12:02%20through%202006-11-12% 
2022:12:02_platform_id=co

ntains_platform_id=IU_phase=runs_success=fail_atom=*by_ 
t
   est_case=Table_agg_timestamp=- 
_mpi_name=All_mpi_version

=All_os_name=All_os_version=All_platform_hardware=All 
_
   platform_id=All_platform_id=off&1- 
page=off_bookmarks_bookmar

   ks
   Click on the [i] in the upper right (the first entry)
   to get the popup window which shows the MPIRrun cmd as:
   mpirun -mca btl tcp,sm,self -np 6 --prefix
   /san/homedirs/mpiteam/mtt-runs/odin/20061112-Testing-NOCLN/ 
parallel-bl
   ock-3/installs/ompi-nightly-trunk/odin_64_bit_gcc/1.3a1r12559/ 
install

   dynamic/spawn Note the path has "1.3a1r12559" in the
   name... it's a run from the trunk, yet the table showed
   this as a 1.2b1 run.  There are several of these
   missattributed errors.  This would explain why Jeff saw
   some ddt errors on the 1.2 brach yesterday, but was
   unable to reproduce them.  They were from the trunk!
   --
   Tim Mattox - [2]http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
   I'm a bright... [3]http://www.the-brights.net/
   ___
   mtt-users mailing list
   mtt-us...@open-mpi.org
   [4]http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users

References

   1. mailto:timat...@open-mpi.org
   2. http://homepage.mac.com/tmattox/
   3. http://www.the-brights.net/
   4. http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



--
-Ethan
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/



Re: [MTT users] Fwd: [mtt-results] Nightly MPI Install Failures

2006-11-03 Thread Josh Hursey

IU/Thor Short Story:
-
The IU/thor tests are borked because of the scheduler. Ignore these  
results for now.


IU/Thor Longer Story:
-
SLURM is setup to kill any job that's 'idle' for more than N min,  
where N is kinda small. We are compiling, but SLURM is not looking at  
the compile but the MTT script which is pretty much doing nothing  
until the compile complete. Thus SLURM thinks that MTT is 'idle' and  
kills the allocation :(


We fixed this on Odin, but our sysadmin needs to make the change to  
Thor. It is one line in a config file, but getting him to do much is  
like pulling teeth with telepathy somedays :/


Ignore thor for now. It should be running alonside Odin in the next  
day or two.


-- Josh

On Nov 3, 2006, at 11:23 AM, Jeff Squyres wrote:


I see some failures from LANL and IU/thor that *look* like the tests
were aborted before they completed (e.g., "rm -rf" of the scratch dir
while MTT was running).

Can someone from both organizations confirm that these are bogus
results?



Begin forwarded message:


From: mtt-resu...@osl.iu.edu
Date: November 3, 2006 9:00:12 AM EST
To: mtt-resu...@open-mpi.org
Subject: [mtt-results] Nightly MPI Install Failures
Reply-To: MPI Test Tool result submissions 


Query Description
Current Time (GMT)2006-11-03 14:00:11
Date Range (GMT)2006-11-02 14:00:11 through 2006-11-03 14:00:11
successfail
CountBy test case *



Summary of MPI Installs that failed

HardwareOsOs verMpiMpi revClusterCompilerCompiler verMPI Install[i]
PassFail
sun4uSunOSSunOS 5.10Open MPI trunk1.3a1r12408Sun 32-bitsun5.701
Details

Config args:
--enable-shared --enable-mpi-f90 --with-mpi-f90-size=trivial
CC=cc CXX=CC FC=f90 F77=f77
CFLAGS=-xarch=v8plusa -xO5 -xmemalign=8s
CXXFLAGS=-xarch=v8plusa -xO5 -xmemalign=8s
FFLAGS=-xarch=v8plusa -xO5 -xmemalign=8s
FCFLAGS=-xarch=v8plusa -xO5 -xmemalign=8s -KPIC

Stdout:
libtool: compile: cc -DHAVE_CONFIG_H -I. -I. -I../../../opal/
include -I../../../orte/include -I../../../ompi/include -I../../../
ompi/include -I../../../opal/libltdl -I../../.. -DNDEBUG -
xarch=v8plusa -xO5 -xmemalign=8s -mt -c base/io_base_close.c -KPIC -
DPIC -o base/.libs/io_base_close.o
libtool: compile: cc -DHAVE_CONFIG_H -I. -I. -I../../../opal/
include -I../../../orte/include -I../../../ompi/include -I../../../
ompi/include -I../../../opal/libltdl -I../../.. -DNDEBUG -
xarch=v8plusa -xO5 -xmemalign=8s -mt -c base/
io_base_component_list.c -KPIC -DPIC -o base/.libs/
io_base_component_list.o
libtool: compile: cc -DHAVE_CONFIG_H -I. -I. -I../../../opal/
include -I../../../orte/include -I../../../ompi/include -I../../../
ompi/include -I../../../opal/libltdl -I../../.. -DNDEBUG -
xarch=v8plusa -xO5 -xmemalign=8s -mt -c base/io_base_delete.c -KPIC
-DPIC -o base/.libs/io_base_delete.o
source=base/io_base_find_available.c object=base/
io_base_find_available.lo libtool=yes \
DEPDIR=.deps depmode=none /bin/bash ../../../config/depcomp \
/bin/bash ../../../libtool --tag=CC --mode=compile cc -
DHAVE_CONFIG_H -I. -I. -I../../../opal/include -I../../../orte/
include -I../../../ompi/include -I../../../ompi/include -I../../../
opal/libltdl -I../../.. -DNDEBUG -xarch=v8plusa -xO5 -xmemalign=8s -
mt -c -o base/io_base_find_available.lo base/io_base_find_available.c
source=base/io_base_open.c object=base/io_base_open.lo libtool=yes \
DEPDIR=.deps depmode=none /bin/bash ../../../config/depcomp \
/bin/bash ../../../libtool --tag=CC --mode=compile cc -
DHAVE_CONFIG_H -I. -I. -I../../../opal/include -I../../../orte/
include -I../../../ompi/include -I../../../ompi/include -I../../../
opal/libltdl -I../../.. -DNDEBUG -xarch=v8plusa -xO5 -xmemalign=8s -
mt -c -o base/io_base_open.lo base/io_base_open.c
source=base/io_base_request.c object=base/io_base_request.lo
libtool=yes \
DEPDIR=.deps depmode=none /bin/bash ../../../config/depcomp \
/bin/bash ../../../libtool --tag=CC --mode=compile cc -
DHAVE_CONFIG_H -I. -I. -I../../../opal/include -I../../../orte/
include -I../../../ompi/include -I../../../ompi/include -I../../../
opal/libltdl -I../../.. -DNDEBUG -xarch=v8plusa -xO5 -xmemalign=8s -
mt -c -o base/io_base_request.lo base/io_base_request.c
source=base/io_base_register_datarep.c object=base/
io_base_register_datarep.lo libtool=yes \
DEPDIR=.deps depmode=none /bin/bash ../../../config/depcomp \
/bin/bash ../../../libtool --tag=CC --mode=compile cc -
DHAVE_CONFIG_H -I. -I. -I../../../opal/include -I../../../orte/
include -I../../../ompi/include -I../../../ompi/include -I../../../
opal/libltdl -I../../.. -DNDEBUG -xarch=v8plusa -xO5 -xmemalign=8s -
mt -c -o base/io_base_register_datarep.lo base/
io_base_register_datarep.c
libtool: compile: cc -DHAVE_CONFIG_H -I. -I. -I../../../opal/
include -I../../../orte/include -I../../../ompi/include -I../../../
ompi/include -I../../../opal/libltdl -I../../.. -DNDEBUG -
xarch=v8plusa -xO5 -xmemalign=8s -mt -c base/io_base_request.c -
KPIC -DPIC -o base/.libs/io_base_request.o
libtool: compile: 

Re: [MTT users] Discussion on teleconf yesterday?

2006-10-25 Thread Josh Hursey
The discussion started with the bug characteristics of v1.2 versus  
the trunk.


It seemed from the call that IU was the only institution that can  
asses this via MTT as noone else spoke up. Since people were  
interested in seeing things that were breaking I suggested that I  
start forwarding the IU internal MTT reports (run nightly and weekly)  
to the test...@open-mpi.org. This was meet by Brain insisting that it  
would result in "thousands" of emails to the development list. I  
clarified that it is only 3 - 4 messages a day from IU. However if  
all other institutions do this then it would be a bunch of email  
(where 'a bunch' would still be less than 'thousands'). That's how we  
got to a 'we need a single summary presented to the group' comment.  
It should be noted that we brought up IU sending to the 'testing@open- 
mpi.org' list as a bandaid until MTT could do it better.


This single summary can be email or a webpage that people can check.  
Rich said that he would prefer a webpage, and noone else really had a  
comment. That got us talking about the current summary page that MTT  
generates. Tim M mentioned that the current website is difficult to  
figure out how to get the answers you need. I agree, it is hard  
[usability] for someone to go to the summary page and answer the  
question "So what failed from IU last night, and how does that differ  
from Yesterday -- e.g., what regressed and progressed yesterday at  
IU?". The website is flexible enough to due it, but having a couple  
of basic summary pages would be nice for basic users. What that  
should look like we can discuss further.


The IU group really likes the emails that we currently generate. A  
plain-text summary of the previous run. I posted copies on the MTT  
bug tracker here:

http://svn.open-mpi.org/trac/mtt/ticket/61
Currently we have not put the work in to aggregate the runs, so for  
each ini file that we run we get 1 email to the IU group. This is  
fine for the moment, but as we add the rest of the clusters and  
dimensions in the testing matrix we will need MTT to aggregate the  
results for us and generate such an email.


So I think the general feel of the discussion is that we need the  
following from MTT:
 - A 'basic' summary page providing answers to some general  
frequently asked queries. The current interface is too advanced for  
the current users.
 - A summary email [in plain-text preferably] similar to the one  
that IU generated showing an aggregation of the previous nights  
results for (a) all reporters (b) my institution [so I can track them  
down and file bugs].

 - 1 email a day on the previous nights testing results.

Some relevant bugs currently in existence:
http://svn.open-mpi.org/trac/mtt/ticket/92
http://svn.open-mpi.org/trac/mtt/ticket/61
http://svn.open-mpi.org/trac/mtt/ticket/94


The other concern is that given the frequency of testing as bugs  
appear from the testing someone needs to make sure the bug tracker is  
updated. I think the group is unclear about how this is done. Meaning  
when a MTT identifies a test as failed whom is responsible for  
putting the bug in the bug tracker?
The obvious solution is the institution that identified the bug.  
[Warning: My opinion] But then that becomes unwieldy for IU since we  
have a large testing matrix, and would need to commit someone to  
doing this everyday (and it may take all day to properly track a set  
of bugs). Also this kind of punishes an institution for testing more  
instead of providing incentive to test.


-- Page Break -- Context switch --

In case you all want to know what we are doing here at IU. I attached  
to this email our planed MTT testing matrix. Currently we have BigRed  
and Odin running the complete matrix less the BLACS tests. Wotan and  
Thor will come online as we get more resources to support them.


In order to do such a complex testing matrix we have various .ini  
files that we use. And since some of the dimensions in the matrix are  
large we break some of the tests into a couple .ini files that are  
submitted concurrently to have them run in a reasonable time.


 | BigRed   | Odin | Thor   | Wotan
-+--+--++--
Sun  |N |N |  IMB   |  BLACS
Mon  |N BLACS   |N |N   |N
Tues |N |N IMB*|N   |N
Wed  |N IMB*|N |N   |N
Thur |N |N BLACS   |N   |N
Fri  |N |N |N   |N
Sat  |N Intel*  |N Intel*  |  BLACS |  IMB

N = Nightly run
* = Large runs
All runs start at 2 am on the day listed.

=
BigRed
=
Nightly
---
- Branches: trunk, v1.2
- Configurations: All 64 and 32 bit builds
  * MX, LoadLeveler, No debug, gcc 3.x
- Test Suites
  * Trivial
  * IBM suite
- Processing Elements/tasks/cores/...
  * # < 8 hours
  * 7 nodes/28 tasks [to start with]
- Runtime Parameters
  * PML ob1/BTL mx,sm,self
  * PML cm /MTL mx

Weekly: Monday 2am 

Re: [MTT users] Ibm test suite build "failing"

2006-10-02 Thread Josh Hursey
Yea I believe so. From what I can tell it looks pretty much the same  
IIRC.


-- Josh

On Oct 2, 2006, at 7:35 PM, Jeff Squyres wrote:


Josh --

Is your "failed" IBM build looking like this?

http://www.open-mpi.org/~emallove/svn/mtt/trunk/server/php/ 
reporter.php?
_start_test_timestamp=2006-10-01%2000:00:00%20through%202006-10-02% 
2021:27:2
7_agg_timestamp=- 
_phase=builds_success=Fail_platform_hardwar
e=All_os_name=All_os_version=All_mpi_name=All_mpi_name 
=off
ef_mpi_version=All_mpi_version=off_platform_id=All_platfor 
m_id=o
n_atom=*by_test_case=Table_test_build_section_name=off_t 
est_bu
ild_section_name=All_title=Details%20of%20Test%20Builds%20that 
%20faile

d&1-page=off_bookmarks_bookmarks

(hopefully that'll wrap ok and you can click on it...)

It's a development web page, so don't bookmark it, but it looks  
like Sun is
having the same problems you are (trunk).  The build outwardly  
succeeds --
no error messages are shown -- but MTT is reporting that it fails  
(click on

the "[I]" to see the stdout/stderr).

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/



Re: [MTT users] Running the v1.1 nightly

2006-10-01 Thread Josh Hursey
So I dug into this about as much as I can, and found that the v1.1  
build seems to have completed successfully, but doesn't contain the  
'success = 1' line. I attached the build dumps in case that helps  
figure this one out.




ibm-build.tar.bz2
Description: Binary data


Any thoughts on why this might happen,
Josh

On Sep 29, 2006, at 3:40 PM, Josh Hursey wrote:


Has anyone been using MTT to test the v1.1 nightly?

I have been trying to run the [trivial,ibm] tests against
[trunk,v1.2,v1.1]. MTT will build all the sources and all the tests
with all the sources. It will then run the trivial tests against all
three sources, but only the ibm tests against the trunk and v1.2.

I looked at the logs produced and there didn't seem to be any errors
with the ibm+v1.1 test build. Would there be any other reason this
would happen?


Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/



[MTT users] Running the v1.1 nightly

2006-09-29 Thread Josh Hursey

Has anyone been using MTT to test the v1.1 nightly?

I have been trying to run the [trivial,ibm] tests against  
[trunk,v1.2,v1.1]. MTT will build all the sources and all the tests  
with all the sources. It will then run the trivial tests against all  
three sources, but only the ibm tests against the trunk and v1.2.


I looked at the logs produced and there didn't seem to be any errors  
with the ibm+v1.1 test build. Would there be any other reason this  
would happen?



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/



Re: [MTT users] Post run result submission

2006-09-26 Thread Josh Hursey
So the login node is the only one that has a window to the outside  
world. I can't access the outside world from within an allocation.


So my script does:
 - Login Node:
   1) Get MPI Tarballs
 - 1 Compute node:
   0) Allocate a compute node to compile.
   1) Build/Install MPI builds
   2) Deallocate compute node
 - Login Node:
   1) Get MPI Test sources
 - N Compute Nodes:
   0) Allocate N compute Nodes to run the tests on
   1) Build/Install Tests
   2) Run the tests...
 - Login Node:
   0) Check to make sure we are all done (scheduler didn't kill the  
job, etc.).

   1) Report results to MTT *

* This is what I am missing currently.

I currently have the "Reporter: IU Database" section commented out so  
that once the tests finish they don't try to post the database, since  
they can't see the outside world.


On Sep 26, 2006, at 3:17 PM, Ethan Mallove wrote:


On Tue, Sep/26/2006 02:01:41PM, Josh Hursey wrote:

I'm setting up MTT on BigRed at IU, and due to some visibility
requirements of the compute nodes I segment the MTT operations.
Currently I have a perl script that does all the svn and wget
interactions from the login node, then compiles and runs on the
compute nodes. This all seems to work fine.

Now I am wondering how to get the textfile results that were
generated back to the MTT database once the run has finished.



If you run the "MPI Install", "Test build", and "Test run"
sections from the same machine (call it the
"Install-Build-Run" node), I would think you could then
additionaly run the "Reporter: IU Database" section. Or can
you not do the HTTP POST from Install-Build-Run node?

-Ethan


I know HLRS deals with this situation, is there a supported way of
doing this yet or is it a future work item still?

Currently I have a method to send a summary email to our team after
the results are generated, so this isn't a show stopper for IU or
anything, just something so we can share our results with the rest of
the team.

Cheers,
Josh
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] How goes the MTT?

2006-09-19 Thread Josh Hursey
Things are working well at IU. We have nightly and weekly runs going  
smoothly on our Odin cluster.
Have not started using it on BigRed (our LoadLeveler scheduled  
environment), but hope to do that soonish.


Cheers,
Josh

On Sep 19, 2006, at 10:18 AM, Ethan Mallove wrote:


Folks,

Just checking in with everyone on how the new client is working out
for you. Any feature requests and/or bugs for the client and/or
reports? I see submissions from HLRS and IU, so I am assuming things
are working (to some degree). Do keep us apprised of any issues,
observations, complaints, etc.

Thanks!

-Ethan
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users




Re: [MTT users] New stuff

2006-09-14 Thread Josh Hursey
After iterating a bit with Jeff. It seems that the error indicates  
that I had a malformed ini file. I accidently left a bit of the old  
script in there when I updated. :[


After removing that and doing a sanity check for other bits things  
are working once again.


Thanks :)

Josh

On Sep 14, 2006, at 5:36 PM, Josh Hursey wrote:


Here you go:
[mpiteam@odin ~/mtt]$ ./client/mtt --mpi-get --mpi-install --scratch /
u/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18 --file /u/mpiteam/
local/etc/ompi-iu-odin-core.ini --verbose --print-time --debug | tee
~/mtt.out
Debug is 1, Verbose is 1
*** MTT: ./client/mtt --mpi-get --mpi-install --scratch
/u/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18 --file
/u/mpiteam/local/etc/ompi-iu-odin-core.ini --verbose --print-time
--debug
Scratch: /u/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18
Scratch resolved:
/san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18
Making dir:
/san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18/
sources (cwd:
/san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18)
Making dir:
/san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18/ 
installs

(cwd: /san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18)
Reading ini file: /u/mpiteam/local/etc/ompi-iu-odin-core.ini
*** WARNING: Could not read INI file:
 /u/mpiteam/local/etc/ompi-iu-odin-core.ini; skipping

[mpiteam@odin ~/mtt]$ cat ~/mtt.out
Debug is 1, Verbose is 1
*** MTT: ./client/mtt --mpi-get --mpi-install --scratch
/u/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18 --file
/u/mpiteam/local/etc/ompi-iu-odin-core.ini --verbose --print-time
--debug
Scratch: /u/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18
Scratch resolved:
/san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18
Making dir:
/san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18/
sources (cwd:
/san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18)
Making dir:
/san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18/ 
installs

(cwd: /san/homedirs/mpiteam/mtt-runs/Testing-09-14-2006-17-14-18)
Reading ini file: /u/mpiteam/local/etc/ompi-iu-odin-core.ini
*** WARNING: Could not read INI file:
 /u/mpiteam/local/etc/ompi-iu-odin-core.ini; skipping

[mpiteam@odin ~/mtt]$ ls -l ~/local/etc/ompi-iu-odin-core.ini
-rw-r-  1 mpiteam projects 13741 Sep 14 17:01 /u/mpiteam/local/
etc/ompi-iu-odin-core.ini

On Sep 14, 2006, at 5:33 PM, Ethan Mallove wrote:


On Thu, Sep/14/2006 05:20:23PM, Josh Hursey wrote:
Maybe I jumped the gun a bit, but I just updated and tried to run  
mtt

and get the following error message when I run:
Reading ini file: /u/mpiteam/local/etc/ompi-iu-odin-core.ini
*** WARNING: Could not read INI file:
 /u/mpiteam/local/etc/ompi-iu-odin-core.ini; skipping

The file exists and was working previously. Any thoughts on why this
might happen?



Never seen this one. I think I need more details. Could you do:

$ client/mtt -f file.ini | tee mtt.out
$ cat mtt.out
$ ls -l file.ini

I assume the mtt.out is very short if it's dying while trying to
read the ini.

Thanks!

-Ethan



Cheers,
Josh

On Sep 14, 2006, at 2:53 PM, Jeff Squyres wrote:


Howdy MTT users!

We have a bunch of important updates for you, including some that
*REQURE*
action tomorrow morning (15 Sep 2006: update your client and update
your INI
files).  Please go read the full text of the announcement here:

http://svn.open-mpi.org/trac/mtt/wiki/News-14-Sep-2006

As usual, please let us know if you have any questions, comments,
feedback,
etc.

Thanks!

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/



Re: [MTT users] Tests timing out

2006-08-30 Thread Josh Hursey
This fixes the hanging and gets me running (and passing) some/most of  
the tests [Trivial and ibm]. Yay!


I have a 16 processor job running on Odin at the moment that seems to  
be going well so far.


Thanks for your help.

Want me to file a bug about the tcsh problem below?

-- Josh

On Aug 30, 2006, at 2:30 PM, Jeff Squyres wrote:


Bah!

This is the result of perl expanding $? To 0 -- it seems that I  
need to

escape $? So that it's not output as 0.

Sorry about that!

So is this just for the sourcing files, or for your overall (hanging)
problems?


On 8/30/06 2:28 PM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:


So here are the results of my exploration. I have things running now.
The problem was that the user that I am running under does not set
the LD_LIBRARY_PATH variable at any point. So when MTT tries to
export the variable it does:
if (0LD_LIBRARY_PATH == 0) then
 setenv LD_LIBRARY_PATH /san//install/lib
else
 setenv LD_LIBRARY_PATH /san//install/lib:
$LD_LIBRARY_PATH
endif

So this causes tcsh to emit the error the LD_LIBRARY_PATH is not
defined. So it is not set due to the error.

I fixed this by always declaring it in the .cshrc file to "". However
MTT could do a sanity check before trying to check the value to see
if it is defined. Something like:


if ($?LD_LIBRARY_PATH) then
else
   setenv LD_LIBRARY_PATH ""
endif

if (0LD_LIBRARY_PATH == 0) then
 setenv LD_LIBRARY_PATH /san//install/lib
else
 setenv LD_LIBRARY_PATH /san//install/lib:
$LD_LIBRARY_PATH
endif


or something of the sort.

As another note, could we start a "How to debug MTT" Wiki page with
some of the information that Jeff sent in this message regarding the
dumping of env vars? I think that would be helpful when getting
things started.

Thanks for all your help, I'm sure I'll have more questions in the
near future.

Cheers,
Josh


On Aug 30, 2006, at 12:31 PM, Jeff Squyres wrote:


On 8/30/06 12:10 PM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:

MTT directly sets environment variables in its own environment  
(via

$ENV{whatever} = "foo") before using fork/exec to launch compiles
and runs.
Hence, the forked children inherit the environment variables that
we set
(E.g., PATH and LD_LIBRARY_PATH).

So if you source the env vars files that MTT drops, that should be
sufficient.


Does it drop them to a file, or is it printed in the debugging  
output

anywhere? I'm having a bit of trouble finding these strings in the
output.


It does not put these in the -debug output.

The files that it drops are in the scratch dir.  You'll need to go
into
scratch/installs, and then it depends on what your INI file section
names
are.  You'll go to:

/installs

And there should be files named "mpi_installed_vars.[csh|sh]" that
you can
source, depending on your shell.  IT should set PATH and
LD_LIBRARY_PATH.

The intent of these files is for exactly this purpose -- for a
human to test
borked MPI installs inside the MTT scratch tree.



As for setting the values on *remote* nodes, we do it solely  
via the

--prefix option.  I wonder if --prefix is broken under SLURM...?
That might
be something to check -- youmight be inadvertantly mixing
installations of
OMPI...?


Yep I'll check it out.

Cheers,
Josh




On 8/30/06 10:36 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:


I'm trying to replicate the MTT environment as much as possible,
and
have a couple of questions.

Assume there is no mpirun in my PATH/LD_LIBRARY_PATH when I start
MTT. After MTT builds Open MPI, how does it export these
variables so
that it can build the tests? How does it export these when it  
runs

those tests (solely via --prefix)?

Cheers,
josh

On Aug 30, 2006, at 10:25 AM, Josh Hursey wrote:

I already tried that. However I'm trying it in a couple  
different

ways and getting some mixed results. Let me formulate the error
cases
and get back to you.

Cheers,
Josh

On Aug 30, 2006, at 10:17 AM, Ralph H Castain wrote:

Well, why don't you try first separating this from MTT? Just  
run

the command
manually in batch mode and see if it works. If that works,
then the
problem
is with MTT. Otherwise, we have a problem with notification.

Or are you saying that you have already done this?
Ralph


On 8/30/06 8:03 AM, "Josh Hursey" <jjhur...@open-mpi.org>  
wrote:



yet another point (sorry for the spam). This may not be an MTT
issue
but a broken ORTE on the trunk :(

When I try to run in a allocation (srun -N 16 -A) things run
fine.
But if I try to run in batch mode (srun -N 16 -b myscript.sh)
then I
see the same hang as in MTT. seems that mpirun is not getting
properly notified of the completion of the job. :(

I'll try to investigate a bit further today. Any thoughts on
what
might be causing this?

Cheers,
Josh

On Aug 30, 2006, at 9:54 AM, Josh Hursey wrote:


forgot this bit in my mail. With the mpirun just hanging out
there I
at

Re: [MTT users] Tests timing out

2006-08-30 Thread Josh Hursey


On Aug 30, 2006, at 11:36 AM, Jeff Squyres wrote:


(sorry -- been afk much of this morning)

MTT directly sets environment variables in its own environment (via
$ENV{whatever} = "foo") before using fork/exec to launch compiles  
and runs.
Hence, the forked children inherit the environment variables that  
we set

(E.g., PATH and LD_LIBRARY_PATH).

So if you source the env vars files that MTT drops, that should be
sufficient.


Does it drop them to a file, or is it printed in the debugging output  
anywhere? I'm having a bit of trouble finding these strings in the  
output.




As for setting the values on *remote* nodes, we do it solely via the
--prefix option.  I wonder if --prefix is broken under SLURM...?   
That might
be something to check -- youmight be inadvertantly mixing  
installations of

OMPI...?


Yep I'll check it out.

Cheers,
Josh




On 8/30/06 10:36 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:


I'm trying to replicate the MTT environment as much as possible, and
have a couple of questions.

Assume there is no mpirun in my PATH/LD_LIBRARY_PATH when I start
MTT. After MTT builds Open MPI, how does it export these variables so
that it can build the tests? How does it export these when it runs
those tests (solely via --prefix)?

Cheers,
josh

On Aug 30, 2006, at 10:25 AM, Josh Hursey wrote:


I already tried that. However I'm trying it in a couple different
ways and getting some mixed results. Let me formulate the error  
cases

and get back to you.

Cheers,
Josh

On Aug 30, 2006, at 10:17 AM, Ralph H Castain wrote:


Well, why don't you try first separating this from MTT? Just run
the command
manually in batch mode and see if it works. If that works, then the
problem
is with MTT. Otherwise, we have a problem with notification.

Or are you saying that you have already done this?
Ralph


On 8/30/06 8:03 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:

yet another point (sorry for the spam). This may not be an MTT  
issue

but a broken ORTE on the trunk :(

When I try to run in a allocation (srun -N 16 -A) things run fine.
But if I try to run in batch mode (srun -N 16 -b myscript.sh)  
then I

see the same hang as in MTT. seems that mpirun is not getting
properly notified of the completion of the job. :(

I'll try to investigate a bit further today. Any thoughts on what
might be causing this?

Cheers,
Josh

On Aug 30, 2006, at 9:54 AM, Josh Hursey wrote:


forgot this bit in my mail. With the mpirun just hanging out
there I
attached GDB and got the following stack trace:
(gdb) bt
#0  0x003d1b9bd1af in poll () from /lib64/tls/libc.so.6
#1  0x002a956e6389 in opal_poll_dispatch (base=0x5136d0,
arg=0x513730, tv=0x7fbfffee70) at poll.c:191
#2  0x002a956e28b6 in opal_event_base_loop (base=0x5136d0,
flags=5) at event.c:584
#3  0x002a956e26b7 in opal_event_loop (flags=5) at event.c: 
514

#4  0x002a956db7c7 in opal_progress () at runtime/
opal_progress.c:
259
#5  0x0040334c in opal_condition_wait (c=0x509650,
m=0x509600) at ../../../opal/threads/condition.h:81
#6  0x00402f52 in orterun (argc=9, argv=0x7fb0b8) at
orterun.c:444
#7  0x004028a3 in main (argc=9, argv=0x7fb0b8) at
main.c:13

Seems that mpirun is waiting for things to complete :/

On Aug 30, 2006, at 9:53 AM, Josh Hursey wrote:



On Aug 30, 2006, at 7:19 AM, Jeff Squyres wrote:

On 8/29/06 8:57 PM, "Josh Hursey" <jjhur...@open-mpi.org>  
wrote:


Does this apply to *all* tests, or only some of the tests  
(like

allgather)?


All of the tests: Trivial and ibm. They all timeout :(


Blah.  The trivial tests are simply "hello world", so they  
should

take just
about no time at all.

Is this running under SLURM?  I put the code in there to  
know how

many procs
to use in SLURM but have not tested it in eons.  I doubt that's
the
problem,
but that's one thing to check.



Yep it is in SLURM. and it seems that the 'number of procs'
code is
working fine as it changes with different allocations.

Can you set a super-long timeout (e.g., a few minutes), and  
while

one of the
trivial tests is running, do some ps's on the relevant nodes  
and

see what,
if anything, is running?  E.g., mpirun, the test executable on
the
nodes,
etc.


Without setting a long timeout. It seems that mpirun is running,
but
nothing else and only on the launching node.

When a test starts you see the mpirun launching properly:
$ ps aux | grep ...
USER   PID %CPU %MEM   VSZ  RSS TTY  STAT START   TIME
COMMAND
mpiteam  15117  0.5  0.8 113024 33680 ?  S09:32   0:06
perl ./
client/mtt --debug --scratch /u/mpiteam/tmp/mtt-scratch -- 
file /u/

mpiteam/local/etc/ompi-iu-odin-core.ini --verbose --print-time
mpiteam  15294  0.0  0.0 00 ?Z09:32   0:00
[sh]

mpiteam  28453  0.2  0.0 38072 3536 ?S09:50   0:00
mpirun
-mca btl tcp,self -np 32 --prefix /san/homedirs/mpiteam/tmp/mtt-
scratch/installs/omp

Re: [MTT users] Tests timing out

2006-08-30 Thread Josh Hursey
I'm trying to replicate the MTT environment as much as possible, and  
have a couple of questions.


Assume there is no mpirun in my PATH/LD_LIBRARY_PATH when I start  
MTT. After MTT builds Open MPI, how does it export these variables so  
that it can build the tests? How does it export these when it runs  
those tests (solely via --prefix)?


Cheers,
josh

On Aug 30, 2006, at 10:25 AM, Josh Hursey wrote:


I already tried that. However I'm trying it in a couple different
ways and getting some mixed results. Let me formulate the error cases
and get back to you.

Cheers,
Josh

On Aug 30, 2006, at 10:17 AM, Ralph H Castain wrote:


Well, why don't you try first separating this from MTT? Just run
the command
manually in batch mode and see if it works. If that works, then the
problem
is with MTT. Otherwise, we have a problem with notification.

Or are you saying that you have already done this?
Ralph


On 8/30/06 8:03 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:


yet another point (sorry for the spam). This may not be an MTT issue
but a broken ORTE on the trunk :(

When I try to run in a allocation (srun -N 16 -A) things run fine.
But if I try to run in batch mode (srun -N 16 -b myscript.sh) then I
see the same hang as in MTT. seems that mpirun is not getting
properly notified of the completion of the job. :(

I'll try to investigate a bit further today. Any thoughts on what
might be causing this?

Cheers,
Josh

On Aug 30, 2006, at 9:54 AM, Josh Hursey wrote:

forgot this bit in my mail. With the mpirun just hanging out  
there I

attached GDB and got the following stack trace:
(gdb) bt
#0  0x003d1b9bd1af in poll () from /lib64/tls/libc.so.6
#1  0x002a956e6389 in opal_poll_dispatch (base=0x5136d0,
arg=0x513730, tv=0x7fbfffee70) at poll.c:191
#2  0x002a956e28b6 in opal_event_base_loop (base=0x5136d0,
flags=5) at event.c:584
#3  0x002a956e26b7 in opal_event_loop (flags=5) at event.c:514
#4  0x002a956db7c7 in opal_progress () at runtime/
opal_progress.c:
259
#5  0x0040334c in opal_condition_wait (c=0x509650,
m=0x509600) at ../../../opal/threads/condition.h:81
#6  0x00402f52 in orterun (argc=9, argv=0x7fb0b8) at
orterun.c:444
#7  0x004028a3 in main (argc=9, argv=0x7fb0b8) at
main.c:13

Seems that mpirun is waiting for things to complete :/

On Aug 30, 2006, at 9:53 AM, Josh Hursey wrote:



On Aug 30, 2006, at 7:19 AM, Jeff Squyres wrote:


On 8/29/06 8:57 PM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:


Does this apply to *all* tests, or only some of the tests (like
allgather)?


All of the tests: Trivial and ibm. They all timeout :(


Blah.  The trivial tests are simply "hello world", so they should
take just
about no time at all.

Is this running under SLURM?  I put the code in there to know how
many procs
to use in SLURM but have not tested it in eons.  I doubt that's
the
problem,
but that's one thing to check.



Yep it is in SLURM. and it seems that the 'number of procs'  
code is

working fine as it changes with different allocations.


Can you set a super-long timeout (e.g., a few minutes), and while
one of the
trivial tests is running, do some ps's on the relevant nodes and
see what,
if anything, is running?  E.g., mpirun, the test executable on  
the

nodes,
etc.


Without setting a long timeout. It seems that mpirun is running,
but
nothing else and only on the launching node.

When a test starts you see the mpirun launching properly:
$ ps aux | grep ...
USER   PID %CPU %MEM   VSZ  RSS TTY  STAT START   TIME
COMMAND
mpiteam  15117  0.5  0.8 113024 33680 ?  S09:32   0:06
perl ./
client/mtt --debug --scratch /u/mpiteam/tmp/mtt-scratch --file /u/
mpiteam/local/etc/ompi-iu-odin-core.ini --verbose --print-time
mpiteam  15294  0.0  0.0 00 ?Z09:32   0:00  
[sh]


mpiteam  28453  0.2  0.0 38072 3536 ?S09:50   0:00
mpirun
-mca btl tcp,self -np 32 --prefix /san/homedirs/mpiteam/tmp/mtt-
scratch/installs/ompi-nightly-trunk/odin_gcc_warnings/1.3a1r11497/
install collective/allgather_in_place
mpiteam  28454  0.0  0.0 41716 2040 ?Sl   09:50   0:00
srun --
nodes=16 --ntasks=16 --
nodelist=odin022,odin021,odin020,odin019,odin018,odin017,odin016,o 
d

in
0
15
,odin014,odin013,odin012,odin011,odin010,odin009,odin008,odin007
orted --no-daemonize --bootproxy 1 --ns-nds slurm --name 0.0.1 --
num_procs 16 --vpid_start 0 --universe
mpit...@odin007.cs.indiana.edu:default-universe-28453 --nsreplica
"0.0.0;tcp://129.79.240.107:40904" --gprreplica "0.0.0;tcp://
129.79.240.107:40904"
mpiteam  28455  0.0  0.0 23212 1804 ?Ssl  09:50   0:00
srun --
nodes=16 --ntasks=16 --
nodelist=odin022,odin021,odin020,odin019,odin018,odin017,odin016,o 
d

in
0
15
,odin014,odin013,odin012,odin011,odin010,odin009,odin008,odin007
orted --no-daemonize --bootproxy 1 --ns-nds slurm --name 0.0.1 --
num_procs 16 --vpid_start 0 --universe
mpit...@odin007.cs.indiana.edu:

Re: [MTT users] Tests timing out

2006-08-29 Thread Josh Hursey


On Aug 29, 2006, at 6:57 PM, Jeff Squyres wrote:


On 8/29/06 1:55 PM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:


So I'm having trouble getting tests to complete without timing out in
MTT. It seems that the tests timeout and hang in MTT, but complete
normally outside of MTT.


Does this apply to *all* tests, or only some of the tests (like  
allgather)?


All of the tests: Trivial and ibm. They all timeout :(




Here are some details:
Build:
   Open MPI Trunk (1.3a1r11481)

Tests:
   Trivial
   ibm

BTL:
   tcp
   self

Nodes/processes:
   16 nodes (32 processors) on the Odin Cluster at IU


In MTT all of the tests timeout:

Running command: mpirun  -mca btl tcp,self -np 32 --prefix
/san/homedirs/mpiteam/tmp/mtt-scratch/installs/ompi-nightly- 
trunk/

odin_g
cc_warnings/1.3a1r11481/install collective/allgather
Timeout: 1 - 1156872348 (vs. now: 1156872028)
Past timeout! 1156872348 < 1156872349
Past timeout! 1156872348 < 1156872349

[snipped]

: returning 0
String now: 0
*** WARNING: Test: allgather, np=32, variant=1: TIMED OUT (failed)


Outside of MTT using the same build the test runs and completes
normally:
  $ cd ~/tmp/mtt-scratch/installs/ompi-nightly-trunk/
odin_gcc_warnings/1.3a1r11481/tests/ibm/ibm/
  $ mpirun -mca btl tcp,self -np 32 --prefix /san/homedirs/mpiteam/
tmp/mtt-scratch/installs/ompi-nightly-trunk/odin_gcc_warnings/
1.3a1r11481/install collective/allgather


Where is mpirun in your path?

MTT actually drops sourceable files in the top-level install dir  
(i.e., the

1.3a1r11481) that you can source in your shell and set the
PATH/LD_LIBRARY_PATH for that install.  Can you source it and try  
to run

again?


Yep I exported the PATH/LD_LIBRARY_PATH to the one cited in the -- 
prefix argument before running manually.





How long does it take to run manually -- just a few seconds, or a  
long time

(that could potentially timeout)?


Just a few seconds (say 5 or so).



--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/



[MTT users] Tests timing out

2006-08-29 Thread Josh Hursey

Hey all,

So I'm having trouble getting tests to complete without timing out in  
MTT. It seems that the tests timeout and hang in MTT, but complete  
normally outside of MTT.


Here are some details:
Build:
  Open MPI Trunk (1.3a1r11481)

Tests:
  Trivial
  ibm

BTL:
  tcp
  self

Nodes/processes:
  16 nodes (32 processors) on the Odin Cluster at IU


In MTT all of the tests timeout:

Running command: mpirun  -mca btl tcp,self -np 32 --prefix
   /san/homedirs/mpiteam/tmp/mtt-scratch/installs/ompi-nightly-trunk/ 
odin_g

   cc_warnings/1.3a1r11481/install collective/allgather
Timeout: 1 - 1156872348 (vs. now: 1156872028)
Past timeout! 1156872348 < 1156872349
Past timeout! 1156872348 < 1156872349
Command complete, exit status: 72057594037927935
Evaluating: ((_exit_status(), 0), (_exit_status(),  
77))

Got name: test_exit_status
Got args:
_do: $ret = MTT::Values::Functions::test_exit_status()
_exit_status returning: 72057594037927935
String now: ((72057594037927935, 0), (_exit_status(), 77))
Got name: eq
Got args: 72057594037927935, 0
_do: $ret = MTT::Values::Functions::eq(72057594037927935, 0)
 got: 72057594037927935 0
: returning 0
String now: (0, (_exit_status(), 77))
Got name: test_exit_status
Got args:
_do: $ret = MTT::Values::Functions::test_exit_status()
_exit_status returning: 72057594037927935
String now: (0, (72057594037927935, 77))
Got name: eq
Got args: 72057594037927935, 77
_do: $ret = MTT::Values::Functions::eq(72057594037927935, 77)
 got: 72057594037927935 77
: returning 0
String now: (0, 0)
Got name: or
Got args: 0, 0
_do: $ret = MTT::Values::Functions::or(0, 0)
 got: 0 0
: returning 0
String now: 0
*** WARNING: Test: allgather, np=32, variant=1: TIMED OUT (failed)


Outside of MTT using the same build the test runs and completes  
normally:
 $ cd ~/tmp/mtt-scratch/installs/ompi-nightly-trunk/ 
odin_gcc_warnings/1.3a1r11481/tests/ibm/ibm/
 $ mpirun -mca btl tcp,self -np 32 --prefix /san/homedirs/mpiteam/ 
tmp/mtt-scratch/installs/ompi-nightly-trunk/odin_gcc_warnings/ 
1.3a1r11481/install collective/allgather

 $

Any thoughts on why this might be happening in MTT but not outside of  
it?


Cheers,
Josh