[MTT devel] MTT queries... problems

2012-05-30 Thread Eugene Loh

I seem to get unreliable results from MTT queries.

To reproduce:
- go to http://www.open-mpi.org/mtt
- click on "Test run"
- for "Date range:" enter "2012-03-23 00:30:00 - 2012-03-23 23:55:00"
- for "Org:" enter "oracle"
- for "Platform name:" enter "t2k-0"
- for "Suite:" enter "ibm-32"
- click on "Summary"
- click on "Detail"

The summary indicates there are 193 passes and 6 skips.  The detail 
shows 199 results distributed over two pages, 1-100 on one page and 
101-199 on the next.  The total (199=193+6) is correct, but I think the 
second page is suspect.  It includes "Date range" output (which is nice, 
but I didn't ask for it and I think is a symptom of what's going wrong 
here).  That second page includes some repeats from the first page 
(e.g., "00_create" and "00_create_cxx"), etc.  Because the total (199) 
is correct and because there are repeats, other results are missing 
entirely.  Another indication that there is a problem is that there are 
only three skipped tests in the "detail" view but six in the summary.  
(I believe the summary.)


Before I click on "Detail", I can go to "Preferences" and set the number 
of rows per page to be 200.  Doing so and then clicking on "Detail", the 
repeats (00_create, 00_create_cxx, and others) disappear and the number 
of Skipped tests is correct.


So, the problem seems to be with distributing results over multiple pages.


Re: [MTT devel] duplicate results

2012-02-24 Thread Eugene Loh
This is recycled e-mail from about 1.5 months ago.  I observed this 
problem again.  That is, if one queries the MTT database, certain 
results are reported twice.


The date range in question is 2012/02/23 from about 08:48 to 09:02.  The 
submitting system is once again ^burl-ct-v20z-2$.  The problem is once 
again with v1.5 testing and with intel-64 tests.  On the client side, 
the log seems to indicate that each result is submitted once.  If I 
query the database, however, it shows a number of results reported 
twice.  These incidents are consecutive -- that is, the behavior starts 
at some time and ends at another.


Even if no one has time to figure this out, I figured I'd report this 
for the record books.


On 1/9/2012 11:13 AM, Josh Hursey wrote:
Well if the debug results seem correct then there must be some bug in 
the submission script. :/ It is a pretty old piece of code, so it is 
possible that something is going awry in there.


Let us know if you notice further problems like this. I won't have 
time to look into them in the near term, but I'll try to put in on the 
short list to get to when I get free cycles. If you happen to come 
across a repeater scenario (not likely since this seems like something 
difficult to reproduce) that would help the debugging effort.


On Fri, Jan 6, 2012 at 2:07 PM, Eugene Loh <mailto:eugene@oracle.com>> wrote:


On 01/06/12 08:52, Josh Hursey wrote:

Weird. I don't know what is going on here, unless the client is
somehow submitting some of the results too many times. One thing
to check is the debug output file that the MTT client is
submitting to the server. Check that for duplicates.

Sorry, I don't understand where to check.  I do know that if I
look at the output from the MTT client, I see a bunch of messages
like this:

>> Reported to MTTDatabase client: 1 successful submit, 0 failed
submits (total of 6 results)

If I add up those numbers of results submitted, the totals match
what I would expect.  So, there is some indication that the number
of client submissions is right.


That will help determine whether this is a server side problem or
client side problem. I have not noticed this behavior on the
server before,

I haven't either, but I only just started looking more closely at
results.  Mostly, in any case, things look fine.


but might be something with the submit.php script - just a guess
though at this point.

Unfortunately I have zero time to spend on MTT for a few weeks at
least. :/

    On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh mailto:eugene@oracle.com>> wrote:

I go to MTT and I choose:

Test run
Date range: 2012-01-05 05 :00:00 -
2012-01-05 12 :00:00
Org: Oracle
Platform name: $burl-ct-v20z-2$
Suite: intel-64

and I get:

1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk
1.7a1r25692 intel-64 4 870 0 86 0 0
2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5
1.5.5rc2r25683 intel-64 4 915 0 92 0 0

I get more tests (passing and skipped) with v1.5 than I do
with the trunk run.  I have lots of ways of judging what the
numbers should be. The "trunk" numbers are right.  The "v1.5"
numbers are too high;  they should be the same as the trunk
numbers.

I can see the explanation by clicking on "Detail" and looking
at individual runs.  (To get time stamps, I add a " | by
minute" qualifier before clicking on "Detail".  Maybe there's
a more proper way of getting time stamps, but that seems to
work for me.)  Starting with record 890 and ending with 991,
records are repeated.  That is, 890 and 891 have identical
command lines, time stamps, output, etc.  One of them is a
duplicate.  Same with 892 and 893, then for 894 and 895, then
896 and 897, and so on.  So, for about a one-hour period, the
records sent in by this test run appear duplicated when I
submit queries to the database. These 51 duplicates are the
45 extra passes and 6 extra skips seen in the results above.

Can someone figure out what's going wrong here?  Clearly, I'd
like to be able to rely on query results.



Re: [MTT devel] duplicate results

2012-01-06 Thread Eugene Loh

On 1/6/2012 5:52 AM, Josh Hursey wrote:
Weird. I don't know what is going on here, unless the client is 
somehow submitting some of the results too many times. One thing to 
check is the debug output file that the MTT client is submitting to 
the server. Check that for duplicates.


That will help determine whether this is a server side problem or 
client side problem. I have not noticed this behavior on the server 
before, but might be something with the submit.php script - just a 
guess though at this point.


Unfortunately I have zero time to spend on MTT for a few weeks at 
least. :/


-- Josh

On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh <mailto:eugene@oracle.com>> wrote:


I go to MTT and I choose:

Test run
Date range: 2012-01-05 05 :00:00 - 2012-01-05
12 :00:00
Org: Oracle
Platform name: $burl-ct-v20z-2$
Suite: intel-64

and I get:

1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk 1.7a1r25692
intel-64 4 870 0 86 0 0
2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5
1.5.5rc2r25683 intel-64 4 915 0 92 0 0

I get more tests (passing and skipped) with v1.5 than I do with
the trunk run.  I have lots of ways of judging what the numbers
should be. The "trunk" numbers are right.  The "v1.5" numbers are
too high;  they should be the same as the trunk numbers.

I can see the explanation by clicking on "Detail" and looking at
individual runs.  (To get time stamps, I add a " | by minute"
qualifier before clicking on "Detail".  Maybe there's a more
proper way of getting time stamps, but that seems to work for me.)
 Starting with record 890 and ending with 991, records are
repeated.  That is, 890 and 891 have identical command lines, time
stamps, output, etc.  One of them is a duplicate.  Same with 892
and 893, then for 894 and 895, then 896 and 897, and so on.  So,
for about a one-hour period, the records sent in by this test run
appear duplicated when I submit queries to the database. These 51
duplicates are the 45 extra passes and 6 extra skips seen in the
results above.

Can someone figure out what's going wrong here?  Clearly, I'd like
to be able to rely on query results.
___
mtt-devel mailing list
mtt-de...@open-mpi.org <mailto:mtt-de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel




--
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey <http://users.nccs.gov/%7Ejjhursey>


Re: [MTT devel] duplicate results

2012-01-06 Thread Eugene Loh

On 01/06/12 08:52, Josh Hursey wrote:
Weird. I don't know what is going on here, unless the client is 
somehow submitting some of the results too many times. One thing to 
check is the debug output file that the MTT client is submitting to 
the server. Check that for duplicates.
Sorry, I don't understand where to check.  I do know that if I look at 
the output from the MTT client, I see a bunch of messages like this:


>> Reported to MTTDatabase client: 1 successful submit, 0 failed 
submits (total of 6 results)


If I add up those numbers of results submitted, the totals match what I 
would expect.  So, there is some indication that the number of client 
submissions is right.
That will help determine whether this is a server side problem or 
client side problem. I have not noticed this behavior on the server 
before,
I haven't either, but I only just started looking more closely at 
results.  Mostly, in any case, things look fine.
but might be something with the submit.php script - just a guess 
though at this point.


Unfortunately I have zero time to spend on MTT for a few weeks at 
least. :/


-- Josh

On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh <mailto:eugene@oracle.com>> wrote:


I go to MTT and I choose:

Test run
Date range: 2012-01-05 05 :00:00 - 2012-01-05
12 :00:00
Org: Oracle
Platform name: $burl-ct-v20z-2$
Suite: intel-64

and I get:

1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk 1.7a1r25692
intel-64 4 870 0 86 0 0
2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5
1.5.5rc2r25683 intel-64 4 915 0 92 0 0

I get more tests (passing and skipped) with v1.5 than I do with
the trunk run.  I have lots of ways of judging what the numbers
should be. The "trunk" numbers are right.  The "v1.5" numbers are
too high;  they should be the same as the trunk numbers.

I can see the explanation by clicking on "Detail" and looking at
individual runs.  (To get time stamps, I add a " | by minute"
qualifier before clicking on "Detail".  Maybe there's a more
proper way of getting time stamps, but that seems to work for me.)
 Starting with record 890 and ending with 991, records are
repeated.  That is, 890 and 891 have identical command lines, time
stamps, output, etc.  One of them is a duplicate.  Same with 892
and 893, then for 894 and 895, then 896 and 897, and so on.  So,
for about a one-hour period, the records sent in by this test run
appear duplicated when I submit queries to the database. These 51
duplicates are the 45 extra passes and 6 extra skips seen in the
results above.

Can someone figure out what's going wrong here?  Clearly, I'd like
to be able to rely on query results.



[MTT devel] duplicate results

2012-01-05 Thread Eugene Loh

I go to MTT and I choose:

Test run
Date range: 2012-01-05 05:00:00 - 2012-01-05 12:00:00
Org: Oracle
Platform name: $burl-ct-v20z-2$
Suite: intel-64

and I get:

1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk 1.7a1r25692 
intel-64 4 870 0 86 0 0
2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5 1.5.5rc2r25683 
intel-64 4 915 0 92 0 0


I get more tests (passing and skipped) with v1.5 than I do with the 
trunk run.  I have lots of ways of judging what the numbers should be. 
The "trunk" numbers are right.  The "v1.5" numbers are too high;  they 
should be the same as the trunk numbers.


I can see the explanation by clicking on "Detail" and looking at 
individual runs.  (To get time stamps, I add a " | by minute" qualifier 
before clicking on "Detail".  Maybe there's a more proper way of getting 
time stamps, but that seems to work for me.)  Starting with record 890 
and ending with 991, records are repeated.  That is, 890 and 891 have 
identical command lines, time stamps, output, etc.  One of them is a 
duplicate.  Same with 892 and 893, then for 894 and 895, then 896 and 
897, and so on.  So, for about a one-hour period, the records sent in by 
this test run appear duplicated when I submit queries to the database. 
These 51 duplicates are the 45 extra passes and 6 extra skips seen in 
the results above.


Can someone figure out what's going wrong here?  Clearly, I'd like to be 
able to rely on query results.