Re: [MTT devel] duplicate results

2012-02-24 Thread Eugene Loh
This is recycled e-mail from about 1.5 months ago.  I observed this 
problem again.  That is, if one queries the MTT database, certain 
results are reported twice.


The date range in question is 2012/02/23 from about 08:48 to 09:02.  The 
submitting system is once again ^burl-ct-v20z-2$.  The problem is once 
again with v1.5 testing and with intel-64 tests.  On the client side, 
the log seems to indicate that each result is submitted once.  If I 
query the database, however, it shows a number of results reported 
twice.  These incidents are consecutive -- that is, the behavior starts 
at some time and ends at another.


Even if no one has time to figure this out, I figured I'd report this 
for the record books.


On 1/9/2012 11:13 AM, Josh Hursey wrote:
Well if the debug results seem correct then there must be some bug in 
the submission script. :/ It is a pretty old piece of code, so it is 
possible that something is going awry in there.


Let us know if you notice further problems like this. I won't have 
time to look into them in the near term, but I'll try to put in on the 
short list to get to when I get free cycles. If you happen to come 
across a repeater scenario (not likely since this seems like something 
difficult to reproduce) that would help the debugging effort.


On Fri, Jan 6, 2012 at 2:07 PM, Eugene Loh > wrote:


On 01/06/12 08:52, Josh Hursey wrote:

Weird. I don't know what is going on here, unless the client is
somehow submitting some of the results too many times. One thing
to check is the debug output file that the MTT client is
submitting to the server. Check that for duplicates.

Sorry, I don't understand where to check.  I do know that if I
look at the output from the MTT client, I see a bunch of messages
like this:

>> Reported to MTTDatabase client: 1 successful submit, 0 failed
submits (total of 6 results)

If I add up those numbers of results submitted, the totals match
what I would expect.  So, there is some indication that the number
of client submissions is right.


That will help determine whether this is a server side problem or
client side problem. I have not noticed this behavior on the
server before,

I haven't either, but I only just started looking more closely at
results.  Mostly, in any case, things look fine.


but might be something with the submit.php script - just a guess
though at this point.

Unfortunately I have zero time to spend on MTT for a few weeks at
least. :/

On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh mailto:eugene@oracle.com>> wrote:

I go to MTT and I choose:

Test run
Date range: 2012-01-05 05 :00:00 -
2012-01-05 12 :00:00
Org: Oracle
Platform name: $burl-ct-v20z-2$
Suite: intel-64

and I get:

1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk
1.7a1r25692 intel-64 4 870 0 86 0 0
2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5
1.5.5rc2r25683 intel-64 4 915 0 92 0 0

I get more tests (passing and skipped) with v1.5 than I do
with the trunk run.  I have lots of ways of judging what the
numbers should be. The "trunk" numbers are right.  The "v1.5"
numbers are too high;  they should be the same as the trunk
numbers.

I can see the explanation by clicking on "Detail" and looking
at individual runs.  (To get time stamps, I add a " | by
minute" qualifier before clicking on "Detail".  Maybe there's
a more proper way of getting time stamps, but that seems to
work for me.)  Starting with record 890 and ending with 991,
records are repeated.  That is, 890 and 891 have identical
command lines, time stamps, output, etc.  One of them is a
duplicate.  Same with 892 and 893, then for 894 and 895, then
896 and 897, and so on.  So, for about a one-hour period, the
records sent in by this test run appear duplicated when I
submit queries to the database. These 51 duplicates are the
45 extra passes and 6 extra skips seen in the results above.

Can someone figure out what's going wrong here?  Clearly, I'd
like to be able to rely on query results.



Re: [MTT devel] duplicate results

2012-01-09 Thread Josh Hursey
Well if the debug results seem correct then there must be some bug in the
submission script. :/ It is a pretty old piece of code, so it is possible
that something is going awry in there.

Let us know if you notice further problems like this. I won't have time to
look into them in the near term, but I'll try to put in on the short list
to get to when I get free cycles. If you happen to come across a repeater
scenario (not likely since this seems like something difficult
to reproduce) that would help the debugging effort.

Thanks and sorry for the trouble...

-- Josh

On Fri, Jan 6, 2012 at 2:07 PM, Eugene Loh  wrote:

>  On 01/06/12 08:52, Josh Hursey wrote:
>
> Weird. I don't know what is going on here, unless the client is somehow
> submitting some of the results too many times. One thing to check is the
> debug output file that the MTT client is submitting to the server. Check
> that for duplicates.
>
> Sorry, I don't understand where to check.  I do know that if I look at the
> output from the MTT client, I see a bunch of messages like this:
>
> >> Reported to MTTDatabase client: 1 successful submit, 0 failed submits
> (total of 6 results)
>
> If I add up those numbers of results submitted, the totals match what I
> would expect.  So, there is some indication that the number of client
> submissions is right.
>
> That will help determine whether this is a server side problem or client
> side problem. I have not noticed this behavior on the server before,
>
> I haven't either, but I only just started looking more closely at
> results.  Mostly, in any case, things look fine.
>
> but might be something with the submit.php script - just a guess though at
> this point.
>
>  Unfortunately I have zero time to spend on MTT for a few weeks at least.
> :/
>
>  -- Josh
>
> On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh  wrote:
>
>> I go to MTT and I choose:
>>
>> Test run
>> Date range: 2012-01-05 05:00:00 - 2012-01-05 12:00:00
>> Org: Oracle
>> Platform name: $burl-ct-v20z-2$
>> Suite: intel-64
>>
>> and I get:
>>
>> 1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk 1.7a1r25692
>> intel-64 4 870 0 86 0 0
>> 2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5 1.5.5rc2r25683
>> intel-64 4 915 0 92 0 0
>>
>> I get more tests (passing and skipped) with v1.5 than I do with the trunk
>> run.  I have lots of ways of judging what the numbers should be. The
>> "trunk" numbers are right.  The "v1.5" numbers are too high;  they should
>> be the same as the trunk numbers.
>>
>> I can see the explanation by clicking on "Detail" and looking at
>> individual runs.  (To get time stamps, I add a " | by minute" qualifier
>> before clicking on "Detail".  Maybe there's a more proper way of getting
>> time stamps, but that seems to work for me.)  Starting with record 890 and
>> ending with 991, records are repeated.  That is, 890 and 891 have identical
>> command lines, time stamps, output, etc.  One of them is a duplicate.  Same
>> with 892 and 893, then for 894 and 895, then 896 and 897, and so on.  So,
>> for about a one-hour period, the records sent in by this test run appear
>> duplicated when I submit queries to the database. These 51 duplicates are
>> the 45 extra passes and 6 extra skips seen in the results above.
>>
>> Can someone figure out what's going wrong here?  Clearly, I'd like to be
>> able to rely on query results.
>>
>
> ___
> mtt-devel mailing list
> mtt-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [MTT devel] duplicate results

2012-01-06 Thread Eugene Loh

On 1/6/2012 5:52 AM, Josh Hursey wrote:
Weird. I don't know what is going on here, unless the client is 
somehow submitting some of the results too many times. One thing to 
check is the debug output file that the MTT client is submitting to 
the server. Check that for duplicates.


That will help determine whether this is a server side problem or 
client side problem. I have not noticed this behavior on the server 
before, but might be something with the submit.php script - just a 
guess though at this point.


Unfortunately I have zero time to spend on MTT for a few weeks at 
least. :/


-- Josh

On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh > wrote:


I go to MTT and I choose:

Test run
Date range: 2012-01-05 05 :00:00 - 2012-01-05
12 :00:00
Org: Oracle
Platform name: $burl-ct-v20z-2$
Suite: intel-64

and I get:

1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk 1.7a1r25692
intel-64 4 870 0 86 0 0
2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5
1.5.5rc2r25683 intel-64 4 915 0 92 0 0

I get more tests (passing and skipped) with v1.5 than I do with
the trunk run.  I have lots of ways of judging what the numbers
should be. The "trunk" numbers are right.  The "v1.5" numbers are
too high;  they should be the same as the trunk numbers.

I can see the explanation by clicking on "Detail" and looking at
individual runs.  (To get time stamps, I add a " | by minute"
qualifier before clicking on "Detail".  Maybe there's a more
proper way of getting time stamps, but that seems to work for me.)
 Starting with record 890 and ending with 991, records are
repeated.  That is, 890 and 891 have identical command lines, time
stamps, output, etc.  One of them is a duplicate.  Same with 892
and 893, then for 894 and 895, then 896 and 897, and so on.  So,
for about a one-hour period, the records sent in by this test run
appear duplicated when I submit queries to the database. These 51
duplicates are the 45 extra passes and 6 extra skips seen in the
results above.

Can someone figure out what's going wrong here?  Clearly, I'd like
to be able to rely on query results.
___
mtt-devel mailing list
mtt-de...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel




--
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey 


Re: [MTT devel] duplicate results

2012-01-06 Thread Eugene Loh

On 01/06/12 08:52, Josh Hursey wrote:
Weird. I don't know what is going on here, unless the client is 
somehow submitting some of the results too many times. One thing to 
check is the debug output file that the MTT client is submitting to 
the server. Check that for duplicates.
Sorry, I don't understand where to check.  I do know that if I look at 
the output from the MTT client, I see a bunch of messages like this:


>> Reported to MTTDatabase client: 1 successful submit, 0 failed 
submits (total of 6 results)


If I add up those numbers of results submitted, the totals match what I 
would expect.  So, there is some indication that the number of client 
submissions is right.
That will help determine whether this is a server side problem or 
client side problem. I have not noticed this behavior on the server 
before,
I haven't either, but I only just started looking more closely at 
results.  Mostly, in any case, things look fine.
but might be something with the submit.php script - just a guess 
though at this point.


Unfortunately I have zero time to spend on MTT for a few weeks at 
least. :/


-- Josh

On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh > wrote:


I go to MTT and I choose:

Test run
Date range: 2012-01-05 05 :00:00 - 2012-01-05
12 :00:00
Org: Oracle
Platform name: $burl-ct-v20z-2$
Suite: intel-64

and I get:

1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk 1.7a1r25692
intel-64 4 870 0 86 0 0
2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5
1.5.5rc2r25683 intel-64 4 915 0 92 0 0

I get more tests (passing and skipped) with v1.5 than I do with
the trunk run.  I have lots of ways of judging what the numbers
should be. The "trunk" numbers are right.  The "v1.5" numbers are
too high;  they should be the same as the trunk numbers.

I can see the explanation by clicking on "Detail" and looking at
individual runs.  (To get time stamps, I add a " | by minute"
qualifier before clicking on "Detail".  Maybe there's a more
proper way of getting time stamps, but that seems to work for me.)
 Starting with record 890 and ending with 991, records are
repeated.  That is, 890 and 891 have identical command lines, time
stamps, output, etc.  One of them is a duplicate.  Same with 892
and 893, then for 894 and 895, then 896 and 897, and so on.  So,
for about a one-hour period, the records sent in by this test run
appear duplicated when I submit queries to the database. These 51
duplicates are the 45 extra passes and 6 extra skips seen in the
results above.

Can someone figure out what's going wrong here?  Clearly, I'd like
to be able to rely on query results.



Re: [MTT devel] duplicate results

2012-01-06 Thread Josh Hursey
Weird. I don't know what is going on here, unless the client is somehow
submitting some of the results too many times. One thing to check is the
debug output file that the MTT client is submitting to the server. Check
that for duplicates. That will help determine whether this is a server side
problem or client side problem. I have not noticed this behavior on the
server before, but might be something with the submit.php script - just a
guess though at this point.

Unfortunately I have zero time to spend on MTT for a few weeks at least. :/

-- Josh

On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh  wrote:

> I go to MTT and I choose:
>
> Test run
> Date range: 2012-01-05 05:00:00 - 2012-01-05 12:00:00
> Org: Oracle
> Platform name: $burl-ct-v20z-2$
> Suite: intel-64
>
> and I get:
>
> 1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk 1.7a1r25692
> intel-64 4 870 0 86 0 0
> 2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5 1.5.5rc2r25683
> intel-64 4 915 0 92 0 0
>
> I get more tests (passing and skipped) with v1.5 than I do with the trunk
> run.  I have lots of ways of judging what the numbers should be. The
> "trunk" numbers are right.  The "v1.5" numbers are too high;  they should
> be the same as the trunk numbers.
>
> I can see the explanation by clicking on "Detail" and looking at
> individual runs.  (To get time stamps, I add a " | by minute" qualifier
> before clicking on "Detail".  Maybe there's a more proper way of getting
> time stamps, but that seems to work for me.)  Starting with record 890 and
> ending with 991, records are repeated.  That is, 890 and 891 have identical
> command lines, time stamps, output, etc.  One of them is a duplicate.  Same
> with 892 and 893, then for 894 and 895, then 896 and 897, and so on.  So,
> for about a one-hour period, the records sent in by this test run appear
> duplicated when I submit queries to the database. These 51 duplicates are
> the 45 extra passes and 6 extra skips seen in the results above.
>
> Can someone figure out what's going wrong here?  Clearly, I'd like to be
> able to rely on query results.
> __**_
> mtt-devel mailing list
> mtt-de...@open-mpi.org
> http://www.open-mpi.org/**mailman/listinfo.cgi/mtt-devel
>
>


-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


[MTT devel] duplicate results

2012-01-05 Thread Eugene Loh

I go to MTT and I choose:

Test run
Date range: 2012-01-05 05:00:00 - 2012-01-05 12:00:00
Org: Oracle
Platform name: $burl-ct-v20z-2$
Suite: intel-64

and I get:

1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk 1.7a1r25692 
intel-64 4 870 0 86 0 0
2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5 1.5.5rc2r25683 
intel-64 4 915 0 92 0 0


I get more tests (passing and skipped) with v1.5 than I do with the 
trunk run.  I have lots of ways of judging what the numbers should be. 
The "trunk" numbers are right.  The "v1.5" numbers are too high;  they 
should be the same as the trunk numbers.


I can see the explanation by clicking on "Detail" and looking at 
individual runs.  (To get time stamps, I add a " | by minute" qualifier 
before clicking on "Detail".  Maybe there's a more proper way of getting 
time stamps, but that seems to work for me.)  Starting with record 890 
and ending with 991, records are repeated.  That is, 890 and 891 have 
identical command lines, time stamps, output, etc.  One of them is a 
duplicate.  Same with 892 and 893, then for 894 and 895, then 896 and 
897, and so on.  So, for about a one-hour period, the records sent in by 
this test run appear duplicated when I submit queries to the database. 
These 51 duplicates are the 45 extra passes and 6 extra skips seen in 
the results above.


Can someone figure out what's going wrong here?  Clearly, I'd like to be 
able to rely on query results.