On 10 Sep 2015, at 14:49, Roger Riggs <roger.ri...@oracle.com> wrote:
> Hi Chris, > > ok, updated the webrev with the 30 sec timeouts. Thanks Roger. I remember going many rounds on false timeouts from tests in other areas a few years back. We came to a consensus that 30 secs as a timeout, that should never be triggered, was a reasonable value. > I also expect that the timeoutFactor on slow systems would be applied by > jtreg. Yes, but this does not cater for swamped systems. I think we can err on the side of caution here, without any real cost. Thanks, -Chris. > Roger > > > On 9/10/2015 9:43 AM, Chris Hegarty wrote: >> Roger, >> >> The timeouts, in this test, are just to ensure that the test does not block >> indefinitely, if it encounters a bug in the JDK, right? If a timeout is >> ever triggered then there is a bug, right? >> > correct >> >> If this is the case then, we have used larger timeouts in other areas ( net, >> concurrency ) to cover running on slooooow, or busy, machines. Typically 30 >> secs. To ensure no false failures. The large value doesn’t really matter >> because it is never expected to actually wait that long. If it does timeout, >> then there is definitely a JDK bug. Does it make sense to bump these to 30 >> secs also? >> >> -Chris. >> >> On 10 Sep 2015, at 14:30, Roger Riggs >> <roger.ri...@oracle.com> >> wrote: >> >> >>> Hi Joe, >>> >>> I think adjusting the timeouts is already covered. >>> The test uses Process.waitFor(timeout) to wait for the process to exit, but >>> only up to the timeout value. >>> The "Utils.adjustTimeout(5)", performs the desired adjustment based on the >>> jtreg timeoutFactor. >>> Utils is in the testlibrary. >>> >>> Roger >>> >>> >>> On 9/9/2015 8:08 PM, Joseph D. Darcy wrote: >>> >>>> Hi Roger, >>>> >>>> If timeouts need to be used, I suggest rather than fixed values they be >>>> adjusted according to the timeout factor being used in the test run. >>>> >>>> Can some sort of repeated testing with exponential backout to a longer >>>> timeout be used ? If the system is actually ready is a fraction of a >>>> second, it is preferable for the test to be able to complete without >>>> waiting the full timeout value. (Perhaps that is already encapsulated in >>>> the existing code.) >>>> >>>> Thanks, >>>> >>>> -Joe >>>> >>>> On 9/9/2015 2:49 PM, Roger Riggs wrote: >>>> >>>>> Hi, >>>>> >>>>> Please review this update to extract the uid on from the owner of the >>>>> /proc/<pid> file. >>>>> It should be more reliable than using the owner of the >>>>> /proc/<pid>/cmdline file. >>>>> >>>>> Webrev: >>>>> >>>>> http://cr.openjdk.java.net/~rriggs/webrev-info-8133552/ >>>>> >>>>> >>>>> Thanks, Roger >>>>> >>>>> >>>>> On 9/9/2015 12:56 PM, Roger Riggs wrote: >>>>> >>>>>> Hi Volker, >>>>>> >>>>>> Thanks for the review and diagnosis. >>>>>> >>>>>> Can opening /proc/pid be used as a fallback if the st_uid is zero or >>>>>> is it worth the overhead of stat'ing /proc/pid always? >>>>>> >>>>>> Thanks, Roger >>>>>> >>>>>> >>>>>> On 9/9/2015 11:46 AM, Volker Simonis wrote: >>>>>> >>>>>>> Hi Roger, >>>>>>> >>>>>>> I think your change looks good and it surely improves the test >>>>>>> stability but I don't think it solves the problem in all cases. >>>>>>> >>>>>>> I think this problem is caused by a <defunct> (i.e. "zombie") process >>>>>>> (the spawned process lived too short and was already a zombie when the >>>>>>> info object was created). If you look at the proc-file system entry of >>>>>>> a <defunct> process you can see that its 'cmdline' file has zero size >>>>>>> and the file is owned by root. This is exactly what is reported by the >>>>>>> corresponding info object in the bug report (user=root and no cmd >>>>>>> field). >>>>>>> >>>>>>> We may need to improve the way how we get the uid of a pid on Linux. >>>>>>> The current way of querying the owner of /proc/<pid>/cmdline seems to >>>>>>> be unreliable. We may instead take the owner of /proc/<pid> which >>>>>>> seems to be still the initial user of the process. >>>>>>> >>>>>>> Regards, >>>>>>> Volker >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 8, 2015 at 11:35 PM, Roger Riggs >>>>>>> <roger.ri...@oracle.com> >>>>>>> wrote: >>>>>>> >>>>>>>> With link to webrev corrected: >>>>>>>> >>>>>>>> On 9/8/2015 5:08 PM, Roger Riggs wrote: >>>>>>>> >>>>>>>>> Please review an intermittent test bug fix. >>>>>>>>> The test setup time is very short and the user may be returned as 0 >>>>>>>>> which >>>>>>>>> is reported as root. >>>>>>>>> The correction lengthens the time allowed for the process to start. >>>>>>>>> >>>>>>>>> The test is removed from the ProblemList. >>>>>>>>> >>>>>>>>> Webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~rriggs//webrev-info-8133552 >>>>>>>>> >>>>>>>>> >>>>>>>>> Bug: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133552 >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, Roger >>>>>>>>> >>>>>>>>> >