On 10 Sep 2015, at 14:49, Roger Riggs <roger.ri...@oracle.com> wrote:

> Hi Chris,
> 
> ok, updated the webrev with the 30 sec timeouts.  

Thanks Roger.

I remember going many rounds on false timeouts from tests in other areas a few 
years back. We came to a consensus that 30 secs as a timeout, that should never 
be triggered, was a reasonable value.

> I also expect that the timeoutFactor on slow systems would be applied by 
> jtreg.

Yes, but this does not cater for swamped systems.  I think we can err on the 
side of caution here, without any real cost.

Thanks,
-Chris.

> Roger
> 
> 
> On 9/10/2015 9:43 AM, Chris Hegarty wrote:
>> Roger,
>> 
>> The timeouts, in this test, are just to ensure that the test does not block 
>> indefinitely, if it encounters a bug in the JDK, right?  If a timeout is 
>> ever triggered then there is a bug, right?
>> 
> correct
>> 
>> If this is the case then, we have used larger timeouts in other areas ( net, 
>> concurrency ) to cover running on slooooow, or busy, machines. Typically 30 
>> secs.   To ensure no false failures.  The large value doesn’t really matter 
>> because it is never expected to actually wait that long. If it does timeout, 
>> then there is definitely a JDK bug.  Does it make sense to bump these to 30 
>> secs also?
>> 
>> -Chris.
>> 
>> On 10 Sep 2015, at 14:30, Roger Riggs 
>> <roger.ri...@oracle.com>
>>  wrote:
>> 
>> 
>>> Hi Joe,
>>> 
>>> I think adjusting the timeouts is already covered.
>>> The test uses Process.waitFor(timeout) to wait for the process to exit, but 
>>> only up to the timeout value.
>>> The "Utils.adjustTimeout(5)", performs the desired adjustment based on the 
>>> jtreg timeoutFactor.
>>> Utils is in the testlibrary.
>>> 
>>> Roger
>>> 
>>> 
>>> On 9/9/2015 8:08 PM, Joseph D. Darcy wrote:
>>> 
>>>> Hi Roger,
>>>> 
>>>> If timeouts need to be used, I suggest rather than fixed values they be 
>>>> adjusted according to the timeout factor being used in the test run.
>>>> 
>>>> Can some sort of repeated testing with exponential backout to a longer 
>>>> timeout be used ? If the system is actually ready is a fraction of a 
>>>> second, it is preferable for the test to be able to complete without 
>>>> waiting the full timeout value. (Perhaps that is already encapsulated in 
>>>> the existing code.)
>>>> 
>>>> Thanks,
>>>> 
>>>> -Joe
>>>> 
>>>> On 9/9/2015 2:49 PM, Roger Riggs wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Please review this update to extract the uid on from the owner of the 
>>>>> /proc/<pid> file.
>>>>> It should be more reliable than using the owner of the 
>>>>> /proc/<pid>/cmdline file.
>>>>> 
>>>>> Webrev:
>>>>>    
>>>>> http://cr.openjdk.java.net/~rriggs/webrev-info-8133552/
>>>>> 
>>>>> 
>>>>> Thanks, Roger
>>>>> 
>>>>> 
>>>>> On 9/9/2015 12:56 PM, Roger Riggs wrote:
>>>>> 
>>>>>> Hi Volker,
>>>>>> 
>>>>>> Thanks for the review and diagnosis.
>>>>>> 
>>>>>> Can opening /proc/pid be used as a fallback if the st_uid is zero or
>>>>>> is it worth the overhead of stat'ing /proc/pid always?
>>>>>> 
>>>>>> Thanks, Roger
>>>>>> 
>>>>>> 
>>>>>> On 9/9/2015 11:46 AM, Volker Simonis wrote:
>>>>>> 
>>>>>>> Hi Roger,
>>>>>>> 
>>>>>>> I think your change looks good and it surely improves the test
>>>>>>> stability but I don't think it solves the problem in all cases.
>>>>>>> 
>>>>>>> I think this problem is caused by a <defunct> (i.e. "zombie") process
>>>>>>> (the spawned process lived too short and was already a zombie when the
>>>>>>> info object was created). If you look at the proc-file system entry of
>>>>>>> a <defunct> process you can see that its 'cmdline' file has zero size
>>>>>>> and the file is owned by root. This is exactly what is reported by the
>>>>>>> corresponding info object in the bug report (user=root and no cmd
>>>>>>> field).
>>>>>>> 
>>>>>>> We may need to improve the way how we get the uid of a pid on Linux.
>>>>>>> The current way of querying the owner of /proc/<pid>/cmdline seems to
>>>>>>> be unreliable. We may instead take the owner of /proc/<pid> which
>>>>>>> seems to be still the initial user of the process.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Volker
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Sep 8, 2015 at 11:35 PM, Roger Riggs 
>>>>>>> <roger.ri...@oracle.com>
>>>>>>>  wrote:
>>>>>>> 
>>>>>>>> With link to webrev corrected:
>>>>>>>> 
>>>>>>>> On 9/8/2015 5:08 PM, Roger Riggs wrote:
>>>>>>>> 
>>>>>>>>> Please review an intermittent test bug fix.
>>>>>>>>> The test setup time is very short and the user may be returned as 0 
>>>>>>>>> which
>>>>>>>>> is reported as root.
>>>>>>>>> The correction lengthens the time allowed for the process to start.
>>>>>>>>> 
>>>>>>>>> The test is removed from the ProblemList.
>>>>>>>>> 
>>>>>>>>> Webrev:
>>>>>>>>> 
>>>>>>>>> http://cr.openjdk.java.net/~rriggs//webrev-info-8133552
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Bug:
>>>>>>>>>   
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8133552
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks, Roger
>>>>>>>>> 
>>>>>>>>> 
> 

Reply via email to