Just committed HBASE-4874.
Even though file:///dev/urandom worked in my tests as well ended up leaving it 
at file:/dev/./urandom.
A bit of googling brought up *many* references for file:/dev/./urandom 
(including the Sun bug below) and only one or two for file:///dev/urandom.

I hope that this will generally help with hanging tests. TestHCM does not use 
SecureRandom, but TestHCM.testConnectionUniqueness()
is generating many random ints, thereby exhausting the systems entropy 
reservoirs. So any test following TestHCM could hang when
it needs a secure random number (such as UUID.randomUUID(), which is used to 
generated a new cluster UUID for the mini clusters).


-- Lars

________________________________
From: Todd Lipcon <[email protected]>
To: [email protected]; lars hofhansl <[email protected]> 
Sent: Saturday, November 26, 2011 9:09 PM
Subject: Re: Status of 0.92RC

On Sat, Nov 26, 2011 at 5:10 PM, lars hofhansl <[email protected]> wrote:
> Are you sure this works?
>
> According to this: 
> http://bugs.sun.com/view_bug.do;jsessionid=ff625daf459fdffffffffcd54f1c775299e0?bug_id=6202721
> the JDK will treat /dev/urandom as a special string and use /dev/random 
> anyway (hence the workaround with /dev/./urandom)

It worked for me so long as I had enough slashes - file:///dev/urandom
and not file:/dev/urandom as some other resources showed.

-Todd

>
>
> ----- Original Message -----
> From: Todd Lipcon <[email protected]>
> To: [email protected]
> Cc: lars hofhansl <[email protected]>
> Sent: Friday, November 25, 2011 10:22 PM
> Subject: Re: Status of 0.92RC
>
> In Hadoop we use this in our pom:
>             <java.security.egd>file:///dev/urandom</java.security.egd>
>
> which also works in JDK6. See HADOOP-7841
>
>
> On Fri, Nov 25, 2011 at 10:12 PM, Li Pi <[email protected]> wrote:
>> Very, very nice discovery!
>>
>> Once I saw the move your mouse around part, I smiled.
>>
>> Entropy can also be gathered from other system events, such as network
>> traffic, so on a production machine, you should always have enough
>> entropy. So it should be fine to change the test to use urandom.
>>
>> On Fri, Nov 25, 2011 at 9:06 PM, lars hofhansl <[email protected]> wrote:
>>> Here is something that will boggle your mind:
>>> After you ran TestHCM often enough it will hang until you move your mouse 
>>> around!
>>>
>>> No joke... But what sounds like an impossibility is actually related to the 
>>> use of SecureRandom.
>>> SecureRandom uses /dev/random on Linux and that gathers entropy from system 
>>> events
>>> such as network activity, disk activity, keyboard activity, mouse movement, 
>>> etc.
>>> If not enough the entropy is available /dev/random will block until enough 
>>> entropy is gathered!
>>>
>>> When I add this to the command line (via pom.xml) TestHCM never hangs:
>>> "-Djava.security.egd=file:/dev/./urandom"
>>>
>>> (note that -Djava.security.egd=file:/dev/urandom does not work for some 
>>> reason on JDK 1.5 or newer,  the extra ./ is needed)
>>>
>>>
>>> This instructs the JDK use /dev/urandom for SecureRandom, which unlike 
>>> /dev/random will never block, but if not enough
>>> entropy exists it will generate random numbers of lesser quality... But for 
>>> tests we don't care.
>>>
>>>
>>> That explains why tests only time out sometimes... when the system happens 
>>> to run out of entropy bits when a secure random
>>> is used.
>>>
>>> -- Lars
>>>
>>>
>>> ________________________________
>>> From: Stack <[email protected]>
>>> To: [email protected]
>>> Cc: lars hofhansl <[email protected]>
>>> Sent: Friday, November 25, 2011 7:14 PM
>>> Subject: Re: Status of 0.92RC
>>>
>>> On Fri, Nov 25, 2011 at 6:16 PM, Ted Yu <[email protected]> wrote:
>>>> I then looped TestHCM 4 times and there was no test failure.
>>>>
>>>
>>> Its fine on mac.  On ubuntu:
>>>
>>> -------------------------------------------------------------------------------
>>> Test set: org.apache.hadoop.hbase.client.TestHCM
>>> -------------------------------------------------------------------------------
>>> Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
>>> 757.578 sec <<< FAILURE!
>>> testClosing(org.apache.hadoop.hbase.client.TestHCM)  Time elapsed:
>>> 35.34 sec  <<< FAILURE!
>>> java.lang.AssertionError
>>>         at org.junit.Assert.fail(Assert.java:92)
>>>         at org.junit.Assert.assertTrue(Assert.java:43)
>>>         at org.junit.Assert.assertFalse(Assert.java:68)
>>>         at org.junit.Assert.assertFalse(Assert.java:79)
>>>         at 
>>> org.apache.hadoop.hbase.client.TestHCM.testClosing(TestHCM.java:221)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> ....
>>>
>>> Line numbers are off because I'm messing.  Its saying connection 1 is
>>> closed if I test it just after creating it.
>>>
>>> St.Ack
>>>
>>>> On Fri, Nov 25, 2011 at 5:39 PM, Ted Yu <[email protected]> wrote:
>>>>
>>>>> I looped TestHCM#testClosing 5 times on MacBook and didn't see test
>>>>> failure.
>>>>>
>>>>> Stack:
>>>>> Can you share the test output ?
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> On Fri, Nov 25, 2011 at 5:04 PM, lars hofhansl <[email protected]>wrote:
>>>>>
>>>>>> I added testClosing as part of HBASE-4805, I'll have a look as soon as I
>>>>>> get a chance.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________
>>>>>>  From: Stack <[email protected]>
>>>>>> To: HBase Dev List <[email protected]>
>>>>>> Sent: Friday, November 25, 2011 2:12 PM
>>>>>> Subject: Status of 0.92RC
>>>>>>
>>>>>> I'm having a little difficulty getting all tests to pass.  On
>>>>>> linux/ubuntu, TestHCM (testClosing strange issue) and TestReplication
>>>>>> are failing for me.  On mac osx, it'll build without fail about 50% of
>>>>>> the time.  I'd like to make it so tests pass all the time before
>>>>>> cutting the RC.  Thats what I'm at these times.
>>>>>>
>>>>>> Also, 0.92 build on jenkins has been turned off by Apache
>>>>>> Infrastructure.  It was hanging.  Its done this in the past too and
>>>>>> when it hangs it requires a jenkins reboot which doesn't make Apache
>>>>>> Infrastructure team too happy.  The hang looks to me like a Jenkins
>>>>>> bug because build hangs before we even checkout src.  Am trying to see
>>>>>> what can be done to get it going again but thats the story at the mo.
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to