Dear Lars

It was a nice learning.  Really appreciate for making things clear.

Thanks to Todd for the same. :)

Regards
Ram



-----Original Message-----
From: lars hofhansl [mailto:[email protected]] 
Sent: Monday, November 28, 2011 5:08 AM
To: Todd Lipcon; [email protected]
Subject: Re: Status of 0.92RC

Just committed HBASE-4874.
Even though file:///dev/urandom worked in my tests as well ended up leaving
it at file:/dev/./urandom.
A bit of googling brought up *many* references for file:/dev/./urandom
(including the Sun bug below) and only one or two for file:///dev/urandom.

I hope that this will generally help with hanging tests. TestHCM does not
use SecureRandom, but TestHCM.testConnectionUniqueness()
is generating many random ints, thereby exhausting the systems entropy
reservoirs. So any test following TestHCM could hang when
it needs a secure random number (such as UUID.randomUUID(), which is used to
generated a new cluster UUID for the mini clusters).


-- Lars

________________________________
From: Todd Lipcon <[email protected]>
To: [email protected]; lars hofhansl <[email protected]> 
Sent: Saturday, November 26, 2011 9:09 PM
Subject: Re: Status of 0.92RC

On Sat, Nov 26, 2011 at 5:10 PM, lars hofhansl <[email protected]> wrote:
> Are you sure this works?
>
> According to this:
http://bugs.sun.com/view_bug.do;jsessionid=ff625daf459fdffffffffcd54f1c77529
9e0?bug_id=6202721
> the JDK will treat /dev/urandom as a special string and use /dev/random
anyway (hence the workaround with /dev/./urandom)

It worked for me so long as I had enough slashes - file:///dev/urandom
and not file:/dev/urandom as some other resources showed.

-Todd

>
>
> ----- Original Message -----
> From: Todd Lipcon <[email protected]>
> To: [email protected]
> Cc: lars hofhansl <[email protected]>
> Sent: Friday, November 25, 2011 10:22 PM
> Subject: Re: Status of 0.92RC
>
> In Hadoop we use this in our pom:
>             <java.security.egd>file:///dev/urandom</java.security.egd>
>
> which also works in JDK6. See HADOOP-7841
>
>
> On Fri, Nov 25, 2011 at 10:12 PM, Li Pi <[email protected]> wrote:
>> Very, very nice discovery!
>>
>> Once I saw the move your mouse around part, I smiled.
>>
>> Entropy can also be gathered from other system events, such as network
>> traffic, so on a production machine, you should always have enough
>> entropy. So it should be fine to change the test to use urandom.
>>
>> On Fri, Nov 25, 2011 at 9:06 PM, lars hofhansl <[email protected]>
wrote:
>>> Here is something that will boggle your mind:
>>> After you ran TestHCM often enough it will hang until you move your
mouse around!
>>>
>>> No joke... But what sounds like an impossibility is actually related to
the use of SecureRandom.
>>> SecureRandom uses /dev/random on Linux and that gathers entropy from
system events
>>> such as network activity, disk activity, keyboard activity, mouse
movement, etc.
>>> If not enough the entropy is available /dev/random will block until
enough entropy is gathered!
>>>
>>> When I add this to the command line (via pom.xml) TestHCM never hangs:
>>> "-Djava.security.egd=file:/dev/./urandom"
>>>
>>> (note that -Djava.security.egd=file:/dev/urandom does not work for some
reason on JDK 1.5 or newer,  the extra ./ is needed)
>>>
>>>
>>> This instructs the JDK use /dev/urandom for SecureRandom, which unlike
/dev/random will never block, but if not enough
>>> entropy exists it will generate random numbers of lesser quality... But
for tests we don't care.
>>>
>>>
>>> That explains why tests only time out sometimes... when the system
happens to run out of entropy bits when a secure random
>>> is used.
>>>
>>> -- Lars
>>>
>>>
>>> ________________________________
>>> From: Stack <[email protected]>
>>> To: [email protected]
>>> Cc: lars hofhansl <[email protected]>
>>> Sent: Friday, November 25, 2011 7:14 PM
>>> Subject: Re: Status of 0.92RC
>>>
>>> On Fri, Nov 25, 2011 at 6:16 PM, Ted Yu <[email protected]> wrote:
>>>> I then looped TestHCM 4 times and there was no test failure.
>>>>
>>>
>>> Its fine on mac.  On ubuntu:
>>>
>>>
----------------------------------------------------------------------------
---
>>> Test set: org.apache.hadoop.hbase.client.TestHCM
>>>
----------------------------------------------------------------------------
---
>>> Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
>>> 757.578 sec <<< FAILURE!
>>> testClosing(org.apache.hadoop.hbase.client.TestHCM)  Time elapsed:
>>> 35.34 sec  <<< FAILURE!
>>> java.lang.AssertionError
>>>         at org.junit.Assert.fail(Assert.java:92)
>>>         at org.junit.Assert.assertTrue(Assert.java:43)
>>>         at org.junit.Assert.assertFalse(Assert.java:68)
>>>         at org.junit.Assert.assertFalse(Assert.java:79)
>>>         at
org.apache.hadoop.hbase.client.TestHCM.testClosing(TestHCM.java:221)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> ....
>>>
>>> Line numbers are off because I'm messing.  Its saying connection 1 is
>>> closed if I test it just after creating it.
>>>
>>> St.Ack
>>>
>>>> On Fri, Nov 25, 2011 at 5:39 PM, Ted Yu <[email protected]> wrote:
>>>>
>>>>> I looped TestHCM#testClosing 5 times on MacBook and didn't see test
>>>>> failure.
>>>>>
>>>>> Stack:
>>>>> Can you share the test output ?
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> On Fri, Nov 25, 2011 at 5:04 PM, lars hofhansl
<[email protected]>wrote:
>>>>>
>>>>>> I added testClosing as part of HBASE-4805, I'll have a look as soon
as I
>>>>>> get a chance.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________
>>>>>>  From: Stack <[email protected]>
>>>>>> To: HBase Dev List <[email protected]>
>>>>>> Sent: Friday, November 25, 2011 2:12 PM
>>>>>> Subject: Status of 0.92RC
>>>>>>
>>>>>> I'm having a little difficulty getting all tests to pass.  On
>>>>>> linux/ubuntu, TestHCM (testClosing strange issue) and TestReplication
>>>>>> are failing for me.  On mac osx, it'll build without fail about 50%
of
>>>>>> the time.  I'd like to make it so tests pass all the time before
>>>>>> cutting the RC.  Thats what I'm at these times.
>>>>>>
>>>>>> Also, 0.92 build on jenkins has been turned off by Apache
>>>>>> Infrastructure.  It was hanging.  Its done this in the past too and
>>>>>> when it hangs it requires a jenkins reboot which doesn't make Apache
>>>>>> Infrastructure team too happy.  The hang looks to me like a Jenkins
>>>>>> bug because build hangs before we even checkout src.  Am trying to
see
>>>>>> what can be done to get it going again but thats the story at the mo.
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to