[ 
https://issues.apache.org/jira/browse/HDFS-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012258#comment-13012258
 ] 

John George commented on HDFS-1782:
-----------------------------------

This was caused because of a null pointer deference. The change was tested by 
the same person who filed this bug and it passed. 

I thought about adding a test to run multiple balancers to as part of the unit 
test, but the timing of balancers does not guarantee that this will always test 
the change that was put in. Adding a "mock" test was a possibility, but it 
seems to me that it requires source code change to be able to "mock" and test 
the particular change.


    [exec] BUILD SUCCESSFUL
     [exec] Total time: 49 seconds
     [exec] 
     [exec] 
     [exec] 
     [exec] 
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
     [exec]                         Please justify why no new tests are needed 
for this patch.
     [exec]                         Also please list what manual steps were 
performed to verify this patch.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
     [exec] 
     [exec]     +1 system test framework.  The patch passed system test 
framework compile.
     [exec] 


> FSNamesystem.startFileInternal(..) throws NullPointerException
> --------------------------------------------------------------
>
>                 Key: HDFS-1782
>                 URL: https://issues.apache.org/jira/browse/HDFS-1782
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: John George
>            Assignee: John George
>             Fix For: 0.22.0
>
>         Attachments: HDFS-1782.patch
>
>
> I'm observing when there is one balancer running trying to run another one 
> results in
> "Java.lang.NullPointerException" error. I was hoping to see message "Another 
> balancer is running. 
> Exiting....  Exiting ...". This is a reproducible issue.
> Details
> ========
> 1) Cluster ->elrond
> [hdfs@gsbl90568 smilli]$ hadoop version
> Hadoop 0.22.0.1102280202
> Subversion 
> git://hadoopre5.corp.sk1.yahoo.com/home/y/var/builds/thread2/workspace/Cloud-HadoopCOMMON-0.22-Secondary
>  -r
> c7c9a21d7289e29f0133452acf8b761e455a84b5
> Compiled by hadoopqa on Mon Feb 28 02:12:38 PST 2011
> From source with checksum 9ecbc6f17e8847a1cddca2282dbd9b31
> [hdfs@gsbl90568 smilli]$
> 2) Run first balancer
> [hdfs@gsbl90565 smilli]$ hdfs balancer
> 11/03/09 16:33:56 INFO balancer.Balancer: namenodes = 
> [gsbl90565.blue.ygrid.yahoo.com/98.137.97.57:8020,
> gsbl90569.blue.ygrid.yahoo.com/98.137.97.53:8020]
> 11/03/09 16:33:56 INFO balancer.Balancer: p         = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0]
> Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 11/03/09 16:33:57 WARN conf.Configuration: mapred.task.id is deprecated. 
> Instead, use mapreduce.task.attempt.id
> 11/03/09 16:33:57 INFO balancer.Balancer: Block token params received from 
> NN: keyUpdateInterval=600 min(s),
> tokenLifetime=600 min(s)
> 11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:33:57 INFO balancer.Balancer: Balancer will update its block keys 
> every 150 minute(s)
> 11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:33:57 INFO balancer.Balancer: Block token params received from 
> NN: keyUpdateInterval=600 min(s),
> tokenLifetime=600 min(s)
> 11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:33:57 INFO balancer.Balancer: Balancer will update its block keys 
> every 150 minute(s)
> 11/03/09 16:33:57 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: 
> /98.137.97.0/98.137.97.62:1004
> 11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: 
> /98.137.97.0/98.137.97.58:1004
> 11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: 
> /98.137.97.0/98.137.97.60:1004
> 11/03/09 16:33:57 INFO net.NetworkTopology: Adding a new node: 
> /98.137.97.0/98.137.97.59:1004
> 11/03/09 16:33:57 INFO balancer.Balancer: 1 over-utilized: 
> [Source[98.137.97.62:1004, utilization=24.152507825759344]]
> 11/03/09 16:33:57 INFO balancer.Balancer: 0 underutilized: []
> 11/03/09 16:33:57 INFO balancer.Balancer: Need to move 207.98 GB to make the 
> cluster balanced.
> 11/03/09 16:33:57 INFO balancer.Balancer: Decided to move 10 GB bytes from 
> 98.137.97.62:1004 to 98.137.97.58:1004
> 11/03/09 16:33:57 INFO balancer.Balancer: Will move 10 GB in this iteration
> Mar 9, 2011 4:33:57 PM            0                 0 KB           207.98 GB  
>             10 GB
> .
> .
> .
> 11/03/09 16:34:36 INFO balancer.Balancer: Moving block -63570336576981940 
> from 98.137.97.62:1004 to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:34:39 INFO balancer.Balancer: Moving block 2379736326585824737 
> from 98.137.97.62:1004 to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:21 INFO balancer.Balancer: Moving block 8884583953927078028 
> from 98.137.97.62:1004 to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:24 INFO balancer.Balancer: Moving block -135758138424743964 
> from 98.137.97.62:1004 to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:27 INFO balancer.Balancer: Moving block -4598153351946352185 
> from 98.137.97.62:1004 to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:33 INFO balancer.Balancer: Moving block 2966087210491094643 
> from 98.137.97.62:1004 to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:42 INFO balancer.Balancer: Moving block -5573983508500804184 
> from 98.137.97.62:1004 to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 11/03/09 16:35:58 INFO balancer.Balancer: Moving block -6222779741597113957 
> from 98.137.97.62:1004 to 98.137.97.59:1004
> through 98.137.97.62:1004 is succeeded.
> 3) Run another balancer observe
> [hdfs@gsbl90568 smilli]$ hdfs balancer
> 11/03/09 16:34:32 INFO balancer.Balancer: namenodes = 
> [gsbl90565.blue.ygrid.yahoo.com/98.137.97.57:8020,
> gsbl90569.blue.ygrid.yahoo.com/98.137.97.53:8020]
> 11/03/09 16:34:32 INFO balancer.Balancer: p         = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=10.0]
> Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> 11/03/09 16:34:33 WARN conf.Configuration: mapred.task.id is deprecated. 
> Instead, use mapreduce.task.attempt.id
> 11/03/09 16:34:33 INFO balancer.Balancer: Block token params received from 
> NN: keyUpdateInterval=600 min(s),
> tokenLifetime=600 min(s)
> 11/03/09 16:34:33 INFO block.BlockTokenSecretManager: Setting block keys
> 11/03/09 16:34:33 INFO balancer.Balancer: Balancer will update its block keys 
> every 150 minute(s)
> 11/03/09 16:34:33 INFO block.BlockTokenSecretManager: Setting block keys
> java.io.IOException: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1400)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1284)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:779)
>         at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:346)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1399)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1395)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1094)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1393)
> .  Exiting ...
> Balancing took 1.366 seconds
> [hdfs@gsbl90568 smilli]$
> Pls let me know if you need additional information.
> Thanks,
> Suma
>               

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to