[
https://issues.apache.org/jira/browse/HBASE-3408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977980#action_12977980
]
Hudson commented on HBASE-3408:
-------------------------------
Integrated in HBase-TRUNK #1703 (See
[https://hudson.apache.org/hudson/job/HBase-TRUNK/1703/])
> AssignmentManager NullPointerException
> --------------------------------------
>
> Key: HBASE-3408
> URL: https://issues.apache.org/jira/browse/HBASE-3408
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.0
> Reporter: Matt Corgan
> Fix For: 0.90.0
>
> Attachments: HBASE-3408[0.90.0].patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> If AssignmentManager tries to move a region to an invalid destination server,
> rather than choosing a random server as intended, it throws an NPE.
> Line 1009 should check if existingPlan.getDestination()!=null:
> if (existingPlan == null || forceNewPlan ||
> (existingPlan.getDestination() != null &&
> existingPlan.getDestination().equals(serverToExclude))) {
> I triggered it by trying to manually move regions around, probably to an
> invalid destination server. I'm not currently able to build the project to
> test if that's the extent of the problem, so here's a little more info...
> It leaves a stranded region-in-transition until the master and/or
> regionserver are restarted and causes problems like the following. "hbck
> -fix" was unable to repair it.
> 2011-01-04 00:14:10,948 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor:
> Scanned 4287 catalog row(s) and gc'd 0 unreferenced parent region(s)
> 2011-01-04 00:14:18,574 DEBUG org.apache.hadoop.hbase.master.HMaster: Not
> running balancer because 1 region(s) in transition:
> {23ebce9a5d174f87bfb96ed1da387fdc=RandomValue,,1291219068335.23ebce9a5d174f87bfb96ed1da387fdc.
> state=OFFLINE, ts=1294118046139}
> 2011-01-04 00:14:36,142 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed
> out: RandomValue,,1291219068335.23ebce9a5d174f87bfb96ed1da387fdc.
> state=OFFLINE, ts=1294118046139
> 2011-01-04 00:14:36,142 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for
> too long, reassigning
> RandomValue,,1291219068335.23ebce9a5d174f87bfb96ed1da387fdc. to a random
> server
> 2011-01-04 00:14:36,142 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
> was=RandomValue,,1291219068335.23ebce9a5d174f87bfb96ed1da387fdc.
> state=OFFLINE, ts=1294118046139
> 2011-01-04 00:14:36,142 ERROR
> org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor: Caught
> exception
> java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:934)
> (i think this is .90.0RC1, so same bug on a different line number)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:909)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:822)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:663)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:643)
> at
> org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1481)
> at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.