[jira] [Resolved] (HBASE-5482) In 0.90, balancer algo leading to same region balanced twice and picking same region with Src and Destination as same RS.

2012-03-20 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5482.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

 In 0.90, balancer algo leading to same region balanced twice and picking same 
 region with Src and Destination as same RS.
 -

 Key: HBASE-5482
 URL: https://issues.apache.org/jira/browse/HBASE-5482
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7

 Attachments: 5482-v2.txt, HBASE-5482_1.patch, HBASE-5482_2.patch


 There are possibility of 2 problems
 - When we populate regionsToMove while iterating the serverinfo in 
 descending manner there is a chance that the same region can be added twice.
 Because in the first loop we do a randomization of the regions.
 Where as when we get we have neededRegions!= 0 we just get the region in the 
 index and add it again . This may lead to have same region in the 
 regionsToMove list.
 - Another problem is 
 when the problem in the first point happens then there is a chance that
 the regionToMove can have the same src and destination and the same region 
 can be picked every 5 mins.
 {code}
 for(Map.EntryHServerInfo, ListHRegionInfo server :
 serversByLoad.descendingMap().entrySet()) {
 BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey());
 int idx =
   balanceInfo == null ? 0 : balanceInfo.getNextRegionForUnload();
 if (idx = server.getValue().size()) break;
 HRegionInfo region = server.getValue().get(idx);
 if (region.isMetaRegion()) continue; // Don't move meta regions.
 regionsToMove.add(new RegionPlan(region, server.getKey(), null));
 if(--neededRegions == 0) {
   // No more regions needed, done shedding
   break;
 }
   }
 {code}
 If i have meta and root in the top two loaded region server(totally 3 RS), we 
 just skip the regions in those region server and populate the region from the 
 least loaded RS.
 Then in the next loop we iterate from the least loaded server and populate 
 the destination as also the same server.
 This is leading to a condition where every 5 min balancing happens and also 
 the server is same for src and dest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5490) Move the enum RS_ZK_REGION_FAILED_OPEN to the last of the enum list in 0.90 EventHandler

2012-03-18 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5490.
---

   Resolution: Fixed
Fix Version/s: (was: 0.90.7)
   0.90.6
 Assignee: ramkrishna.s.vasudevan

This is already committed to 0.90.6.  Changing it to 0.90.6

 Move the enum RS_ZK_REGION_FAILED_OPEN to the last of the enum list in 0.90 
 EventHandler
 

 Key: HBASE-5490
 URL: https://issues.apache.org/jira/browse/HBASE-5490
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: 5490-v2.txt, HBASE-5490.patch


 The new state that was added  RS_ZK_REGION_FAILED_OPEN was failing the 
 rolling restart.
 So move the new enum to the end of the list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5321) this.allRegionServersOffline not set to false after one RS comes online and assignment is done in 0.90.

2012-02-06 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5321.
---

Resolution: Fixed

Committed to 0.90.

 this.allRegionServersOffline  not set to false after one RS comes online and 
 assignment is done in 0.90.
 

 Key: HBASE-5321
 URL: https://issues.apache.org/jira/browse/HBASE-5321
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5321.patch


 In HBASE-5160 we do not wait for TM to assign the regions after the first RS 
 comes online.
 After doing this the variable this.allRegionServersOffline needs to be reset 
 which is not done in 0.90.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4893) HConnectionImplementation is closed but not deleted

2012-01-25 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-4893.
---

Resolution: Fixed
  Assignee: Mubarak Seyed

Resolving the issue

 HConnectionImplementation is closed but not deleted
 ---

 Key: HBASE-4893
 URL: https://issues.apache.org/jira/browse/HBASE-4893
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
 Environment: Linux 2.6, HBase-0.90.1
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
  Labels: noob
 Fix For: 0.90.6

 Attachments: HBASE-4893.v1.patch, HBASE-4893.v2.patch


 In abort() of HConnectionManager$HConnectionImplementation, instance of 
 HConnectionImplementation is marked as this.closed=true.
 There is no way for client application to check the hbase client connection 
 whether it is still opened/good (this.closed=false) or not. We need a method 
 to validate the state of a connection like isClosed().
 {code}
 public boolean isClosed(){
return this.closed;
 } 
 {code}
 Once the connection is closed and it should get deleted. Client application 
 still gets a connection from HConnectionManager.getConnection(Configuration) 
 and tries to make a RPC call to RS, since connection is already closed, 
 HConnectionImplementation.getRegionServerWithRetries throws 
 RetriesExhaustedException with error message
 {code}
 Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying 
 to contact region server null for region , row 
 '----xxx', but failed after 10 attempts.
 Exceptions:
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1008)
   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:546)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5235) HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.

2012-01-25 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5235.
---

Resolution: Fixed

Committed to 0.92, trunk and 0.90

 HLogSplitter writer thread's streams not getting closed when any of the 
 writer threads has exceptions.
 --

 Key: HBASE-5235
 URL: https://issues.apache.org/jira/browse/HBASE-5235
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5, 0.92.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: HBASE-5235_0.90.patch, HBASE-5235_0.90_1.patch, 
 HBASE-5235_0.90_2.patch, HBASE-5235_trunk.patch


 Pls find the analysis.  Correct me if am wrong
 {code}
 2012-01-15 05:14:02,374 FATAL 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-9 Got 
 while writing log entry to log
 java.io.IOException: All datanodes 10.18.40.200:50010 are bad. Aborting...
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3373)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2811)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3026)
 {code}
 Here we have an exception in one of the writer threads. If any exception we 
 try to hold it in an Atomic variable 
 {code}
   private void writerThreadError(Throwable t) {
 thrown.compareAndSet(null, t);
   }
 {code}
 In the finally block of splitLog we try to close the streams.
 {code}
   for (WriterThread t: writerThreads) {
 try {
   t.join();
 } catch (InterruptedException ie) {
   throw new IOException(ie);
 }
 checkForErrors();
   }
   LOG.info(Split writers finished);
   
   return closeStreams();
 {code}
 Inside checkForErrors
 {code}
   private void checkForErrors() throws IOException {
 Throwable thrown = this.thrown.get();
 if (thrown == null) return;
 if (thrown instanceof IOException) {
   throw (IOException)thrown;
 } else {
   throw new RuntimeException(thrown);
 }
   }
 So once we throw the exception the DFSStreamer threads are not getting closed.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5269) IllegalMonitorStateException while retryin HLog split in 0.90 branch.

2012-01-24 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5269.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.90.
Thanks for the review Stack and Ted.

 IllegalMonitorStateException while retryin HLog split in 0.90 branch.
 -

 Key: HBASE-5269
 URL: https://issues.apache.org/jira/browse/HBASE-5269
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5269.patch


 As part of HBASE-5137 fix this bug is introduced.  The splitLogLock is 
 released in the finally block inside the do-while loop. So when the loop 
 executes second time the unlock of the splitLogLock throws Illegal Monitor 
 Exception. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5225) Backport HBASE-3845 -data loss because lastSeqWritten can miss memstore edits

2012-01-21 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5225.
---

Resolution: Fixed

Committed to 0.90.

 Backport HBASE-3845 -data loss because lastSeqWritten can miss memstore edits
 -

 Key: HBASE-5225
 URL: https://issues.apache.org/jira/browse/HBASE-5225
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-3845-90.patch, HBASE-3845_0.90_1.patch


 Critical defect. Patch from HBASE-3845 was not integrated to 0.90.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5207) Apply HBASE-5155 to trunk

2012-01-16 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5207.
---

Resolution: Duplicate

Same as HBASE-5206

 Apply HBASE-5155  to trunk
 --

 Key: HBASE-5207
 URL: https://issues.apache.org/jira/browse/HBASE-5207
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan

 The issue HBASE-5155 has been fixed on branch(0.90).  The same has to be 
 applied on trunk also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5155) ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted

2012-01-15 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5155.
---

Resolution: Fixed

committed to branch 0.90.

 ServerShutDownHandler And Disable/Delete should not happen parallely leading 
 to recreation of regions that were deleted
 ---

 Key: HBASE-5155
 URL: https://issues.apache.org/jira/browse/HBASE-5155
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.90.6

 Attachments: HBASE-5155_1.patch, HBASE-5155_2.patch, 
 HBASE-5155_3.patch, HBASE-5155_latest.patch, hbase-5155_6.patch


 ServerShutDownHandler and disable/delete table handler races.  This is not an 
 issue due to TM.
 - A regionserver goes down.  In our cluster the regionserver holds lot of 
 regions.
 - A region R1 has two daughters D1 and D2.
 - The ServerShutdownHandler gets called and scans the META and gets all the 
 user regions
 - Parallely a table is disabled. (No problem in this step).
 - Delete table is done.
 - The tables and its regions are deleted including R1, D1 and D2.. (So META 
 is cleaned)
 - Now ServerShutdownhandler starts to processTheDeadRegion
 {code}
  if (hri.isOffline()  hri.isSplit()) {
   LOG.debug(Offlined and split region  + hri.getRegionNameAsString() +
 ; checking daughter presence);
   fixupDaughters(result, assignmentManager, catalogTracker);
 {code}
 As part of fixUpDaughters as the daughers D1 and D2 is missing for R1 
 {code}
 if (isDaughterMissing(catalogTracker, daughter)) {
   LOG.info(Fixup; missing daughter  + daughter.getRegionNameAsString());
   MetaEditor.addDaughter(catalogTracker, daughter, null);
   // TODO: Log WARN if the regiondir does not exist in the fs.  If its not
   // there then something wonky about the split -- things will keep going
   // but could be missing references to parent region.
   // And assign it.
   assignmentManager.assign(daughter, true);
 {code}
 we call assign of the daughers.  
 Now after this we again start with the below code.
 {code}
 if (processDeadRegion(e.getKey(), e.getValue(),
 this.services.getAssignmentManager(),
 this.server.getCatalogTracker())) {
   this.services.getAssignmentManager().assign(e.getKey(), true);
 {code}
 Now when the SSH scanned the META it had R1, D1 and D2.
 So as part of the above code D1 and D2 which where assigned by fixUpDaughters
 is again assigned by 
 {code}
 this.services.getAssignmentManager().assign(e.getKey(), true);
 {code}
 Thus leading to a zookeeper issue due to bad version and killing the master.
 The important part here is the regions that were deleted are recreated which 
 i think is more critical.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5192) Backport HBASE-4236 Don't lock the stream while serializing the response

2012-01-14 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5192.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

Thanks for the review Ted.
Committed to 0.90

 Backport HBASE-4236 Don't lock the stream while serializing the response
 

 Key: HBASE-5192
 URL: https://issues.apache.org/jira/browse/HBASE-5192
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-4236_0.90.patch


 Backporting to 0.90.6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5160) Backport HBASE-4397 - -ROOT-, .META. tables stay offline for too long in recovery phase after all RSs are shutdown at the same time

2012-01-14 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5160.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.90
Thanks for the review Ted

 Backport HBASE-4397 - -ROOT-, .META. tables stay offline for too long in 
 recovery phase after all RSs are shutdown at the same time
 ---

 Key: HBASE-5160
 URL: https://issues.apache.org/jira/browse/HBASE-5160
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5160-AssignmentManager.patch, HBASE-5160_2.patch


 Backporting to 0.90.6 considering the importance of the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5159) Backport HBASE-4079 - HTableUtil - helper class for loading data

2012-01-14 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5159.
---

   Resolution: Fixed
Fix Version/s: 0.90.6
 Hadoop Flags: Reviewed

 Backport HBASE-4079 - HTableUtil - helper class for loading data 
 -

 Key: HBASE-5159
 URL: https://issues.apache.org/jira/browse/HBASE-5159
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-4079.patch


 Backporting to 0.90.6 considering the usefulness of the feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5184) Backport HBASE-5152 - Region is on service before completing initialization when doing rollback of split, it will affect read correctness

2012-01-14 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5184.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

 Backport HBASE-5152 - Region is on service before completing initialization 
 when doing rollback of split, it will affect read correctness 
 --

 Key: HBASE-5184
 URL: https://issues.apache.org/jira/browse/HBASE-5184
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5152_0.90.patch


 Important issue to be merged into 0.90.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5168) Backport HBASE-5100 - Rollback of split could cause closed region to be opened again

2012-01-14 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5168.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

 Backport HBASE-5100 - Rollback of split could cause closed region to be 
 opened again
 

 Key: HBASE-5168
 URL: https://issues.apache.org/jira/browse/HBASE-5168
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5100_0.90.patch


 Considering the importance of the defect merging it to 0.90.6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5158) Backport HBASE-4878 - Master crash when splitting hlog may cause data loss

2012-01-14 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5158.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

 Backport HBASE-4878 - Master crash when splitting hlog may cause data loss
 --

 Key: HBASE-5158
 URL: https://issues.apache.org/jira/browse/HBASE-5158
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-4878_branch90_1.patch


 Backporting to 0.90.6 considering the importance of the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5178) Backport HBASE-4101 - Regionserver Deadlock

2012-01-12 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5178.
---

Resolution: Fixed

 Backport HBASE-4101 - Regionserver Deadlock
 ---

 Key: HBASE-5178
 URL: https://issues.apache.org/jira/browse/HBASE-5178
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-4101_0.90_1.patch


 Critical issue not merged to 0.90.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5137) MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws IOException

2012-01-10 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5137.
---

   Resolution: Fixed
Fix Version/s: (was: 0.92.1)
   0.92.0

 MasterFileSystem.splitLog() should abort even if waitOnSafeMode() throws 
 IOException
 

 Key: HBASE-5137
 URL: https://issues.apache.org/jira/browse/HBASE-5137
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.6

 Attachments: 5137-trunk.txt, HBASE-5137.patch, HBASE-5137.patch


 I am not sure if this bug was already raised in JIRA.
 In our test cluster we had a scenario where the RS had gone down and 
 ServerShutDownHandler started with splitLog.
 But as the HDFS was down the check waitOnSafeMode throws IOException.
 {code}
 try {
 // If FS is in safe mode, just wait till out of it.
 FSUtils.waitOnSafeMode(conf,
   conf.getInt(HConstants.THREAD_WAKE_FREQUENCY, 1000));  
 splitter.splitLog();
   } catch (OrphanHLogAfterSplitException e) {
 {code}
 We catch the exception
 {code}
 } catch (IOException e) {
   checkFileSystem();
   LOG.error(Failed splitting  + logDir.toString(), e);
 }
 {code}
 So the HLog split itself did not happen. We encontered like 4 regions that 
 was recently splitted in the crashed RS was lost.
 Can we abort the Master in such scenarios? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5073) Registered listeners not getting removed leading to memory leak in HBaseAdmin

2011-12-27 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5073.
---

Resolution: Fixed

Committed to branch hence resolving.

 Registered listeners not getting removed leading to memory leak in HBaseAdmin
 -

 Key: HBASE-5073
 URL: https://issues.apache.org/jira/browse/HBASE-5073
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5073.patch


 HBaseAdmin apis like tableExists(), flush, split, closeRegion uses catalog 
 tracker.  Every time Root node tracker and meta node tracker are started and 
 a listener is registered.  But after the operations are performed the 
 listeners are not getting removed. Hence if the admin apis are consistently 
 used then it may lead to memory leak.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4840) If I call split fast enough, while inserting, rows disappear.

2011-11-22 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-4840.
---

Resolution: Duplicate

Duplicate of HBASE-4841

 If I call split fast enough, while inserting, rows disappear. 
 --

 Key: HBASE-4840
 URL: https://issues.apache.org/jira/browse/HBASE-4840
 Project: HBase
  Issue Type: Bug
Reporter: Alex Newman

 I'll attach a unit test for this. Basically if you call split, while 
 inserting data you can get to the point to where the cluster becomes 
 unstable, or rows will  disappear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-18 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-4585.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.94.0

 Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing

2011-10-13 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-4540.
---

   Resolution: Fixed
Fix Version/s: 0.90.5

Resolved both in 0.92 and 0.90.5.

 OpenedRegionHandler is not enforcing atomicity of the operation it is 
 performing
 

 Key: HBASE-4540
 URL: https://issues.apache.org/jira/browse/HBASE-4540
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0, 0.90.5

 Attachments: HBASE-4540_1.patch, HBASE-4540_90.patch, 
 HBASE-4540_90_1.patch


 - OpenedRegionHandler has not yet deleted the znode of the region R1 opened 
 by RS1.
 - RS1 goes down.
 - Servershutdownhandler assigns the region R1 to RS2.
 - The znode of R1 is moved to OFFLINE state by master or OPENING state by 
 RS2 if RS2 has started opening the region.
 - Now the first OpenedRegionHandler tries to delete the znode thinking its 
 in OPENED state but fails.
 - Though it fails it removes the node from RIT and adds RS1 as the owner of 
 R1 in master's memory.
 - Now when RS2 completes opening the region the master is not able to open 
 the region as already the reigon has been deleted from RIT.
 {code}
 Master
 ==
 2011-10-05 20:49:45,301 INFO 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished 
 processing of shutdown of linux146,60020,1317827727647
 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
 running balancer because 1 region(s) in transition: 
 {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9.
  state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847}
 2011-10-05 20:49:57,720 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=linux76,6,1317827742012, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Deleting existing unassigned node for 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED
 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x132d3dc13090023 Attempting to delete unassigned node 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in 
 RS_ZK_REGION_OPENING state
 After the region is opened in RS2
 =
 2011-10-05 20:50:48,066 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late
 2011-10-05 20:50:48,290 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 2011-10-05 20:50:53,743 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, 
 region=3e69d628a8bd8e9b7c5e7a2a6e03aad9
 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: 
 Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
 2011-10-05 20:50:54,397 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 
 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but 
 region was in  the state null and not in expected PENDING_OPEN or OPENING 
 states
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4539) OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading to HMaster abort

2011-10-09 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-4539.
---

Resolution: Fixed

Fixed as part of HBASE-4540

 OpenedRegionHandler racing with itself in ServerShutDownhandler flow leading 
 to HMaster abort
 -

 Key: HBASE-4539
 URL: https://issues.apache.org/jira/browse/HBASE-4539
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 Steps to reproduce
 ==
 - Region R1 is being opened in RS1.  
 -After processing the znode to OPENED RS1 goes down.
 -Now before the OpenedRegionHandler executor deletes the znode if 
 ServerShutDownHandler tries to assign the region to RS2, RS2 transits the 
 node to OPENED and this OpenedRegionHandler executor deletes the znode.  
 -Now if the first OpenedRegionHandler tries deleting the znode it throws 
 NoNode Exception and causes the HMaster to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira