[jira] [Assigned] (HBASE-4492) TestRollingRestart fails intermittently

2011-09-28 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4492:
-

Assignee: ramkrishna.s.vasudevan  (was: Jonathan Gray)

 TestRollingRestart fails intermittently
 ---

 Key: HBASE-4492
 URL: https://issues.apache.org/jira/browse/HBASE-4492
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Assignee: ramkrishna.s.vasudevan
 Attachments: 4492-v2.txt, 4492.txt, HBASE-4492.patch


 I got the following when running test suite on TRUNK:
 {code}
 testBasicRollingRestart(org.apache.hadoop.hbase.master.TestRollingRestart)  
 Time elapsed: 300.28 sec   ERROR!
 java.lang.Exception: test timed out after 30 milliseconds
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.hbase.master.TestRollingRestart.waitForRSShutdownToStartAndFinish(TestRollingRestart.java:313)
 at 
 org.apache.hadoop.hbase.master.TestRollingRestart.testBasicRollingRestart(TestRollingRestart.java:210)
 {code}
 I ran TestRollingRestart#testBasicRollingRestart manually afterwards which 
 wiped out test output file for the failed test.
 Similar failure can be found on Jenkins:
 https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/19/testReport/junit/org.apache.hadoop.hbase.master/TestRollingRestart/testBasicRollingRestart/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-2794) ROWCOL bloom filter not used if multiple columns within same family are requested in a Get

2011-09-30 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-2794:
-

Assignee: Mikhail Bautin

 ROWCOL bloom filter not used if multiple columns within same family are 
 requested in a Get
 --

 Key: HBASE-2794
 URL: https://issues.apache.org/jira/browse/HBASE-2794
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: Kannan Muthukkaruppan
Assignee: Mikhail Bautin
 Fix For: 0.92.0


 Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
 {code}
 switch(bloomFilterType) {
   case ROW:
 key = row;
 break;
   case ROWCOL:
 if (columns.size() == 1) {
   byte[] col = columns.first();
   key = Bytes.add(row, col);
   break;
 }
 //$FALL-THROUGH$
   default:
 return true;
 }
 {code}
 If columns.size  1, then we currently don't take advantage of the bloom 
 filter.  We should optimize this to check bloom for each of columns and if 
 none of the columns are present in the bloom avoid opening the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4508) Backport HBASE-3777 to 0.90 branch

2011-10-12 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4508:
-

Assignee: Bright Fulton

 Backport HBASE-3777 to 0.90 branch
 --

 Key: HBASE-4508
 URL: https://issues.apache.org/jira/browse/HBASE-4508
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Bright Fulton
 Attachments: HBASE-4508.v1.patch, HBASE-4508.v2.patch, 
 HBASE-4508.v3.patch, HBASE-4508.v4.patch


 See discussion here: 
 http://search-hadoop.com/m/MJBId1aazTR1/backporting+HBASE-3777+to+0.90subj=backporting+HBASE+3777+to+0+90
 Rocketfuel has been running 0.90.3 with HBASE-3777 since its resolution.
 They have 10 RS nodes , 1 Master and 1 Zookeeper
 Live writes and reads but super heavy on reads. Cache hit is pretty high.
 The qps on one of their data centers is 50K.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4550) When master passed regionserver different address , because regionserver didn't create new zookeeper znode, as a result stop-hbase.sh is hang

2011-10-12 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4550:
-

Assignee: wanbin

 When master passed regionserver different address , because regionserver 
 didn't create new zookeeper znode,  as  a result stop-hbase.sh is hang
 ---

 Key: HBASE-4550
 URL: https://issues.apache.org/jira/browse/HBASE-4550
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.3
Reporter: wanbin
Assignee: wanbin
 Fix For: 0.90.5

 Attachments: patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 when master passed regionserver different address, regionserver didn't create 
 new zookeeper znode, master store new address in ServerManager, when call 
 stop-hbase.sh , RegionServerTracker.nodeDeleted received path is old address, 
 serverManager.expireServer is not be called. so stop-hbase.sh is hang.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4595:
-

Assignee: Matteo Bertozzi

 HFilePrettyPrinter Scanned kv count always 0
 

 Key: HBASE-4595
 URL: https://issues.apache.org/jira/browse/HBASE-4595
 Project: HBase
  Issue Type: Bug
  Components: io
Affects Versions: 0.92.0, 0.94.0, 0.92.1
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-4595.patch


 The count variable used to print the Scanned kv count is never 
 incremented.
 A local count variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-16 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4562:
-

Assignee: bluedavy

 When split doing offlineParentInMeta encounters error, it'll cause data loss
 

 Key: HBASE-4562
 URL: https://issues.apache.org/jira/browse/HBASE-4562
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.4
Reporter: bluedavy
Assignee: bluedavy
Priority: Blocker
 Fix For: 0.90.5

 Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, 
 HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, 
 test-4562-trunk.txt


 Follow below steps to replay the problem:
 1. change the SplitTransaction.java as below,just like mock the timeout error.
{code:title=SplitTransaction.java|borderStyle=solid}
   if (!testing) {
 MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
 throw new IOException(some unexpected error in split);
   }
{code} 
 2. update the regionserver code,restart;
 3. create a table  put some data to the table;
 4. split the table;
 5. kill the regionserver hosted the table;
 6. wait some time after master ServerShutdownHandler.process execute,then 
 scan the table,u'll find the data wrote before lost.
 We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

2011-10-18 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4612:
-

Assignee: Eran Kutner

 Allow ColumnPrefixFilter to support multiple prefixes
 -

 Key: HBASE-4612
 URL: https://issues.apache.org/jira/browse/HBASE-4612
 Project: HBase
  Issue Type: Improvement
  Components: filters
Affects Versions: 0.90.4
Reporter: Eran Kutner
Assignee: Eran Kutner
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-4612-0.90.patch


 When having a lot of columns grouped by name I've found that it would be very 
 useful to be able to scan them using multiple prefixes, allowing to fetch 
 specific groups in one scan, without fetching the entire row. This is 
 impossible to achieve using a FilterList, so I've added such support to the 
 existing ColmnPrefixFilter while keeping backward compatibility.
 The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a 
 new method to support instantiating filters using Thrift. I'm not sure how 
 the serialization works there so I didn't implement that, but the rest of my 
 code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4644) LoadIncrementalHFiles ignores additional configurations

2011-10-21 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4644:
-

Assignee: Alexey Zotov

 LoadIncrementalHFiles ignores additional configurations
 ---

 Key: HBASE-4644
 URL: https://issues.apache.org/jira/browse/HBASE-4644
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
 Environment: Centos 5.5, Cloudera cdh3u1 distribution.
Reporter: Alexey Zotov
Assignee: Alexey Zotov
Priority: Minor
  Labels: Configuration, LoadIncrementalHFiles

 Method run ignores configuration, which was passed in as constructor argument:
 {code}
 LoadIncrementalHFiles hFilesMergeTask = new LoadIncrementalHFiles(conf);
 ToolRunner.run(hFilesMergeTask, args); 
 {code}
 This happens because HTable creation (_new HTable(tableName);_ in 
 LoadIncrementalHFiles.run() method) skips existing configuration and tries to 
 create a new one for HTable. If there is no hbase-site.xml in classpath, 
 previously loaded properties (via -conf configuration file) will be missed. 
 Quick fix:
 {code}
 --- LoadIncrementalHFiles.java2011-07-18 08:20:38.0 +0400
 +++ LoadIncrementalHFiles.java2011-10-19 18:08:31.228972054 +0400
 @@ -447,14 +446,20 @@
  if (!tableExists) this.createTable(tableName,dirPath);
  
  Path hfofDir = new Path(dirPath);
 -HTable table = new HTable(tableName);
 +HTable table;
 +Configuration configuration = getConf();
 +if (configuration != null) {
 +  table = new HTable(configuration, tableName);
 +} else {
 +  table = new HTable(tableName);
 +}
  
  doBulkLoad(hfofDir, table);
  return 0;
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4578) NPE when altering a table that has moving regions

2011-10-22 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4578:
-

Assignee: gaojinchao

 NPE when altering a table that has moving regions
 -

 Key: HBASE-4578
 URL: https://issues.apache.org/jira/browse/HBASE-4578
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: gaojinchao
Priority: Blocker
 Fix For: 0.92.0

 Attachments: HBASE-4578_trial_Trunk.patch


 I'm still not a 100% sure on the source of this error, but here's what I was 
 able to get twice while altering a table that was doing a bunch of splits:
 {quote}
 2011-10-11 23:48:59,344 INFO 
 org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handled SPLIT 
 report); 
 parent=TestTable,0002608338,1318376880454.a75d6815fdfc513fb1c8aabe086c6763. 
 daughter 
 a=TestTable,0002608338,1318376938764.ef170ff6cd8695dc8aec92e542dc9ac1.daughter
  b=TestTable,0003301408,1318376938764.36eb2530341bd46888ede312c5559b5d.
 2011-10-11 23:49:09,579 DEBUG 
 org.apache.hadoop.hbase.master.handler.TableEventHandler: Ignoring table not 
 disabled exception for supporting online schema changes.
 2011-10-11 23:49:09,580 INFO 
 org.apache.hadoop.hbase.master.handler.TableEventHandler: Handling table 
 operation C_M_MODIFY_TABLE on table TestTable
 2011-10-11 23:49:09,612 INFO org.apache.hadoop.hbase.util.FSUtils: 
 TableInfoPath = hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo tmpPath = 
 hdfs://sv4r11s38:9100/hbase/TestTable/.tmp/.tableinfo.1318376949612
 2011-10-11 23:49:09,692 INFO org.apache.hadoop.hbase.util.FSUtils: 
 TableDescriptor stored. TableInfoPath = 
 hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo
 2011-10-11 23:49:09,693 INFO org.apache.hadoop.hbase.util.FSUtils: Updated 
 tableinfo=hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo to blah
 2011-10-11 23:49:09,695 INFO 
 org.apache.hadoop.hbase.master.handler.TableEventHandler: Bucketing regions 
 by region server...
 2011-10-11 23:49:09,695 DEBUG org.apache.hadoop.hbase.client.MetaScanner: 
 Scanning .META. starting at row=TestTable,,00 for max=2147483647 
 rows
 2011-10-11 23:49:09,709 DEBUG 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
 The connection to hconnection-0x132f043bbde02e9 has been closed.
 2011-10-11 23:49:09,709 ERROR org.apache.hadoop.hbase.executor.EventHandler: 
 Caught throwable while processing event C_M_MODIFY_TABLE
 java.lang.NullPointerException
   at java.util.TreeMap.getEntry(TreeMap.java:324)
   at java.util.TreeMap.containsKey(TreeMap.java:209)
   at 
 org.apache.hadoop.hbase.master.handler.TableEventHandler.reOpenAllRegions(TableEventHandler.java:114)
   at 
 org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:90)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {quote}
 The first time the shell reported that all the regions were updated 
 correctly, the second time it got stuck for a while:
 {quote}
 6/14 regions updated.
 0/14 regions updated.
 ...
 0/14 regions updated.
 2/16 regions updated.
 ...
 2/16 regions updated.
 8/9 regions updated.
 ...
 8/9 regions updated.
 {quote}
 After which I killed it, redid the alter and it worked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4669) Add an option of using round-robin assignment for enabling table

2011-10-27 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4669:
-

Assignee: Jieshan Bean

 Add an option of using round-robin assignment for enabling table
 

 Key: HBASE-4669
 URL: https://issues.apache.org/jira/browse/HBASE-4669
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
Priority: Minor
 Fix For: 0.94.0

 Attachments: HBASE-4669-90-V2.patch, HBASE-4669-90.patch, 
 HBASE-4669-Trunk-V2.patch, HBASE-4669-Trunk.patch


 Under some scenarios, we use the function of disable/enable HTable. But 
 currently, enable HTable uses the random-assignment. We hope all the regions 
 show a better distribution, no matter how many regions and how many 
 regionservers.
 So I suggest to add an option of using round-robin assignment on 
 enable-table. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4641) Block cache can be mistakenly instantiated on Master

2011-10-28 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4641:
-

Assignee: Ted Yu  (was: Jonathan Gray)

 Block cache can be mistakenly instantiated on Master
 

 Key: HBASE-4641
 URL: https://issues.apache.org/jira/browse/HBASE-4641
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4641-suggestion-v3.txt, 4641-v4.txt, 
 HBASE-4641-v1.patch, HBASE-4641-v2.patch


 After changes in the block cache instantiation over in HBASE-4422, it looks 
 like the HMaster can now end up with a block cache instantiated.  Not a huge 
 deal but prevents the process from shutting down properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4702) Allow override of scan cache value for rowcounter

2011-10-29 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4702:
-

Assignee: Ted Yu

 Allow override of scan cache value for rowcounter
 -

 Key: HBASE-4702
 URL: https://issues.apache.org/jira/browse/HBASE-4702
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.2
 Environment: Operating System: Linux
Reporter: Rita M
Assignee: Ted Yu

 Doing a row count for a large table via Mapreduce may take long time.
 Trying to set the default cache size but there is no knob to tune it.
 See here for more details, 
 http://search-hadoop.com/m/ECEs6237AIXsubj=Re+speeding+up+rowcount

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4690) Intermittent TestRegionServerCoprocessorExceptionWithAbort#testExceptionFromCoprocessorDuringPut failure

2011-10-29 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4690:
-

Assignee: Ted Yu  (was: Eugene Koontz)

 Intermittent 
 TestRegionServerCoprocessorExceptionWithAbort#testExceptionFromCoprocessorDuringPut
  failure
 

 Key: HBASE-4690
 URL: https://issues.apache.org/jira/browse/HBASE-4690
 Project: HBase
  Issue Type: Test
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 4690-trunk.txt


 See 
 https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/83/testReport/junit/org.apache.hadoop.hbase.coprocessor/TestRegionServerCoprocessorExceptionWithAbort/testExceptionFromCoprocessorDuringPut/
 Somehow getRSForFirstRegionInTable() wasn't able to retrieve the region 
 server.
 One fix for this issue is to spin up MiniCluster with 1 region server so that 
 we don't need to search for the region server where first region is hosted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4741) Online schema change doesn't return errors

2011-11-03 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4741:
-

Assignee: Ted Yu

 Online schema change doesn't return errors
 --

 Key: HBASE-4741
 URL: https://issues.apache.org/jira/browse/HBASE-4741
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.92.0


 Still after the fun I had over in HBASE-4729, I tried to finish altering my 
 table (remove a family) since only half of it was changed so I did this:
 {quote}
 hbase(main):002:0 alter 'TestTable', NAME = 'allo', METHOD = 'delete' 
 Updating all regions with the new schema...
 244/244 regions updated.
 Done.
 0 row(s) in 1.2480 seconds
 {quote}
 Nice it all looks good, but over in the master log:
 {quote}
 org.apache.hadoop.hbase.InvalidFamilyOperationException: Family 'allo' does 
 not exist so cannot be deleted
 at 
 org.apache.hadoop.hbase.master.handler.TableDeleteFamilyHandler.handleTableOperation(TableDeleteFamilyHandler.java:56)
 at 
 org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:86)
 at 
 org.apache.hadoop.hbase.master.HMaster.deleteColumn(HMaster.java:1011)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:348)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1242)
 {quote}
 Maybe we should do checks before launching the async task.
 Marking critical as this is a regression.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4741) Online schema change doesn't return errors

2011-11-04 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4741:
-

Assignee: (was: Ted Yu)

I may not have time to work on this in the next week.

 Online schema change doesn't return errors
 --

 Key: HBASE-4741
 URL: https://issues.apache.org/jira/browse/HBASE-4741
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4741-v2.txt, 4741-v3.txt, 4741-v4.txt, 4741.txt


 Still after the fun I had over in HBASE-4729, I tried to finish altering my 
 table (remove a family) since only half of it was changed so I did this:
 {quote}
 hbase(main):002:0 alter 'TestTable', NAME = 'allo', METHOD = 'delete' 
 Updating all regions with the new schema...
 244/244 regions updated.
 Done.
 0 row(s) in 1.2480 seconds
 {quote}
 Nice it all looks good, but over in the master log:
 {quote}
 org.apache.hadoop.hbase.InvalidFamilyOperationException: Family 'allo' does 
 not exist so cannot be deleted
 at 
 org.apache.hadoop.hbase.master.handler.TableDeleteFamilyHandler.handleTableOperation(TableDeleteFamilyHandler.java:56)
 at 
 org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:86)
 at 
 org.apache.hadoop.hbase.master.HMaster.deleteColumn(HMaster.java:1011)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:348)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1242)
 {quote}
 Maybe we should do checks before launching the async task.
 Marking critical as this is a regression.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4751) Make TestAdmin#testEnableTableRoundRobinAssignment friendly to concurrent tests

2011-11-04 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4751:
-

Assignee: Jieshan Bean

 Make TestAdmin#testEnableTableRoundRobinAssignment friendly to concurrent 
 tests
 ---

 Key: HBASE-4751
 URL: https://issues.apache.org/jira/browse/HBASE-4751
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Jieshan Bean

 From 
 https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2410/artifact/trunk/target/surefire-reports/org.apache.hadoop.hbase.client.TestAdmin.txt
  :
 {code}
 testEnableTableRoundRobinAssignment(org.apache.hadoop.hbase.client.TestAdmin) 
  Time elapsed: 4.345 sec   ERROR!
 java.lang.IllegalArgumentException: Check the value configured in 
 'zookeeper.znode.parent'. There could be a mismatch with the one configured 
 in the master.
   at 
 org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:81)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:753)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:866)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:765)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:202)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:157)
   at 
 org.apache.hadoop.hbase.client.TestAdmin.testEnableTableRoundRobinAssignment(TestAdmin.java:604)
 {code}
 This was due to:
 {code}
 HTable metaTable = new HTable(HConstants.META_TABLE_NAME);
 {code}
 A few lines above, we have the correct usage:
 {code}
 HTable ht = new HTable(TEST_UTIL.getConfiguration(), tableName);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4752:
-

Assignee: Ted Yu  (was: Benoit Sigoure)

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4746:
-

Assignee: Mikhail Bautin

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4478) Improve AssignmentManager.handleRegion so that it can process certain ZK state in the case of RS offline

2011-11-10 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4478:
-

Assignee: ramkrishna.s.vasudevan  (was: Ming Ma)

 Improve AssignmentManager.handleRegion so that it can process certain ZK 
 state in the case of RS offline
 

 Key: HBASE-4478
 URL: https://issues.apache.org/jira/browse/HBASE-4478
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma
Assignee: ramkrishna.s.vasudevan

 Currently AssignmentManager.handleRegion skips processing of ZK event change 
 if the RS is offline. It relies on TimeoutMonitor and ServerShutdownHandler 
 to process RIT.
   // Verify this is a known server
   if (!serverManager.isServerOnline(sn) 
   !this.master.getServerName().equals(sn)) {
 LOG.warn(Attempted to handle region transition for server but  +
   server is not online:  + Bytes.toString(data.getRegionName()));
 return;
   }
 For certain states like OPENED, OPENING, FAILED_OPEN, CLOSED, it can continue 
 the progressing even if the RS is offline.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4799) Catalog Janitor logic bug causes region leackage

2011-11-16 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4799:
-

Assignee: Max Lapan

 Catalog Janitor logic bug causes region leackage
 

 Key: HBASE-4799
 URL: https://issues.apache.org/jira/browse/HBASE-4799
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Max Lapan
Priority: Critical
 Attachments: 0001-Fix-of-Regions-Leaks-problem-in-janitor.patch, 
 0002-Temporary-fix-to-remove-leaked-regions.patch


 When region split takes a significant amount of time, CatalogJanitor can 
 cleanup one of SPLIT records, but left another in META. When another split 
 finish, janitor cleans left SPLIT record, but parent regions haven't removed 
 from FS and META not cleared.
 The race condition is follows:
 1. region split started
 2. one of regions splitted, i.e. A (have no reference storefiles) but other 
 (B) doesn't
 3. janitor started and in routine checkDaughter removes SPLITA from meta, but 
 see that SPLITB has references and does nothing.
 4. region B completes split
 5. janitor wakes up, removes SPLITB, but see that there is no records for A 
 and does nothing again.
 Result - parent region hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4839) Re-enable TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover

2011-11-21 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4839:
-

Assignee: Subbu M Iyer

 Re-enable 
 TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover
 --

 Key: HBASE-4839
 URL: https://issues.apache.org/jira/browse/HBASE-4839
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Assignee: Subbu M Iyer

 TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover
  was disabled for instant schema change (HBASE-4213) after it failed on 
 Jenkins.
 We should enable it and make it pass on Jenkins and dev enviroments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4856) Upgrade zookeeper to 3.4.0 release

2011-11-23 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4856:
-

Assignee: Ted Yu

 Upgrade zookeeper to 3.4.0 release
 --

 Key: HBASE-4856
 URL: https://issues.apache.org/jira/browse/HBASE-4856
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu

 Zookeeper 3.4.0 has been released.
 We should upgade.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4857) Recursive loop on KeeperException in AuthenticationTokenSecretManager/ZKLeaderManager

2011-11-23 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4857:
-

Assignee: Gary Helmling

 Recursive loop on KeeperException in 
 AuthenticationTokenSecretManager/ZKLeaderManager
 -

 Key: HBASE-4857
 URL: https://issues.apache.org/jira/browse/HBASE-4857
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.92.0, 0.94.0
Reporter: Gary Helmling
Assignee: Gary Helmling
 Fix For: 0.92.0

 Attachments: HBASE-4857.patch


 Looking through stack traces for {{TestMasterFailover}}, I see a case where 
 the leader {{AuthenticationTokenSecretManager}} can get into a recursive loop 
 when a {{KeeperException}} is encountered:
 {noformat}
 Thread-1-EventThread daemon prio=10 tid=0x7f9fb47b2800 nid=0x77f6 
 waiting on condition [0x7f9fab376000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
 at java.lang.Thread.sleep(Native Method)
 at java.lang.Thread.sleep(Thread.java:302)
 at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:328)
 at 
 org.apache.hadoop.hbase.util.RetryCounter.sleepUntilNextRetry(RetryCounter.java:55)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:206)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:891)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:161)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:154)
 at 
 org.apache.hadoop.hbase.master.HMaster.tryRecoveringExpiredZKSession(HMaster.java:1397)
 at org.apache.hadoop.hbase.master.HMaster.abortNow(HMaster.java:1435)
 at org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:1374)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.abort(ZooKeeperWatcher.java:450)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:166)
 at 
 org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:167)
 at 
 org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:167)
 at 
 org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:96)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:286)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
 {noformat}
 The {{KeeperException}} causes {{ZKLeaderManager}} to call 
 {{AuthenticationTokenSecretManager$LeaderElector.stop()}}, which calls 
 {{ZKLeaderManager.stepDownAsLeader()}}, which will encounter another 
 {{KeeperException}}, and so on...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4864) testRegionTransitionOperations occasional failures

2011-11-24 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4864:
-

Assignee: gaojinchao

 testRegionTransitionOperations occasional failures
 --

 Key: HBASE-4864
 URL: https://issues.apache.org/jira/browse/HBASE-4864
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: gaojinchao
Assignee: gaojinchao
Priority: Minor
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-4864_Branch92.patch


 looks this logs:
 https://builds.apache.org/job/HBase-TRUNK-security/ws/trunk/target/surefire-reports/
 It seems that we should wait region is added to online region set.
 I made a patch, Please review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4862) Split hlog and open region concurrently happend may cause data loss

2011-11-24 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4862:
-

Assignee: chunhui shen

 Split hlog and open region concurrently happend may cause data loss
 ---

 Key: HBASE-4862
 URL: https://issues.apache.org/jira/browse/HBASE-4862
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.92.0, 0.94.0, 0.90.5

 Attachments: 4862.patch


 Case Description:
 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
 and is appending log entry
 2.Regionserver is opening region A now, and in the process 
 replayRecoveredEditsIfAny() ,it will delete the file region 
 A/recoverd.edits/123456 
 3.Split hlog thread catches the io exception, and stop parse this log file 
 and if skipError = true , add it to the corrupt logsHowever, data in 
 other regions in this log file will loss 
 4.Or if skipError = false, it will check filesystem.Of course, the file 
 system is ok , and it only prints a error log, continue assigning regions. 
 Therefore, data in other log files will also loss!!
 The case may happen in the following:
 1.Move region from server A to server B
 2.kill server A and Server B
 3.restart server A and Server B
 We could prevent this exception throuth forbiding deleting  recover.edits 
 file 
 which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4773) HBaseAdmin may leak ZooKeeper connections

2011-11-28 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4773:
-

Assignee: xufeng

 HBaseAdmin may leak ZooKeeper connections
 -

 Key: HBASE-4773
 URL: https://issues.apache.org/jira/browse/HBASE-4773
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4
Reporter: gaojinchao
Assignee: xufeng
Priority: Critical
 Fix For: 0.90.5

 Attachments: 4773.patch, branches_4773.patch, trunk_4773_patch.patch


 When master crashs, HBaseAdmin will leaks ZooKeeper connections
 I think we should close the zk connetion when throw MasterNotRunningException
  public HBaseAdmin(Configuration c)
   throws MasterNotRunningException, ZooKeeperConnectionException {
 this.conf = HBaseConfiguration.create(c);
 this.connection = HConnectionManager.getConnection(this.conf);
 this.pause = this.conf.getLong(hbase.client.pause, 1000);
 this.numRetries = this.conf.getInt(hbase.client.retries.number, 10);
 this.retryLongerMultiplier = 
 this.conf.getInt(hbase.client.retries.longer.multiplier, 10);
 //we should add this code and close the zk connection
 try{
   this.connection.getMaster();
 }catch(MasterNotRunningException e){
   HConnectionManager.deleteConnection(conf, false);
   throw e;  
 }
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4885) Building against Hadoop 0.23 uses out-of-date MapReduce artifacts

2011-11-28 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4885:
-

Assignee: Tom White

 Building against Hadoop 0.23 uses out-of-date MapReduce artifacts
 -

 Key: HBASE-4885
 URL: https://issues.apache.org/jira/browse/HBASE-4885
 Project: HBase
  Issue Type: Bug
  Components: build
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.94.0

 Attachments: HBASE-4885.patch


 The hadoop-mapred artifacts have been replaced by hadoop-mapreduce-* 
 artifacts in 0.23 onwards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4936) Cached HRegionInterface connections crash when getting UnknownHost exceptions

2011-12-03 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4936:
-

Assignee: Andrei Dragomir

 Cached HRegionInterface connections crash when getting UnknownHost exceptions
 -

 Key: HBASE-4936
 URL: https://issues.apache.org/jira/browse/HBASE-4936
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Attachments: HBASE-4936.patch


 This isssue is unlikely to come up in a cluster test case. However, for 
 development, the following thing happens: 
 1. Start the HBase cluster locally, on network A (DNS A, etc)
 2. The region locations are cached using the hostname 
 (mycomputer.company.com, 211.x.y.z - real ip)
 3. Change network location (go home)
 4. Start the HBase cluster locally. My hostname / ips are not different 
 (mycomputer, 192.168.0.130 - new ip)
 If the region locations have been cached using the hostname, there is an 
 UnknownHostException in CatalogTracker.getCachedConnection(ServerName sn), 
 uncaught in the catch statements. The server will crash constantly. 
 The error should be caught and not rethrown, so that the cached connection 
 expires normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to de

2011-12-05 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4946:
-

Assignee: Andrei Dragomir

 HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
 dynamically loaded coprocessors (from hdfs or local system), because the RPC 
 system tries to deserialize an unknown class. 
 -

 Key: HBASE-4946
 URL: https://issues.apache.org/jira/browse/HBASE-4946
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Andrei Dragomir
Assignee: Andrei Dragomir
 Attachments: HBASE-4946-v2.patch, HBASE-4946.patch


 Loading coprocessors jars from hdfs works fine. I load it from the shell, 
 after setting the attribute, and it gets loaded:
 {noformat}
 INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
 config now ...
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
 com.MyCoprocessorClass needs to be loaded from a file - 
 hdfs://localhost:9000/coproc/rt-  0.0.1-SNAPSHOT.jar.
 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
 com.MyCoprocessorClass
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
 RegionEnvironment createEnvironment
 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
 handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
 protocol=com.MyCoprocessorClassProtocol
 INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
 coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
 {noformat}
 The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
 with a dynamic method. When calling this method from the client with 
 HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
 cannot be deserialized from writables. 
 The problem is that Exec tries to do an early resolve of the coprocessor 
 class. The coprocessor class is loaded, but it is in the context of the 
 HRegionServer / HRegion. So, the call fails:
 {noformat}
 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
 Error in readFields
 java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
   at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
   ... 10 more
 {noformat}
 Probably the correct way to fix this is to make Exec really smart, so that it 
 knows all the class definitions loaded in CoprocessorHost(s).
 I created a small patch that simply doesn't resolve the class definition in 
 the Exec, instead passing it as string down to the HRegion layer. This layer 
 knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira