[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253127#comment-13253127
 ] 

Ted Yu commented on HBASE-5778:
---

The remaining issue is about how the replication sink correctly decompresses 
WAL.
>From test output, I saw:
{code}
java.io.EOFException
  at java.io.DataInputStream.readFully(DataInputStream.java:180)
  at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:2243)
  at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:2249)
  at 
org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:129)
  at 
org.apache.hadoop.hbase.regionserver.wal.HLog$Entry.readFields(HLog.java:1700)
{code}
For replication sink, there is no CompressionContext in HLog$Entry which can be 
used to perform decompression.

I agree the change should be reverted.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253107#comment-13253107
 ] 

Ted Yu commented on HBASE-5778:
---

{code}
+  } catch (IndexOutOfBoundsException iobe) {
+// this can happen with a corrupted file, fall through
+  }
{code}
I think we should note down the cause of failure to retrieve dictionary entry 
and provide clearer message in the IOE below:
{code}
   if (entry == null) {
 throw new IOException("Missing dictionary entry for index "
 + dictIdx);
{code}

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5780) Fix race in HBase regionserver startup vs ZK SASL authentication

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253102#comment-13253102
 ] 

Ted Yu commented on HBASE-5780:
---

I think throwing exception immediately is better.

Please also run through test suite using '-Psecurity' since Hadoop QA doesn't 
test security profile. Let us know the test result.

Thanks

> Fix race in HBase regionserver startup vs ZK SASL authentication
> 
>
> Key: HBASE-5780
> URL: https://issues.apache.org/jira/browse/HBASE-5780
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.92.1, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
> Attachments: HBASE-5780.patch
>
>
> Secure RegionServers sometimes fail to start with the following backtrace:
> 2012-03-22 17:20:16,737 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> centos60-20.ent.cloudera.com,60020,1332462015929: Unexpected exception during 
> initialization, aborting
> org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = 
> NoAuth for /hbase/shutdown
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:295)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:518)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:494)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:569)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:532)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:634)
> at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253085#comment-13253085
 ] 

Ted Yu commented on HBASE-5604:
---

TestWALPlayer.java doesn't have test category.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5780) Fix race in HBase regionserver startup vs ZK SASL authentication

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253074#comment-13253074
 ] 

Ted Yu commented on HBASE-5780:
---

{code}
+} catch (InterruptedException e) {
+  LOG.error("Interrupted while waiting for the ZookeeperWatcher to 
authenticate", e);
{code}
Is it safe to proceed with start() in the above case ?

> Fix race in HBase regionserver startup vs ZK SASL authentication
> 
>
> Key: HBASE-5780
> URL: https://issues.apache.org/jira/browse/HBASE-5780
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.92.1, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
> Attachments: HBASE-5780.patch
>
>
> Secure RegionServers sometimes fail to start with the following backtrace:
> 2012-03-22 17:20:16,737 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> centos60-20.ent.cloudera.com,60020,1332462015929: Unexpected exception during 
> initialization, aborting
> org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = 
> NoAuth for /hbase/shutdown
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:295)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:518)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:494)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:569)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:532)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:634)
> at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252916#comment-13252916
 ] 

Ted Yu commented on HBASE-5604:
---

Patch looks good.
Minor comment:
{code}
+  HLog.Entry temp;
+  long i=-1;
{code}
Insert spaces around equal sign above.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5773) HtablePool constructor not reading config files in certain cases

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252551#comment-13252551
 ] 

Ted Yu commented on HBASE-5773:
---

Patch makes sense.

> HtablePool constructor not reading config files in certain cases
> 
>
> Key: HBASE-5773
> URL: https://issues.apache.org/jira/browse/HBASE-5773
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.6, 0.92.1, 0.94.1
>Reporter: Ioan Eugen Stan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.1
>
> Attachments: different-config-behaviour.patch
>
>
> Creating a HtablePool can issue two behaviour depanding on the constructor 
> called. 
> Case 1: loads the configs from hbase-site
>   public HTablePool() {
> this(HBaseConfiguration.create(), Integer.MAX_VALUE);
>   }
> Calling this with null values for Configuration: 
> public HTablePool(final Configuration config, final int maxSize) {
> this(config, maxSize, null, null);
>   }
> will issue:
>  public HTablePool(final Configuration config, final int maxSize,
>   final HTableInterfaceFactory tableFactory, PoolType poolType) {
> // Make a new configuration instance so I can safely cleanup when
> // done with the pool.
> this.config = config == null ? new Configuration() : config;
> which does not read the hbase-site config files as 
> HBaseConfiguration.create() does. 
> I've tracked this problem to all versions of hbase. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252537#comment-13252537
 ] 

Ted Yu commented on HBASE-5604:
---

If you run the patch through 'arc lint', you would see (just excerpt):
{code}
   Warning  (TXT3) Line Too Long
This line is 106 characters long, but the convention is 100 characters.

 274 System.err.println("  -D" + BULK_OUTPUT_CONF_KEY + 
"=/path/for/output");
 275 System.err.println("  (Exactly one table must be specified 
for this option, and not mapping can be specified!)");
 276 System.err.println("Other options:");
>>>  277 System.err.println("  -D" + HLogInputFormat.START_TIME_KEY 
+ "=ms (only apply edit after this time)");
 278 System.err.println("  -D" + HLogInputFormat.END_TIME_KEY + 
"=ms (only apply edit before this time)");
 279 System.err.println("For performance also consider the 
following options:\n"
 280 + "  -Dmapred.map.tasks.speculative.execution=false\n"

   Warning  (TXT3) Line Too Long
This line is 105 characters long, but the convention is 100 characters.

 275 System.err.println("  (Exactly one table must be specified 
for this option, and not mapping can be specified!)");
 276 System.err.println("Other options:");
 277 System.err.println("  -D" + HLogInputFormat.START_TIME_KEY 
+ "=ms (only apply edit after this time)");
>>>  278 System.err.println("  -D" + HLogInputFormat.END_TIME_KEY + 
"=ms (only apply edit before this time)");
 279 System.err.println("For performance also consider the 
following options:\n"
 280 + "  -Dmapred.map.tasks.speculative.execution=false\n"
 281 + "  
-Dmapred.reduce.tasks.speculative.execution=false");
{code}
Below is a long line:
{code}
+throw new IOException(option + " must be specified either in the form 
2001-02-20T16:35:06.99 or as a number of milliseconds");
{code}
Minor comment: the 'a' in 'as a number' should be omitted.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
> 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5767) Add the hbase shell table_att for any attribute

2012-04-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252515#comment-13252515
 ] 

Ted Yu commented on HBASE-5767:
---

In order for HBASE-5335 to be backported to 0.92, you need to provide backport 
for 0.94 first :-)
Remember to take into account the two required JIRAs, including HBASE-5359.

> Add the hbase shell table_att for any attribute
> ---
>
> Key: HBASE-5767
> URL: https://issues.apache.org/jira/browse/HBASE-5767
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Xing Shi
>Priority: Minor
> Attachments: HBASE-5767-V2.patch, HBASE-5767.patch
>
>
> Now the HTableDescriptor supports setValue(String key, String value) method, 
> but the hbase shell not support it.
> May be like this:
> {quota}
> hbase(main):003:0> alter 'test', METHOD=>'table_att', 'key1'=>'value1'
> Updating all regions with the new schema...
> 1/1 regions updated.
> Done.
> 0 row(s) in 1.0820 seconds
> hbase(main):005:0> describe 'test'
> DESCRIPTION   
> ENABLED  
>  {NAME => 'test', key1 => 'value1', FAMILIES => [{NAME => 'f1', BLOOMFILTER 
> => 'NONE', RE true 
>  PLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS 
> => '0', TTL  
>   => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 
> 'true'}]} 
> 1 row(s) in 0.0300 seconds
> hbase(main):007:0> alter 'test', METHOD=>'table_att_unset', NAME=>'key1'
> Updating all regions with the new schema...
> 1/1 regions updated.
> Done.
> 0 row(s) in 1.0860 seconds
> hbase(main):008:0> describe 'test'
> DESCRIPTION   
> ENABLED  
>  {NAME => 'test', FAMILIES => [{NAME => 'f1', BLOOMFILTER => 'NONE', 
> REPLICATION_SCOPE => false
>   '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 
> '2147483647',   
>  BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}  
>  
> 1 row(s) in 0.0280 seconds
> {quota}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5727) secure hbase build broke because of 'HBASE-5451 Switch RPC call envelope/headers to PBs'

2012-04-08 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249612#comment-13249612
 ] 

Ted Yu commented on HBASE-5727:
---

TestRowProcessorEndpoint in security profile passed smoothly with latest patch.

+1 on integrating the patch and launching another secure build.

Thanks for working over the weekend, Devaraj.

> secure hbase build broke because of 'HBASE-5451 Switch RPC call 
> envelope/headers to PBs'
> 
>
> Key: HBASE-5727
> URL: https://issues.apache.org/jira/browse/HBASE-5727
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Devaraj Das
>Priority: Blocker
> Attachments: 5727.1.patch, 5727.patch
>
>
> If you build with the security profile -- i.e. add '-P security' on the 
> command line -- you'll see that the secure build is broke since we messed in 
> rpc.
> Assigning Deveraj to take a look.   If you can't work on this now DD, just 
> give it back to me and I'll have a go at it.  Thanks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5656) LoadIncrementalHFiles createTable should detect and set compression algorithm

2012-04-08 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249602#comment-13249602
 ] 

Ted Yu commented on HBASE-5656:
---

Latest patch looks good to me.

> LoadIncrementalHFiles createTable should detect and set compression algorithm
> -
>
> Key: HBASE-5656
> URL: https://issues.apache.org/jira/browse/HBASE-5656
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.92.1
>Reporter: Cosmin Lehene
>Assignee: Cosmin Lehene
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5656-simple.txt, HBASE-5656-0.92.patch, 
> HBASE-5656-0.92.patch, HBASE-5656-0.92.patch, HBASE-5656-0.92.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> LoadIncrementalHFiles doesn't set compression when creating the the table.
> This can be detected from the files within each family dir. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5656) LoadIncrementalHFiles createTable should detect and set compression algorithm

2012-04-08 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249555#comment-13249555
 ] 

Ted Yu commented on HBASE-5656:
---

Patch looks good. Minor comment:
{code}
+hcd.setCompressionType(reader.getCompressionAlgorithm());
{code}
I think we should log the compression type used in the if block.

> LoadIncrementalHFiles createTable should detect and set compression algorithm
> -
>
> Key: HBASE-5656
> URL: https://issues.apache.org/jira/browse/HBASE-5656
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.92.1
>Reporter: Cosmin Lehene
>Assignee: Cosmin Lehene
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5656-simple.txt, HBASE-5656-0.92.patch, 
> HBASE-5656-0.92.patch, HBASE-5656-0.92.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> LoadIncrementalHFiles doesn't set compression when creating the the table.
> This can be detected from the files within each family dir. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5436) Right-size the map when reading attributes.

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243958#comment-13243958
 ] 

Ted Yu commented on HBASE-5436:
---

Integrated to 0.92 branch.

> Right-size the map when reading attributes.
> ---
>
> Key: HBASE-5436
> URL: https://issues.apache.org/jira/browse/HBASE-5436
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Trivial
>  Labels: performance
> Fix For: 0.94.0
>
> Attachments: 0001-Right-size-the-map-when-reading-attributes.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243863#comment-13243863
 ] 

Ted Yu commented on HBASE-5666:
---

Patch v2 looks good.

> RegionServer doesn't retry to check if base node is available
> -
>
> Key: HBASE-5666
> URL: https://issues.apache.org/jira/browse/HBASE-5666
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, zookeeper
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, 
> hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, 
> hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log
>
>
> I've a script that starts hbase and a couple of region servers in distributed 
> mode (hbase.cluster.distributed = true)
> {code}
> $HBASE_HOME/bin/start-hbase.sh
> $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
> {code}
> but the region servers are not able to start...
> It seems that during the RS start the the znode is still not available, and 
> HRegionServer.initializeZooKeeper() check just once if the base not is 
> available.
> {code}
> 2012-03-28 21:54:05,013 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
> configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
> one configured in the master.
> 2012-03-28 21:54:08,598 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
> RS.
> java.io.IOException: Received the shutdown message while waiting.
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
>   at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243845#comment-13243845
 ] 

Ted Yu commented on HBASE-5666:
---

{code}
+if (keeperEx != null)
+  throw keeperEx;
{code}
Please either lift the throw to the same line as if or add curly braces.
{code}
+checkExists(zk, parentZNode, maxTimeMs);
+LOG.info("Parent znode exists: " + parentZNode);
{code}
If checkExists() returns -1, would the log statement still be true ?

> RegionServer doesn't retry to check if base node is available
> -
>
> Key: HBASE-5666
> URL: https://issues.apache.org/jira/browse/HBASE-5666
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, zookeeper
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Attachments: HBASE-5666-v1.patch, hbase-1-regionserver.log, 
> hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, 
> hbase-regionserver.log, hbase-zookeeper.log
>
>
> I've a script that starts hbase and a couple of region servers in distributed 
> mode (hbase.cluster.distributed = true)
> {code}
> $HBASE_HOME/bin/start-hbase.sh
> $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
> {code}
> but the region servers are not able to start...
> It seems that during the RS start the the znode is still not available, and 
> HRegionServer.initializeZooKeeper() check just once if the base not is 
> available.
> {code}
> 2012-03-28 21:54:05,013 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value 
> configured in 'zookeeper.znode.parent'. There could be a mismatch with the 
> one configured in the master.
> 2012-03-28 21:54:08,598 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> localhost,60202,133296824: Initialization of RS failed.  Hence aborting 
> RS.
> java.io.IOException: Received the shutdown message while waiting.
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
>   at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243842#comment-13243842
 ] 

Ted Yu commented on HBASE-5665:
---

HBASE-5665-trunk.patch looks good.

> Repeated split causes HRegionServer failures and breaks table 
> --
>
> Key: HBASE-5665
> URL: https://issues.apache.org/jira/browse/HBASE-5665
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.0, 0.92.1
>Reporter: Cosmin Lehene
>Assignee: Cosmin Lehene
>Priority: Blocker
> Attachments: HBASE-5665-0.92.patch, HBASE-5665-trunk.patch
>
>
> Repeated splits on large tables (2 consecutive would suffice) will 
> essentially "break" the table (and the cluster), unrecoverable.
> The regionserver doing the split dies and the master will get into an 
> infinite loop trying to assign regions that seem to have the files missing 
> from HDFS.
> The table can be disabled once. upon trying to re-enable it, it will remain 
> in an intermediary state forever.
> I was able to reproduce this on a smaller table consistently.
> {code}
> hbase(main):030:0> (0..1).each{|x| put 't1', "#{x}", 'f1:t', 'dd'}
> hbase(main):030:0> (0..1000).each{|x| split 't1', "#{x*10}"}
> {code}
> Running overlapping splits in parallel (e.g. "#{x*10+1}", "#{x*10+2}"... ) 
> will reproduce the issue almost instantly and consistently. 
> {code}
> 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
> Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in 
> META
> 2012-03-28 10:57:16,321 DEBUG 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for 
> t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1..  
> compaction_queue=(0:1), split_queue=10
> 2012-03-28 10:57:16,343 INFO 
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
> of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; 
> Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124
> java.io.IOException: Failed 
> ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451)
> at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1813)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
> at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.(StoreFile.java:1008)
> at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader.(HalfStoreFileReader.java:65)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548)
> at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284)
> at org.apache.hadoop.hbase.regionserver.Store.(Store.java:221)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484)
> ... 1 more
> 2012-03-28 10:57:16,345 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> ld2,60020,1332957343833: Abort; we got an error after point-of-no-return
> {code}
> http://hastebin.com/diqinibajo.avrasm
> later edit:
> (I'm using the last 4 characters from each string)
> Region 94e3 has storefile

[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243765#comment-13243765
 ] 

Ted Yu commented on HBASE-5693:
---

@N:
Can you rebased the patch for trunk ?
{code}
Hunk #3 FAILED at 3613.
1 out of 3 hunks FAILED -- saving rejects to file 
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java.rej
{code}

> When creating a region, the master initializes it and creates a memstore 
> within the master server
> -
>
> Key: HBASE-5693
> URL: https://issues.apache.org/jira/browse/HBASE-5693
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5693.v1.patch
>
>
> I didn't do a complete analysis, but the attached patch saves more than 0.25s 
> for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243762#comment-13243762
 ] 

Ted Yu commented on HBASE-5693:
---

It is called from OpenRegionHandler.openRegion()

I once made some threads daemon which passed unit tests but resulted in master 
and region server failing to start.

Testing on a real cluster is desirable.

> When creating a region, the master initializes it and creates a memstore 
> within the master server
> -
>
> Key: HBASE-5693
> URL: https://issues.apache.org/jira/browse/HBASE-5693
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5693.v1.patch
>
>
> I didn't do a complete analysis, but the attached patch saves more than 0.25s 
> for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243747#comment-13243747
 ] 

Ted Yu commented on HBASE-5677:
---

Interesting.
Chunhui proposed safe mode for Master in HBASE-5270. See 
https://issues.apache.org/jira/browse/HBASE-5270?focusedCommentId=13214394&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13214394

Can you verify that this issue has been fixed in 0.92.2 ?

Thanks

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243746#comment-13243746
 ] 

Ted Yu commented on HBASE-5693:
---

CreateTableHandler isn't initializing the regions.
Who will initialize them ?

> When creating a region, the master initializes it and creates a memstore 
> within the master server
> -
>
> Key: HBASE-5693
> URL: https://issues.apache.org/jira/browse/HBASE-5693
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5693.v1.patch
>
>
> I didn't do a complete analysis, but the attached patch saves more than 0.25s 
> for each region creation and locally all the unit tests work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5689) Skipping RecoveredEdits may cause data loss

2012-04-01 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243743#comment-13243743
 ] 

Ted Yu commented on HBASE-5689:
---

Using a TreeMap is common practice.

Please attach test suite result - Hadoop QA is not working.

> Skipping RecoveredEdits may cause data loss
> ---
>
> Key: HBASE-5689
> URL: https://issues.apache.org/jira/browse/HBASE-5689
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5689-simplified.txt, 5689-testcase.patch, 
> HBASE-5689.patch
>
>
> Let's see the following scenario:
> 1.Region is on the server A
> 2.put KV(r1->v1) to the region
> 3.move region from server A to server B
> 4.put KV(r2->v2) to the region
> 5.move region from server B to server A
> 6.put KV(r3->v3) to the region
> 7.kill -9 server B and start it
> 8.kill -9 server A and start it 
> 9.scan the region, we could only get two KV(r1->v1,r2->v2), the third 
> KV(r3->v3) is lost.
> Let's analyse the upper scenario from the code:
> 1.the edit logs of KV(r1->v1) and KV(r3->v3) are both recorded in the same 
> hlog file on server A.
> 2.when we split server B's hlog file in the process of ServerShutdownHandler, 
> we create one RecoveredEdits file f1 for the region.
> 2.when we split server A's hlog file in the process of ServerShutdownHandler, 
> we create another RecoveredEdits file f2 for the region.
> 3.however, RecoveredEdits file f2 will be skiped when initializing region
> HRegion#replayRecoveredEditsIfAny
> {code}
>  for (Path edits: files) {
>   if (edits == null || !this.fs.exists(edits)) {
> LOG.warn("Null or non-existent edits file: " + edits);
> continue;
>   }
>   if (isZeroLengthThenDelete(this.fs, edits)) continue;
>   if (checkSafeToSkip) {
> Path higher = files.higher(edits);
> long maxSeqId = Long.MAX_VALUE;
> if (higher != null) {
>   // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: "-?[0-9]+"
>   String fileName = higher.getName();
>   maxSeqId = Math.abs(Long.parseLong(fileName));
> }
> if (maxSeqId <= minSeqId) {
>   String msg = "Maximum possible sequenceid for this log is " + 
> maxSeqId
>   + ", skipped the whole file, path=" + edits;
>   LOG.debug(msg);
>   continue;
> } else {
>   checkSafeToSkip = false;
> }
>   }
> {code}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239139#comment-13239139
 ] 

Ted Yu commented on HBASE-5641:
---

+1 on patch.

> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-03-26 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238891#comment-13238891
 ] 

Ted Yu commented on HBASE-2600:
---

@Alex:
Can you attach hbase-2600-root.dir.tgz to this JIRA ?
Please briefly describe how you generated the tar ball.

Thanks

> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> 
>
> Key: HBASE-2600
> URL: https://issues.apache.org/jira/browse/HBASE-2600
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Alex Newman
> Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, HBASE-2600+5217-Sun-Mar-25-2012-v3.patch, 
> HBASE-2600+5217-Sun-Mar-25-2012-v4.patch, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaay 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238675#comment-13238675
 ] 

Ted Yu commented on HBASE-5623:
---

Last patch should be good to go.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, plea

[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-03-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238074#comment-13238074
 ] 

Ted Yu commented on HBASE-2600:
---

dev-support/test-patch.sh doesn't use '--binary' option when applying patches.

I tried the following command:
{code}
patch -p0 --binary -i HBASE-2600+5217-Sun-Mar-25-2012-v4.patch
{code}
But src/test/data/hbase-2600-root.dir.tgz wasn't unpacked from patch.

> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> 
>
> Key: HBASE-2600
> URL: https://issues.apache.org/jira/browse/HBASE-2600
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Alex Newman
> Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, HBASE-2600+5217-Sun-Mar-25-2012-v3.patch, 
> HBASE-2600+5217-Sun-Mar-25-2012-v4.patch, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaay 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region

2012-03-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238030#comment-13238030
 ] 

Ted Yu commented on HBASE-5615:
---

@Xufeng:
You're welcome.

In the future, please grant license to Apache when you attach patches.

> the master never does balance because of balancing the parent region
> 
>
> Key: HBASE-5615
> URL: https://issues.apache.org/jira/browse/HBASE-5615
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.7
>Reporter: xufeng
>Assignee: xufeng
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
> NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html
>
>
> the master never do balance becauseof when master do rebuildUserRegions(),it 
> will add the parent region into  AssignmentManager#servers,
> if balancer let the parent region to move,the parent will in RIT forever.thus 
> balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-03-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238010#comment-13238010
 ] 

Ted Yu commented on HBASE-2600:
---

@Alex:
Try this:
{code}
git diff --no-prefix --binary 
{code}

Thanks

> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> 
>
> Key: HBASE-2600
> URL: https://issues.apache.org/jira/browse/HBASE-2600
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Alex Newman
> Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, HBASE-2600+5217-Sun-Mar-25-2012-v3.patch, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaay 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region

2012-03-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238009#comment-13238009
 ] 

Ted Yu commented on HBASE-5615:
---

Integrated to 0.92, 0.94 and TRUNK as well.

> the master never does balance because of balancing the parent region
> 
>
> Key: HBASE-5615
> URL: https://issues.apache.org/jira/browse/HBASE-5615
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.7
>Reporter: xufeng
>Assignee: xufeng
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
> NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html
>
>
> the master never do balance becauseof when master do rebuildUserRegions(),it 
> will add the parent region into  AssignmentManager#servers,
> if balancer let the parent region to move,the parent will in RIT forever.thus 
> balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-24 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237714#comment-13237714
 ] 

Ted Yu commented on HBASE-5623:
---

Adding LOG.debug should be fine.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, plea

[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region

2012-03-24 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237556#comment-13237556
 ] 

Ted Yu commented on HBASE-5615:
---

Integrated to 0.90 branch.

Thanks for the patch Xufeng.

Thanks for the review Ramkrishna and Jinchao.

Patch for TRUNK to follow

> the master never does balance because of balancing the parent region
> 
>
> Key: HBASE-5615
> URL: https://issues.apache.org/jira/browse/HBASE-5615
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.7
>Reporter: xufeng
>Assignee: xufeng
>Priority: Critical
> Fix For: 0.90.7, 0.96.0
>
> Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
> NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html
>
>
> the master never do balance becauseof when master do rebuildUserRegions(),it 
> will add the parent region into  AssignmentManager#servers,
> if balancer let the parent region to move,the parent will in RIT forever.thus 
> balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-19 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233216#comment-13233216
 ] 

Ted Yu commented on HBASE-5596:
---

Is this patch ready for integration ?

>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/1212//testReport/org.apache.hadoop.hbase.client/TestInstantSchemaChangeFailover/testInstantSchemaChangeWhileRSCrash/:
{code}
Caused by: java.lang.RuntimeException: Master not initialized after 200 seconds
at 
org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:208)
at 
org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:424)
at 
org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:200)
{code}

> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Attachments: HBASE_5596.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233179#comment-13233179
 ] 

Ted Yu commented on HBASE-5564:
---

{code}
+public int getTimestapKeyColumnIndex() {
{code}
Please fix typo in the above method name.
{code}
+  "  -D" + TIMESTAMP_CONF_KEY + "=currentTimeAsLong - use the specified 
timestamp for the import. This option is ignored if HBASE_TS_KEY is specfied in 
'importtsv.columns'\n" +
{code}
Please wrap the long line above.
{code}
+// Should never get 0.
+ts = conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0);
{code}
Please explain why 0 wouldn't be returned.
{code}
+  if (parser.getTimestapKeyColumnIndex() != -1)
+ts = parsed.getTimestamp();
{code}
Please use curly braces around the assignment.

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233120#comment-13233120
 ] 

Ted Yu commented on HBASE-3996:
---

Eran might be busy.

I created https://reviews.apache.org/r/4411/ for people to review.

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228009#comment-13228009
 ] 

Ted Yu commented on HBASE-5399:
---

>From test output:
{code}
Exception in thread "Thread-211" junit.framework.AssertionFailedError   at 
junit.framework.Assert.fail(Assert.java:48)
at junit.framework.Assert.fail(Assert.java:56)
at 
org.apache.hadoop.hbase.regionserver.TestAtomicOperation$2.run(TestAtomicOperation.java:392)
{code}
Here is related code in test:
{code}
  if (r.size() != 1) {
LOG.debug(r);
failures.incrementAndGet();
fail();
  }
{code}

> Cut the link between the client and the zookeeper ensemble
> --
>
> Key: HBASE-5399
> URL: https://issues.apache.org/jira/browse/HBASE-5399
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
> 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
> 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
> 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
> 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
> 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
> 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
> 5399_inprogress.v9.patch, nochange.patch
>
>
> The link is often considered as an issue, for various reasons. One of them 
> being that there is a limit on the number of connection that ZK can manage. 
> Stack was suggesting as well to remove the link to master from HConnection.
> There are choices to be made considering the existing API (that we don't want 
> to break).
> The first patches I will submit on hadoop-qa should not be committed: they 
> are here to show the progress on the direction taken.
> ZooKeeper is used for:
> - public getter, to let the client do whatever he wants, and close ZooKeeper 
> when closing the connection => we have to deprecate this but keep it.
> - read get master address to create a master => now done with a temporary 
> zookeeper connection
> - read root location => now done with a temporary zookeeper connection, but 
> questionable. Used in public function "locateRegion". To be reworked.
> - read cluster id => now done once with a temporary zookeeper connection.
> - check if base done is available => now done once with a zookeeper 
> connection given as a parameter
> - isTableDisabled/isTableAvailable => public functions, now done with a 
> temporary zookeeper connection.
>  - Called internally from HBaseAdmin and HTable
> - getCurrentNrHRS(): public function to get the number of region servers and 
> create a pool of thread => now done with a temporary zookeeper connection
> -
> Master is used for:
> - getMaster public getter, as for ZooKeeper => we have to deprecate this but 
> keep it.
> - isMasterRunning(): public function, used internally by HMerge & HBaseAdmin
> - getHTableDescriptor*: public functions offering access to the master.  => 
> we could make them using a temporary master connection as well.
> Main points are:
> - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
> strongly coupled architecture ;-). This can be changed, but requires a lot of 
> modifications in these classes (likely adding a class in the middle of the 
> hierarchy, something like that). Anyway, non connected client will always be 
> really slower, because it's a tcp connection, and establishing a tcp 
> connection is slow.
> - having a link between ZK and all the client seems to make sense for some 
> Use Cases. However, it won't scale if a TCP connection is required for every 
> client
> - if we move the table descriptor part away from the client, we need to find 
> a new place for it.
> - we will have the same issue if HBaseAdmin (for both ZK & Master), may be we 
> can put a timeout on the connection. That would make the whole system less 
> deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227997#comment-13227997
 ] 

Ted Yu commented on HBASE-5399:
---

TestAtomicOperation failed in latest TRUNK build:
https://builds.apache.org/job/HBase-TRUNK/2676/testReport/org.apache.hadoop.hbase.regionserver/TestAtomicOperation/testMultiRowMutationMultiThreads/

Similar failure shows up in the latest Hadoop QA run of HBASE-5542

> Cut the link between the client and the zookeeper ensemble
> --
>
> Key: HBASE-5399
> URL: https://issues.apache.org/jira/browse/HBASE-5399
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 
> 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 
> 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 
> 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 
> 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 
> 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 
> 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 
> 5399_inprogress.v9.patch, nochange.patch
>
>
> The link is often considered as an issue, for various reasons. One of them 
> being that there is a limit on the number of connection that ZK can manage. 
> Stack was suggesting as well to remove the link to master from HConnection.
> There are choices to be made considering the existing API (that we don't want 
> to break).
> The first patches I will submit on hadoop-qa should not be committed: they 
> are here to show the progress on the direction taken.
> ZooKeeper is used for:
> - public getter, to let the client do whatever he wants, and close ZooKeeper 
> when closing the connection => we have to deprecate this but keep it.
> - read get master address to create a master => now done with a temporary 
> zookeeper connection
> - read root location => now done with a temporary zookeeper connection, but 
> questionable. Used in public function "locateRegion". To be reworked.
> - read cluster id => now done once with a temporary zookeeper connection.
> - check if base done is available => now done once with a zookeeper 
> connection given as a parameter
> - isTableDisabled/isTableAvailable => public functions, now done with a 
> temporary zookeeper connection.
>  - Called internally from HBaseAdmin and HTable
> - getCurrentNrHRS(): public function to get the number of region servers and 
> create a pool of thread => now done with a temporary zookeeper connection
> -
> Master is used for:
> - getMaster public getter, as for ZooKeeper => we have to deprecate this but 
> keep it.
> - isMasterRunning(): public function, used internally by HMerge & HBaseAdmin
> - getHTableDescriptor*: public functions offering access to the master.  => 
> we could make them using a temporary master connection as well.
> Main points are:
> - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a 
> strongly coupled architecture ;-). This can be changed, but requires a lot of 
> modifications in these classes (likely adding a class in the middle of the 
> hierarchy, something like that). Anyway, non connected client will always be 
> really slower, because it's a tcp connection, and establishing a tcp 
> connection is slow.
> - having a link between ZK and all the client seems to make sense for some 
> Use Cases. However, it won't scale if a TCP connection is required for every 
> client
> - if we move the table descriptor part away from the client, we need to find 
> a new place for it.
> - we will have the same issue if HBaseAdmin (for both ZK & Master), may be we 
> can put a timeout on the connection. That would make the whole system less 
> deterministic however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227984#comment-13227984
 ] 

Ted Yu commented on HBASE-4608:
---

For code specific review, please use https://reviews.apache.org/r/4185/ where 
there would be context.

I can add WAL_VERSION as v2 in the metadata.
My question is: would HLog v2 be allowed not to compress Log entries ?

If desirable, we can discuss in more detail, face to face, on the 27th.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
> 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227961#comment-13227961
 ] 

Ted Yu commented on HBASE-4608:
---

Uploaded v23 onto review board.
After WAL version metadata design is finalized, will add that.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
> 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227954#comment-13227954
 ] 

Ted Yu commented on HBASE-4608:
---

bq. Should the Compression class in wal package ...
I only see KeyValueCompression.java under wal package. Please elaborate which 
class should carry more comments.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
> 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227946#comment-13227946
 ] 

Ted Yu commented on HBASE-4608:
---

I think WAL_VERSION metadata is orthogonal to compression type metadata and I 
would expect both to be present in new HLog files written with this feature.
Say we define WAL_VERSION as v2 which has WAL compression capability. We still 
need to check compression type metadata before applying dictionary compression.
In this regard adding WAL_VERSION seems to be redundant.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
> 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227919#comment-13227919
 ] 

Ted Yu commented on HBASE-4608:
---

bq. Its the test of a single entry only
Please take a look at the following in test:
{code}
for(int i = 1; i < Short.MAX_VALUE; i++){
  assertTrue(testee.findEntry(BigInteger.valueOf(i).toByteArray(), 0,
  BigInteger.valueOf(i).toByteArray().length) == -1);
}
{code}
32766 entries of the dictionary are tested.

If only compression would evolve, I think checking against compression type 
metadata would be adequate.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
> 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227901#comment-13227901
 ] 

Ted Yu commented on HBASE-4608:
---

bq. try a paragraph of text going in and out
LRUDictionary deals with byte array:
{code}
  public short findEntry(byte[] data, int offset, int length) {
{code}
In this regard, piping text into the dictionary is functionally same as piping 
byte[] form of integer.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
> 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227892#comment-13227892
 ] 

Ted Yu commented on HBASE-4608:
---

Introducing WAL_VERSION would imply that we may change HLog aspect other than 
compression in the future.
Is there plan for the above ?
Having another compression type is nice but requires making HLogKey persistence 
pluggable.

I think it would be better to introduce one meta entry instead of two.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Fix For: 0.94.0
>
> Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 
> 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 
> 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227688#comment-13227688
 ] 

Ted Yu commented on HBASE-5206:
---

The following error is reproducible on MacBook (patch for 0.92):
{code}
Tests in error: 
  org.apache.hadoop.hbase.TestDrainingServer: 
org.apache.hadoop.hbase.TableNotEnabledException: t
{code}

> Port HBASE-5155 to 0.92 and TRUNK
> -
>
> Key: HBASE-5206
> URL: https://issues.apache.org/jira/browse/HBASE-5206
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.2, 0.96.0
>Reporter: Zhihong Yu
> Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
> 5206_trunk_1.patch, 5206_trunk_latest_1.patch
>
>
> This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
> not happen parallely leading to recreation of regions that were deleted) to 
> 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227669#comment-13227669
 ] 

Ted Yu commented on HBASE-5206:
---

{code}
+  String errorMsg = "Unable to ensure that the table " + tableName
+  + "will be" + " enabled because of a ZooKeeper issue";
{code}
A space should be added between " and will.

> Port HBASE-5155 to 0.92 and TRUNK
> -
>
> Key: HBASE-5206
> URL: https://issues.apache.org/jira/browse/HBASE-5206
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.2, 0.96.0
>Reporter: Zhihong Yu
> Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
> 5206_trunk_1.patch, 5206_trunk_latest_1.patch
>
>
> This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
> not happen parallely leading to recreation of regions that were deleted) to 
> 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227661#comment-13227661
 ] 

Ted Yu commented on HBASE-5206:
---

Patch v2 looks good.
Minor comments:
{code}
 // Call to undisableTable does this. TODO: Make a more formal purge table.
-am.getZKTable().setEnabledTable(Bytes.toString(tableName));
+am.getZKTable().setDeletedTable(Bytes.toString(tableName));
{code}
I don't see undisableTable. Can we remove the comment above ?
{code}
+  } else if (!this.zkTable
+  .isEnabledTable(region.getTableNameAsString())) {
+setEnabledTable(region);
{code}
setEnabledTable(HRegionInfo hri) already calls zkTable.isEnabledTable(). It 
seems we can call setEnabledTable(region) directly above.

> Port HBASE-5155 to 0.92 and TRUNK
> -
>
> Key: HBASE-5206
> URL: https://issues.apache.org/jira/browse/HBASE-5206
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.2, 0.96.0
>Reporter: Zhihong Yu
> Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 
> 5206_trunk_1.patch, 5206_trunk_latest_1.patch
>
>
> This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should 
> not happen parallely leading to recreation of regions that were deleted) to 
> 0.92 and TRUNK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()

2012-03-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227569#comment-13227569
 ] 

Ted Yu commented on HBASE-5542:
---

I think we should keep time bound whose default value can be large.

> Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
> 
>
> Key: HBASE-5542
> URL: https://issues.apache.org/jira/browse/HBASE-5542
> Project: HBase
>  Issue Type: Improvement
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.96.0
>
> Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, 
> HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, 
> HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch
>
>
> mutateRowsWithLocks() does atomic mutations on multiple rows.
> processRow() does atomic read-modify-writes on a single row.
> It will be useful to generalize both and have a
> processRowsWithLocks() that does atomic read-modify-writes on multiple rows.
> This also helps reduce some redundancy in the codes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

2012-01-15 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186527#comment-13186527
 ] 

Ted Yu commented on HBASE-5174:
---

A slight variation of my previous proposal:
MonitoredTaskImpl can maintain Map where String key is 
the description passed to 
TaskMonitor.createStatus(), prepended with MonitoredTask.State and separator 
string (such as '||').

A task may have two entries in the map, one starting with 'ABORTED', the other 
starting with 'COMPLETE'. This corresponds to task retries.
Special handling would be added to MonitoredTaskImpl.setState().

> Coalesce aborted tasks in the TaskMonitor
> -
>
> Key: HBASE-5174
> URL: https://issues.apache.org/jira/browse/HBASE-5174
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
> Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going 
> on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region 
> test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap 
> pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, 
> writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
> because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region 
> test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap 
> pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, 
> writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
> because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region 
> test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap 
> pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> NOT flushing memstore for region 
> test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, 
> writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the 
> regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012  Flushing 
> test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. ABORTED (since 31sec 
> ago)   Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have 
> to scroll down all the way to see your regions. Coalescing consecutive 
> aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5201) Add TThreadedSelectorServer and remove repeat codes in ThriftServer and HRegionThriftServer

2012-01-15 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186520#comment-13186520
 ] 

Ted Yu commented on HBASE-5201:
---

@Scott:
I think you need to mark your work complete before seeing the 'Submit Patch' 
button.
After you click 'Submit Patch', future attachments would be picked up by Hadoop 
QA.

> Add TThreadedSelectorServer and remove repeat codes in ThriftServer and 
> HRegionThriftServer
> ---
>
> Key: HBASE-5201
> URL: https://issues.apache.org/jira/browse/HBASE-5201
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.94.0
>
> Attachments: HBASE-5201-v2.txt, HBASE-5201-v3.txt, HBASE-5201.txt
>
>
> TThreadedSelectorServer is good for RPC-heavy situation because IO are not 
> limited to one CPU. See
> https://issues.apache.org/jira/browse/Thrift-1167
> I am porting the related classes form thrift trunk (it is not there in 
> thrift-0.7.0).
> There are lots of repeat codes in ThriftServer and HRegionThriftServer.
> These codes are now moved to a Runnable called ThriftServerRunner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-13 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168257#comment-13168257
 ] 

Ted Yu commented on HBASE-4120:
---

I think the two failed tests should be modified. 
There would be more coprocessors loaded in the future. It is desirable to make 
the assertions deterministic. 

> isolation and allocation
> 
>
> Key: HBASE-4120
> URL: https://issues.apache.org/jira/browse/HBASE-4120
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver
>Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
>Reporter: Liu Jia
>Assignee: Liu Jia
> Fix For: 0.94.0
>
> Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
> Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
> HBase_isolation_and_allocation_user_guide.pdf, 
> Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
> TablePriority_v12.patch, TablePriority_v12.patch, 
> TablePriority_v15_with_coprocessor.patch, TablePriority_v8.patch, 
> TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
> TablePrioriy_v9.patch
>
>
> The HBase isolation and allocation tool is designed to help users manage 
> cluster resource among different application and tables.
> When we have a large scale of HBase cluster with many applications running on 
> it, there will be lots of problems. In Taobao there is a cluster for many 
> departments to test their applications performance, these applications are 
> based on HBase. With one cluster which has 12 servers, there will be only one 
> application running exclusively on this server, and many other applications 
> must wait until the previous test finished.
> After we add allocation manage function to the cluster, applications can 
> share the cluster and run concurrently. Also if the Test Engineer wants to 
> make sure there is no interference, he/she can move out other tables from 
> this group.
> In groups we use table priority to allocate resource, when system is busy; we 
> can make sure high-priority tables are not affected lower-priority tables
> Different groups can have different region server configurations, some groups 
> optimized for reading can have large block cache size, and others optimized 
> for writing can have large memstore size. 
> Tables and region servers can be moved easily between groups; after changing 
> the configuration, a group can be restarted alone instead of restarting the 
> whole cluster.
> git entry : https://github.com/ICT-Ope/HBase_allocation .
> We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5001) Improve the performance of block cache keys

2011-12-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167947#comment-13167947
 ] 

Ted Yu commented on HBASE-5001:
---

In LruBlockCache, please change the javadoc for cacheKey parameter. 

> Improve the performance of block cache keys
> ---
>
> Key: HBASE-5001
> URL: https://issues.apache.org/jira/browse/HBASE-5001
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5001-v1.txt
>
>
> Doing a pure random read test on data that's 100% block cache, I see that we 
> are spending quite some time in getBlockCacheKey:
> {quote}
> "IPC Server handler 19 on 62023" daemon prio=10 tid=0x7fe0501ff800 
> nid=0x6c87 runnable [0x7fe0577f6000]
>java.lang.Thread.State: RUNNABLE
>   at java.util.Arrays.copyOf(Arrays.java:2882)
>   at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>   at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
>   at java.lang.StringBuilder.append(StringBuilder.java:119)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFile.getBlockCacheKey(HFile.java:457)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:249)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:209)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:521)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:536)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:178)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:111)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekExactly(StoreFileScanner.java:219)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:80)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1689)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:2857)
> {quote}
> Since the HFile name size is known and the offset is a long, it should be 
> possible to allocate exactly what we need. Maybe use byte[] as the key and 
> drop the separator too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4951) master process can not be stopped when it is initializing

2011-12-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167581#comment-13167581
 ] 

Ted Yu commented on HBASE-4951:
---

+1 on patch. 

> master process can not be stopped when it is initializing
> -
>
> Key: HBASE-4951
> URL: https://issues.apache.org/jira/browse/HBASE-4951
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.3
>Reporter: xufeng
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.90.6
>
> Attachments: HBASE-4951.patch
>
>
> It is easy to reproduce by following step:
> step1:start master process.(do not start regionserver process in the cluster).
> the master will wait the regionserver to check in:
> org.apache.hadoop.hbase.master.ServerManager: Waiting on regionserver(s) to 
> checkin
> step2:stop the master by sh command bin/hbase master stop
> result:the master process will never die because catalogTracker.waitForRoot() 
> method will block unitl the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4994) TestHeapSize broke in trunk

2011-12-09 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166405#comment-13166405
 ] 

Ted Yu commented on HBASE-4994:
---

Thanks for fixing this. 
Hadoop Qa didn't report this 

> TestHeapSize broke in trunk
> ---
>
> Key: HBASE-4994
> URL: https://issues.apache.org/jira/browse/HBASE-4994
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: heapsize.txt
>
>
> This commit added Map to HRegion
> {code}
> commit 888d73a9f5fe907f7c616211322fff339eeaa446
> Author: Zhihong Yu 
> Date:   Fri Dec 9 06:01:58 2011 +
> HBASE-4946  HTable.coprocessorExec (and possibly coprocessorProxy) does 
> not work with
>dynamically loaded coprocessors (Andrei Dragomir)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-08 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165876#comment-13165876
 ] 

Ted Yu commented on HBASE-4120:
---

In PriorityFunction.java, several lines are much wider than 80 characters.
Please use the formatter from HBASE-3678 on the new files.

> isolation and allocation
> 
>
> Key: HBASE-4120
> URL: https://issues.apache.org/jira/browse/HBASE-4120
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver
>Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
>Reporter: Liu Jia
>Assignee: Liu Jia
> Fix For: 0.94.0
>
> Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
> Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
> HBase_isolation_and_allocation_user_guide.pdf, 
> Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
> TablePriority_v12.patch, TablePriority_v12.patch, TablePriority_v8.patch, 
> TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
> TablePrioriy_v9.patch
>
>
> The HBase isolation and allocation tool is designed to help users manage 
> cluster resource among different application and tables.
> When we have a large scale of HBase cluster with many applications running on 
> it, there will be lots of problems. In Taobao there is a cluster for many 
> departments to test their applications performance, these applications are 
> based on HBase. With one cluster which has 12 servers, there will be only one 
> application running exclusively on this server, and many other applications 
> must wait until the previous test finished.
> After we add allocation manage function to the cluster, applications can 
> share the cluster and run concurrently. Also if the Test Engineer wants to 
> make sure there is no interference, he/she can move out other tables from 
> this group.
> In groups we use table priority to allocate resource, when system is busy; we 
> can make sure high-priority tables are not affected lower-priority tables
> Different groups can have different region server configurations, some groups 
> optimized for reading can have large block cache size, and others optimized 
> for writing can have large memstore size. 
> Tables and region servers can be moved easily between groups; after changing 
> the configuration, a group can be restarted alone instead of restarting the 
> whole cluster.
> git entry : https://github.com/ICT-Ope/HBase_allocation .
> We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d

2011-12-08 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165870#comment-13165870
 ] 

Ted Yu commented on HBASE-4946:
---

Integrated to 0.92 and TRUNK.

Thanks for the patch Andrei.

Thanks for the review Lars and Stack.

> HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
> dynamically loaded coprocessors (from hdfs or local system), because the RPC 
> system tries to deserialize an unknown class. 
> -
>
> Key: HBASE-4946
> URL: https://issues.apache.org/jira/browse/HBASE-4946
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors
>Affects Versions: 0.92.0
>Reporter: Andrei Dragomir
>Assignee: Andrei Dragomir
> Fix For: 0.92.0, 0.94.0
>
> Attachments: 4946-v4.txt, 4946-v5.txt, HBASE-4946-v2.patch, 
> HBASE-4946-v3.patch, HBASE-4946.patch
>
>
> Loading coprocessors jars from hdfs works fine. I load it from the shell, 
> after setting the attribute, and it gets loaded:
> {noformat}
> INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
> config now ...
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
> com.MyCoprocessorClass needs to be loaded from a file - 
> hdfs://localhost:9000/coproc/rt-  >0.0.1-SNAPSHOT.jar.
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
> com.MyCoprocessorClass
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
> RegionEnvironment createEnvironment
> DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
> handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
> protocol=com.MyCoprocessorClassProtocol
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
> coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
> {noformat}
> The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
> with a dynamic method. When calling this method from the client with 
> HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
> cannot be deserialized from writables. 
> The problem is that Exec tries to do an "early" resolve of the coprocessor 
> class. The coprocessor class is loaded, but it is in the context of the 
> HRegionServer / HRegion. So, the call fails:
> {noformat}
> 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
> Error in readFields
> java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
>   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:247)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
>   ... 10 more
> {noformat}
> Probably the correct way to fix this is to make Exec really smart, so that it 
> knows all the class definitions loaded in CoprocessorHost(s).
> I created a small patch that simply doesn't resolve the class definition in 
> the Exec, instead passing it as string down to the HRegion layer. This layer 
> knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If yo

[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-08 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165857#comment-13165857
 ] 

Ted Yu commented on HBASE-4120:
---

Please add license to QosRegionObserver.java

I think we should enhance ScannerListener.leaseExpired() with 
pre/postScannerClose() so that features such as table priority can receive 
consistent notification.
The trick here is that RegionScanner only exposes HRegionInfo. We should be 
able to utilize onlineRegions in looking up HRegion by region name.
Then we should be able to call the following:
{code}
if (region != null && region.getCoprocessorHost() != null) {
  region.getCoprocessorHost().postScannerClose(s);
}
{code}

> isolation and allocation
> 
>
> Key: HBASE-4120
> URL: https://issues.apache.org/jira/browse/HBASE-4120
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver
>Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
>Reporter: Liu Jia
>Assignee: Liu Jia
> Fix For: 0.94.0
>
> Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
> Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
> HBase_isolation_and_allocation_user_guide.pdf, 
> Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
> TablePriority_v12.patch, TablePriority_v12.patch, TablePriority_v8.patch, 
> TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
> TablePrioriy_v9.patch
>
>
> The HBase isolation and allocation tool is designed to help users manage 
> cluster resource among different application and tables.
> When we have a large scale of HBase cluster with many applications running on 
> it, there will be lots of problems. In Taobao there is a cluster for many 
> departments to test their applications performance, these applications are 
> based on HBase. With one cluster which has 12 servers, there will be only one 
> application running exclusively on this server, and many other applications 
> must wait until the previous test finished.
> After we add allocation manage function to the cluster, applications can 
> share the cluster and run concurrently. Also if the Test Engineer wants to 
> make sure there is no interference, he/she can move out other tables from 
> this group.
> In groups we use table priority to allocate resource, when system is busy; we 
> can make sure high-priority tables are not affected lower-priority tables
> Different groups can have different region server configurations, some groups 
> optimized for reading can have large block cache size, and others optimized 
> for writing can have large memstore size. 
> Tables and region servers can be moved easily between groups; after changing 
> the configuration, a group can be restarted alone instead of restarting the 
> whole cluster.
> git entry : https://github.com/ICT-Ope/HBase_allocation .
> We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d

2011-12-08 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165839#comment-13165839
 ] 

Ted Yu commented on HBASE-4946:
---

Will commit patch v4 tomorrow.

> HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
> dynamically loaded coprocessors (from hdfs or local system), because the RPC 
> system tries to deserialize an unknown class. 
> -
>
> Key: HBASE-4946
> URL: https://issues.apache.org/jira/browse/HBASE-4946
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors
>Affects Versions: 0.92.0
>Reporter: Andrei Dragomir
>Assignee: Andrei Dragomir
> Attachments: 4946-v4.txt, HBASE-4946-v2.patch, HBASE-4946-v3.patch, 
> HBASE-4946.patch
>
>
> Loading coprocessors jars from hdfs works fine. I load it from the shell, 
> after setting the attribute, and it gets loaded:
> {noformat}
> INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
> config now ...
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
> com.MyCoprocessorClass needs to be loaded from a file - 
> hdfs://localhost:9000/coproc/rt-  >0.0.1-SNAPSHOT.jar.
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
> com.MyCoprocessorClass
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
> RegionEnvironment createEnvironment
> DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
> handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
> protocol=com.MyCoprocessorClassProtocol
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
> coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
> {noformat}
> The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
> with a dynamic method. When calling this method from the client with 
> HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
> cannot be deserialized from writables. 
> The problem is that Exec tries to do an "early" resolve of the coprocessor 
> class. The coprocessor class is loaded, but it is in the context of the 
> HRegionServer / HRegion. So, the call fails:
> {noformat}
> 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
> Error in readFields
> java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
>   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:247)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
>   ... 10 more
> {noformat}
> Probably the correct way to fix this is to make Exec really smart, so that it 
> knows all the class definitions loaded in CoprocessorHost(s).
> I created a small patch that simply doesn't resolve the class definition in 
> the Exec, instead passing it as string down to the HRegion layer. This layer 
> knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactA

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

2011-12-08 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165837#comment-13165837
 ] 

Ted Yu commented on HBASE-4880:
---

+1 on patch v4.

> Region is on service before openRegionHandler completes, may cause data loss
> 
>
> Key: HBASE-4880
> URL: https://issues.apache.org/jira/browse/HBASE-4880
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0, 0.94.0
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, 
> hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
> region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 
> may loss, because the region has been opend on another regionserver and be 
> put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
> because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4682) Support deleted rows using Import/Export

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165048#comment-13165048
 ] 

Ted Yu commented on HBASE-4682:
---

In the new Delete ctor, row key should be included in the IOE. 

> Support deleted rows using Import/Export
> 
>
> Key: HBASE-4682
> URL: https://issues.apache.org/jira/browse/HBASE-4682
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 4682-v1.txt
>
>
> Parent allows keeping deleted rows around. Would be nice if those could be 
> exported and imported as well.
> All the building blocks are there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4980) Null pointer exception in HBaseClient receiveResponse

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165031#comment-13165031
 ] 

Ted Yu commented on HBASE-4980:
---

Please use --no-prefix to generate patch. 

> Null pointer exception in HBaseClient receiveResponse
> -
>
> Key: HBASE-4980
> URL: https://issues.apache.org/jira/browse/HBASE-4980
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.92.0
>Reporter: Shrijeet Paliwal
>  Labels: newbie
> Attachments: 
> 0001-HBASE-4980-Fix-NPE-in-HBaseClient-receiveResponse.patch, 
> 0002-HBASE-4980-Fix-NPE-in-HBaseClient-receiveResponse.patch
>
>
> Relevant Stack trace: 
> 2011-11-30 13:10:26,557 [IPC Client (47) connection to 
> xx.xx.xx/172.22.4.68:60020 from an unknown user] WARN  
> org.apache.hadoop.ipc.HBaseClient - Unexpected exception receiving call 
> responses
> java.lang.NullPointerException
> >-at 
> >org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:583)
> >-at 
> >org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:511)
> {code}
>   if (LOG.isDebugEnabled())
>   LOG.debug(getName() + " got value #" + id);
> Call call = calls.remove(id);
> // Read the flag byte
> byte flag = in.readByte();
> boolean isError = ResponseFlag.isError(flag);
> if (ResponseFlag.isLength(flag)) {
>   // Currently length if present is unused.
>   in.readInt();
> }
> int state = in.readInt(); // Read the state.  Currently unused.
> if (isError) {
>   //noinspection ThrowableInstanceNeverThrown
>   call.setException(new RemoteException( WritableUtils.readString(in),
>   WritableUtils.readString(in)));
> } else {
> {code}
> This line {code}Call call = calls.remove(id);{code}  may return a null 
> 'call'. It is so because if you have rpc timeout enable, we proactively clean 
> up other calls which have expired their lifetime along with the call for 
> which socket timeout exception happend.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4980) Null pointer exception in HBaseClient receiveResponse

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165025#comment-13165025
 ] 

Ted Yu commented on HBASE-4980:
---

Please attach patch for trunk last so that HadoopQA can test it. 

> Null pointer exception in HBaseClient receiveResponse
> -
>
> Key: HBASE-4980
> URL: https://issues.apache.org/jira/browse/HBASE-4980
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.92.0
>Reporter: Shrijeet Paliwal
>  Labels: newbie
> Attachments: 
> 0001-HBASE-4980-Fix-NPE-in-HBaseClient-receiveResponse.patch, 
> 0002-HBASE-4980-Fix-NPE-in-HBaseClient-receiveResponse.patch
>
>
> Relevant Stack trace: 
> 2011-11-30 13:10:26,557 [IPC Client (47) connection to 
> xx.xx.xx/172.22.4.68:60020 from an unknown user] WARN  
> org.apache.hadoop.ipc.HBaseClient - Unexpected exception receiving call 
> responses
> java.lang.NullPointerException
> >-at 
> >org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:583)
> >-at 
> >org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:511)
> {code}
>   if (LOG.isDebugEnabled())
>   LOG.debug(getName() + " got value #" + id);
> Call call = calls.remove(id);
> // Read the flag byte
> byte flag = in.readByte();
> boolean isError = ResponseFlag.isError(flag);
> if (ResponseFlag.isLength(flag)) {
>   // Currently length if present is unused.
>   in.readInt();
> }
> int state = in.readInt(); // Read the state.  Currently unused.
> if (isError) {
>   //noinspection ThrowableInstanceNeverThrown
>   call.setException(new RemoteException( WritableUtils.readString(in),
>   WritableUtils.readString(in)));
> } else {
> {code}
> This line {code}Call call = calls.remove(id);{code}  may return a null 
> 'call'. It is so because if you have rpc timeout enable, we proactively clean 
> up other calls which have expired their lifetime along with the call for 
> which socket timeout exception happend.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4980) Null pointer exception in HBaseClient receiveResponse

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165021#comment-13165021
 ] 

Ted Yu commented on HBASE-4980:
---

The patch changes the semantics of original error handling. 
I think we shouldn't leave data unread in the input stream. 

> Null pointer exception in HBaseClient receiveResponse
> -
>
> Key: HBASE-4980
> URL: https://issues.apache.org/jira/browse/HBASE-4980
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.92.0
>Reporter: Shrijeet Paliwal
>  Labels: newbie
> Attachments: 
> 0001-HBASE-4980-Fix-NPE-in-HBaseClient-receiveResponse.patch
>
>
> Relevant Stack trace: 
> 2011-11-30 13:10:26,557 [IPC Client (47) connection to 
> xx.xx.xx/172.22.4.68:60020 from an unknown user] WARN  
> org.apache.hadoop.ipc.HBaseClient - Unexpected exception receiving call 
> responses
> java.lang.NullPointerException
> >-at 
> >org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:583)
> >-at 
> >org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:511)
> {code}
>   if (LOG.isDebugEnabled())
>   LOG.debug(getName() + " got value #" + id);
> Call call = calls.remove(id);
> // Read the flag byte
> byte flag = in.readByte();
> boolean isError = ResponseFlag.isError(flag);
> if (ResponseFlag.isLength(flag)) {
>   // Currently length if present is unused.
>   in.readInt();
> }
> int state = in.readInt(); // Read the state.  Currently unused.
> if (isError) {
>   //noinspection ThrowableInstanceNeverThrown
>   call.setException(new RemoteException( WritableUtils.readString(in),
>   WritableUtils.readString(in)));
> } else {
> {code}
> This line {code}Call call = calls.remove(id);{code}  may return a null 
> 'call'. It is so because if you have rpc timeout enable, we proactively clean 
> up other calls which have expired their lifetime along with the call for 
> which socket timeout exception happend.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165017#comment-13165017
 ] 

Ted Yu commented on HBASE-4120:
---

Looks like we should expose scanner lease expiration through new coprocessor 
API. 

> isolation and allocation
> 
>
> Key: HBASE-4120
> URL: https://issues.apache.org/jira/browse/HBASE-4120
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver
>Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
>Reporter: Liu Jia
>Assignee: Liu Jia
> Fix For: 0.94.0
>
> Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
> Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
> HBase_isolation_and_allocation_user_guide.pdf, 
> Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
> TablePriority_v12.patch, TablePriority_v12.patch, TablePriority_v8.patch, 
> TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
> TablePrioriy_v9.patch
>
>
> The HBase isolation and allocation tool is designed to help users manage 
> cluster resource among different application and tables.
> When we have a large scale of HBase cluster with many applications running on 
> it, there will be lots of problems. In Taobao there is a cluster for many 
> departments to test their applications performance, these applications are 
> based on HBase. With one cluster which has 12 servers, there will be only one 
> application running exclusively on this server, and many other applications 
> must wait until the previous test finished.
> After we add allocation manage function to the cluster, applications can 
> share the cluster and run concurrently. Also if the Test Engineer wants to 
> make sure there is no interference, he/she can move out other tables from 
> this group.
> In groups we use table priority to allocate resource, when system is busy; we 
> can make sure high-priority tables are not affected lower-priority tables
> Different groups can have different region server configurations, some groups 
> optimized for reading can have large block cache size, and others optimized 
> for writing can have large memstore size. 
> Tables and region servers can be moved easily between groups; after changing 
> the configuration, a group can be restarted alone instead of restarting the 
> whole cluster.
> git entry : https://github.com/ICT-Ope/HBase_allocation .
> We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164960#comment-13164960
 ] 

Ted Yu commented on HBASE-4120:
---

But preScannerClose() is only passed one scanner. 

> isolation and allocation
> 
>
> Key: HBASE-4120
> URL: https://issues.apache.org/jira/browse/HBASE-4120
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver
>Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
>Reporter: Liu Jia
>Assignee: Liu Jia
> Fix For: 0.94.0
>
> Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
> Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
> HBase_isolation_and_allocation_user_guide.pdf, 
> Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
> TablePriority_v12.patch, TablePriority_v12.patch, TablePriority_v8.patch, 
> TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
> TablePrioriy_v9.patch
>
>
> The HBase isolation and allocation tool is designed to help users manage 
> cluster resource among different application and tables.
> When we have a large scale of HBase cluster with many applications running on 
> it, there will be lots of problems. In Taobao there is a cluster for many 
> departments to test their applications performance, these applications are 
> based on HBase. With one cluster which has 12 servers, there will be only one 
> application running exclusively on this server, and many other applications 
> must wait until the previous test finished.
> After we add allocation manage function to the cluster, applications can 
> share the cluster and run concurrently. Also if the Test Engineer wants to 
> make sure there is no interference, he/she can move out other tables from 
> this group.
> In groups we use table priority to allocate resource, when system is busy; we 
> can make sure high-priority tables are not affected lower-priority tables
> Different groups can have different region server configurations, some groups 
> optimized for reading can have large block cache size, and others optimized 
> for writing can have large memstore size. 
> Tables and region servers can be moved easily between groups; after changing 
> the configuration, a group can be restarted alone instead of restarting the 
> whole cluster.
> git entry : https://github.com/ICT-Ope/HBase_allocation .
> We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164699#comment-13164699
 ] 

Ted Yu commented on HBASE-4927:
---

Also ran through the two tests in original patch:
{code}
 1302  mt -Dtest=TestHRegionInfo
 1303  mt -Dtest=TestCatalogJanitor
{code}
They passed as well.

Integrated to 0.92 and TRUNK.

Thanks for the addendum, Jimmy.

Thanks for the help, Jonathan.

> CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
> last region when the endkey is empty
> ---
>
> Key: HBASE-4927
> URL: https://issues.apache.org/jira/browse/HBASE-4927
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 
> 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
> 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
> 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
> 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
> hbase-4927-fix-ws.txt
>
>
> When reviewing HBASE-4238 backporting, Jon found this issue.
> What happens if the split points are  (empty end key is the last key, empty 
> start key is the first key)
> Parent [A,)
> L daughter [A,B), 
> R daughter [B,)
> When sorted, we gets to end key comparision which results in this incorrector 
> order:
> [A,B), [A,), [B,) 
> we wanted:
> [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4927) CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the last region when the endkey is empty

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164695#comment-13164695
 ] 

Ted Yu commented on HBASE-4927:
---

Ran through the previously failing tests:
{code}
 1010  mt -Dtest=TestMasterRestartAfterDisablingTable 
 1012  mt -Dtest=TestOfflineMetaRebuildBase#testMetaRebuild
 1013  mt -Dtest=TestOfflineMetaRebuildHole
{code}
They pass now.

Going to commit to 0.92 and TRUNK.

> CatalogJanior:SplitParentFirstComparator doesn't sort as expected, for the 
> last region when the endkey is empty
> ---
>
> Key: HBASE-4927
> URL: https://issues.apache.org/jira/browse/HBASE-4927
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 
> 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
> 0001-Fixed-TestOffline-failure-caused-by-HBASE-4927.patch, 
> 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator-.patch, 
> 0001-HBASE-4927-CatalogJanior-SplitParentFirstComparator_v2.patch, 
> hbase-4927-fix-ws.txt
>
>
> When reviewing HBASE-4238 backporting, Jon found this issue.
> What happens if the split points are  (empty end key is the last key, empty 
> start key is the first key)
> Parent [A,)
> L daughter [A,B), 
> R daughter [B,)
> When sorted, we gets to end key comparision which results in this incorrector 
> order:
> [A,B), [A,), [B,) 
> we wanted:
> [A,), [A,B), [B,)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164658#comment-13164658
 ] 

Ted Yu commented on HBASE-4729:
---

The HadoopQA report @ 
https://builds.apache.org/job/PreCommit-HBASE-Build/405//testReport/ showed 
basically no tests were run.

A manual test suite execution should have been performed.

> Clash between region unassign and splitting kills the master
> 
>
> Key: HBASE-4729
> URL: https://issues.apache.org/jira/browse/HBASE-4729
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: stack
>Priority: Critical
> Fix For: 0.92.0, 0.94.0
>
> Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 
> 4729-v6-092.txt, 4729-v6-trunk.txt, 4729.txt
>
>
> I was running an online alter while regions were splitting, and suddenly the 
> master died and left my table half-altered (haven't restarted the master yet).
> What killed the master:
> {quote}
> 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unexpected ZK exception creating node CLOSING
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
> at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {quote}
> A znode was created because the region server was splitting the region 4 
> seconds before:
> {quote}
> 2011-11-02 17:06:40,704 INFO 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
> region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
> 2011-11-02 17:06:40,704 DEBUG 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: 
> regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
> f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
> 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:62023-0x132f043bbde0710 Attempting to transition node 
> f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
> RS_ZK_REGION_SPLITTING
> ...
> 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
> f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
> RS_ZK_REGION_SPLIT
> 2011-11-02 17:06:44,061 INFO 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
> master to process the split for f7e1783e65ea8d621a4bc96ad310f101
> {quote}
> Now that the master is dead the region server is spewing those last two lines 
> like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164599#comment-13164599
 ] 

Ted Yu commented on HBASE-4224:
---

@Akash:
Do you have a newer patch ?
If so, please upload to this JIRA.

> Need a flush by regionserver rather than by table option
> 
>
> Key: HBASE-4224
> URL: https://issues.apache.org/jira/browse/HBASE-4224
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Reporter: stack
>Assignee: Akash Ashok
> Attachments: HBase-4224.patch
>
>
> This evening needed to clean out logs on the cluster.  logs are by 
> regionserver.  to let go of logs, we need to have all edits emptied from 
> memory.  only flush is by table or region.  We need to be able to flush the 
> regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter to change keepAliveTime of Htable thread pool.

2011-12-07 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164274#comment-13164274
 ] 

Ted Yu commented on HBASE-4970:
---

I think there shouldn't be upper case letters in name of new config. 

> Add a parameter  to change keepAliveTime of Htable thread pool.
> ---
>
> Key: HBASE-4970
> URL: https://issues.apache.org/jira/browse/HBASE-4970
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Trivial
> Fix For: 0.90.5
>
> Attachments: HBASE-4970_Branch90.patch
>
>
> In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
> is slowed down.
> Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
> occurance [RES value increase]?
> You can go through the source of sun.nio.ch.Util. Every thread hold 3 
> softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
> softreferences buffercache. If the buffer was all occupied or none was 
> suitable in size, and new request comes, new direct buffer is allocated. 
> After the service, the bigger one replaces the smaller one in buffercache. 
> The replaced buffer is released.
> So I think we can add a parameter to change keepAliveTime of Htable thread 
> pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164168#comment-13164168
 ] 

Ted Yu commented on HBASE-4880:
---

Please check failed tests. 
It seems trunk is broken now. 

> Region is on service before completing openRegionHanlder, may cause data loss
> -
>
> Key: HBASE-4880
> URL: https://issues.apache.org/jira/browse/HBASE-4880
> Project: HBase
>  Issue Type: Bug
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
> region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 
> may loss, because the region has been opend on another regionserver and be 
> put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
> because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164104#comment-13164104
 ] 

Ted Yu commented on HBASE-4880:
---

Did RestAdmin pass for patch v2?

> Region is on service before completing openRegionHanlder, may cause data loss
> -
>
> Key: HBASE-4880
> URL: https://issues.apache.org/jira/browse/HBASE-4880
> Project: HBase
>  Issue Type: Bug
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
> region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 
> may loss, because the region has been opend on another regionserver and be 
> put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
> because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4605) Constraints

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163939#comment-13163939
 ] 

Ted Yu commented on HBASE-4605:
---

Renaming IntegrationTestConstraint as described above makes sense.

Thanks Jesse.

> Constraints
> ---
>
> Key: HBASE-4605
> URL: https://issues.apache.org/jira/browse/HBASE-4605
> Project: HBase
>  Issue Type: Improvement
>  Components: client, coprocessors
>Affects Versions: 0.94.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Attachments: 4605.v7, constraint_as_cp.txt, java_Constraint_v2.patch, 
> java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, java_HBASE-4605_v3.patch
>
>
> From Jesse's comment on dev:
> {quote}
> What I would like to propose is a simple interface that people can use to 
> implement a 'constraint' (matching the classic database definition). This 
> would help ease of adoption by helping HBase more easily check that box, help 
> minimize code duplication across organizations, and lead to easier adoption.
> Essentially, people would implement a 'Constraint' interface for checking 
> keys before they are put into a table. Puts that are valid get written to the 
> table, but if not people can will throw an exception that gets propagated 
> back to the client explaining why the put was invalid.
> Constraints would be set on a per-table basis and the user would be expected 
> to ensure the jars containing the constraint are present on the machines 
> serving that table.
> Yes, people could roll their own mechanism for doing this via coprocessors 
> each time, but this would make it easier to do so, so you only have to 
> implement a very minimal interface and not worry about the specifics.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163902#comment-13163902
 ] 

Ted Yu commented on HBASE-4120:
---

Andy suggested placing the PriorityFunction.initRegionPriority(region) call in 
RegionObserver.postOpen()

> isolation and allocation
> 
>
> Key: HBASE-4120
> URL: https://issues.apache.org/jira/browse/HBASE-4120
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver
>Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
>Reporter: Liu Jia
>Assignee: Liu Jia
> Fix For: 0.94.0
>
> Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
> Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
> HBase_isolation_and_allocation_user_guide.pdf, 
> Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
> TablePriority_v12.patch, TablePriority_v12.patch, TablePriority_v8.patch, 
> TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
> TablePrioriy_v9.patch
>
>
> The HBase isolation and allocation tool is designed to help users manage 
> cluster resource among different application and tables.
> When we have a large scale of HBase cluster with many applications running on 
> it, there will be lots of problems. In Taobao there is a cluster for many 
> departments to test their applications performance, these applications are 
> based on HBase. With one cluster which has 12 servers, there will be only one 
> application running exclusively on this server, and many other applications 
> must wait until the previous test finished.
> After we add allocation manage function to the cluster, applications can 
> share the cluster and run concurrently. Also if the Test Engineer wants to 
> make sure there is no interference, he/she can move out other tables from 
> this group.
> In groups we use table priority to allocate resource, when system is busy; we 
> can make sure high-priority tables are not affected lower-priority tables
> Different groups can have different region server configurations, some groups 
> optimized for reading can have large block cache size, and others optimized 
> for writing can have large memstore size. 
> Tables and region servers can be moved easily between groups; after changing 
> the configuration, a group can be restarted alone instead of restarting the 
> whole cluster.
> git entry : https://github.com/ICT-Ope/HBase_allocation .
> We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4120) isolation and allocation

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163853#comment-13163853
 ] 

Ted Yu commented on HBASE-4120:
---

>From Andrew Purtell (see 'trip report for Hadoop In China' up on dev@):

After 0.92 is out I intend to champion / mentor / co-develop 4120 and the 
follow on table allocation work and target 0.94 for it. I think the RPC QoS 
aspect is not too controversial. The allocation/reservation aspects I'd like to 
aim for a coprocessor or at least master plugin based integration so they won't 
impact stability for users who don't enable it. Unlike RPC QoS I suspect the 
changes needed to core can be minimized to coprocessor framework additions. 
Follow up in new JIRAs soon.

> isolation and allocation
> 
>
> Key: HBASE-4120
> URL: https://issues.apache.org/jira/browse/HBASE-4120
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver
>Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
>Reporter: Liu Jia
>Assignee: Liu Jia
> Fix For: 0.94.0
>
> Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
> Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
> HBase_isolation_and_allocation_user_guide.pdf, 
> Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
> TablePriority_v12.patch, TablePriority_v12.patch, TablePriority_v8.patch, 
> TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
> TablePrioriy_v9.patch
>
>
> The HBase isolation and allocation tool is designed to help users manage 
> cluster resource among different application and tables.
> When we have a large scale of HBase cluster with many applications running on 
> it, there will be lots of problems. In Taobao there is a cluster for many 
> departments to test their applications performance, these applications are 
> based on HBase. With one cluster which has 12 servers, there will be only one 
> application running exclusively on this server, and many other applications 
> must wait until the previous test finished.
> After we add allocation manage function to the cluster, applications can 
> share the cluster and run concurrently. Also if the Test Engineer wants to 
> make sure there is no interference, he/she can move out other tables from 
> this group.
> In groups we use table priority to allocate resource, when system is busy; we 
> can make sure high-priority tables are not affected lower-priority tables
> Different groups can have different region server configurations, some groups 
> optimized for reading can have large block cache size, and others optimized 
> for writing can have large memstore size. 
> Tables and region servers can be moved easily between groups; after changing 
> the configuration, a group can be restarted alone instead of restarting the 
> whole cluster.
> git entry : https://github.com/ICT-Ope/HBase_allocation .
> We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163840#comment-13163840
 ] 

Ted Yu commented on HBASE-4946:
---

For coprocessor/Exec.java, javadoc doesn't match code:
{code}
 try {
   protocol = (Class)conf.getClassByName(protocolName);
 }
 catch (ClassNotFoundException cnfe) {
-  throw new IOException("Protocol class "+protocolName+" not found", cnfe);
+  // can't do eager instantiation. pass it as a string and try to 
deserialize later.
+  //throw new IOException("Protocol class "+protocolName+" not found", 
cnfe);
{code}
I think the above try block should be commented out.
TestCoprocessorEndpoint passes without the above assignment to protocol.

> HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
> dynamically loaded coprocessors (from hdfs or local system), because the RPC 
> system tries to deserialize an unknown class. 
> -
>
> Key: HBASE-4946
> URL: https://issues.apache.org/jira/browse/HBASE-4946
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors
>Affects Versions: 0.92.0
>Reporter: Andrei Dragomir
>Assignee: Andrei Dragomir
> Attachments: HBASE-4946-v2.patch, HBASE-4946-v3.patch, 
> HBASE-4946.patch
>
>
> Loading coprocessors jars from hdfs works fine. I load it from the shell, 
> after setting the attribute, and it gets loaded:
> {noformat}
> INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
> config now ...
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
> com.MyCoprocessorClass needs to be loaded from a file - 
> hdfs://localhost:9000/coproc/rt-  >0.0.1-SNAPSHOT.jar.
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
> com.MyCoprocessorClass
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
> RegionEnvironment createEnvironment
> DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
> handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
> protocol=com.MyCoprocessorClassProtocol
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
> coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
> {noformat}
> The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
> with a dynamic method. When calling this method from the client with 
> HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
> cannot be deserialized from writables. 
> The problem is that Exec tries to do an "early" resolve of the coprocessor 
> class. The coprocessor class is loaded, but it is in the context of the 
> HRegionServer / HRegion. So, the call fails:
> {noformat}
> 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
> Error in readFields
> java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
>   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:247)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
>   ... 10 more
> {noformat}
> Probably the

[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d

2011-12-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163819#comment-13163819
 ] 

Ted Yu commented on HBASE-4946:
---

The failed tests were due to 'unable to create new native thread'

> HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
> dynamically loaded coprocessors (from hdfs or local system), because the RPC 
> system tries to deserialize an unknown class. 
> -
>
> Key: HBASE-4946
> URL: https://issues.apache.org/jira/browse/HBASE-4946
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors
>Affects Versions: 0.92.0
>Reporter: Andrei Dragomir
>Assignee: Andrei Dragomir
> Attachments: HBASE-4946-v2.patch, HBASE-4946-v3.patch, 
> HBASE-4946.patch
>
>
> Loading coprocessors jars from hdfs works fine. I load it from the shell, 
> after setting the attribute, and it gets loaded:
> {noformat}
> INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
> config now ...
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
> com.MyCoprocessorClass needs to be loaded from a file - 
> hdfs://localhost:9000/coproc/rt-  >0.0.1-SNAPSHOT.jar.
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
> com.MyCoprocessorClass
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
> RegionEnvironment createEnvironment
> DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
> handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
> protocol=com.MyCoprocessorClassProtocol
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
> coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
> {noformat}
> The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
> with a dynamic method. When calling this method from the client with 
> HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
> cannot be deserialized from writables. 
> The problem is that Exec tries to do an "early" resolve of the coprocessor 
> class. The coprocessor class is loaded, but it is in the context of the 
> HRegionServer / HRegion. So, the call fails:
> {noformat}
> 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
> Error in readFields
> java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
>   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:247)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
>   ... 10 more
> {noformat}
> Probably the correct way to fix this is to make Exec really smart, so that it 
> knows all the class definitions loaded in CoprocessorHost(s).
> I created a small patch that simply doesn't resolve the class definition in 
> the Exec, instead passing it as string down to the HRegion layer. This layer 
> knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.or

[jira] [Commented] (HBASE-4954) IllegalArgumentException in hfile2 blockseek

2011-12-05 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163380#comment-13163380
 ] 

Ted Yu commented on HBASE-4954:
---

Can I mark this issue as Won't fix ?

> IllegalArgumentException in hfile2 blockseek
> 
>
> Key: HBASE-4954
> URL: https://issues.apache.org/jira/browse/HBASE-4954
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.92.0
>
>
> On Tue, Nov 29, 2011 at 10:20 PM, Stack  wrote:
> > The first hbase 0.92.0 release candidate is available for download:
> >
> >  http://people.apache.org/~stack/hbase-0.92.0-candidate-0/
> Here's another persistent issues that I'd appreciate somebody taking
> a quick look at:
> 
> http://bigtop01.cloudera.org:8080/view/Hadoop%200.22/job/Bigtop-hadoop22-smoketest/28/testReport/org.apache.bigtop.itest.hbase.smoke/TestHFileOutputFormat/testMRIncrementalLoadWithSplit/
> Caused by: java.lang.IllegalArgumentException
>at java.nio.Buffer.position(Buffer.java:218)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(HFileReaderV2.java:632)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:545)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:503)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:511)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:475)
>at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:157)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.copyHFileHalf(LoadIncrementalHFiles.java:544)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:516)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:377)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:441)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:325)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4954) IllegalArgumentException in hfile2 blockseek

2011-12-05 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163212#comment-13163212
 ] 

Ted Yu commented on HBASE-4954:
---

Stack trace for 0.92 is essentially the same.
But I don't see test output:
{code}
-rw-r--r--  1 zhihyu  110088321  0 Dec  5 16:21 
target/surefire-reports/org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat-output.txt
{code}


> IllegalArgumentException in hfile2 blockseek
> 
>
> Key: HBASE-4954
> URL: https://issues.apache.org/jira/browse/HBASE-4954
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.92.0
>
>
> On Tue, Nov 29, 2011 at 10:20 PM, Stack  wrote:
> > The first hbase 0.92.0 release candidate is available for download:
> >
> >  http://people.apache.org/~stack/hbase-0.92.0-candidate-0/
> Here's another persistent issues that I'd appreciate somebody taking
> a quick look at:
> 
> http://bigtop01.cloudera.org:8080/view/Hadoop%200.22/job/Bigtop-hadoop22-smoketest/28/testReport/org.apache.bigtop.itest.hbase.smoke/TestHFileOutputFormat/testMRIncrementalLoadWithSplit/
> Caused by: java.lang.IllegalArgumentException
>at java.nio.Buffer.position(Buffer.java:218)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(HFileReaderV2.java:632)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:545)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:503)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:511)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:475)
>at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:157)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.copyHFileHalf(LoadIncrementalHFiles.java:544)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:516)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:377)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:441)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:325)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4954) IllegalArgumentException in hfile2 blockseek

2011-12-05 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163208#comment-13163208
 ] 

Ted Yu commented on HBASE-4954:
---

It turns out that we should be using the following command:
{code}
 1180  mvn clean -Dhadoop.profile=22 compile
 1181  mvn -Dhadoop.profile=22 -P localTests test 
-Dtest=TestHFileOutputFormat#testMRIncrementalLoadWithSplit
{code}
where I saw (under TRUNK):
{code}
testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
  Time elapsed: 18.591 sec  <<< FAILURE!
java.lang.AssertionError
  at org.junit.Assert.fail(Assert.java:92)
  at org.junit.Assert.assertTrue(Assert.java:43)
  at org.junit.Assert.assertTrue(Assert.java:54)
  at 
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.runIncrementalPELoad(TestHFileOutputFormat.java:479)
  at 
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.doIncrementalLoadTest(TestHFileOutputFormat.java:391)
  at 
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.testMRIncrementalLoadWithSplit(TestHFileOutputFormat.java:370)
{code}

> IllegalArgumentException in hfile2 blockseek
> 
>
> Key: HBASE-4954
> URL: https://issues.apache.org/jira/browse/HBASE-4954
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.92.0
>
>
> On Tue, Nov 29, 2011 at 10:20 PM, Stack  wrote:
> > The first hbase 0.92.0 release candidate is available for download:
> >
> >  http://people.apache.org/~stack/hbase-0.92.0-candidate-0/
> Here's another persistent issues that I'd appreciate somebody taking
> a quick look at:
> 
> http://bigtop01.cloudera.org:8080/view/Hadoop%200.22/job/Bigtop-hadoop22-smoketest/28/testReport/org.apache.bigtop.itest.hbase.smoke/TestHFileOutputFormat/testMRIncrementalLoadWithSplit/
> Caused by: java.lang.IllegalArgumentException
>at java.nio.Buffer.position(Buffer.java:218)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(HFileReaderV2.java:632)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:545)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:503)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:511)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:475)
>at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:157)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.copyHFileHalf(LoadIncrementalHFiles.java:544)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:516)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:377)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:441)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:325)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4847) Activate single jvm for small tests on jenkins

2011-12-05 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163069#comment-13163069
 ] 

Ted Yu commented on HBASE-4847:
---

I don't see TestAssignmentManager under TRUNK.
I was able to run TestMemStore individually.

> Activate single jvm for small tests on jenkins
> --
>
> Key: HBASE-4847
> URL: https://issues.apache.org/jira/browse/HBASE-4847
> Project: HBase
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 0.94.0
> Environment: build
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 4847_all.v10.patch, 4847_all.v10.patch, 
> 4847_all.v10.patch, 4847_all.v11.patch, 4847_all.v11.patch, 
> 4847_all.v12.patch, 4847_all.v4.patch, 4847_all.v5.patch, 4847_all.v6.patch, 
> 4847_all.v6.patch, 4847_all.v7.patch, 4847_all.v7.patch, 4847_all.v7.patch, 
> 4847_all.v7.patch, 4847_all.v8.patch, 4847_all.v8.patch, 4847_all.v9.patch, 
> 4847_pom.patch, 4847_pom.v2.patch, 4847_pom.v2.patch, 4847_pom.v2.patch, 
> 4847_pom.v3.patch
>
>
> This will not revolutionate performances alone. We will win between 1 to 4 
> minutes.
> But we win as well:
>  - it's a step for parallelizing the tests
>  - new tests are less expensive as they do not create a new jvm: it's a 
> continuous win
>  - it will allow to push it on dev env while having the same env on dev & on 
> build, and 3 minutes are 10% of small + medium tests execution time.
> I will do a few "submit patch" to see if it works well before asking for the 
> real commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4954) IllegalArgumentException in hfile2 blockseek

2011-12-05 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163010#comment-13163010
 ] 

Ted Yu commented on HBASE-4954:
---

In our internal Jenkins build which uses hadoop 0.22, there was no such 
exception.
I last merged from Apache 0.92 on Nov. 14th
So this test failure might have been introduced by JIRAs integrated after Nov. 
14th - possibly the backport of HBASE-2856.

> IllegalArgumentException in hfile2 blockseek
> 
>
> Key: HBASE-4954
> URL: https://issues.apache.org/jira/browse/HBASE-4954
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.92.0
>
>
> On Tue, Nov 29, 2011 at 10:20 PM, Stack  wrote:
> > The first hbase 0.92.0 release candidate is available for download:
> >
> >  http://people.apache.org/~stack/hbase-0.92.0-candidate-0/
> Here's another persistent issues that I'd appreciate somebody taking
> a quick look at:
> 
> http://bigtop01.cloudera.org:8080/view/Hadoop%200.22/job/Bigtop-hadoop22-smoketest/28/testReport/org.apache.bigtop.itest.hbase.smoke/TestHFileOutputFormat/testMRIncrementalLoadWithSplit/
> Caused by: java.lang.IllegalArgumentException
>at java.nio.Buffer.position(Buffer.java:218)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(HFileReaderV2.java:632)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:545)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:503)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:511)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:475)
>at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:157)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.copyHFileHalf(LoadIncrementalHFiles.java:544)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:516)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.splitStoreFile(LoadIncrementalHFiles.java:377)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplit(LoadIncrementalHFiles.java:441)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:325)
>at 
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$2.call(LoadIncrementalHFiles.java:323)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to d

2011-12-05 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162740#comment-13162740
 ] 

Ted Yu commented on HBASE-4946:
---

The change in BloomFilterFactory is unrelated, please remove it from patch. 

> HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
> dynamically loaded coprocessors (from hdfs or local system), because the RPC 
> system tries to deserialize an unknown class. 
> -
>
> Key: HBASE-4946
> URL: https://issues.apache.org/jira/browse/HBASE-4946
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors
>Affects Versions: 0.92.0
>Reporter: Andrei Dragomir
>Assignee: Andrei Dragomir
> Attachments: HBASE-4946-v2.patch, HBASE-4946.patch
>
>
> Loading coprocessors jars from hdfs works fine. I load it from the shell, 
> after setting the attribute, and it gets loaded:
> {noformat}
> INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
> config now ...
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
> com.MyCoprocessorClass needs to be loaded from a file - 
> hdfs://localhost:9000/coproc/rt-  >0.0.1-SNAPSHOT.jar.
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
> com.MyCoprocessorClass
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
> RegionEnvironment createEnvironment
> DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
> handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
> protocol=com.MyCoprocessorClassProtocol
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
> coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
> {noformat}
> The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
> with a dynamic method. When calling this method from the client with 
> HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
> cannot be deserialized from writables. 
> The problem is that Exec tries to do an "early" resolve of the coprocessor 
> class. The coprocessor class is loaded, but it is in the context of the 
> HRegionServer / HRegion. So, the call fails:
> {noformat}
> 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
> Error in readFields
> java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
>   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:247)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
>   ... 10 more
> {noformat}
> Probably the correct way to fix this is to make Exec really smart, so that it 
> knows all the class definitions loaded in CoprocessorHost(s).
> I created a small patch that simply doesn't resolve the class definition in 
> the Exec, instead passing it as string down to the HRegion layer. This layer 
> knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure

[jira] [Commented] (HBASE-4605) Constraints

2011-12-04 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162565#comment-13162565
 ] 

Ted Yu commented on HBASE-4605:
---

Thanks Gary for the detailed review.
For the check() API:
{code}
  public void check(Put p) throws ConstraintException;
{code}
We can abstract the input parameter as Mutation.

I think we need to reach agreement on the following major issues:
1. introduction of dependency of Guava in client library - Jonathan Gray would 
strongly disagree with this practice.
2. whether IntegerConstraint should be included in the constraint package
3. constraint configuration serialization method

If the current implementation for some of the above isn't critical to this 
feature (#1 and #2), we can defer to future JIRAs.

Actually a fourth issue (raised by Suraj under the discussion of 'Question on 
Coprocessors and Atomicity') is more important: if we don't provide atomicity 
by holding rowlock during the check, the use cases for Constraints would 
decrease.
Issue #4 definitely doesn't have to be covered by this JIRA.

The fifth issue is how IntegrationTestConstraint.java should be tagged.

> Constraints
> ---
>
> Key: HBASE-4605
> URL: https://issues.apache.org/jira/browse/HBASE-4605
> Project: HBase
>  Issue Type: Improvement
>  Components: client, coprocessors
>Affects Versions: 0.94.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Attachments: 4605.v7, constraint_as_cp.txt, java_Constraint_v2.patch, 
> java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, java_HBASE-4605_v3.patch
>
>
> From Jesse's comment on dev:
> {quote}
> What I would like to propose is a simple interface that people can use to 
> implement a 'constraint' (matching the classic database definition). This 
> would help ease of adoption by helping HBase more easily check that box, help 
> minimize code duplication across organizations, and lead to easier adoption.
> Essentially, people would implement a 'Constraint' interface for checking 
> keys before they are put into a table. Puts that are valid get written to the 
> table, but if not people can will throw an exception that gets propagated 
> back to the client explaining why the put was invalid.
> Constraints would be set on a per-table basis and the user would be expected 
> to ensure the jars containing the constraint are present on the machines 
> serving that table.
> Yes, people could roll their own mechanism for doing this via coprocessors 
> each time, but this would make it easier to do so, so you only have to 
> implement a very minimal interface and not worry about the specifics.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4944) Optionally verify bulk loaded HFiles

2011-12-04 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162549#comment-13162549
 ] 

Ted Yu commented on HBASE-4944:
---

Patch v3 looks good.

Minor comment for the case of different families:
{code}
+  + " previous=" + Bytes.toStringBinary(prevKV.getKey())
+  + " current=" + Bytes.toStringBinary(kv.getKey()));
{code}
I think it would be nice to include family names by calling getFamily() in the 
above message.
This can be done at time of commit.

> Optionally verify bulk loaded HFiles
> 
>
> Key: HBASE-4944
> URL: https://issues.apache.org/jira/browse/HBASE-4944
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0, 0.94.0, 0.90.5
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: HBASE-4944-v2.patch, HBASE-4944-v3.patch
>
>
> We rely on users to produce properly formatted HFiles for bulk import. 
> Attached patch adds an optional code path, toggled by a configuration 
> property, that verifies the HFile under consideration for import is properly 
> sorted. The default maintains the current behavior, which does not scan the 
> file for correctness.
> Patch is against trunk but can apply against all active branches.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4942) HMaster is unable to start of HFile V1 is used

2011-12-04 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162546#comment-13162546
 ] 

Ted Yu commented on HBASE-4942:
---

Integrated to 0.92 branch and TRUNK.

Thanks for the patch, Honghua.

Thanks for the review Andy.

> HMaster is unable to start of HFile V1 is used
> --
>
> Key: HBASE-4942
> URL: https://issues.apache.org/jira/browse/HBASE-4942
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: honghua zhu
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBase_0.92.0_HBASE-4942, HBase_0.94.0_HBASE-4942
>
>
> This was reported by HH Zhu (zhh200...@gmail.com)
> If the following is specified in hbase-site.xml:
> {code}
> 
> hfile.format.version
> 1
> 
> {code}
> Clear the hdfs directory "hbase.rootdir" so that MasterFileSystem.bootstrap() 
> is executed.
> You would see:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV1.close(HFileReaderV1.java:358)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.close(StoreFile.java:1083)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.closeReader(StoreFile.java:570)
> at org.apache.hadoop.hbase.regionserver.Store.close(Store.java:441)
> at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:782)
> at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:717)
> at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:688)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.bootstrap(MasterFileSystem.java:390)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:356)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:128)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:113)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:435)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:314)
> at java.lang.Thread.run(Thread.java:619)
> {code}
> The above exception would lead to:
> {code}
> java.lang.RuntimeException: HMaster Aborted
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:152)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1512)
> {code}
> In org.apache.hadoop.hbase.master.HMaster.HMaster(Configuration conf), we 
> have:
> {code}
> this.conf.setFloat(CacheConfig.HFILE_BLOCK_CACHE_SIZE_KEY, 0.0f);
> {code}
> When CacheConfig is instantiated, the following is called:
> {code}
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(Configuration
>  conf)
> {code}
> Since "hfile.block.cache.size" is 0.0, instantiateBlockCache() would return 
> null, resulting in blockCache field of CacheConfig to be null.
> When master closes Root region, 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV1.close(boolean evictOnClose) 
> would be called. cacheConf.getBlockCache() returns null, leading to master 
> abort.
> The following should be called in HFileReaderV1.close(), similar to the code 
> in HFileReaderV2.close():
> {code}
> if (evictOnClose && cacheConf.isBlockCacheEnabled())
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4944) Optionally verify bulk loaded HFiles

2011-12-03 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162281#comment-13162281
 ] 

Ted Yu commented on HBASE-4944:
---

Looks like the patch should be rebased:
{code}
4 out of 5 hunks FAILED -- saving rejects to file 
src/main/java/org/apache/hadoop/hbase/regionserver/Store.java.rej
{code}

> Optionally verify bulk loaded HFiles
> 
>
> Key: HBASE-4944
> URL: https://issues.apache.org/jira/browse/HBASE-4944
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0, 0.94.0, 0.90.5
>Reporter: Andrew Purtell
>Priority: Minor
> Attachments: 4944.txt
>
>
> We rely on users to produce properly formatted HFiles for bulk import. 
> Attached patch adds an optional code path, toggled by a configuration 
> property, that verifies the HFile under consideration for import is properly 
> sorted. The default maintains the current behavior, which does not scan the 
> file for correctness.
> Patch is against trunk but can apply against all active branches.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4944) Optionally verify bulk loaded HFiles

2011-12-03 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162280#comment-13162280
 ] 

Ted Yu commented on HBASE-4944:
---

Minor comments:
{code}
+KeyValue pkv = null;
{code}
The variable can be named prevKV which is clearer.
{code}
+  throw new InvalidHFileException("Previous row is greater then"
{code}
Typo above, should be 'greater than'.



> Optionally verify bulk loaded HFiles
> 
>
> Key: HBASE-4944
> URL: https://issues.apache.org/jira/browse/HBASE-4944
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0, 0.94.0, 0.90.5
>Reporter: Andrew Purtell
>Priority: Minor
> Attachments: 4944.txt
>
>
> We rely on users to produce properly formatted HFiles for bulk import. 
> Attached patch adds an optional code path, toggled by a configuration 
> property, that verifies the HFile under consideration for import is properly 
> sorted. The default maintains the current behavior, which does not scan the 
> file for correctness.
> Patch is against trunk but can apply against all active branches.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

2011-12-03 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162278#comment-13162278
 ] 

Ted Yu commented on HBASE-4880:
---

I think the patch makes sense.

I ran TestHCM with patch applied. It passed.

> Region is on service before completing openRegionHanlder, may cause data loss
> -
>
> Key: HBASE-4880
> URL: https://issues.apache.org/jira/browse/HBASE-4880
> Project: HBase
>  Issue Type: Bug
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing 
> region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 
> may loss, because the region has been opend on another regionserver and be 
> put new data. Therefore, it may not be recoverd through replayRecoveredEdit() 
> because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4945) NPE in HRegion.bulkLoadHFiles(...)

2011-12-03 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162275#comment-13162275
 ] 

Ted Yu commented on HBASE-4945:
---

Good catch, Lars.

I think the fix should go to 0.92

> NPE in HRegion.bulkLoadHFiles(...)
> --
>
> Key: HBASE-4945
> URL: https://issues.apache.org/jira/browse/HBASE-4945
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Priority: Minor
>
> Was playing with "completebulkload", and ran into an NPE.
> The problem is here (HRegion.bulkLoadHFiles(...)).
> {code}
> Store store = getStore(familyName);
> if (store == null) {
>   IOException ioe = new DoNotRetryIOException(
>   "No such column family " + Bytes.toStringBinary(familyName));
>   ioes.add(ioe);
>   failures.add(p);
> }
> try {
>   store.assertBulkLoadHFileOk(new Path(path));
> } catch (WrongRegionException wre) {
>   // recoverable (file doesn't fit in region)
>   failures.add(p);
> } catch (IOException ioe) {
>   // unrecoverable (hdfs problem)
>   ioes.add(ioe);
> }
> {code}
> This should be 
> {code}
> Store store = getStore(familyName);
> if (store == null) {
> ...
> } else {
>   try {
> store.assertBulkLoadHFileOk(new Path(path));
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4936) Cached HRegionInterface connections crash when getting UnknownHost exceptions

2011-12-03 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162272#comment-13162272
 ] 

Ted Yu commented on HBASE-4936:
---

+1 on patch.

@Andrei:
Please re-attach patch using --no-prefix option.
HadoopQA uses -p0 to apply patches.

> Cached HRegionInterface connections crash when getting UnknownHost exceptions
> -
>
> Key: HBASE-4936
> URL: https://issues.apache.org/jira/browse/HBASE-4936
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Andrei Dragomir
> Attachments: HBASE-4936.patch
>
>
> This isssue is unlikely to come up in a cluster test case. However, for 
> development, the following thing happens: 
> 1. Start the HBase cluster locally, on network A (DNS A, etc)
> 2. The region locations are cached using the hostname 
> (mycomputer.company.com, 211.x.y.z - real ip)
> 3. Change network location (go home)
> 4. Start the HBase cluster locally. My hostname / ips are not different 
> (mycomputer, 192.168.0.130 - new ip)
> If the region locations have been cached using the hostname, there is an 
> UnknownHostException in CatalogTracker.getCachedConnection(ServerName sn), 
> uncaught in the catch statements. The server will crash constantly. 
> The error should be caught and not rethrown, so that the cached connection 
> expires normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4942) HMaster is unable to start of HFile V1 is used

2011-12-03 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162270#comment-13162270
 ] 

Ted Yu commented on HBASE-4942:
---

You're right Lars.

> HMaster is unable to start of HFile V1 is used
> --
>
> Key: HBASE-4942
> URL: https://issues.apache.org/jira/browse/HBASE-4942
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: honghua zhu
> Fix For: 0.92.0, 0.94.0
>
>
> This was reported by HH Zhu (zhh200...@gmail.com)
> If the following is specified in hbase-site.xml:
> {code}
> 
> hfile.format.version
> 1
> 
> {code}
> Clear the hdfs directory "hbase.rootdir" so that MasterFileSystem.bootstrap() 
> is executed.
> You would see:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV1.close(HFileReaderV1.java:358)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.close(StoreFile.java:1083)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.closeReader(StoreFile.java:570)
> at org.apache.hadoop.hbase.regionserver.Store.close(Store.java:441)
> at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:782)
> at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:717)
> at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:688)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.bootstrap(MasterFileSystem.java:390)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:356)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:128)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:113)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:435)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:314)
> at java.lang.Thread.run(Thread.java:619)
> {code}
> The above exception would lead to:
> {code}
> java.lang.RuntimeException: HMaster Aborted
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:152)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1512)
> {code}
> In org.apache.hadoop.hbase.master.HMaster.HMaster(Configuration conf), we 
> have:
> {code}
> this.conf.setFloat(CacheConfig.HFILE_BLOCK_CACHE_SIZE_KEY, 0.0f);
> {code}
> When CacheConfig is instantiated, the following is called:
> {code}
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(Configuration
>  conf)
> {code}
> Since "hfile.block.cache.size" is 0.0, instantiateBlockCache() would return 
> null, resulting in blockCache field of CacheConfig to be null.
> When master closes Root region, 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV1.close(boolean evictOnClose) 
> would be called. cacheConf.getBlockCache() returns null, leading to master 
> abort.
> The following should be called in HFileReaderV1.close(), similar to the code 
> in HFileReaderV2.close():
> {code}
> if (evictOnClose && cacheConf.isBlockCacheEnabled())
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]

2011-12-03 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162224#comment-13162224
 ] 

Ted Yu commented on HBASE-4379:
---

@Jonathan:
You need to re-attach patch because HadoopQA checks the attachment Id.
If the attachment Id corresponds to a patch which was verified earlier, there 
would be no test suite execution.

Thanks

> [hbck] Does not complain about tables with no end region [Z,]
> -
>
> Key: HBASE-4379
> URL: https://issues.apache.org/jira/browse/HBASE-4379
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch, 
> hbase-4379.v2.patch
>
>
> hbck does not detect or have an error condition when the last region of a 
> table is missing (end key != '').

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4921) HTable initialization looks for EMPTY_START_ROW

2011-12-03 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162215#comment-13162215
 ] 

Ted Yu commented on HBASE-4921:
---

@Pritam:
Can you provide a patch and let us know the result of running through test 
suite ?

Thanks

> HTable initialization looks for EMPTY_START_ROW
> ---
>
> Key: HBASE-4921
> URL: https://issues.apache.org/jira/browse/HBASE-4921
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: Pritam Damania
>
> The HTable initialization does something like this : 
> {code}this.connection.locateRegion(tableName, 
> HConstants.EMPTY_START_ROW);{code}
> What is the rationale behind this ? What would happen if this region is in 
> flight ? I ran into a problem where I disabled the first region of the table 
> and now I can't create an HTable instance to this table.
> Disabling the first region is like disabling the entire table from a client 
> perspective. I feel this is not the correct behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3373) Allow regions of specific table to be load-balanced

2011-11-30 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160233#comment-13160233
 ] 

Ted Yu commented on HBASE-3373:
---

@Ben:
Thanks for trying out 0.94

The code snippet above deals with region server which recently joined the 
cluster. Its goal is to avoid hot region server which receives above average 
load.
This is part of the changes from HBASE-3609. The randomization is done on this 
line:
{code}
Collections.shuffle(sns, RANDOM);
{code}
where we schedule regions to region servers which are shuffled randomly.

Your observation about unbalanced table(s) in the cluster is valid. This is due 
to master not passing per-table region distribution to balanceCluster().
I have a patch which is in internal repository where master calls 
balanceCluster() for each table.
Once we test it in production cluster, I should be able to contribute back.

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Attachments: HbaseBalancerTest2.java
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159859#comment-13159859
 ] 

Ted Yu commented on HBASE-4899:
---

@Chunhui:
Please let us know testing result on your QA environment.

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159858#comment-13159858
 ] 

Ted Yu commented on HBASE-4899:
---

{code}
+  + " because it has been opened in "
+  + addressFromAM.getServerName());
{code}
We should use the value of rit (RegionState) in the above log instead of hard 
coding 'opened'.

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: chunhui shen
> Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META

2011-11-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159761#comment-13159761
 ] 

Ted Yu commented on HBASE-4616:
---

In SplitTransaction.java, getDaughterRegionIdTimestamp() is removed.
I think we should take care of clock skew.

Also, using hri.getRegionId() - 1 as Id for daughter region Ids is not 
intuitive.

> Update hregion encoded name to reduce logic and prevent region collisions in 
> META
> -
>
> Key: HBASE-4616
> URL: https://issues.apache.org/jira/browse/HBASE-4616
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Alex Newman
>Assignee: Alex Newman
> Attachments: HBASE-4616.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META

2011-11-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159694#comment-13159694
 ] 

Ted Yu commented on HBASE-4616:
---

In HRegionInfo.java:
{code}
  public static final int END_OF_TABLE_TABLE_NAME = END_OF_TABLE_NAME + 1;
{code}
I think END_OF_TABLE_NAME_FOR_EMPTY_ENDKEY would be a better name.

In createRegionName():
{code}
if (id != null || id.length > 0 ) {
{code}
&& should be used above.

> Update hregion encoded name to reduce logic and prevent region collisions in 
> META
> -
>
> Key: HBASE-4616
> URL: https://issues.apache.org/jira/browse/HBASE-4616
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Alex Newman
>Assignee: Alex Newman
> Attachments: HBASE-4616.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META

2011-11-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159676#comment-13159676
 ] 

Ted Yu commented on HBASE-4616:
---

Where is MetaSearchRow defined ?
What's the difference between MetaSearchRow.getStartSearchRow(tableName, null) 
and MetaSearchRow.getStartSearchRow(tableName, HConstants.EMPTY_BYTE_ARRAY) ?

> Update hregion encoded name to reduce logic and prevent region collisions in 
> META
> -
>
> Key: HBASE-4616
> URL: https://issues.apache.org/jira/browse/HBASE-4616
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Alex Newman
>Assignee: Alex Newman
> Attachments: HBASE-4616.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4893) HConnectionImplementation closed-but-not-deleted, need a way to find the state of connection

2011-11-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159613#comment-13159613
 ] 

Ted Yu commented on HBASE-4893:
---

HConnectionManager.deleteStaleConnection() can be utilized in the above 
proposal.

> HConnectionImplementation closed-but-not-deleted, need a way to find the 
> state of connection
> 
>
> Key: HBASE-4893
> URL: https://issues.apache.org/jira/browse/HBASE-4893
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.1, 0.90.2, 0.90.3, 0.90.4
> Environment: Linux 2.6, HBase-0.90.1
>Reporter: Mubarak Seyed
>  Labels: hbase-client
> Fix For: 0.90.5
>
>
> In abort() of HConnectionManager$HConnectionImplementation, instance of 
> HConnectionImplementation is marked as this.closed=true.
> There is no way for client application to check the hbase client connection 
> whether it is still opened/good (this.closed=false) or not. We need a method 
> to validate the state of a connection like isClosed().
> {code}
> public boolean isClosed(){
>return this.closed;
> } 
> {code}
> Once the connection is closed and it should get deleted. Client application 
> still gets a connection from HConnectionManager.getConnection(Configuration) 
> and tries to make a RPC call to RS, since connection is already closed, 
> HConnectionImplementation.getRegionServerWithRetries throws 
> RetriesExhaustedException with error message
> {code}
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying 
> to contact region server null for region , row 
> '----xxx', but failed after 10 attempts.
> Exceptions:
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
> java.io.IOException: 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
>  closed
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1008)
>   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:546)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Clash between region unassign and splitting kills the master

2011-11-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159426#comment-13159426
 ] 

Ted Yu commented on HBASE-4729:
---

+1 on patch v5.

> Clash between region unassign and splitting kills the master
> 
>
> Key: HBASE-4729
> URL: https://issues.apache.org/jira/browse/HBASE-4729
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.92.0, 0.94.0
>
> Attachments: 4729-v2.txt, 4729-v3.txt, 4729-v4.txt, 4729-v5.txt, 
> 4729.txt
>
>
> I was running an online alter while regions were splitting, and suddenly the 
> master died and left my table half-altered (haven't restarted the master yet).
> What killed the master:
> {quote}
> 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unexpected ZK exception creating node CLOSING
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
> at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {quote}
> A znode was created because the region server was splitting the region 4 
> seconds before:
> {quote}
> 2011-11-02 17:06:40,704 INFO 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
> region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
> 2011-11-02 17:06:40,704 DEBUG 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: 
> regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
> f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
> 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:62023-0x132f043bbde0710 Attempting to transition node 
> f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
> RS_ZK_REGION_SPLITTING
> ...
> 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
> f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
> RS_ZK_REGION_SPLIT
> 2011-11-02 17:06:44,061 INFO 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
> master to process the split for f7e1783e65ea8d621a4bc96ad310f101
> {quote}
> Now that the master is dead the region server is spewing those last two lines 
> like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   3   4   5   6   7   8   >