date:20111214

2011-12-14 Thread Nicolas Spiegelberg (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169186#comment-13169186
 ] 

nkeywal commented on HBASE-4993:


HBase#Configuration is expensive by nature, see the comment in @return.

As well the two comments are not saying the same thing: Creates a clone is 
not reloading the xml file and adding the given conf.

{noformat}
  /**
   * Creates a clone of passed configuration.
   * @param that Configuration to clone.
   * @return a Configuration created with the hbase-*.xml files plus
   * the given configuration.
   */
  public static Configuration create(final Configuration that) {
Configuration conf = create(); // this loads the xml files
merge(conf, that); // for every entry in 'that', replace the content of 
'conf' by the content of 'that'
return conf;
  }
{noformat}

To  be compared with a call to clone as implemented in Configuration
{noformat}
public static Configuration create(final Configuration that) {
  new Configuration(that);
return conf;
{noformat}

When I replace the former by the later, the tests is 3 times faster.

We can either:
- remove the function
- replace the former implementation with the one mentioned above
- keep the function, add a performance warning, and replace the calls to this 
function by a call to the Configuration constructor.

I'm going to check the tests results on option 2.








 Performance regression in minicluster creation
 --

 Key: HBASE-4993
 URL: https://issues.apache.org/jira/browse/HBASE-4993
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4993.patch, 4993.v3.patch


 Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5021) Enforce upper bound on timestamp

2011-12-14 Thread Nicolas Spiegelberg (Created) (JIRA)

Enforce upper bound on timestamp


 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0


We have been getting hit with performance problems on our time-series database 
due to invalid timestamps being inserted by the timestamp.  We are working on 
adding proper checks to app server, but production performance could be 
severely impacted with significant recovery time if something slips past.  
Since timestamps are considered a fundamental part of the HBase schema  
multiple optimizations use timestamp information, we should allow the option to 
sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5021) Enforce upper bound on timestamp


[ 
https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169242#comment-13169242
 ] 

Nicolas Spiegelberg commented on HBASE-5021:


Note: my original idea was to allow the user to input a specified timerange to 
perform min/max checks.  This doesn't seem to be very useful because the lower 
bound is already handled by the TTL option.

 Enforce upper bound on timestamp
 

 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0


 We have been getting hit with performance problems on our time-series 
 database due to invalid timestamps being inserted by the timestamp.  We are 
 working on adding proper checks to app server, but production performance 
 could be severely impacted with significant recovery time if something slips 
 past.  Since timestamps are considered a fundamental part of the HBase schema 
  multiple optimizations use timestamp information, we should allow the 
 option to sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5021) Enforce upper bound on timestamp

2011-12-14 Thread Phabricator (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5021:
---

Attachment: D849.1.patch

nspiegelberg requested code review of [jira] [HBase-5021] Enforce upper bound 
on timestamp.
Reviewers: Kannan, Liyin, JIRA

  We have been getting hit with performance problems on the ODS
  side due to invalid timestamps being inserted by the timestamp.  ODS is
  working on adding proper checks to app server, but production
  performance could be severely impacted with significant recovery time if
  something slips past.  Therefore, we should also allow the option to
  check the upper bound in HBase.

  This is the first draft.  Probably should allow per-CF customization.

TEST PLAN
   - mvn test -Dtest=TestHRegion#testPutWithTsTooNew

REVISION DETAIL
  https://reviews.facebook.net/D849

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Enforce upper bound on timestamp
 

 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0

 Attachments: D849.1.patch


 We have been getting hit with performance problems on our time-series 
 database due to invalid timestamps being inserted by the timestamp.  We are 
 working on adding proper checks to app server, but production performance 
 could be severely impacted with significant recovery time if something slips 
 past.  Since timestamps are considered a fundamental part of the HBase schema 
  multiple optimizations use timestamp information, we should allow the 
 option to sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5022) Optimize HBaseConfiguration#create

Optimize HBaseConfiguration#create
--

 Key: HBASE-5022
 URL: https://issues.apache.org/jira/browse/HBASE-5022
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5022) Optimize HBaseConfiguration#create


 [ 
https://issues.apache.org/jira/browse/HBASE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5022:
---

Attachment: 5022.patch

 Optimize HBaseConfiguration#create
 --

 Key: HBASE-5022
 URL: https://issues.apache.org/jira/browse/HBASE-5022
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5022.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5022) Optimize HBaseConfiguration#create


 [ 
https://issues.apache.org/jira/browse/HBASE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5022:
---

Status: Patch Available  (was: Open)

 Optimize HBaseConfiguration#create
 --

 Key: HBASE-5022
 URL: https://issues.apache.org/jira/browse/HBASE-5022
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5022.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5023) A thread named LeaseChecker remains after the shutdown of a cluster

A thread named LeaseChecker remains after the shutdown of a cluster
---

 Key: HBASE-5023
 URL: https://issues.apache.org/jira/browse/HBASE-5023
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Priority: Minor


If the minicluster is started/stopped multiple time, there is a new 
LeaseChecker each time. This thread is not created by HBase but by hdfs.

This is likely to be HDFS-1840, solved in 0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-14 Thread Anoop Sam John (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169264#comment-13169264
 ] 

Anoop Sam John commented on HBASE-5009:
---

Due to the max wait time out elapse, when we drop the split attempt, we just 
shutdown the thread pool which do the StoreFileSplitter tasks. This will not 
guarantee the stop of the threads . Dont you think this could be an issue?

After the time out the split log thread will start the rollback but some 
threads which it had started still might be there alive and can do some work 
afterwards. 
Ram - The split rollback need to ensure the closure of these threads also?


 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-14 Thread gaojinchao (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169266#comment-13169266
 ] 

gaojinchao commented on HBASE-5009:
---

Yes , that is th root reason, I think we should guarantee all children threads 
is stoped.
At same time, splitdir is not useful, we alse delete it. It seems no harm


 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5024) A thread named LruBlockCache.EvictionThread remains after the shutdown of a cluster

A thread named LruBlockCache.EvictionThread remains after the shutdown of a 
cluster
---

 Key: HBASE-5024
 URL: https://issues.apache.org/jira/browse/HBASE-5024
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Priority: Minor


There is no cleanup function in hbase.io.hfile.CacheConfig. The cache is a 
singleton, shared by all cluster if we launch more than one cluster on a test.

Related code is:

{noformat}
  /**
   * Static reference to the block cache, or null if no caching should be used
   * at all.
   */
  private static BlockCache globalBlockCache;

  /** Boolean whether we have disabled the block cache entirely. */
  private static boolean blockCacheDisabled = false;

  /**
   * Returns the block cache or codenull/code in case none should be used.
   *
   * @param conf  The current configuration.
   * @return The block cache or codenull/code.
   */
  private static synchronized BlockCache instantiateBlockCache(){
 // initiate globalBlockCache
{noformat}




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5022) Optimize HBaseConfiguration#create

[
https://issues.apache.org/jira/browse/HBASE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169286#comment-13169286
]

Hadoop QA commented on HBASE-5022:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12507329/5022.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 javadoc. The javadoc tool appears to have generated -152 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 75 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestAdmin
org.apache.hadoop.hbase.client.TestInstantSchemaChange
org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/503//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/503//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/503//console

This message is automatically generated.

Optimize HBaseConfiguration#create
--

Key: HBASE-5022
URL: https://issues.apache.org/jira/browse/HBASE-5022
Project: HBase
Issue Type: Improvement
Components: performance
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
Attachments: 5022.patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5022) Optimize HBaseConfiguration#create


[ 
https://issues.apache.org/jira/browse/HBASE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169328#comment-13169328
 ] 

nkeywal commented on HBASE-5022:


usual Too many open files

It just proves that TestAdmin.testCheckHBaseAvailableClosesConnection flakyness 
is not linked to the loading of the xml config files...

 Optimize HBaseConfiguration#create
 --

 Key: HBASE-5022
 URL: https://issues.apache.org/jira/browse/HBASE-5022
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5022.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5025) Two threads named pool-1-thread-1 and pool-2-thread-1 remains after the shutdown of a cluster

Two threads named pool-1-thread-1 and pool-2-thread-1 remains after the 
shutdown of a cluster
-

 Key: HBASE-5025
 URL: https://issues.apache.org/jira/browse/HBASE-5025
 Project: HBase
  Issue Type: Bug
  Components: ipc, test
Affects Versions: 0.94.0
Reporter: nkeywal
Priority: Minor


There are two issues with these threads:
 - there should have a better name.  pool-x-thread-y is the default name given 
by java.util.concurrent.Executors
 - these threads should not survive to a minicluster shutdown

They are created by org.apache.hadoop.ipc.Server$Listener (with version 
0.20.205.0; the code was different before), so the first issue is a hadoop 
common one. It's unclear for the second one, it could be hadoop-common as well. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5025) Two threads named pool-1-thread-1 and pool-2-thread-1 remains after the shutdown of a cluster

[
https://issues.apache.org/jira/browse/HBASE-5025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

nkeywal updated HBASE-5025:
---

Description:
There are two issues with these threads:
- there should have a better name. pool-x-thread-y is the default name given
by java.util.concurrent.Executors
- these threads should not survive to a minicluster shutdown

They are created by org.apache.hadoop.ipc.Server$Listener (with version
0.20.205.0; the code was different before), so the first issue is a hadoop
common one. It's unclear for the second one, it could be hadoop-common as well.

Constructor for org.apache.hadoop.ipc.Server$Listener:
{noformat}
public Listener() throws IOException {
//...
readPool = Executors.newFixedThreadPool(readThreads); // Lack a
ThreadFactory to set the names
//...
}
{noformat}

Server#stop shutdowns the thread pool.

was:
There are two issues with these threads:
- there should have a better name. pool-x-thread-y is the default name given
by java.util.concurrent.Executors
- these threads should not survive to a minicluster shutdown

Two threads named pool-1-thread-1 and pool-2-thread-1 remains after the
shutdown of a cluster
-

Key: HBASE-5025
URL: https://issues.apache.org/jira/browse/HBASE-5025
Project: HBase
Issue Type: Bug
Components: ipc, test
Affects Versions: 0.94.0
Reporter: nkeywal
Priority: Minor

There are two issues with these threads:
- there should have a better name. pool-x-thread-y is the default name
given by java.util.concurrent.Executors
- these threads should not survive to a minicluster shutdown
They are created by org.apache.hadoop.ipc.Server$Listener (with version
0.20.205.0; the code was different before), so the first issue is a hadoop
common one. It's unclear for the second one, it could be hadoop-common as
well.
Constructor for org.apache.hadoop.ipc.Server$Listener:
{noformat}
public Listener() throws IOException {
//...
readPool = Executors.newFixedThreadPool(readThreads); // Lack a
ThreadFactory to set the names
//...
}
{noformat}
Server#stop shutdowns the thread pool.

[jira] [Commented] (HBASE-5022) Optimize HBaseConfiguration#create


[ 
https://issues.apache.org/jira/browse/HBASE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169352#comment-13169352
 ] 

Zhihong Yu commented on HBASE-5022:
---

+1 on patch. 

 Optimize HBaseConfiguration#create
 --

 Key: HBASE-5022
 URL: https://issues.apache.org/jira/browse/HBASE-5022
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5022.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()

2011-12-14 Thread Liu Jia (Created) (JIRA)

Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()


 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0


The RegionObserver's preScannerClose() and postScannerClose()
methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()

2011-12-14 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Jia updated HBASE-5026:
---

Attachment: RegionObserverLeaseExpired.patch

 Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()
 

 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0

 Attachments: RegionObserverLeaseExpired.patch


 The RegionObserver's preScannerClose() and postScannerClose()
 methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()


[ 
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169490#comment-13169490
 ] 

jirapos...@reviews.apache.org commented on HBASE-5026:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3197/
---

Review request for hbase.


Summary
---

Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired(),the 
RegionObserver's preScannerClose() and postScannerClose()
methods should cover the scanner leaseExpired() situation.The following code is 
provided by Ted.


This addresses bug HBASE-5026.
https://issues.apache.org/jira/browse/HBASE-5026


Diffs
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
 1213130 

Diff: https://reviews.apache.org/r/3197/diff


Testing
---

Test done.


Thanks,

Jia



 Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()
 

 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0

 Attachments: RegionObserverLeaseExpired.patch


 The RegionObserver's preScannerClose() and postScannerClose()
 methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()

2011-12-14 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Jia updated HBASE-5026:
---

Status: Patch Available  (was: Open)

 Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()
 

 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0

 Attachments: RegionObserverLeaseExpired.patch


 The RegionObserver's preScannerClose() and postScannerClose()
 methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4120) isolation and allocation


[ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169507#comment-13169507
 ] 

jirapos...@reviews.apache.org commented on HBASE-4120:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1421/
---

(Updated 2011-12-14 17:12:58.306061)


Review request for hbase.


Changes
---

Modify the two test cases 
:org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort
   
org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove
Enable the table priority function by default.


Summary
---

Patch used for table priority alone,In this patch, not only tables can have 
different priorities but also the different actions like get,scan,put and 
delete can have priorities.


This addresses bug HBase-4120.
https://issues.apache.org/jira/browse/HBase-4120


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java
 1213130 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
 1213130 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/PriorityFunction.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/PriorityHBaseServer.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/PriorityJobQueue.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/QosRegionObserver.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
 1213130 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java
 1213130 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java
 1213130 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/ipc/TestActionPriority.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/ipc/TestPriorityJobQueue.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/ipc/TestTablePriority.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1421/diff


Testing
---

Tested with test cases in  TestCase_For_TablePriority_trunk_v1.patch 
please apply the patch of HBASE-4181 first,in some circumstances this bug will 
affect the performance of client.


Thanks,

Jia



 isolation and allocation
 

 Key: HBASE-4120
 URL: https://issues.apache.org/jira/browse/HBASE-4120
 Project: HBase
  Issue Type: New Feature
  Components: master, regionserver
Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.94.0

 Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
 Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
 HBase_isolation_and_allocation_user_guide.pdf, 
 Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
 TablePriority_v12.patch, TablePriority_v12.patch, 
 TablePriority_v15_with_coprocessor.patch, TablePriority_v8.patch, 
 TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
 TablePrioriy_v9.patch


 The HBase isolation and allocation tool is designed to help users manage 
 cluster resource among different application and tables.
 When we have a large scale of HBase cluster with many applications running on 
 it, there will be lots of problems. In Taobao there is a cluster for many 
 departments to test their applications performance, these applications are 
 based on HBase. With one cluster which has 12 servers, there will be only one 
 application running exclusively on this server, and many other applications 
 must wait until the previous test finished.
 After we add allocation manage function to the cluster, applications can 
 share the cluster and run concurrently. Also if the Test Engineer wants to 
 make sure there is no interference, he/she can move out other tables from 
 this group.
 In groups we use table priority to allocate resource, when system is busy; we 
 can make sure high-priority tables are not affected lower-priority tables
 Different groups can have different region server configurations, some groups 
 optimized for reading can have large block cache size, and others optimized

[jira] [Updated] (HBASE-4120) isolation and allocation


 [ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Jia updated HBASE-4120:
---

Attachment: TablePriority_v16_with_coprocessor.patch

Modify the two test cases 
:org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort
   
org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove
Enable the table priority function by default.

 isolation and allocation
 

 Key: HBASE-4120
 URL: https://issues.apache.org/jira/browse/HBASE-4120
 Project: HBase
  Issue Type: New Feature
  Components: master, regionserver
Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.94.0

 Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
 Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
 HBase_isolation_and_allocation_user_guide.pdf, 
 Performance_of_Table_priority.pdf, System Structure.jpg, TablePriority.patch, 
 TablePriority_v12.patch, TablePriority_v12.patch, 
 TablePriority_v15_with_coprocessor.patch, 
 TablePriority_v16_with_coprocessor.patch, TablePriority_v8.patch, 
 TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
 TablePrioriy_v9.patch


 The HBase isolation and allocation tool is designed to help users manage 
 cluster resource among different application and tables.
 When we have a large scale of HBase cluster with many applications running on 
 it, there will be lots of problems. In Taobao there is a cluster for many 
 departments to test their applications performance, these applications are 
 based on HBase. With one cluster which has 12 servers, there will be only one 
 application running exclusively on this server, and many other applications 
 must wait until the previous test finished.
 After we add allocation manage function to the cluster, applications can 
 share the cluster and run concurrently. Also if the Test Engineer wants to 
 make sure there is no interference, he/she can move out other tables from 
 this group.
 In groups we use table priority to allocate resource, when system is busy; we 
 can make sure high-priority tables are not affected lower-priority tables
 Different groups can have different region server configurations, some groups 
 optimized for reading can have large block cache size, and others optimized 
 for writing can have large memstore size. 
 Tables and region servers can be moved easily between groups; after changing 
 the configuration, a group can be restarted alone instead of restarting the 
 whole cluster.
 git entry : https://github.com/ICT-Ope/HBase_allocation .
 We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META

2011-12-14 Thread Alex Newman (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-4616:
---

Attachment: HBASE-4616-v2.patch

Renamed the file

 Update hregion encoded name to reduce logic and prevent region collisions in 
 META
 -

 Key: HBASE-4616
 URL: https://issues.apache.org/jira/browse/HBASE-4616
 Project: HBase
  Issue Type: Umbrella
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: HBASE-4616-v2.patch, HBASE-4616.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META

2011-12-14 Thread Alex Newman (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-4616:
---

Attachment: (was: HBASE-2616-v2.patch)

 Update hregion encoded name to reduce logic and prevent region collisions in 
 META
 -

 Key: HBASE-4616
 URL: https://issues.apache.org/jira/browse/HBASE-4616
 Project: HBase
  Issue Type: Umbrella
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: HBASE-4616-v2.patch, HBASE-4616.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META

2011-12-14 Thread Alex Newman (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169518#comment-13169518
 ] 

Alex Newman commented on HBASE-4616:


@Zhihong Yu

This is because we haven't taken care of migration yet. Stack promised he would 
help me out. 

@Stack perhaps I could drop over at su or something like that so we could plan 
the migration?

 Update hregion encoded name to reduce logic and prevent region collisions in 
 META
 -

 Key: HBASE-4616
 URL: https://issues.apache.org/jira/browse/HBASE-4616
 Project: HBase
  Issue Type: Umbrella
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: HBASE-4616-v2.patch, HBASE-4616.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5022) Optimize HBaseConfiguration#create


[ 
https://issues.apache.org/jira/browse/HBASE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169520#comment-13169520
 ] 

Zhihong Yu commented on HBASE-5022:
---

Integrated to TRUNK.

Thanks for the patch N.

 Optimize HBaseConfiguration#create
 --

 Key: HBASE-5022
 URL: https://issues.apache.org/jira/browse/HBASE-5022
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5022.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4120) isolation and allocation


 [ 
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Jia updated HBASE-4120:
---

Attachment: Simple_YCSB_Tests_For_TablePriority_Trunk_and_0.90.4.pdf

 isolation and allocation
 

 Key: HBASE-4120
 URL: https://issues.apache.org/jira/browse/HBASE-4120
 Project: HBase
  Issue Type: New Feature
  Components: master, regionserver
Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.94.0

 Attachments: Design_document_for_HBase_isolation_and_allocation.pdf, 
 Design_document_for_HBase_isolation_and_allocation_Revised.pdf, 
 HBase_isolation_and_allocation_user_guide.pdf, 
 Performance_of_Table_priority.pdf, 
 Simple_YCSB_Tests_For_TablePriority_Trunk_and_0.90.4.pdf, System 
 Structure.jpg, TablePriority.patch, TablePriority_v12.patch, 
 TablePriority_v12.patch, TablePriority_v15_with_coprocessor.patch, 
 TablePriority_v16_with_coprocessor.patch, TablePriority_v8.patch, 
 TablePriority_v8.patch, TablePriority_v8_for_trunk.patch, 
 TablePrioriy_v9.patch


 The HBase isolation and allocation tool is designed to help users manage 
 cluster resource among different application and tables.
 When we have a large scale of HBase cluster with many applications running on 
 it, there will be lots of problems. In Taobao there is a cluster for many 
 departments to test their applications performance, these applications are 
 based on HBase. With one cluster which has 12 servers, there will be only one 
 application running exclusively on this server, and many other applications 
 must wait until the previous test finished.
 After we add allocation manage function to the cluster, applications can 
 share the cluster and run concurrently. Also if the Test Engineer wants to 
 make sure there is no interference, he/she can move out other tables from 
 this group.
 In groups we use table priority to allocate resource, when system is busy; we 
 can make sure high-priority tables are not affected lower-priority tables
 Different groups can have different region server configurations, some groups 
 optimized for reading can have large block cache size, and others optimized 
 for writing can have large memstore size. 
 Tables and region servers can be moved easily between groups; after changing 
 the configuration, a group can be restarted alone instead of restarting the 
 whole cluster.
 git entry : https://github.com/ICT-Ope/HBase_allocation .
 We hope our work is helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4993) Performance regression in minicluster creation


[ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169526#comment-13169526
 ] 

nkeywal commented on HBASE-4993:


@ted, stack;
Can be committed imho, it seems we all agree that the TestAdmin issues are not 
linked to this patch

 Performance regression in minicluster creation
 --

 Key: HBASE-4993
 URL: https://issues.apache.org/jira/browse/HBASE-4993
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4993.patch, 4993.v3.patch


 Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4605) Constraints


 [ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4605:
--

Attachment: 4605-v6.txt

Patch that adds category to TestHTableDescriptor

I plan to integrate v6 if there is no objection from Gary.

 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: 4605-v6.txt, 4605.v7, constraint_as_cp.txt, 
 java_Constraint_v2.patch, java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, 
 java_HBASE-4605_v3.patch, java_HBASE-4605_v5.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-14 Thread dhruba borthakur (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169564#comment-13169564
 ] 

dhruba borthakur commented on HBASE-4938:
-

Hi Todd, can you pl take a peek at the latest patch that I uploaded?

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor

 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4993) Performance regression in minicluster creation


[ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169565#comment-13169565
 ] 

Zhihong Yu commented on HBASE-4993:
---

What about Stack's proposed changes @ 14/Dec/11 06:42 ?

 Performance regression in minicluster creation
 --

 Key: HBASE-4993
 URL: https://issues.apache.org/jira/browse/HBASE-4993
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4993.patch, 4993.v3.patch


 Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4605) Constraints

2011-12-14 Thread Jesse Yates (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169570#comment-13169570
 ] 

Jesse Yates commented on HBASE-4605:


I'd like to get the comments updated as per the latest comments on RB. 
Otherwise, +1

 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: 4605-v6.txt, 4605.v7, constraint_as_cp.txt, 
 java_Constraint_v2.patch, java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, 
 java_HBASE-4605_v3.patch, java_HBASE-4605_v5.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4993) Performance regression in minicluster creation


[ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169572#comment-13169572
 ] 

nkeywal commented on HBASE-4993:


It not really related to the patch, but to TestAdmin flakiness, so it can be 
done separately. Note that I still have random failures with 5022, I am not 
sure if removing the clone will change so much. I will try.

 Performance regression in minicluster creation
 --

 Key: HBASE-4993
 URL: https://issues.apache.org/jira/browse/HBASE-4993
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4993.patch, 4993.v3.patch


 Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()

[
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169573#comment-13169573
]

Hadoop QA commented on HBASE-5026:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12507376/RegionObserverLeaseExpired.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 javadoc. The javadoc tool appears to have generated -152 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 75 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestAdmin

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/504//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/504//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/504//console

This message is automatically generated.

Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()

Key: HBASE-5026
URL: https://issues.apache.org/jira/browse/HBASE-5026
Project: HBase
Issue Type: Bug
Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
Fix For: 0.92.0

Attachments: RegionObserverLeaseExpired.patch

The RegionObserver's preScannerClose() and postScannerClose()
methods should cover the scanner leaseExpired() situation.

[jira] [Commented] (HBASE-4120) isolation and allocation

[
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169577#comment-13169577
]

Hadoop QA commented on HBASE-4120:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12507382/Simple_YCSB_Tests_For_TablePriority_Trunk_and_0.90.4.pdf
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/507//console

This message is automatically generated.

isolation and allocation

Key: HBASE-4120
URL: https://issues.apache.org/jira/browse/HBASE-4120
Project: HBase
Issue Type: New Feature
Components: master, regionserver
Affects Versions: 0.90.2, 0.90.3, 0.90.4, 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
Fix For: 0.94.0

Attachments: Design_document_for_HBase_isolation_and_allocation.pdf,
Design_document_for_HBase_isolation_and_allocation_Revised.pdf,
HBase_isolation_and_allocation_user_guide.pdf,
Performance_of_Table_priority.pdf,
Simple_YCSB_Tests_For_TablePriority_Trunk_and_0.90.4.pdf, System
Structure.jpg, TablePriority.patch, TablePriority_v12.patch,
TablePriority_v12.patch, TablePriority_v15_with_coprocessor.patch,
TablePriority_v16_with_coprocessor.patch, TablePriority_v8.patch,
TablePriority_v8.patch, TablePriority_v8_for_trunk.patch,
TablePrioriy_v9.patch

The HBase isolation and allocation tool is designed to help users manage
cluster resource among different application and tables.
When we have a large scale of HBase cluster with many applications running on
it, there will be lots of problems. In Taobao there is a cluster for many
departments to test their applications performance, these applications are
based on HBase. With one cluster which has 12 servers, there will be only one
application running exclusively on this server, and many other applications
must wait until the previous test finished.
After we add allocation manage function to the cluster, applications can
share the cluster and run concurrently. Also if the Test Engineer wants to
make sure there is no interference, he/she can move out other tables from
this group.
In groups we use table priority to allocate resource, when system is busy; we
can make sure high-priority tables are not affected lower-priority tables
Different groups can have different region server configurations, some groups
optimized for reading can have large block cache size, and others optimized
for writing can have large memstore size.
Tables and region servers can be moved easily between groups; after changing
the configuration, a group can be restarted alone instead of restarting the
whole cluster.
git entry : https://github.com/ICT-Ope/HBase_allocation .
We hope our work is helpful.

[jira] [Created] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.

2011-12-14 Thread stack (Created) (JIRA)

HConnection.create(final Connection conf) does not clone, it creates a new 
Configuration reading *.xmls and then does a merge.
--

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack


Its more expensive that it should be; its causing TestAdmin to fail after 
HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169583#comment-13169583
 ] 

stack commented on HBASE-5027:
--

See tail of hbase-4993 for some discussion.

 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4993) Performance regression in minicluster creation


[ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169584#comment-13169584
 ] 

stack commented on HBASE-4993:
--

So yeah, HBaseConfiguration#create(Configuration) should be deprecated?  And 
replaced by a clone method that does return new Configuration(conf)?  Will a 
cloned Configuration have same identity -- connection key -- when it comes to 
connections (see HConnectionManager#getConnection)?  Do we want that?  Should 
it have a new identity?  I opened hbase-5027 to address this.

Meantime let me apply this patch.

 Performance regression in minicluster creation
 --

 Key: HBASE-4993
 URL: https://issues.apache.org/jira/browse/HBASE-4993
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4993.patch, 4993.v3.patch


 Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4993) Performance regression in minicluster creation


[ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169585#comment-13169585
 ] 

nkeywal commented on HBASE-4993:


Test done:
- it's much much faster if we remove the clone, even after hbase-5022 (3 times 
faster)
- the time is now constant, while I have a lot of variation when I keep the 
clone
- it still fails sometimes (1 out 5).

 Performance regression in minicluster creation
 --

 Key: HBASE-4993
 URL: https://issues.apache.org/jira/browse/HBASE-4993
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4993.patch, 4993.v3.patch


 Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169587#comment-13169587
 ] 

Zhihong Yu commented on HBASE-5027:
---

I think cloned Configuration should have the same identity with the original 
one in terms of ConnectionKey.

 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5028) [book] book.xml - adding information on region assignment and file locality

2011-12-14 Thread Doug Meil (Created) (JIRA)

[book] book.xml - adding information on region assignment and file locality
---

 Key: HBASE-5028
 URL: https://issues.apache.org/jira/browse/HBASE-5028
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book_hbase_5028.xml.patch

book.xml
* adding info in Architecture chapter on region assignment 
* adding info in Architecture chapter on region-RS locality
* adding 2 more links in other info appendix. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5028) [book] book.xml - adding information on region assignment and file locality

2011-12-14 Thread Doug Meil (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5028:
-

Attachment: book_hbase_5028.xml.patch

 [book] book.xml - adding information on region assignment and file locality
 ---

 Key: HBASE-5028
 URL: https://issues.apache.org/jira/browse/HBASE-5028
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book_hbase_5028.xml.patch


 book.xml
 * adding info in Architecture chapter on region assignment 
 * adding info in Architecture chapter on region-RS locality
 * adding 2 more links in other info appendix. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169590#comment-13169590
 ] 

nkeywal commented on HBASE-5027:


In hbase-5022, I replaced the reload implementation with a simple clone. But 
the comparison with a full removal of the clone in HBaseAdmin shows that the 
clone is still an expensive function. I believe it triggers some GC as well.

TestAdmin fails sometimes even without the clone (may be less often ? I don't 
know. But still fails for sure).


 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()

2011-12-14 Thread Andrew Purtell (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169596#comment-13169596
 ] 

Andrew Purtell commented on HBASE-5026:
---

+1

Any case where the CP framework misses a scanner close is a bug.

 Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()
 

 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0

 Attachments: RegionObserverLeaseExpired.patch


 The RegionObserver's preScannerClose() and postScannerClose()
 methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4120) isolation and allocation

[
https://issues.apache.org/jira/browse/HBASE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169600#comment-13169600
]

Hadoop QA commented on HBASE-4120:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12507380/TablePriority_v16_with_coprocessor.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 15 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -128 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 93 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestAdmin
org.apache.hadoop.hbase.client.TestInstantSchemaChange
org.apache.hadoop.hbase.coprocessor.TestClassLoading

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/505//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/505//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/505//console

This message is automatically generated.

isolation and allocation

[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169613#comment-13169613
 ] 

Phabricator commented on HBASE-4218:


Kannan has commented on the revision [jira] [HBASE-4218] Delta encoding for 
keys in HFile.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:93 I forget how 
there ended up being 3 options here. Jacek would have more context here. But I 
am guessing maybe there should just be 2 options:

  a) What delta encoding algo is to be used for a CF?

  b) Whether the encoding is to be in-memory only or on-disk also? [This is 
primarily a testing mode/dev-time option, where one can experiment with 
different delta encoders without touching on-disk format or risking corrupting 
on disk data. So most folks should not even have to worry about this option.]

REVISION DETAIL
  https://reviews.facebook.net/D447


 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 D447.1.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4993) Performance regression in minicluster creation

2011-12-14 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4993:
-

   Resolution: Fixed
Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to TRUNK.  Thanks for patch Nicolas.

 Performance regression in minicluster creation
 --

 Key: HBASE-4993
 URL: https://issues.apache.org/jira/browse/HBASE-4993
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.94.0

 Attachments: 4993.patch, 4993.v3.patch


 Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169621#comment-13169621
 ] 

Phabricator commented on HBASE-4218:


tedyu has commented on the revision [jira] [HBASE-4218] Delta encoding for 
keys in HFile.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2036 I think 
SamePrefixComparator should carry byte[] as type parameter.
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2020 How about 'avoids 
redundant comparisons for better performance' ?
  src/test/java/org/apache/hadoop/hbase/util/TestByteBufferUtils.java:35 
Missing test category.
  
src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestBufferedDeltaEncoder.java:34
 Missing test category.
  
src/test/java/org/apache/hadoop/hbase/io/deltaencoder/TestDeltaEncoders.java:47 
Missing test category.

REVISION DETAIL
  https://reviews.facebook.net/D447


 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 D447.1.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()


[ 
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169632#comment-13169632
 ] 

Zhihong Yu commented on HBASE-5026:
---

This is the first time I saw the following for a test build 
(https://builds.apache.org/job/PreCommit-HBASE-Build/504//testReport/org.apache.hadoop.hbase.client/TestAdmin/testCheckHBaseAvailableClosesConnection/):
{code}
java.lang.InternalError: errno: 24 error: Unable to open directory /proc/self/fd
{code}

 Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()
 

 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0

 Attachments: RegionObserverLeaseExpired.patch


 The RegionObserver's preScannerClose() and postScannerClose()
 methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()


 [ 
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5026:
--

Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed

 Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()
 

 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0, 0.94.0

 Attachments: RegionObserverLeaseExpired.patch


 The RegionObserver's preScannerClose() and postScannerClose()
 methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()


[ 
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169639#comment-13169639
 ] 

Zhihong Yu commented on HBASE-5026:
---

TestAdmin doesn't utilize coprocessor - the test failures were unrelated.

Integrated to branch and TRUNK.

Thanks for the patch, Jia.

Thanks for the review Andy.

 Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()
 

 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0, 0.94.0

 Attachments: RegionObserverLeaseExpired.patch


 The RegionObserver's preScannerClose() and postScannerClose()
 methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5026) Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()


 [ 
https://issues.apache.org/jira/browse/HBASE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5026:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Add coprocessor hook to HRegionServer.ScannerListener.leaseExpired()
 

 Key: HBASE-5026
 URL: https://issues.apache.org/jira/browse/HBASE-5026
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Affects Versions: 0.92.0
Reporter: Liu Jia
Assignee: Liu Jia
 Fix For: 0.92.0, 0.94.0

 Attachments: RegionObserverLeaseExpired.patch


 The RegionObserver's preScannerClose() and postScannerClose()
 methods should cover the scanner leaseExpired() situation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4605) Constraints

2011-12-14 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169645#comment-13169645
]

Hadoop QA commented on HBASE-4605:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12507384/4605-v6.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 24 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -150 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 75 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestAdmin
org.apache.hadoop.hbase.client.TestInstantSchemaChange

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/506//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/506//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/506//console

This message is automatically generated.

Constraints
---

Key: HBASE-4605
URL: https://issues.apache.org/jira/browse/HBASE-4605
Project: HBase
Issue Type: Improvement
Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Attachments: 4605-v6.txt, 4605.v7, constraint_as_cp.txt,
java_Constraint_v2.patch, java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch,
java_HBASE-4605_v3.patch, java_HBASE-4605_v5.patch

From Jesse's comment on dev:
{quote}
What I would like to propose is a simple interface that people can use to
implement a 'constraint' (matching the classic database definition). This
would help ease of adoption by helping HBase more easily check that box, help
minimize code duplication across organizations, and lead to easier adoption.
Essentially, people would implement a 'Constraint' interface for checking
keys before they are put into a table. Puts that are valid get written to the
table, but if not people can will throw an exception that gets propagated
back to the client explaining why the put was invalid.
Constraints would be set on a per-table basis and the user would be expected
to ensure the jars containing the constraint are present on the machines
serving that table.
Yes, people could roll their own mechanism for doing this via coprocessors
each time, but this would make it easier to do so, so you only have to
implement a very minimal interface and not worry about the specifics.
{quote}

[jira] [Commented] (HBASE-4605) Constraints


[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169648#comment-13169648
 ] 

jirapos...@reviews.apache.org commented on HBASE-4605:
--



bq.  On 2011-12-14 00:40:59, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java, line 
335
bq.   https://reviews.apache.org/r/2579/diff/8/?file=64473#file64473line335
bq.  
bq.   Can we call ByteArrayOutputStream.toString() directly ?

No, because there is no assurance which encoding will be used. This way, we can 
be sure that it is UTF-8 encoded and conforms to the rest of the project. Also, 
the overhead here is just one more, short-lived pointer (not a big deal).


bq.  On 2011-12-14 00:40:59, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java, line 
196
bq.   https://reviews.apache.org/r/2579/diff/8/?file=64473#file64473line196
bq.  
bq.   Priority is updated at the end.
bq.   I think we should document concurrency assumption in javadoc.

Added a section to package-info talking about concurrency.


bq.  On 2011-12-14 00:40:59, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java, line 
179
bq.   https://reviews.apache.org/r/2579/diff/8/?file=64473#file64473line179
bq.  
bq.   Multiple Constraints are supported. Please update javadoc.

done.


bq.  On 2011-12-14 00:40:59, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java, line 
355
bq.   https://reviews.apache.org/r/2579/diff/8/?file=64473#file64473line355
bq.  
bq.   The dash is not needed.

done.


- Jesse


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2579/#review3897
---


On 2011-12-13 21:38:03, Jesse Yates wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2579/
bq.  ---
bq.  
bq.  (Updated 2011-12-13 21:38:03)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Most of the implementation for adding constraints as a coprocessor. 
bq.  
bq.  Looking for general comments on style/structure, though nitpicks are ok 
too. 
bq.  
bq.  Currently missing implementation for disableConstraints() since that will 
require adding removeCoprocessor() to HTD (also comments on if this is worth it 
would be good). 
bq.  
bq.  
bq.  This addresses bug HBASE-4605.
bq.  https://issues.apache.org/jira/browse/HBASE-4605
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/docbkx/book.xml 9617950 
bq.src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 84a0d1a 
bq.src/main/java/org/apache/hadoop/hbase/constraint/BaseConstraint.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/constraint/Constraint.java 
PRE-CREATION 
bq.
src/main/java/org/apache/hadoop/hbase/constraint/ConstraintException.java 
PRE-CREATION 
bq.
src/main/java/org/apache/hadoop/hbase/constraint/ConstraintProcessor.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/constraint/package-info.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/TestHTableDescriptor.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/AllFailConstraint.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/AllPassConstraint.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/constraint/CheckConfigurationConstraint.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/constraint/RuntimeFailConstraint.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/TestConstraint.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/TestConstraints.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/constraint/WorksConstraint.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2579/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Adding IntegrationTestConstraint and unit tests for Constraints and 
IntegerConstraint. All of those pass.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jesse
bq.  
bq.



 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: 4605-v6.txt, 4605.v7, constraint_as_cp.txt, 
 java_Constraint_v2.patch,

[jira] [Created] (HBASE-5029) TestDistributedLogSplitting fails on occasion

2011-12-14 Thread stack (Created) (JIRA)

TestDistributedLogSplitting fails on occasion
-

 Key: HBASE-5029
 URL: https://issues.apache.org/jira/browse/HBASE-5029
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Prakash Khemani


This is how it usually fails: 
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/lastCompletedBuild/testReport/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testWorkerAbort/

Assigning mighty Prakash since he offered to take a looksee.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4605) Constraints

2011-12-14 Thread Jesse Yates (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-4605:
---

Attachment: java_HBASE-4605_v7.patch

Updated patch with fixed javadocs as per Ted's comments.

 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: 4605-v6.txt, 4605.v7, constraint_as_cp.txt, 
 java_Constraint_v2.patch, java_HBASE-4605_v1.patch, java_HBASE-4605_v2.patch, 
 java_HBASE-4605_v3.patch, java_HBASE-4605_v5.patch, java_HBASE-4605_v7.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169678#comment-13169678
 ] 

stack commented on HBASE-5027:
--

Is it same test that fails each time, this long-running one of 1000 cycles?

 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169682#comment-13169682
 ] 

nkeywal commented on HBASE-5027:


Yes, I executed it alone. On 5 times, it fails 1.

On Wed, Dec 14, 2011 at 9:49 PM, stack (Commented) (JIRA)



 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5030) Some tests do not close the HFile.Reader they use, leaving some file descriptor open


 [ 
https://issues.apache.org/jira/browse/HBASE-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5030:
---

Attachment: 5030.patch

 Some tests do not close the HFile.Reader they use, leaving some file 
 descriptor open
 

 Key: HBASE-5030
 URL: https://issues.apache.org/jira/browse/HBASE-5030
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 5030.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5030) Some tests do not close the HFile.Reader they use, leaving some file descriptor open


 [ 
https://issues.apache.org/jira/browse/HBASE-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5030:
---

Status: Patch Available  (was: Open)

 Some tests do not close the HFile.Reader they use, leaving some file 
 descriptor open
 

 Key: HBASE-5030
 URL: https://issues.apache.org/jira/browse/HBASE-5030
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 5030.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169685#comment-13169685
 ] 

Zhihong Yu commented on HBASE-5027:
---

I think we should reduce the number of iterations for this test to e.g. 200.

 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169687#comment-13169687
 ] 

stack commented on HBASE-5027:
--

I was thinking to whatever the zk client sessions max is + 1 or doubled.

 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169693#comment-13169693
 ] 

nkeywal commented on HBASE-5027:


It it possible to count the number of connection open? Then we would just have 
to check that it does not increase after two calls.

But may be there is a real issue behind this flakiness, even if it's not what 
the test is looking for. 

 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5030) Some tests do not close the HFile.Reader they use, leaving some file descriptor open


[ 
https://issues.apache.org/jira/browse/HBASE-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169705#comment-13169705
 ] 

Zhihong Yu commented on HBASE-5030:
---

+1 on patch.
Waiting for QA result.

 Some tests do not close the HFile.Reader they use, leaving some file 
 descriptor open
 

 Key: HBASE-5030
 URL: https://issues.apache.org/jira/browse/HBASE-5030
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 5030.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169711#comment-13169711
 ] 

stack commented on HBASE-5027:
--

bq. It it possible to count the number of connection open? Then we would just 
have to check that it does not increase after two calls.

If you added this:

{code}
diff --git 
a/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java 
b/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java
index 17da42d..00f4541 100644
--- 
a/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java
+++ 
b/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java
@@ -85,4 +85,13 @@ public class HConnectionTestingUtility {
   return connection;
 }
   }
-}
+
+  /**
+   * @return Count of extant connection instances
+   */
+  public static int getConnectionCount() {
+synchronized (HConnectionManager.HBASE_INSTANCES) {
+  return HConnectionManager.HBASE_INSTANCES.size();
+}
+  }
+}
{code}

... you could count connections.  We shouldn't open access to 
HCM.HBASE_INSTANCES Map.  Above does it in a way that allows tests count 
instances.

 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5027) HConnection.create(final Connection conf) does not clone, it creates a new Configuration reading *.xmls and then does a merge.


[ 
https://issues.apache.org/jira/browse/HBASE-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169715#comment-13169715
 ] 

nkeywal commented on HBASE-5027:


Ok, I will do that and change the test then.

On Wed, Dec 14, 2011 at 10:27 PM, stack (Commented) (JIRA)



 HConnection.create(final Connection conf) does not clone, it creates a new 
 Configuration reading *.xmls and then does a merge.
 --

 Key: HBASE-5027
 URL: https://issues.apache.org/jira/browse/HBASE-5027
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Its more expensive that it should be; its causing TestAdmin to fail after 
 HBASE-4417  went in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4683) Always cache index and bloom blocks

2011-12-14 Thread Mikhail Bautin (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169720#comment-13169720
]

Mikhail Bautin commented on HBASE-4683:
---

+1 on J-D's patch for 0.92

Always cache index and bloom blocks
---

Key: HBASE-4683
URL: https://issues.apache.org/jira/browse/HBASE-4683
Project: HBase
Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
Fix For: 0.92.0, 0.94.0

Attachments: 4683-v2.txt, 4683.txt, D807.1.patch, D807.2.patch,
HBASE-4683-0.92-v2.patch, HBASE-4683-v3.patch

This would add a new boolean config option: hfile.block.cache.datablocks
Default would be true.
Setting this to false allows HBase in a mode where only index blocks are
cached, which is useful for analytical scenarios where a useful working set
of the data cannot be expected to fit into the (aggregate) cache.
This is the equivalent of setting cacheBlocks to false on all scans
(including scans on behalf of gets).
I would like to get a general feeling about what folks think about this.
The change itself would be simple.
Update (Mikhail): we probably don't need a new conf option. Instead, we will
make index blocks cached by default.

[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169741#comment-13169741
 ] 

Phabricator commented on HBASE-4218:


Kannan has commented on the revision [jira] [HBASE-4218] Delta encoding for 
keys in HFile.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:153 perhaps change these 
too to use the newly introduced constants..
  src/main/java/org/apache/hadoop/hbase/KeyValue.java:2130 In this function 
(compareWithoutRow), is commonPrefix the common part including the rowkey 
portion?

  - If no, then @line 2119, should you pass commonPrefix - (rowLen + 
sizeOfShort) instead of commonPrefix

  - If yes, then @line 2051, should you pass rowLen + sizeOfShort instead of 0?

REVISION DETAIL
  https://reviews.facebook.net/D447


 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 D447.1.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, 
 D447.6.patch, D447.7.patch, D447.8.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5031) [89-fb] Remove hard-coded non-existent host name from TestScanner

2011-12-14 Thread Mikhail Bautin (Created) (JIRA)

[89-fb] Remove hard-coded non-existent host name from TestScanner 
--

 Key: HBASE-5031
 URL: https://issues.apache.org/jira/browse/HBASE-5031
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Priority: Minor


TestScanner is failing on 0.89-fb because it has a hard-coded fake host name 
that it is trying to look up. Replacing this with 127.0.0.1:random_port 
instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5031) [89-fb] Remove hard-coded non-existent host name from TestScanner

2011-12-14 Thread Phabricator (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5031:
---

Attachment: D867.1.patch

mbautin requested code review of [jira] [HBASE-5031] Remove hard-coded 
non-existent host name from TestScanner.
Reviewers: Liyin, Kannan, JIRA

  TestScanner is failing on 0.89-fb because it has a hard-coded fake host name 
that it is trying to look up. Replacing this with 127.0.0.1:random_port 
instead.

TEST PLAN
  Run TestScanner

REVISION DETAIL
  https://reviews.facebook.net/D867

AFFECTED FILES
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanner.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/1839/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 [89-fb] Remove hard-coded non-existent host name from TestScanner 
 --

 Key: HBASE-5031
 URL: https://issues.apache.org/jira/browse/HBASE-5031
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Priority: Minor
 Attachments: D867.1.patch


 TestScanner is failing on 0.89-fb because it has a hard-coded fake host name 
 that it is trying to look up. Replacing this with 127.0.0.1:random_port 
 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5029) TestDistributedLogSplitting fails on occasion


[ 
https://issues.apache.org/jira/browse/HBASE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169758#comment-13169758
 ] 

Zhihong Yu commented on HBASE-5029:
---

I see the following snippet repeated several times before 
SplitLogManager$TimeoutMonitor kicked in:
{code}
2011-12-14 02:48:58,764 DEBUG 
[SplitLogWorker-juno.apache.org,56552,1323830916772] wal.HLogSplitter(1071): 
Creating writer 
path=hdfs://localhost:58644/user/jenkins/splitlog/juno.apache.org,56552,1323830916772_hdfs%3A%2F%2Flocalhost%3A58644%2Fuser%2Fjenkins%2F.logs%2Fjuno.apache.org%2C56552%2C1323830916772%2Fjuno.apache.org%252C56552%252C1323830916772.1323830919190/table/96a6d25cd549f2fcb7ee720c5467b048/recovered.edits/167.temp
 region=96a6d25cd549f2fcb7ee720c5467b048
2011-12-14 02:48:58,768 DEBUG 
[SplitLogWorker-juno.apache.org,56552,1323830916772] 
wal.SequenceFileLogWriter(126): using new createWriter -- HADOOP-6840
2011-12-14 02:48:58,768 DEBUG 
[SplitLogWorker-juno.apache.org,56552,1323830916772] 
wal.SequenceFileLogWriter(136): 
Path=hdfs://localhost:58644/user/jenkins/splitlog/juno.apache.org,56552,1323830916772_hdfs%3A%2F%2Flocalhost%3A58644%2Fuser%2Fjenkins%2F.logs%2Fjuno.apache.org%2C56552%2C1323830916772%2Fjuno.apache.org%252C56552%252C1323830916772.1323830919190/table/0559440bdae0aab2cc02a1b9d5cd72f0/recovered.edits/168.temp,
 syncFs=true, hflush=false
{code}

Later:
{code}
2011-12-14 02:49:03,591 ERROR 
[SplitLogWorker-juno.apache.org,56552,1323830916772] 
regionserver.SplitLogWorker(169): unexpected error 
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeThreads(DFSClient.java:3831)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3874)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3809)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
at 
org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:1017)
at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.close(SequenceFileLogWriter.java:214)
at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFileToTemp(HLogSplitter.java:458)
at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFileToTemp(HLogSplitter.java:351)
at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:113)
at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:266)
at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:197)
at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:165)
{code}

 TestDistributedLogSplitting fails on occasion
 -

 Key: HBASE-5029
 URL: https://issues.apache.org/jira/browse/HBASE-5029
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Prakash Khemani

 This is how it usually fails: 
 https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/lastCompletedBuild/testReport/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testWorkerAbort/
 Assigning mighty Prakash since he offered to take a looksee.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4605) Constraints

[
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169764#comment-13169764
]

Hadoop QA commented on HBASE-4605:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12507427/java_HBASE-4605_v7.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 16 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -152 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 75 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestAdmin
org.apache.hadoop.hbase.client.TestInstantSchemaChange

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/508//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/508//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/508//console

This message is automatically generated.

Constraints
---

[jira] [Updated] (HBASE-4605) Constraints

[
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-4605:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12505275/java_HBASE-4605_v1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 18 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/384//console

This message is automatically generated.)

Constraints
---

[jira] [Updated] (HBASE-4605) Constraints

[
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-4605:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12504935/4605.v7
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 18 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -162 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 68 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.replication.TestReplication
org.apache.hadoop.hbase.client.TestAdmin
org.apache.hadoop.hbase.client.TestInstantSchemaChange

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/352//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/352//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/352//console

This message is automatically generated.)

Constraints
---

[jira] [Commented] (HBASE-5030) Some tests do not close the HFile.Reader they use, leaving some file descriptor open

[
https://issues.apache.org/jira/browse/HBASE-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169771#comment-13169771
]

Hadoop QA commented on HBASE-5030:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12507429/5030.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 21 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -152 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 75 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestInstantSchemaChange
org.apache.hadoop.hbase.client.TestAdmin

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/509//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/509//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/509//console

This message is automatically generated.

Some tests do not close the HFile.Reader they use, leaving some file
descriptor open

Key: HBASE-5030
URL: https://issues.apache.org/jira/browse/HBASE-5030
Project: HBase
Issue Type: Bug
Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
Attachments: 5030.patch

[jira] [Updated] (HBASE-4605) Constraints

[
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-4605:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12505277/java_HBASE-4605_v2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 18 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/385//console

This message is automatically generated.)

Constraints
---

[jira] [Updated] (HBASE-4605) Constraints

[
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-4605:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12507251/java_HBASE-4605_v5.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 16 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/500//console

This message is automatically generated.)

Constraints
---

[jira] [Commented] (HBASE-5030) Some tests do not close the HFile.Reader they use, leaving some file descriptors open


[ 
https://issues.apache.org/jira/browse/HBASE-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169778#comment-13169778
 ] 

Zhihong Yu commented on HBASE-5030:
---

Integrated to TRUNK.

Thanks for the patch N.

 Some tests do not close the HFile.Reader they use, leaving some file 
 descriptors open
 -

 Key: HBASE-5030
 URL: https://issues.apache.org/jira/browse/HBASE-5030
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 5030.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5030) Some tests do not close the HFile.Reader they use, leaving some file descriptors open


 [ 
https://issues.apache.org/jira/browse/HBASE-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5030:
--

Hadoop Flags: Reviewed
 Summary: Some tests do not close the HFile.Reader they use, leaving 
some file descriptors open  (was: Some tests do not close the HFile.Reader they 
use, leaving some file descriptor open)

 Some tests do not close the HFile.Reader they use, leaving some file 
 descriptors open
 -

 Key: HBASE-5030
 URL: https://issues.apache.org/jira/browse/HBASE-5030
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 5030.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4336) Convert source tree into maven modules


[ 
https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169784#comment-13169784
 ] 

Zhihong Yu commented on HBASE-4336:
---

Now src/main/java/org/apache/hadoop/hbase/client is contained in hbase-core.
Would it make sense to create hbase-client module ?

 Convert source tree into maven modules
 --

 Key: HBASE-4336
 URL: https://issues.apache.org/jira/browse/HBASE-4336
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Gary Helmling
Priority: Critical
 Fix For: 0.94.0


 When we originally converted the build to maven we had a single core module 
 defined, but later reverted this to a module-less build for the sake of 
 simplicity.
 It now looks like it's time to re-address this, as we have an actual need for 
 modules to:
 * provide a trimmed down client library that applications can make use of
 * more cleanly support building against different versions of Hadoop, in 
 place of some of the reflection machinations currently required
 * incorporate the secure RPC engine that depends on some secure Hadoop classes
 I propose we start simply by refactoring into two initial modules:
 * core - common classes and utilities, and client-side code and interfaces
 * server - master and region server implementations and supporting code
 This would also lay the groundwork for incorporating the HBase security 
 features that have been developed.  Once the module structure is in place, 
 security-related features could then be incorporated into a third module -- 
 security -- after normal review and approval.  The security module could 
 then depend on secure Hadoop, without modifying the dependencies of the rest 
 of the HBase code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2011-12-14 Thread Phabricator (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4683:
---

Attachment: D807.3.patch

mbautin updated the revision [jira] [HBASE-4683] Always cache index and bloom 
blocks.
Reviewers: jdcryans, lhofhansl, JIRA

  Rebasing up to r1214519.

REVISION DETAIL
  https://reviews.facebook.net/D807

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java
  src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaConfigured.java


 Always cache index and bloom blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.92.0, 0.94.0

 Attachments: 4683-v2.txt, 4683.txt, D807.1.patch, D807.2.patch, 
 D807.3.patch, HBASE-4683-0.92-v2.patch, HBASE-4683-v3.patch


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.
 Update (Mikhail): we probably don't need a new conf option. Instead, we will 
 make index blocks cached by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2011-12-14 Thread Mikhail Bautin (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mikhail Bautin updated HBASE-4683:
--

Attachment: 0001-Cache-important-block-types.patch

Attaching the patch rebased on top of r1214519.

Always cache index and bloom blocks
---

Key: HBASE-4683
URL: https://issues.apache.org/jira/browse/HBASE-4683
Project: HBase
Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
Fix For: 0.92.0, 0.94.0

Attachments: 0001-Cache-important-block-types.patch, 4683-v2.txt,
4683.txt, D807.1.patch, D807.2.patch, D807.3.patch, HBASE-4683-0.92-v2.patch,
HBASE-4683-v3.patch

[jira] [Commented] (HBASE-5031) [89-fb] Remove hard-coded non-existent host name from TestScanner


[ 
https://issues.apache.org/jira/browse/HBASE-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169798#comment-13169798
 ] 

Phabricator commented on HBASE-5031:


Liyin has commented on the revision [jira] [HBASE-5031] [89-fb] Remove 
hard-coded non-existent host name from TestScanner.

  Nice work. Thanks Mikhail for fixing this unit test.

REVISION DETAIL
  https://reviews.facebook.net/D867


 [89-fb] Remove hard-coded non-existent host name from TestScanner 
 --

 Key: HBASE-5031
 URL: https://issues.apache.org/jira/browse/HBASE-5031
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Priority: Minor
 Attachments: D867.1.patch


 TestScanner is failing on 0.89-fb because it has a hard-coded fake host name 
 that it is trying to look up. Replacing this with 127.0.0.1:random_port 
 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5031) [89-fb] Remove hard-coded non-existent host name from TestScanner


[ 
https://issues.apache.org/jira/browse/HBASE-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169797#comment-13169797
 ] 

Phabricator commented on HBASE-5031:


Liyin has accepted the revision [jira] [HBASE-5031] [89-fb] Remove hard-coded 
non-existent host name from TestScanner.

REVISION DETAIL
  https://reviews.facebook.net/D867


 [89-fb] Remove hard-coded non-existent host name from TestScanner 
 --

 Key: HBASE-5031
 URL: https://issues.apache.org/jira/browse/HBASE-5031
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Priority: Minor
 Attachments: D867.1.patch


 TestScanner is failing on 0.89-fb because it has a hard-coded fake host name 
 that it is trying to look up. Replacing this with 127.0.0.1:random_port 
 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4336) Convert source tree into maven modules

2011-12-14 Thread Jesse Yates (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169801#comment-13169801
 ] 

Jesse Yates commented on HBASE-4336:


I've been struggling with how to setup each of the packages, and I think 
creating an hbase-client module is part of the solution.

A large problem comes from the fact that there are a lot of classes that are 
used across packages. In some cases, its as simple as abstracting out some 
constant,  putting it into HConstants, and then having each class just 
reference that new constant. 

However, there are other things like the MiniCluster (as associated classes) 
that actually use stuff both in core and in server, leading me to want to move 
a lot of the tests to something like an hbase-test module. There are a lot of 
tests in core and server that depend on the minicluster, and I feel like it 
would bad to have a lot of the client tests in the server module. 

Another alternative would be to have a module hbase-core that has dependencies 
on hbase-client and hbase-server. hbase-core then also puts out a tests jar 
that is picked up by hbase-client-test and hbase-server-test for doing the 
minicluster tests. 

Its feeling like its leading to a _lot_ of modules. However, the best thing I 
can see is doing the hbase-core module with the highly dependent classes, e.g. 
MiniCluster, then have the hbase-it turn into hbase-test where it has all the 
minicluster based tests as well as the api-level tests discussed here: 
http://search-hadoop.com/m/q41O6YiyfN

 Convert source tree into maven modules
 --

 Key: HBASE-4336
 URL: https://issues.apache.org/jira/browse/HBASE-4336
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Gary Helmling
Priority: Critical
 Fix For: 0.94.0


 When we originally converted the build to maven we had a single core module 
 defined, but later reverted this to a module-less build for the sake of 
 simplicity.
 It now looks like it's time to re-address this, as we have an actual need for 
 modules to:
 * provide a trimmed down client library that applications can make use of
 * more cleanly support building against different versions of Hadoop, in 
 place of some of the reflection machinations currently required
 * incorporate the secure RPC engine that depends on some secure Hadoop classes
 I propose we start simply by refactoring into two initial modules:
 * core - common classes and utilities, and client-side code and interfaces
 * server - master and region server implementations and supporting code
 This would also lay the groundwork for incorporating the HBase security 
 features that have been developed.  Once the module structure is in place, 
 security-related features could then be incorporated into a third module -- 
 security -- after normal review and approval.  The security module could 
 then depend on secure Hadoop, without modifying the dependencies of the rest 
 of the HBase code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5032) Add DELETE COLUMN into the delete bloom filter

2011-12-14 Thread Liyin Tang (Created) (JIRA)

Add DELETE COLUMN into the delete bloom filter
--

 Key: HBASE-5032
 URL: https://issues.apache.org/jira/browse/HBASE-5032
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang


Previously, the delete family bloom filter only contains the row key which has 
the delete family. It helps us to avoid the top-row seek operation.

This jira attempts to add the delete column into this delete bloom filter as 
well (rename the delete family bloom filter as delete bloom filter).

The motivation is to save seek ops for scan time-range queries if we know there 
is no delete column for this row/column. 
We can seek directly to the exact timestamp we are interested in, instead of 
seeking to the latest timestamp and keeping skipping to found out whether there 
is any delete column before the interested timestamp.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5032) Add DELETE COLUMN into the delete bloom filter

[
https://issues.apache.org/jira/browse/HBASE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-5032:
--

Description:
Previously, the delete family bloom filter only contains the row key which has
the delete family. It helps us to avoid the top-row seek operation.

This jira attempts to add the delete column into this delete bloom filter as
well (rename the delete family bloom filter as delete bloom filter).

The motivation is to save seek ops for scan time-range queries if we know there
is no delete column for this row/column.
We can seek directly to the exact timestamp we are interested in, instead of
seeking to the latest timestamp and keeping skipping to find out whether there
is any delete column before the interested timestamp.

was:
Previously, the delete family bloom filter only contains the row key which has
the delete family. It helps us to avoid the top-row seek operation.

This jira attempts to add the delete column into this delete bloom filter as
well (rename the delete family bloom filter as delete bloom filter).

The motivation is to save seek ops for scan time-range queries if we know there
is no delete column for this row/column.
We can seek directly to the exact timestamp we are interested in, instead of
seeking to the latest timestamp and keeping skipping to found out whether there
is any delete column before the interested timestamp.

Add DELETE COLUMN into the delete bloom filter
--

Key: HBASE-5032
URL: https://issues.apache.org/jira/browse/HBASE-5032
Project: HBase
Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

Previously, the delete family bloom filter only contains the row key which
has the delete family. It helps us to avoid the top-row seek operation.
This jira attempts to add the delete column into this delete bloom filter as
well (rename the delete family bloom filter as delete bloom filter).
The motivation is to save seek ops for scan time-range queries if we know
there is no delete column for this row/column.
We can seek directly to the exact timestamp we are interested in, instead of
seeking to the latest timestamp and keeping skipping to find out whether
there is any delete column before the interested timestamp.

[jira] [Updated] (HBASE-5014) PutSortReducer and KeyValueSortReduce should adhere to memory limits

2011-12-14 Thread dhruba borthakur (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5014:


Status: Patch Available  (was: Open)

Hi Kannan, I addressed your comment and ran all unit tests.

 PutSortReducer and KeyValueSortReduce should adhere to memory limits
 

 Key: HBASE-5014
 URL: https://issues.apache.org/jira/browse/HBASE-5014
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: dhruba borthakur
Assignee: dhruba borthakur

 The PutSortReduce class has a configurable threshold to flush partial sorted 
 data for large rows. However, it was not using the size of the key in the 
 calculation of overall memory used. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-14 Thread dhruba borthakur (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4938:


Status: Patch Available  (was: Open)

I have run all the unit tests for this one.

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor

 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5033) Opening store files in parallel to reduce region open time

2011-12-14 Thread Liyin Tang (Created) (JIRA)

Opening store files in parallel to reduce region open time
--

 Key: HBASE-5033
 URL: https://issues.apache.org/jira/browse/HBASE-5033
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang


Opening store files in parallel to reduce region open time

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5021) Enforce upper bound on timestamp


[ 
https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169813#comment-13169813
 ] 

Phabricator commented on HBASE-5021:


stack has commented on the revision [jira] [HBase-5021] Enforce upper bound on 
timestamp.

  +1

  One thought though is that rather than default be FOREVER, instead have it be 
one hour?  If  one hour drift, then I'd think this usually indicative of bad 
cluster setup and you'll be thankful of the warning or else you are doing 
something 'odd' so you should change the default config.

REVISION DETAIL
  https://reviews.facebook.net/D849


 Enforce upper bound on timestamp
 

 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0

 Attachments: D849.1.patch


 We have been getting hit with performance problems on our time-series 
 database due to invalid timestamps being inserted by the timestamp.  We are 
 working on adding proper checks to app server, but production performance 
 could be severely impacted with significant recovery time if something slips 
 past.  Since timestamps are considered a fundamental part of the HBase schema 
  multiple optimizations use timestamp information, we should allow the 
 option to sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4932) Block cache can be mistakenly instantiated by tools


[ 
https://issues.apache.org/jira/browse/HBASE-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169815#comment-13169815
 ] 

Hadoop QA commented on HBASE-4932:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506185/HBASE-4932.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 25 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/511//console

This message is automatically generated.

 Block cache can be mistakenly instantiated by tools
 ---

 Key: HBASE-4932
 URL: https://issues.apache.org/jira/browse/HBASE-4932
 Project: HBase
  Issue Type: Bug
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Fix For: 0.94.0

 Attachments: HBASE-4932.patch


 Map Reduce tasks that create a writer to write HFiles inadvertently end up 
 creating block cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5021) Enforce upper bound on timestamp


[ 
https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169816#comment-13169816
 ] 

Phabricator commented on HBASE-5021:


nspiegelberg has commented on the revision [jira] [HBase-5021] Enforce upper 
bound on timestamp.

  @stack : looking around, will this be addressed better by HBASE-4605 
Constraints in 0.94?  Should I just put this in 89-fb and help the constraint 
review so we have feature parity?  I don't think you'd want to enable sanity 
checking by default, since not all use cases use currentTimeMillis() are the 
timestamp.

REVISION DETAIL
  https://reviews.facebook.net/D849


 Enforce upper bound on timestamp
 

 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0

 Attachments: D849.1.patch


 We have been getting hit with performance problems on our time-series 
 database due to invalid timestamps being inserted by the timestamp.  We are 
 working on adding proper checks to app server, but production performance 
 could be severely impacted with significant recovery time if something slips 
 past.  Since timestamps are considered a fundamental part of the HBase schema 
  multiple optimizations use timestamp information, we should allow the 
 option to sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5021) Enforce upper bound on timestamp


[ 
https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169817#comment-13169817
 ] 

Zhihong Yu commented on HBASE-5021:
---

@Nicolas:
You're welcome to comment on HBASE-4605.

 Enforce upper bound on timestamp
 

 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0

 Attachments: D849.1.patch


 We have been getting hit with performance problems on our time-series 
 database due to invalid timestamps being inserted by the timestamp.  We are 
 working on adding proper checks to app server, but production performance 
 could be severely impacted with significant recovery time if something slips 
 past.  Since timestamps are considered a fundamental part of the HBase schema 
  multiple optimizations use timestamp information, we should allow the 
 option to sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5021) Enforce upper bound on timestamp


[ 
https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169822#comment-13169822
 ] 

Phabricator commented on HBASE-5021:


stack has commented on the revision [jira] [HBase-5021] Enforce upper bound on 
timestamp.

  @Nicolas Or apply this patch as is to trunk and 0.89; user can then use this 
mechanism or constraints in 0.94.

REVISION DETAIL
  https://reviews.facebook.net/D849


 Enforce upper bound on timestamp
 

 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0

 Attachments: D849.1.patch


 We have been getting hit with performance problems on our time-series 
 database due to invalid timestamps being inserted by the timestamp.  We are 
 working on adding proper checks to app server, but production performance 
 could be severely impacted with significant recovery time if something slips 
 past.  Since timestamps are considered a fundamental part of the HBase schema 
  multiple optimizations use timestamp information, we should allow the 
 option to sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

2011-12-14 Thread Mikhail Bautin (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169842#comment-13169842
]

Mikhail Bautin commented on HBASE-4218:
---

bq. Maybe we could call it KeyValueEncoding, DataBlockEncoding, HCellEncoding,
BlockEncoding...

Matt: do you have a specific re-naming of delta encoders in mind? Jacek's
original delta encoding algorithm names are
{Bitset,Prefix,Diff,FastDiff}KeyDeltaEncoder. How do these correspond to the
alternative encoder names you are suggesting?

Delta Encoding of KeyValues (aka prefix compression)
-

Key: HBASE-4218
URL: https://issues.apache.org/jira/browse/HBASE-4218
Project: HBase
Issue Type: Improvement
Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
Labels: compression
Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch,
D447.1.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch,
D447.6.patch, D447.7.patch, D447.8.patch,
Delta_encoding_with_memstore_TS.patch, open-source.diff

A compression for keys. Keys are sorted in HFile and they are usually very
similar. Because of that, it is possible to design better compression than
general purpose algorithms,
It is an additional step designed to be used in memory. It aims to save
memory in cache as well as speeding seeks within HFileBlocks. It should
improve performance a lot, if key lengths are larger than value lengths. For
example, it makes a lot of sense to use it when value is a counter.
Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes)
shows that I could achieve decent level of compression:
key compression ratio: 92%
total compression ratio: 85%
LZO on the same data: 85%
LZO after delta encoding: 91%
While having much better performance (20-80% faster decompression ratio than
LZO). Moreover, it should allow far more efficient seeking which should
improve performance a bit.
It seems that a simple compression algorithms are good enough. Most of the
savings are due to prefix compression, int128 encoding, timestamp diffs and
bitfields to avoid duplication. That way, comparisons of compressed data can
be much faster than a byte comparator (thanks to prefix compression and
bitfields).
In order to implement it in HBase two important changes in design will be
needed:
-solidify interface to HFileBlock / HFileReader Scanner to provide seeking
and iterating; access to uncompressed buffer in HFileBlock will have bad
performance
-extend comparators to support comparison assuming that N first bytes are
equal (or some fields are equal)
Link to a discussion about something similar:
http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

[jira] [Commented] (HBASE-5021) Enforce upper bound on timestamp


[ 
https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169856#comment-13169856
 ] 

Phabricator commented on HBASE-5021:


nspiegelberg has commented on the revision [jira] [HBase-5021] Enforce upper 
bound on timestamp.

  @stack : sounds good.  I kind of thought they were slightly different 
features myself but thought I'd get outside confirmation.  Timestamp is a 
specific, explicit optimization of the LSMT algorithm, where Constraints are 
more implicit, application-specific schema enforcement.

REVISION DETAIL
  https://reviews.facebook.net/D849


 Enforce upper bound on timestamp
 

 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0

 Attachments: D849.1.patch


 We have been getting hit with performance problems on our time-series 
 database due to invalid timestamps being inserted by the timestamp.  We are 
 working on adding proper checks to app server, but production performance 
 could be severely impacted with significant recovery time if something slips 
 past.  Since timestamps are considered a fundamental part of the HBase schema 
  multiple optimizations use timestamp information, we should allow the 
 option to sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)

[
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169860#comment-13169860
]

Phabricator commented on HBASE-4218:

stack has commented on the revision [jira] [HBASE-4218] Delta encoding for
keys in HFile.

More to follow (Sorry for piecemealing this review... )

INLINE COMMENTS

src/main/java/org/apache/hadoop/hbase/io/deltaencoder/BufferedDeltaEncoder.java:443
Do all methods up to here belong elsewhere out in a utility class?
CompressedInts or something? In ByteBufferUtils would be a better place?

REVISION DETAIL
https://reviews.facebook.net/D447

Delta Encoding of KeyValues (aka prefix compression)
-

[jira] [Commented] (HBASE-5021) Enforce upper bound on timestamp


[ 
https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169869#comment-13169869
 ] 

Phabricator commented on HBASE-5021:


tedyu has commented on the revision [jira] [HBase-5021] Enforce upper bound on 
timestamp.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:1851 Does 
this operation status code still apply ?

REVISION DETAIL
  https://reviews.facebook.net/D849


 Enforce upper bound on timestamp
 

 Key: HBASE-5021
 URL: https://issues.apache.org/jira/browse/HBASE-5021
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Critical
 Fix For: 0.94.0

 Attachments: D849.1.patch


 We have been getting hit with performance problems on our time-series 
 database due to invalid timestamps being inserted by the timestamp.  We are 
 working on adding proper checks to app server, but production performance 
 could be severely impacted with significant recovery time if something slips 
 past.  Since timestamps are considered a fundamental part of the HBase schema 
  multiple optimizations use timestamp information, we should allow the 
 option to sanity check the upper bound on the server-side in HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server