date:20141110

[jira] [Updated] (HBASE-10999) Cross-row Transaction : Implement Percolator Algorithm on HBase

2014-11-10 Thread cuijianwei (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

cuijianwei updated HBASE-10999:
---
Description:
Cross-row transaction is a desired function for database. It is not easy to
keep ACID characteristics of cross-row transactions in distribute databases
such as HBase, because data of cross-transaction might locate in different
machines. In the paper http://research.google.com/pubs/pub36726.html, google
presents an algorithm(named percolator) to implement cross-row transactions on
BigTable. After analyzing the algorithm, we found percolator might also be a
choice to provide cross-row transaction on HBase. The reasons includes:
1. Percolator could keep the ACID of cross-row transaction as described in
google's paper. Percolator depends on a Global Incremental Timestamp Service to
define the order of transactions, this is important to keep ACID of transaction.
2. Percolator algorithm could be totally implemented in client-side. This means
we do not need to change the logic of server side. Users could easily include
percolator in their client and adopt percolator APIs only when they want
cross-row transaction.
3. Percolator is a general algorithm which could be implemented based on
databases providing single-row transaction. Therefore, it is feasible to
implement percolator on HBase.
In last few months, we have implemented percolator on HBase, did correctness
validation, performance test and finally successfully applied this algorithm in
our production environment. Our works include:
1. percolator algorithm implementation on HBase. The current implementations
includes:
a). a Transaction module to provides put/delete/get/scan interfaces to do
cross-row/cross-table transaction.
b). a Global Incremental Timestamp Server to provide globally monotonically
increasing timestamp for transaction.
c). a LockCleaner module to resolve conflict when concurrent transactions
mutate the same column.
d). an internal module to implement prewrite/commit/get/scan logic of
percolator.
Although percolator logic could be totally implemented in client-side, we
use coprocessor framework of HBase in our implementation. This is because
coprocessor could provide percolator-specific Rpc interfaces such as
prewrite/commit to reduce Rpc rounds and improve efficiency. Another reason to
use coprocessor is that we want to decouple percolator's code from HBase so
that users will get clean HBase code if they don't need cross-row transactions.
In future, we will also explore the concurrent running characteristic of
coprocessor to do cross-row mutations more efficiently.
2. an AccountTransfer simulation program to validate the correctness of
implementation. This program will distribute initial values in different
tables, rows and columns in HBase. Each column represents an account. Then,
configured client threads will be concurrently started to read out a number of
account values from different tables and rows by percolator's get; after this,
clients will randomly transfer values among these accounts while keeping the
sum unchanged, which simulates concurrent cross-table/cross-row transactions.
To check the correctness of transactions, a checker thread will periodically
scan account values from all columns, make sure the current total value is the
same as the initial total value. We run this validation program while
developing, this help us correct errors of implementation.
3. performance evaluation under various test situations. We compared
percolator's APIs with HBase's with different data size and client thread count
for single-column transaction which represents the worst performance case for
percolator. We get the performance comparison result as (below):
a) For read, the performance of percolator is 90% of HBase;
b) For write, the performance of percolator is 23% of HBase.
The drop derives from the overhead of percolator logic, the performance test
result is similar as the result reported by google's paper.
4. Performance improvement. The write performance of percolator decreases more
compared with HBase. This is because percolator's write needs to read data out
to check write conflict and needs two Rpcs which do prewriting and commiting
respectively. We are investigating ways to improve the write performance.
We are glad to share current percolator implementation and hope this could
provide a choice for users who want cross-row transactions because it does not
need to change the code and logic of origin HBase. Comments and discussions are
welcomed.

was:
Cross-row transaction is a desired function for database. It is not easy to
keep ACID characteristics of cross-row transactions in distribute databases
such as HBase, because data of cross-transaction might locate in different
machines. In the paper http://research.google.com/pubs/pub36726.html, google

[jira] [Commented] (HBASE-10999) Cross-row Transaction : Implement Percolator Algorithm on HBase

2014-11-10 Thread cuijianwei (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204508#comment-14204508
]

cuijianwei commented on HBASE-10999:

In last few months, we have updated Themis to achieve better performance and
include more features:
1. Improve the single-row write performance from 23%(relative drop compared
with HBase's put) to 60%(for most test cases). For single-row write
transaction, we only write lock to MemStore in prewrite-phase, then, we erase
correpsonding lock, write data and commit information to HLog in commit-phase.
This won't break the correctness of percolator algorithm and will help improve
the performance a lot for single-row write.
2. Support HBase 0.98. We create a branch:
https://github.com/XiaoMi/themis/tree/for_hbase_0.98 to make themis support
HBase 0.98(Currently, support HBase 0.98.5). All the functions of master branch
will also be implemented in this branch.
3. Transaction TTL support and Old Data Clean. Users could set TTL for
read/write transaction respectivley. Then, old data which could not be read
will be cleaned periodly.
4. MapReduce Support. We implement ThemisTableInputFormat to scan data from
themis-enable table in Map Job and ThemisTableOutputFormat to write data by
themis transaction in Reducer Job. Mult-table scan and write are also
supportted.
5. Implement Zookeeper based WorkerRegister. As mentioned in percolator paper,
Running workers write a token into the Chubby lockservice,
ZookeeperWorkerRegister implements this function and will help resolve conflict
more efficiently.
6. Table Schema Support. Users could set THEMIS_ENABLE attribute to true to
family which needs themis transaction, then, themis will automatically set
corresponding attributes to the family and create lock family.
For more details, please see: https://github.com/XiaoMi/themis (for HBase 0.94)
and https://github.com/XiaoMi/themis/tree/for_hbase_0.98 (for HBase 0.98).

Cross-row Transaction : Implement Percolator Algorithm on HBase
---

Key: HBASE-10999
URL: https://issues.apache.org/jira/browse/HBASE-10999
Project: HBase
Issue Type: New Feature
Components: Transactions/MVCC
Affects Versions: 0.99.0
Reporter: cuijianwei
Assignee: cuijianwei

Cross-row transaction is a desired function for database. It is not easy to
keep ACID characteristics of cross-row transactions in distribute databases
such as HBase, because data of cross-transaction might locate in different
machines. In the paper http://research.google.com/pubs/pub36726.html, google
presents an algorithm(named percolator) to implement cross-row transactions
on BigTable. After analyzing the algorithm, we found percolator might also be
a choice to provide cross-row transaction on HBase. The reasons includes:
1. Percolator could keep the ACID of cross-row transaction as described in
google's paper. Percolator depends on a Global Incremental Timestamp Service
to define the order of transactions, this is important to keep ACID of
transaction.
2. Percolator algorithm could be totally implemented in client-side. This
means we do not need to change the logic of server side. Users could easily
include percolator in their client and adopt percolator APIs only when they
want cross-row transaction.
3. Percolator is a general algorithm which could be implemented based on
databases providing single-row transaction. Therefore, it is feasible to
implement percolator on HBase.
In last few months, we have implemented percolator on HBase, did correctness
validation, performance test and finally successfully applied this algorithm
in our production environment. Our works include:
1. percolator algorithm implementation on HBase. The current implementations
includes:
a). a Transaction module to provides put/delete/get/scan interfaces to do
cross-row/cross-table transaction.
b). a Global Incremental Timestamp Server to provide globally
monotonically increasing timestamp for transaction.
c). a LockCleaner module to resolve conflict when concurrent transactions
mutate the same column.
d). an internal module to implement prewrite/commit/get/scan logic of
percolator.
Although percolator logic could be totally implemented in client-side, we
use coprocessor framework of HBase in our implementation. This is because
coprocessor could provide percolator-specific Rpc interfaces such as
prewrite/commit to reduce Rpc rounds and improve efficiency. Another reason
to use coprocessor is that we want to decouple percolator's code from HBase
so that users will get clean HBase code if they don't need cross-row
transactions. In future, we will also explore the concurrent running
characteristic of coprocessor

[jira] [Updated] (HBASE-10999) Cross-row Transaction : Implement Percolator Algorithm on HBase

2014-11-10 Thread cuijianwei (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

cuijianwei updated HBASE-10999:
---
Description:
Cross-row transaction is a desired function for database. It is not easy to
keep ACID characteristics of cross-row transactions in distribute databases
such as HBase, because data of cross-transaction might locate in different
machines. In the paper http://research.google.com/pubs/pub36726.html, google
presents an algorithm(named percolator) to implement cross-row transactions on
BigTable. After analyzing the algorithm, we found percolator might also be a
choice to provide cross-row transaction on HBase. The reasons includes:
1. Percolator could keep the ACID of cross-row transaction as described in
google's paper. Percolator depends on a Global Incremental Timestamp Service to
define the order of transactions, this is important to keep ACID of transaction.
2. Percolator algorithm could be totally implemented in client-side. This means
we do not need to change the logic of server side. Users could easily include
percolator in their client and adopt percolator APIs only when they want
cross-row transaction.
3. Percolator is a general algorithm which could be implemented based on
databases providing single-row transaction. Therefore, it is feasible to
implement percolator on HBase.
In last few months, we have implemented percolator on HBase, did correctness
validation, performance test and finally successfully applied this algorithm in
our production environment. Our works include:
1. percolator algorithm implementation on HBase. The current implementations
includes:
a). a Transaction module to provides put/delete/get/scan interfaces to do
cross-row/cross-table transaction.
b). a Global Incremental Timestamp Server to provide globally monotonically
increasing timestamp for transaction.
c). a LockCleaner module to resolve conflict when concurrent transactions
mutate the same column.
d). an internal module to implement prewrite/commit/get/scan logic of
percolator.
Although percolator logic could be totally implemented in client-side, we
use coprocessor framework of HBase in our implementation. This is because
coprocessor could provide percolator-specific Rpc interfaces such as
prewrite/commit to reduce Rpc rounds and improve efficiency. Another reason to
use coprocessor is that we want to decouple percolator's code from HBase so
that users will get clean HBase code if they don't need cross-row transactions.
In future, we will also explore the concurrent running characteristic of
coprocessor to do cross-row mutations more efficiently.
2. an AccountTransfer simulation program to validate the correctness of
implementation. This program will distribute initial values in different
tables, rows and columns in HBase. Each column represents an account. Then,
configured client threads will be concurrently started to read out a number of
account values from different tables and rows by percolator's get; after this,
clients will randomly transfer values among these accounts while keeping the
sum unchanged, which simulates concurrent cross-table/cross-row transactions.
To check the correctness of transactions, a checker thread will periodically
scan account values from all columns, make sure the current total value is the
same as the initial total value. We run this validation program while
developing, this help us correct errors of implementation.
3. performance evaluation under various test situations. We compared
percolator's APIs with HBase's with different data size and client thread count
for single-column transaction which represents the worst performance case for
percolator. We get the performance comparison result as (below):
a) For read, the performance of percolator is 85% of HBase;
b) For write, the performance of percolator is 60% of HBase.
The drop derives from the overhead of percolator logic, the read performance is
about 10% lower compared to that reported in percolator paper(94% for
percolator). The write performance is much better compared to that reported in
percolator paper(23% for percolator). We improve the performance of
single-column transaction(also for single-row transaction) by only writing
MemStore in prewrite-phase which will reduce one time HLog's write.
4. MapReduce Support. We implement a group of classes to support read data by
themis transaction in Mapper job and write data by themis transaction in Reduce
job.
5. The master branch of themis(https://github.com/XiaoMi/themis) is based on
HBase 0.94, we also create a
branch(https://github.com/XiaoMi/themis/tree/for_hbase_0.98) to support hbase
0.98.
We are glad to share current percolator implementation and hope this could
provide a choice for users who want cross-row transactions because it does not
need to change the code and logic of origin HBase.

[jira] [Updated] (HBASE-12444) Total number of requests overflow because it's int

2014-11-10 Thread zhaoyunjiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HBASE-12444:
-
Attachment: hbase-12444.patch

Use uint64 for total_number_of_requests.

 Total number of requests overflow because it's int
 --

 Key: HBASE-12444
 URL: https://issues.apache.org/jira/browse/HBASE-12444
 Project: HBase
  Issue Type: Bug
  Components: hbck, master, regionserver
Reporter: zhaoyunjiong
Priority: Minor
 Attachments: hbase-12444.patch


 When running hbck, I noticed Number of requests was wrong:
 Average load: 466.41237113402065
 Number of requests: -1835941345
 Number of regions: 45242
 Number of regions in transition: 0
 The root cause is it use int, and clearly it overflowed.
 I'll update a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12444) Total number of requests overflow because it's int

2014-11-10 Thread zhaoyunjiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated HBASE-12444:
-
Hadoop Flags: Incompatible change
  Status: Patch Available  (was: Open)

 Total number of requests overflow because it's int
 --

 Key: HBASE-12444
 URL: https://issues.apache.org/jira/browse/HBASE-12444
 Project: HBase
  Issue Type: Bug
  Components: hbck, master, regionserver
Reporter: zhaoyunjiong
Priority: Minor
 Attachments: hbase-12444.patch


 When running hbck, I noticed Number of requests was wrong:
 Average load: 466.41237113402065
 Number of requests: -1835941345
 Number of regions: 45242
 Number of regions in transition: 0
 The root cause is it use int, and clearly it overflowed.
 I'll update a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12444) Total number of requests overflow because it's int

2014-11-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204630#comment-14204630
 ] 

Hadoop QA commented on HBASE-12444:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680559/hbase-12444.patch
  against trunk revision .
  ATTACHMENT ID: 12680559

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color}.  The patch appears to cause mvn compile goal to 
fail.

Compilation errors resume:
[ERROR] COMPILATION ERROR : 
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:[1094,15]
 method setTotalNumberOfRequests in class 
org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos.ServerLoad.Builder
 cannot be applied to given types;
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.2:compile (default-compile) on 
project hbase-server: Compilation failure
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:[1094,15]
 method setTotalNumberOfRequests in class 
org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos.ServerLoad.Builder
 cannot be applied to given types;
[ERROR] required: int
[ERROR] found: long
[ERROR] reason: actual argument long cannot be converted to int by method 
invocation conversion
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn goals -rf :hbase-server


Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11625//console

This message is automatically generated.

 Total number of requests overflow because it's int
 --

 Key: HBASE-12444
 URL: https://issues.apache.org/jira/browse/HBASE-12444
 Project: HBase
  Issue Type: Bug
  Components: hbck, master, regionserver
Reporter: zhaoyunjiong
Priority: Minor
 Attachments: hbase-12444.patch


 When running hbck, I noticed Number of requests was wrong:
 Average load: 466.41237113402065
 Number of requests: -1835941345
 Number of regions: 45242
 Number of regions in transition: 0
 The root cause is it use int, and clearly it overflowed.
 I'll update a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-9817) Add throughput quota and enforce hard limits

2014-11-10 Thread He Liangliang (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

He Liangliang updated HBASE-9817:
-
Attachment: HBASE-9817.patch

Add throughput quota and enforce hard limits

Key: HBASE-9817
URL: https://issues.apache.org/jira/browse/HBASE-9817
Project: HBase
Issue Type: New Feature
Components: Coprocessors
Reporter: He Liangliang
Assignee: He Liangliang
Attachments: HBASE-9817.patch

There is planning to add region count and table count quota mentioned in
HBASE-8410. However, it's also quite useful to control the requesting
throughput inside the region server. For example, we don't want a data
dumping task affecting the online services and it's better to enforce the
throughput quota inside HBase. Another common scenario is multi-tenancy, i.e.
a cluster is shared by multiple applications.
The following rules will be supported:
1. per user quota
limits the total read/write throughput initiated by a single user on any
table.
2. per (user, table) quota
limits the total read/write throughput initiated by a single user on a
specified table.
The implementation will use coprocessor to intercept and check each request
on each region. And each region server allocate a portion of quota from the
total specified quota (for that user or user + table) based on the portion of
active regions (the whole cluster or the specified table) assigned on that
region server. The request will be rejected or delayed if the limit is
reached.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12451) IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region splits in rolling update of cluster

2014-11-10 Thread Liu Shaohui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-12451:

Summary: IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary 
region splits in rolling update of cluster  (was: 
IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region split in 
rolling update of cluster)

 IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region splits 
 in rolling update of cluster
 

 Key: HBASE-12451
 URL: https://issues.apache.org/jira/browse/HBASE-12451
 Project: HBase
  Issue Type: Bug
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0


 Currently IncreasingToUpperBoundRegionSplitPolicy is the default region split 
 policy. In this policy, split size is the number of regions that are on this 
 server that all are of the same table, cubed, times 2x the region flush size.
 But when unloading regions of a regionserver in a cluster using 
 region_mover.rb, the number of regions that are on this server that all are 
 of the same table will decrease, and the split size will decrease too, which 
 may cause the left region split in the regionsever. Region Splits also 
 happens when loading regions of a regionserver in a cluster. 
 A improvment may set a minmum split size in 
 IncreasingToUpperBoundRegionSplitPolicy
 Suggestions are welcomed. Thanks~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12451) IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region splits in rolling update of cluster

2014-11-10 Thread Liu Shaohui (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Liu Shaohui updated HBASE-12451:

Description:
Currently IncreasingToUpperBoundRegionSplitPolicy is the default region split
policy. In this policy, split size is the number of regions that are on this
server that all are of the same table, cubed, times 2x the region flush size.

But when unloading regions of a regionserver in a cluster using
region_mover.rb, the number of regions that are on this server that all are of
the same table will decrease, and the split size will decrease too, which may
cause the left region split in the regionsever. Region Splits also happens when
loading regions of a regionserver in a cluster.

A improvment may set a minimum split size in
IncreasingToUpperBoundRegionSplitPolicy
Suggestions are welcomed. Thanks~

was:
Currently IncreasingToUpperBoundRegionSplitPolicy is the default region split
policy. In this policy, split size is the number of regions that are on this
server that all are of the same table, cubed, times 2x the region flush size.

A improvment may set a minmum split size in
IncreasingToUpperBoundRegionSplitPolicy
Suggestions are welcomed. Thanks~

IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region splits
in rolling update of cluster

Key: HBASE-12451
URL: https://issues.apache.org/jira/browse/HBASE-12451
Project: HBase
Issue Type: Bug
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
Fix For: 2.0.0

Currently IncreasingToUpperBoundRegionSplitPolicy is the default region split
policy. In this policy, split size is the number of regions that are on this
server that all are of the same table, cubed, times 2x the region flush size.
But when unloading regions of a regionserver in a cluster using
region_mover.rb, the number of regions that are on this server that all are
of the same table will decrease, and the split size will decrease too, which
may cause the left region split in the regionsever. Region Splits also
happens when loading regions of a regionserver in a cluster.
A improvment may set a minimum split size in
IncreasingToUpperBoundRegionSplitPolicy
Suggestions are welcomed. Thanks~

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-12451) IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region split in rolling update of cluster

2014-11-10 Thread Liu Shaohui (JIRA)

Liu Shaohui created HBASE-12451:
---

 Summary: IncreasingToUpperBoundRegionSplitPolicy may cause 
unnecessary region split in rolling update of cluster
 Key: HBASE-12451
 URL: https://issues.apache.org/jira/browse/HBASE-12451
 Project: HBase
  Issue Type: Bug
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0


Currently IncreasingToUpperBoundRegionSplitPolicy is the default region split 
policy. In this policy, split size is the number of regions that are on this 
server that all are of the same table, cubed, times 2x the region flush size.

But when unloading regions of a regionserver in a cluster using 
region_mover.rb, the number of regions that are on this server that all are of 
the same table will decrease, and the split size will decrease too, which may 
cause the left region split in the regionsever. Region Splits also happens when 
loading regions of a regionserver in a cluster. 

A improvment may set a minmum split size in 
IncreasingToUpperBoundRegionSplitPolicy
Suggestions are welcomed. Thanks~




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-12452) Add regular expression based split policy

2014-11-10 Thread He Liangliang (JIRA)

He Liangliang created HBASE-12452:
-

 Summary: Add regular expression based split policy
 Key: HBASE-12452
 URL: https://issues.apache.org/jira/browse/HBASE-12452
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: He Liangliang
Assignee: He Liangliang
Priority: Minor


The current DelimitedKeyPrefixRegionSplitPolicy split policy is not flexible 
enough to describe the split point prefix in some case. A regex based split 
policy is proposed, for example:
^[^\x00]+\x00[^\x00]+\x00
means the split point will always be truncated to a prefix at the second \0 
char. We rely on this to implement local secondary index with data types.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12452) Add regular expression based split policy

2014-11-10 Thread He Liangliang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Liangliang updated HBASE-12452:
--
Attachment: HBASE-12452.patch

 Add regular expression based split policy
 -

 Key: HBASE-12452
 URL: https://issues.apache.org/jira/browse/HBASE-12452
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: He Liangliang
Assignee: He Liangliang
Priority: Minor
 Attachments: HBASE-12452.patch


 The current DelimitedKeyPrefixRegionSplitPolicy split policy is not flexible 
 enough to describe the split point prefix in some case. A regex based split 
 policy is proposed, for example:
 ^[^\x00]+\x00[^\x00]+\x00
 means the split point will always be truncated to a prefix at the second \0 
 char. We rely on this to implement local secondary index with data types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12452) Add regular expression based split policy

2014-11-10 Thread He Liangliang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Liangliang updated HBASE-12452:
--
Status: Patch Available  (was: Open)

 Add regular expression based split policy
 -

 Key: HBASE-12452
 URL: https://issues.apache.org/jira/browse/HBASE-12452
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: He Liangliang
Assignee: He Liangliang
Priority: Minor
 Attachments: HBASE-12452.patch


 The current DelimitedKeyPrefixRegionSplitPolicy split policy is not flexible 
 enough to describe the split point prefix in some case. A regex based split 
 policy is proposed, for example:
 ^[^\x00]+\x00[^\x00]+\x00
 means the split point will always be truncated to a prefix at the second \0 
 char. We rely on this to implement local secondary index with data types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12451) IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region splits in rolling update of cluster

2014-11-10 Thread zhangduo (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204741#comment-14204741
 ] 

zhangduo commented on HBASE-12451:
--

Setting minimum split size will delay the first split operation which is good 
for load balancing(a new table will have only one region if not pre-split)

What about reduce the boost factor(current is region_count^3) and use the total 
region count of this table(I do not know where is it stored but I think we can 
get it somewhere such as master or zookeeper) instead of the region count of 
this table on this server?

 IncreasingToUpperBoundRegionSplitPolicy may cause unnecessary region splits 
 in rolling update of cluster
 

 Key: HBASE-12451
 URL: https://issues.apache.org/jira/browse/HBASE-12451
 Project: HBase
  Issue Type: Bug
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0


 Currently IncreasingToUpperBoundRegionSplitPolicy is the default region split 
 policy. In this policy, split size is the number of regions that are on this 
 server that all are of the same table, cubed, times 2x the region flush size.
 But when unloading regions of a regionserver in a cluster using 
 region_mover.rb, the number of regions that are on this server that all are 
 of the same table will decrease, and the split size will decrease too, which 
 may cause the left region split in the regionsever. Region Splits also 
 happens when loading regions of a regionserver in a cluster. 
 A improvment may set a minimum split size in 
 IncreasingToUpperBoundRegionSplitPolicy
 Suggestions are welcomed. Thanks~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12444) Total number of requests overflow because it's int

2014-11-10 Thread zhaoyunjiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204806#comment-14204806
 ] 

zhaoyunjiong commented on HBASE-12444:
--

Should I need include below two generated files into patch?
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClusterStatusProtos.java
hbase-rest/src/main/java/org/apache/hadoop/hbase/rest/protobuf/generated/StorageClusterStatusMessage.java

 Total number of requests overflow because it's int
 --

 Key: HBASE-12444
 URL: https://issues.apache.org/jira/browse/HBASE-12444
 Project: HBase
  Issue Type: Bug
  Components: hbck, master, regionserver
Reporter: zhaoyunjiong
Priority: Minor
 Attachments: hbase-12444.patch


 When running hbck, I noticed Number of requests was wrong:
 Average load: 466.41237113402065
 Number of requests: -1835941345
 Number of regions: 45242
 Number of regions in transition: 0
 The root cause is it use int, and clearly it overflowed.
 I'll update a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12452) Add regular expression based split policy

2014-11-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204825#comment-14204825
 ] 

Hadoop QA commented on HBASE-12452:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680573/HBASE-12452.patch
  against trunk revision .
  ATTACHMENT ID: 12680573

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3785 checkstyle errors (more than the trunk's current 3783 errors).

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 release 
audit warnings (more than the trunk's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/patchReleaseAuditWarnings.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11626//console

This message is automatically generated.

 Add regular expression based split policy
 -

 Key: HBASE-12452
 URL: https://issues.apache.org/jira/browse/HBASE-12452
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: He Liangliang
Assignee: He Liangliang
Priority: Minor
 Attachments: HBASE-12452.patch


 The current DelimitedKeyPrefixRegionSplitPolicy split policy is not flexible 
 enough to describe the split point prefix in some case. A regex based split 
 policy is proposed, for example:
 ^[^\x00]+\x00[^\x00]+\x00
 means the split point will always be truncated to a prefix at the second \0 
 char. We rely on this to implement local secondary index with data types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-10 Thread sri (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204958#comment-14204958
 ] 

sri commented on HBASE-12445:
-

In the ProtoBufUtil.toMutation(final MutationType type, final Mutation 
mutation, MutationProto.Builder builder, long nonce)
QualifierValue.Builder valueBuilder = QualifierValue.newBuilder(); at line 
number 1110 should be inside the inner for loop at 1115 or needs to clear the 
ValueBuilder. Otherwise valueBuilder is carrying over the previous values that 
where set. In this case the DeleteType is getting carried over to the next 
columns and deleting all the columns.

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Attachments: TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-12453) Make region available once it's open

2014-11-10 Thread Jimmy Xiang (JIRA)

Jimmy Xiang created HBASE-12453:
---

 Summary: Make region available once it's open
 Key: HBASE-12453
 URL: https://issues.apache.org/jira/browse/HBASE-12453
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang


Currently (in trunk, with zk-less assignment), a region is available to serving 
requests only after RS notifies the master the region is open, and the meta is 
updated with the new location. We may be able to do better than this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-10 Thread Hani Nadra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hani Nadra updated HBASE-12445:
---
Attachment: 0001-HBASE-11788-hbase-is-not-deleting-the-cell-when-a-Pu.patch

Here the patch with the updated test and the fix suggest by sri in 1178:

QualifierValue.Builder valueBuilder = QualifierValue.newBuilder(); at line 
number 1110 should be inside the inner for loop at 1115. Otherwise valueBuilder 
is carrying over the previous values that where set. In this case the 
DeleteType is getting carried over to the next columns and deleting all the 
columns. Attached the testcase to the defect.Could you please reopen the defect.

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Attachments: 
 0001-HBASE-11788-hbase-is-not-deleting-the-cell-when-a-Pu.patch, 
 TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-12445:
---
Status: Patch Available  (was: Open)

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Attachments: 
 0001-HBASE-11788-hbase-is-not-deleting-the-cell-when-a-Pu.patch, 
 12445-v2.patch, TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-12445:
---
Attachment: 12445-v2.patch

Slightly modified patch where valueBuilder is cleared inside the loop.

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Attachments: 
 0001-HBASE-11788-hbase-is-not-deleting-the-cell-when-a-Pu.patch, 
 12445-v2.patch, TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12444) Total number of requests overflow because it's int

2014-11-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205035#comment-14205035
 ] 

Ted Yu commented on HBASE-12444:


Please include generated files for QA to run tests.

 Total number of requests overflow because it's int
 --

 Key: HBASE-12444
 URL: https://issues.apache.org/jira/browse/HBASE-12444
 Project: HBase
  Issue Type: Bug
  Components: hbck, master, regionserver
Reporter: zhaoyunjiong
Priority: Minor
 Attachments: hbase-12444.patch


 When running hbck, I noticed Number of requests was wrong:
 Average load: 466.41237113402065
 Number of requests: -1835941345
 Number of regions: 45242
 Number of regions in transition: 0
 The root cause is it use int, and clearly it overflowed.
 I'll update a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205158#comment-14205158
 ] 

Hadoop QA commented on HBASE-12445:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680604/12445-v2.patch
  against trunk revision .
  ATTACHMENT ID: 12680604

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11627//console

This message is automatically generated.

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Attachments: 
 0001-HBASE-11788-hbase-is-not-deleting-the-cell-when-a-Pu.patch, 
 12445-v2.patch, TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,

[jira] [Commented] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205198#comment-14205198
 ] 

Ted Yu commented on HBASE-12445:


[~hnadra]:
Should this issue be assigned to sri ?

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Attachments: 
 0001-HBASE-11788-hbase-is-not-deleting-the-cell-when-a-Pu.patch, 
 12445-v2.patch, TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-12445:
---
Fix Version/s: 0.99.2
   0.98.9
   2.0.0

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: 
 0001-HBASE-11788-hbase-is-not-deleting-the-cell-when-a-Pu.patch, 
 12445-v2.patch, TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12444) Total number of requests overflow because it's int

2014-11-10 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205269#comment-14205269
 ] 

Elliott Clark commented on HBASE-12444:
---

This will break compatibility of rolling restarts.

 Total number of requests overflow because it's int
 --

 Key: HBASE-12444
 URL: https://issues.apache.org/jira/browse/HBASE-12444
 Project: HBase
  Issue Type: Bug
  Components: hbck, master, regionserver
Reporter: zhaoyunjiong
Priority: Minor
 Attachments: hbase-12444.patch


 When running hbck, I noticed Number of requests was wrong:
 Average load: 466.41237113402065
 Number of requests: -1835941345
 Number of regions: 45242
 Number of regions in transition: 0
 The root cause is it use int, and clearly it overflowed.
 I'll update a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-10 Thread sri (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205281#comment-14205281
 ] 

sri commented on HBASE-12445:
-

Hani and I are collegues

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: 
 0001-HBASE-11788-hbase-is-not-deleting-the-cell-when-a-Pu.patch, 
 12445-v2.patch, TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

92 matches

Mail list logo