date:20180618

[jira] [Commented] (HBASE-19861) Avoid using RPCs when querying table infos for master status pages

2018-06-18 Thread Guanghao Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516685#comment-16516685
 ] 

Guanghao Zhang commented on HBASE-19861:


bq. Is this behaviour intentional?
Yes. backup master show nothing about tables.
bq. I think we should fall back to rpc based listing in case of backup master 
and can have this improvement in case of active master.
The backup master may use a seperate UI based on rpc. No need active master 
fall back to rpc.


> Avoid using RPCs when querying table infos for master status pages
> --
>
> Key: HBASE-19861
> URL: https://issues.apache.org/jira/browse/HBASE-19861
> Project: HBase
>  Issue Type: Improvement
>  Components: UI
>Reporter: Xiaolin Ha
>Assignee: Xiaolin Ha
>Priority: Major
> Fix For: 2.0.0-beta-2, 2.0.0
>
> Attachments: 19861.4.patch, HBASE-19861.v1.patch, 
> HBASE-19861.v3.patch, HBASE-19861.v4.patch, errorMsgExample.png
>
>
> When querying table information for master status pages, currently method is 
> using admin interfaces. For example, when list user tables, codes are as 
> follows.
> Connection connection = master.getConnection();
> Admin admin = connection.getAdmin();
> try {
>  tables = admin.listTables();
> } finally {
>  admin.close();
> }
> But actually, we can get all user tables from master's memory.
> Using admin interfaces means using RPCs, which has a low efficiency.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20642) IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException

2018-06-18 Thread stack (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516676#comment-16516676
 ] 

stack commented on HBASE-20642:
---

Doing nonce creation outside the call method is what makes it so we don't make 
a new nonce on each invocation?

345 Long nonceGroup = ng.getNonceGroup();
346 Long nonce = ng.newNonce();

If so thank you for fixing a bunch of mangled calls.

Bit of doc for this...
  public interface StepHook{ 
?

So, the patch here makes it so call could complete even though Master was 
restarted while call was going on (as long as new Master came up before 
timeout)?

Patch looks good to me [~an...@apache.org]  Nice work sir. I learned/re-learned 
stuff reviewing this work. Thanks.

> IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException 
> -
>
> Key: HBASE-20642
> URL: https://issues.apache.org/jira/browse/HBASE-20642
> Project: HBase
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Attachments: HBASE-20642.001.patch, HBASE-20642.patch
>
>
> [~romil.choksi] reported that IntegrationTestDDLMasterFailover is failing 
> while adding column family during the time master is restarting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20743) ASF License warnings for branch-1

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516642#comment-16516642
 ] 

Ted Yu commented on HBASE-20743:


Yes.

Here is excerpt from target/rat.txt :
{code}
  hbase-error-prone/target/checkstyle-result.xml
  hbase-error-prone/target/maven-archiver/pom.properties
  
hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
  
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
  
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
{code}

> ASF License warnings for branch-1
> -
>
> Key: HBASE-20743
> URL: https://issues.apache.org/jira/browse/HBASE-20743
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350/artifact/output-general/patch-asflicense-problems.txt
>  :
> {code}
> Lines that start with ? in the ASF License  report indicate files that do 
> not have an Apache license header:
>  !? hbase-error-prone/target/checkstyle-result.xml
>  !? 
> hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
>  !? 
> hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
>  !? 
> hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
> {code}
> Looks like they should be excluded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516625#comment-16516625
 ] 

Ted Yu commented on HBASE-20542:


When I ran the test locally with patch, I saw the following in test output:
{code}
2018-06-18 20:43:32,244 ERROR [Time-limited test] regionserver.HRegion(1249): 
Asked to modify this region's 
(foobar,,1529379812191.c0c4ada07a3a9905699278a1b1fd63ff.) memStoreSizing  to a 
negative value which is incorrect. Current memStoreSizing=0, delta=-32
java.lang.Exception
  at 
org.apache.hadoop.hbase.regionserver.HRegion.checkNegativeMemStoreDataSize(HRegion.java:1249)
  at 
org.apache.hadoop.hbase.regionserver.HRegion.incMemStoreSize(HRegion.java:1229)
  at 
org.apache.hadoop.hbase.regionserver.RegionServicesForStores.addMemStoreSize(RegionServicesForStores.java:61)
  at 
org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:153)
  at 
org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:332)
  at 
org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:177)
  at 
org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:110)
  at 
org.apache.hadoop.hbase.regionserver.CompactingMemStore.inMemoryCompaction(CompactingMemStore.java:459)
  at 
org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:439)
  at 
org.apache.hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore.testTimeRange(TestCompactingToCellFlatMapMemStore.java:409)
  at 
org.apache.hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore.testTimeRangeAfterCompaction(TestCompactingToCellFlatMapMemStore.java:374)
{code}

> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch, run.sh, workloada, 
> workloadc, workloadx, workloady
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20708) Remove the usage of RecoverMetaProcedure in master startup

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516622#comment-16516622
 ] 

Hadoop QA commented on HBASE-20708:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 21 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
28s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} The patch hbase-client passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} hbase-procedure: The patch generated 0 new + 43 
unchanged - 1 fixed = 43 total (was 44) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
16s{color} | {color:red} hbase-server: The patch generated 1 new + 359 
unchanged - 17 fixed = 360 total (was 376) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 22s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
3s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
44s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}113m

[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-18 Thread Kuan-Po Tseng (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516567#comment-16516567
 ] 

Kuan-Po Tseng commented on HBASE-18201:
---

reattach patch to trigger testing.

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20615) emphasize use of shaded client jars when they're present in an install

2018-06-18 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516562#comment-16516562
 ] 

Hudson commented on HBASE-20615:


Results for branch branch-2
[build #878 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> emphasize use of shaded client jars when they're present in an install
> --
>
> Key: HBASE-20615
> URL: https://issues.apache.org/jira/browse/HBASE-20615
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, Client, Usability
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20615.0.patch, HBASE-20615.1.patch, 
> HBASE-20615.2.patch
>
>
> Working through setting up an IT for our shaded artifacts in HBASE-20334 
> makes our lack of packaging seem like an oversight. While I could work around 
> by pulling the shaded clients out of whatever build process built the 
> convenience binary that we're trying to test, it seems v awkward.
> After reflecting on it more, it makes more sense to me for there to be a 
> common place in the install that folks running jobs against the cluster can 
> rely on. If they need to run without a full hbase install, that should still 
> work fine via e.g. grabbing from the maven repo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20478) move import checks from hbaseanti to checkstyle

2018-06-18 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516558#comment-16516558
 ] 

Hudson commented on HBASE-20478:


Results for branch branch-2
[build #878 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> move import checks from hbaseanti to checkstyle
> ---
>
> Key: HBASE-20478
> URL: https://issues.apache.org/jira/browse/HBASE-20478
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Minor
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20478.0.patch, HBASE-20478.1.patch, 
> HBASE-20478.2.patch, HBASE-20478.3.patch, HBASE-20478.4.patch, 
> HBASE-20478.5.patch, HBASE-20478.6.patch, HBASE-20478.8.patch, 
> HBASE-20478.WIP.2.patch, HBASE-20478.WIP.2.patch, HBASE-20478.WIP.patch, 
> HBASE-anti-check.patch
>
>
> came up in discussion on HBASE-20332. our check of "don't do this" things in 
> the codebase doesn't log the specifics of complaints anywhere, which forces 
> those who want to follow up to reverse engineer the check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20333) break up shaded client into one with no Hadoop and one that's standalone

2018-06-18 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516560#comment-16516560
 ] 

Hudson commented on HBASE-20333:


Results for branch branch-2
[build #878 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> break up shaded client into one with no Hadoop and one that's standalone
> 
>
> Key: HBASE-20333
> URL: https://issues.apache.org/jira/browse/HBASE-20333
> Project: HBase
>  Issue Type: Sub-task
>  Components: shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20333.1.patch, HBASE-20333.WIP.0.patch
>
>
> there are contexts where we want to stay out of our downstream users way wrt 
> dependencies, but they need more Hadoop classes than we provide. i.e. any 
> downstream client that wants to use both HBase and HDFS in their application, 
> or any non-MR YARN application.
> Now that Hadoop also has shaded client artifacts for Hadoop 3, we're also 
> providing less incremental benefit by including our own rewritten Hadoop 
> classes to avoid downstream needing to pull in all of Hadoop's transitive 
> dependencies.
> right now those users need to ensure that any jars from the Hadoop project 
> are loaded in the classpath prior to our shaded client jar. This is brittle 
> and prone to weird debugging trouble.
> instead, we should have two artifacts: one that just lists Hadoop as a 
> prerequisite and one that still includes the rewritten-but-not-relocated 
> Hadoop classes.
> We can then use docs to emphasize when each of these is appropriate to use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

2018-06-18 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516559#comment-16516559
 ] 

Hudson commented on HBASE-20332:


Results for branch branch-2
[build #878 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> shaded mapreduce module shouldn't include hadoop
> 
>
> Key: HBASE-20332
> URL: https://issues.apache.org/jira/browse/HBASE-20332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20332.0.patch, HBASE-20332.1.WIP.patch, 
> HBASE-20332.2.WIP.patch, HBASE-20332.3.patch, HBASE-20332.4.patch, 
> HBASE-20332.5.patch, HBASE-20332.6.patch, HBASE-20332.7.patch
>
>
> AFAICT, we should just entirely skip including hadoop in our shaded mapreduce 
> module
> 1) Folks expect to run yarn / mr apps via {{hadoop jar}} / {{yarn jar}}
> 2) those commands include all the needed Hadoop jars in your classpath by 
> default (both client side and in the containers)
> 3) If you try to use "user classpath first" for your job as a workaround 
> (e.g. for some library your application needs that hadoop provides) then our 
> inclusion of *some but not all* hadoop classes then causes everything to fall 
> over because of mixing rewritten and non-rewritten hadoop classes
> 4) if you don't use "user classpath first" then all of our 
> non-relocated-but-still-shaded hadoop classes are ignored anyways so we're 
> just wasting space



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20334) add a test that expressly uses both our shaded client and the one from hadoop 3

2018-06-18 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516563#comment-16516563
 ] 

Hudson commented on HBASE-20334:


Results for branch branch-2
[build #878 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> add a test that expressly uses both our shaded client and the one from hadoop 
> 3
> ---
>
> Key: HBASE-20334
> URL: https://issues.apache.org/jira/browse/HBASE-20334
> Project: HBase
>  Issue Type: Sub-task
>  Components: hadoop3, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20334.0.patch, HBASE-20334.1.patch
>
>
> Since we're making a shaded client that bleed out of our namespace and into 
> Hadoop's, we should ensure that we can show our clients coexisting. Even if 
> it's just an IT that successfully talks to both us and HDFS via our 
> respective shaded clients, that'd be a big help in keeping us proactive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19735) Create a minimal "client" tarball installation

2018-06-18 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516561#comment-16516561
 ] 

Hudson commented on HBASE-19735:


Results for branch branch-2
[build #878 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/878//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Create a minimal "client" tarball installation
> --
>
> Key: HBASE-19735
> URL: https://issues.apache.org/jira/browse/HBASE-19735
> Project: HBase
>  Issue Type: New Feature
>  Components: build, Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-19735.000.patch, HBASE-19735.001.branch-2.patch, 
> HBASE-19735.002.branch-2.patch, HBASE-19735.003.patch, HBASE-19735.004.patch
>
>
> We're moving ourselves towards more controlled dependencies. A logical next 
> step is to try to do the same for our "binary" artifacts that we create 
> during releases.
> There is code (our's and our dependency's) which the HMaster and RegionServer 
> require which, obviously, clients do not need.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-18201) add UT and docs for DataBlockEncodingTool

2018-06-18 Thread Kuan-Po Tseng (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuan-Po Tseng updated HBASE-18201:
--
Attachment: HBASE-18201.master.002.patch

> add UT and docs for DataBlockEncodingTool
> -
>
> Key: HBASE-18201
> URL: https://issues.apache.org/jira/browse/HBASE-18201
> Project: HBase
>  Issue Type: Task
>  Components: tooling
>Reporter: Chia-Ping Tsai
>Assignee: Kuan-Po Tseng
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20737) put collection into ArrayList instead of addAll function

2018-06-18 Thread taiyinglee (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516553#comment-16516553
 ] 

taiyinglee commented on HBASE-20737:


ok,  i will push another patch for RetriesExhaustedWithDetailsException

> put collection into ArrayList instead of addAll function
> 
>
> Key: HBASE-20737
> URL: https://issues.apache.org/jira/browse/HBASE-20737
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: taiyinglee
>Assignee: taiyinglee
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20737.v0.patch, HBASE-20737.v0.patch
>
>
> [https://docs.oracle.com/javase/7/docs/api/java/util/Collection.html]
> [https://docs.oracle.com/javase/7/docs/api/java/util/ArrayList.html]
> [https://docs.oracle.com/javase/7/docs/api/java/util/Set.html]
>  
> /hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ClusterStatusPublisher.java
> change
> List> entries = new ArrayList<>();
>  entries.addAll(lastSent.entrySet());
> to
> List> entries = new 
> ArrayList<>(lastSent.entrySet());



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Reopened] (HBASE-20737) put collection into ArrayList instead of addAll function

2018-06-18 Thread taiyinglee (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

taiyinglee reopened HBASE-20737:


> put collection into ArrayList instead of addAll function
> 
>
> Key: HBASE-20737
> URL: https://issues.apache.org/jira/browse/HBASE-20737
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: taiyinglee
>Assignee: taiyinglee
>Priority: Trivial
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20737.v0.patch, HBASE-20737.v0.patch
>
>
> [https://docs.oracle.com/javase/7/docs/api/java/util/Collection.html]
> [https://docs.oracle.com/javase/7/docs/api/java/util/ArrayList.html]
> [https://docs.oracle.com/javase/7/docs/api/java/util/Set.html]
>  
> /hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ClusterStatusPublisher.java
> change
> List> entries = new ArrayList<>();
>  entries.addAll(lastSent.entrySet());
> to
> List> entries = new 
> ArrayList<>(lastSent.entrySet());



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20708) Remove the usage of RecoverMetaProcedure in master startup

2018-06-18 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20708:
--
Attachment: HBASE-20708-v9.patch

> Remove the usage of RecoverMetaProcedure in master startup
> --
>
> Key: HBASE-20708
> URL: https://issues.apache.org/jira/browse/HBASE-20708
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20708-v1.patch, HBASE-20708-v2.patch, 
> HBASE-20708-v3.patch, HBASE-20708-v4.patch, HBASE-20708-v5.patch, 
> HBASE-20708-v6.patch, HBASE-20708-v7.patch, HBASE-20708-v8.patch, 
> HBASE-20708-v9.patch, HBASE-20708-v9.patch, HBASE-20708.patch
>
>
> In HBASE-20700, we make RecoverMetaProcedure use a special lock which is only 
> used by RMP to avoid dead lock with MoveRegionProcedure. But we will always 
> schedule a RMP when master starting up, so we still need to make sure that 
> there is no race between this RMP and other RMPs and SCPs scheduled before 
> the master restarts.
> Please see [#[accompanying design document 
> |https://docs.google.com/document/d/1_872oHzrhJq4ck7f6zmp1J--zMhsIFvXSZyX1Mxg5MA/edit#heading=h.xy1z4alsq7uy]
>  ]where we call out the problem being addressed by this issue in more detail 
> and in which we describe our new approach to Master startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-15887) Report Log Additions and Removals in Builds

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-15887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516455#comment-16516455
 ] 

Hadoop QA commented on HBASE-15887:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
2s{color} | {color:blue} The patch file was not named according to hbase's 
naming conventions. Please see 
https://yetus.apache.org/documentation/0.7.0/precommit-patchnames for 
instructions. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  3s{color} 
| {color:red} HBASE-15887 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-15887 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12806010/HBASE-15887-v1.txt |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13306/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Report Log Additions and Removals in Builds
> ---
>
> Key: HBASE-15887
> URL: https://issues.apache.org/jira/browse/HBASE-15887
> Project: HBase
>  Issue Type: New Feature
>  Components: build
>Reporter: Clay B.
>Priority: Trivial
> Attachments: HBASE-15887-v1.txt
>
>
> It would be very nice for the Apache Yetus verifications of HBase patches to 
> report log item addition and deletions.
> This is not my idea but [~mbm] asked if we could modify the personality for 
> reporting log additions and removals yesterday at an [HBase meetup at Splice 
> machine|http://www.meetup.com/hbaseusergroup/events/230547750/] as [~aw] 
> presented Apache Yetus for building HBase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-15887) Report Log Additions and Removals in Builds

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-15887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516454#comment-16516454
 ] 

Hadoop QA commented on HBASE-15887:
---

(!) A patch to the testing environment has been detected. 
Re-executing against the patched versions to perform further tests. 
The console is at 
https://builds.apache.org/job/PreCommit-HBASE-Build/13306/console in case of 
problems.


> Report Log Additions and Removals in Builds
> ---
>
> Key: HBASE-15887
> URL: https://issues.apache.org/jira/browse/HBASE-15887
> Project: HBase
>  Issue Type: New Feature
>  Components: build
>Reporter: Clay B.
>Priority: Trivial
> Attachments: HBASE-15887-v1.txt
>
>
> It would be very nice for the Apache Yetus verifications of HBase patches to 
> report log item addition and deletions.
> This is not my idea but [~mbm] asked if we could modify the personality for 
> reporting log additions and removals yesterday at an [HBase meetup at Splice 
> machine|http://www.meetup.com/hbaseusergroup/events/230547750/] as [~aw] 
> presented Apache Yetus for building HBase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-15887) Report Log Additions and Removals in Builds

2018-06-18 Thread Clay B. (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-15887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516451#comment-16516451
 ] 

Clay B. commented on HBASE-15887:
-

[~busbey] ping again; too hacky or better for Yetus

> Report Log Additions and Removals in Builds
> ---
>
> Key: HBASE-15887
> URL: https://issues.apache.org/jira/browse/HBASE-15887
> Project: HBase
>  Issue Type: New Feature
>  Components: build
>Reporter: Clay B.
>Priority: Trivial
> Attachments: HBASE-15887-v1.txt
>
>
> It would be very nice for the Apache Yetus verifications of HBase patches to 
> report log item addition and deletions.
> This is not my idea but [~mbm] asked if we could modify the personality for 
> reporting log additions and removals yesterday at an [HBase meetup at Splice 
> machine|http://www.meetup.com/hbaseusergroup/events/230547750/] as [~aw] 
> presented Apache Yetus for building HBase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20334) add a test that expressly uses both our shaded client and the one from hadoop 3

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20334:

  Resolution: Fixed
Release Note: 


HBase now includes a helper script that can be used to run a basic 
functionality test for a given HBase installation at in `dev_support`. The test 
can optionally be given an HBase client artifact to rely on and can optionally 
be given specific Hadoop client artifacts to use.

For usage information see 
`./dev-support/hbase_nightly_pseudo-distributed-test.sh --help`.

The project nightly tests now make use of this test to check running on top of 
Hadoop 2, Hadoop 3, and Hadoop 3 with shaded client artifacts.
  Status: Resolved  (was: Patch Available)

> add a test that expressly uses both our shaded client and the one from hadoop 
> 3
> ---
>
> Key: HBASE-20334
> URL: https://issues.apache.org/jira/browse/HBASE-20334
> Project: HBase
>  Issue Type: Sub-task
>  Components: hadoop3, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20334.0.patch, HBASE-20334.1.patch
>
>
> Since we're making a shaded client that bleed out of our namespace and into 
> Hadoop's, we should ensure that we can show our clients coexisting. Even if 
> it's just an IT that successfully talks to both us and HDFS via our 
> respective shaded clients, that'd be a big help in keeping us proactive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516365#comment-16516365
 ] 

Hadoop QA commented on HBASE-20542:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
51s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  2m  2s{color} 
| {color:red} hbase-server generated 3 new + 185 unchanged - 3 fixed = 188 
total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
26s{color} | {color:red} hbase-server: The patch generated 11 new + 109 
unchanged - 1 fixed = 120 total (was 110) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
52s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 23s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 37s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}157m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-20542 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928242/HBASE-20542.branch-2.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c530045a11e7 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 8edd5d948a |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| javac | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13304/artifact/patchprocess/diff-compile-javac-hbase-server.txt
 |
| checkstyle |

[jira] [Updated] (HBASE-19735) Create a minimal "client" tarball installation

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-19735:

  Resolution: Fixed
Release Note: 


The HBase convenience binary artifacts now includes a client focused tarball 
that a) includes more docs and b) does not include scripts or jars only needed 
for running HBase cluster services.

The new artifact is made as a normal part of the `assembly:single` maven 
command.
  Status: Resolved  (was: Patch Available)

> Create a minimal "client" tarball installation
> --
>
> Key: HBASE-19735
> URL: https://issues.apache.org/jira/browse/HBASE-19735
> Project: HBase
>  Issue Type: New Feature
>  Components: build, Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-19735.000.patch, HBASE-19735.001.branch-2.patch, 
> HBASE-19735.002.branch-2.patch, HBASE-19735.003.patch, HBASE-19735.004.patch
>
>
> We're moving ourselves towards more controlled dependencies. A logical next 
> step is to try to do the same for our "binary" artifacts that we create 
> during releases.
> There is code (our's and our dependency's) which the HMaster and RegionServer 
> require which, obviously, clients do not need.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19735) Create a minimal "client" tarball installation

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-19735:

Component/s: Client
 build

> Create a minimal "client" tarball installation
> --
>
> Key: HBASE-19735
> URL: https://issues.apache.org/jira/browse/HBASE-19735
> Project: HBase
>  Issue Type: New Feature
>  Components: build, Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-19735.000.patch, HBASE-19735.001.branch-2.patch, 
> HBASE-19735.002.branch-2.patch, HBASE-19735.003.patch, HBASE-19735.004.patch
>
>
> We're moving ourselves towards more controlled dependencies. A logical next 
> step is to try to do the same for our "binary" artifacts that we create 
> during releases.
> There is code (our's and our dependency's) which the HMaster and RegionServer 
> require which, obviously, clients do not need.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20615) emphasize use of shaded client jars when they're present in an install

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20615:

  Resolution: Fixed
Release Note: 


HBase's built in scripts now rely on the downstream facing shaded artifacts 
where possible. In particular interest to downstream users, the `hadoop 
classpath` and `hadoop mapredcp` commands now return the relevant shaded client 
artifact and only those third paty jars needed to make use of them (e.g. 
slf4j-api, commons-logging, htrace, etc).

Downstream users should note that by default the `hbase classpath` command will 
treat having `hadoop` on the shell's PATH as an implicit request to include the 
output of the `hadoop classpath` command in the returned classpath. This 
long-existing behavior can be opted out of by setting the environment variable 
`HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP` to the value "true". For example: 
`HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true" bin/hbase classpath`.
  Status: Resolved  (was: Patch Available)

pushed to master and branch-2. I missed signed-off-by lines on master. :/

> emphasize use of shaded client jars when they're present in an install
> --
>
> Key: HBASE-20615
> URL: https://issues.apache.org/jira/browse/HBASE-20615
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, Client, Usability
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20615.0.patch, HBASE-20615.1.patch, 
> HBASE-20615.2.patch
>
>
> Working through setting up an IT for our shaded artifacts in HBASE-20334 
> makes our lack of packaging seem like an oversight. While I could work around 
> by pulling the shaded clients out of whatever build process built the 
> convenience binary that we're trying to test, it seems v awkward.
> After reflecting on it more, it makes more sense to me for there to be a 
> common place in the install that folks running jobs against the cluster can 
> rely on. If they need to run without a full hbase install, that should still 
> work fine via e.g. grabbing from the maven repo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20333) break up shaded client into one with no Hadoop and one that's standalone

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20333:

  Resolution: Fixed
Release Note: 


Downstream users who need to use both HBase and Hadoop APIs should switch to 
relying on the new `hbase-shaded-client-byo-hadoop` artifact rather than the 
existing `hbase-shaded-client` artifact. The new artifact no longer includes 
and Hadoop classes.

It should work in combination with either the output of `hadoop classpath` or 
the Hadoop provided client-facing shaded artifacts in Hadoop 3+.
  Status: Resolved  (was: Patch Available)

pushed to master and branch-2. I missed signed-off-by lines on master. :/

> break up shaded client into one with no Hadoop and one that's standalone
> 
>
> Key: HBASE-20333
> URL: https://issues.apache.org/jira/browse/HBASE-20333
> Project: HBase
>  Issue Type: Sub-task
>  Components: shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20333.1.patch, HBASE-20333.WIP.0.patch
>
>
> there are contexts where we want to stay out of our downstream users way wrt 
> dependencies, but they need more Hadoop classes than we provide. i.e. any 
> downstream client that wants to use both HBase and HDFS in their application, 
> or any non-MR YARN application.
> Now that Hadoop also has shaded client artifacts for Hadoop 3, we're also 
> providing less incremental benefit by including our own rewritten Hadoop 
> classes to avoid downstream needing to pull in all of Hadoop's transitive 
> dependencies.
> right now those users need to ensure that any jars from the Hadoop project 
> are loaded in the classpath prior to our shaded client jar. This is brittle 
> and prone to weird debugging trouble.
> instead, we should have two artifacts: one that just lists Hadoop as a 
> prerequisite and one that still includes the rewritten-but-not-relocated 
> Hadoop classes.
> We can then use docs to emphasize when each of these is appropriate to use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20332:

  Resolution: Fixed
Release Note: 


The `hbase-shaded-mapreduce` artifact no longer include its own copy of Hadoop 
classes. Users who make use of the artifact via YARN should be able to get 
these classes from YARN's classpath without having to make any changes.
  Status: Resolved  (was: Patch Available)

pushed to master and branch-2. I missed signed-off-by lines on master. :/

> shaded mapreduce module shouldn't include hadoop
> 
>
> Key: HBASE-20332
> URL: https://issues.apache.org/jira/browse/HBASE-20332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20332.0.patch, HBASE-20332.1.WIP.patch, 
> HBASE-20332.2.WIP.patch, HBASE-20332.3.patch, HBASE-20332.4.patch, 
> HBASE-20332.5.patch, HBASE-20332.6.patch, HBASE-20332.7.patch
>
>
> AFAICT, we should just entirely skip including hadoop in our shaded mapreduce 
> module
> 1) Folks expect to run yarn / mr apps via {{hadoop jar}} / {{yarn jar}}
> 2) those commands include all the needed Hadoop jars in your classpath by 
> default (both client side and in the containers)
> 3) If you try to use "user classpath first" for your job as a workaround 
> (e.g. for some library your application needs that hadoop provides) then our 
> inclusion of *some but not all* hadoop classes then causes everything to fall 
> over because of mixing rewritten and non-rewritten hadoop classes
> 4) if you don't use "user classpath first" then all of our 
> non-relocated-but-still-shaded hadoop classes are ignored anyways so we're 
> just wasting space



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20478) move import checks from hbaseanti to checkstyle

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20478:

Fix Version/s: 2.1.0

> move import checks from hbaseanti to checkstyle
> ---
>
> Key: HBASE-20478
> URL: https://issues.apache.org/jira/browse/HBASE-20478
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Minor
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20478.0.patch, HBASE-20478.1.patch, 
> HBASE-20478.2.patch, HBASE-20478.3.patch, HBASE-20478.4.patch, 
> HBASE-20478.5.patch, HBASE-20478.6.patch, HBASE-20478.8.patch, 
> HBASE-20478.WIP.2.patch, HBASE-20478.WIP.2.patch, HBASE-20478.WIP.patch, 
> HBASE-anti-check.patch
>
>
> came up in discussion on HBASE-20332. our check of "don't do this" things in 
> the codebase doesn't log the specifics of complaints anywhere, which forces 
> those who want to follow up to reverse engineer the check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-20750) If shaded client artifacts are built without the -Prelease flag, make sure they'll fail loudly if used

2018-06-18 Thread Sean Busbey (JIRA)

Sean Busbey created HBASE-20750:
---

 Summary: If shaded client artifacts are built without the 
-Prelease flag, make sure they'll fail loudly if used
 Key: HBASE-20750
 URL: https://issues.apache.org/jira/browse/HBASE-20750
 Project: HBase
  Issue Type: Improvement
  Components: build, shading
Affects Versions: 3.0.0, 2.1.0
Reporter: Sean Busbey


If someone builds the shaded jars and doesn't pass the {{-Prelease}} flag, we 
get near-empty jars.

We should make sure that if they're loaded they fail loudly rather than as 
no-ops as they are now.

On a Hadoop 3 cluster without YARN-7190, this results in the following 
confusing result:
{code}
Busbey-MBA:hbase busbey$ 
./hbase-assembly/target/hbase-2.1.0-SNAPSHOT-client/bin/hbase version
HBase 1.2.6
Source code repository 
file:///home/busbey/projects/hbase/hbase-assembly/target/hbase-1.2.6 
revision=Unknown
Compiled by busbey on Mon May 29 02:25:32 CDT 2017
>From source with checksum 7e8ce83a648e252758e9dae1fbe779c9
{code}

On a cluster not impacted by YARN-7190 we'll jsut get a confusing class not 
found exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20642) IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516281#comment-16516281
 ] 

Hadoop QA commented on HBASE-20642:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
53s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 47s{color} 
| {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 
total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
12s{color} | {color:red} hbase-server: The patch generated 4 new + 159 
unchanged - 14 fixed = 163 total (was 173) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
58s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 21s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
9s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}114m 
36s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}165m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20642 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928233/HBASE-20642.001.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 74f6b8e47935 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh

[jira] [Commented] (HBASE-20188) [TESTING] Performance

2018-06-18 Thread Eshcar Hillel (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516256#comment-16516256
 ] 

Eshcar Hillel commented on HBASE-20188:
---

Please note new patch and benchmarks results are available in HBASE-20542

> [TESTING] Performance
> -
>
> Key: HBASE-20188
> URL: https://issues.apache.org/jira/browse/HBASE-20188
> Project: HBase
>  Issue Type: Umbrella
>  Components: Performance
>Reporter: stack
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0
>
> Attachments: CAM-CONFIG-V01.patch, HBASE-20188-xac.sh, 
> HBASE-20188.sh, HBase 2.0 performance evaluation - 8GB(1).pdf, HBase 2.0 
> performance evaluation - 8GB.pdf, HBase 2.0 performance evaluation - Basic vs 
> None_ system settings.pdf, HBase 2.0 performance evaluation - throughput 
> SSD_HDD.pdf, ITBLL2.5B_1.2.7vs2.0.0_cpu.png, 
> ITBLL2.5B_1.2.7vs2.0.0_gctime.png, ITBLL2.5B_1.2.7vs2.0.0_iops.png, 
> ITBLL2.5B_1.2.7vs2.0.0_load.png, ITBLL2.5B_1.2.7vs2.0.0_memheap.png, 
> ITBLL2.5B_1.2.7vs2.0.0_memstore.png, ITBLL2.5B_1.2.7vs2.0.0_ops.png, 
> ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, 
> YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, 
> YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, 
> flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, 
> hbase-site.xml, hits.png, hits_with_fp_scheduler.png, 
> lock.127.workloadc.20180402T200918Z.svg, 
> lock.2.memsize2.c.20180403T160257Z.svg, perregion.png, run_ycsb.sh, 
> total.png, tree.txt, workloadx, workloadx
>
>
> How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor 
> that it is much slower, that the problem is the asyncwal writing. Does 
> in-memory compaction slow us down or speed us up? What happens when you 
> enable offheaping?
> Keep notes here in this umbrella issue. Need to be able to say something 
> about perf when 2.0.0 ships.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Eshcar Hillel (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516253#comment-16516253
 ] 

Eshcar Hillel commented on HBASE-20542:
---

bq. Where is write lock released ?
The write lock is not released. Once the write lock is acquired the segment is 
really immutable, no update operation should read-lock it and there is no need 
to release the write lock.

> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch, run.sh, workloada, 
> workloadc, workloadx, workloady
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Eshcar Hillel (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516187#comment-16516187
 ] 

Eshcar Hillel commented on HBASE-20542:
---

I'm having some trouble with RB will check this and see how I can upload the 
patch there

> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch, run.sh, workloada, 
> workloadc, workloadx, workloady
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Eshcar Hillel (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516186#comment-16516186
 ] 

Eshcar Hillel commented on HBASE-20542:
---

Attaching the ycsb scripts used for benchmarking.
Two sets of runs  [^run.sh] .
First, write-only zipfian  [^workloadx]  with 10 region pre-split, followed by 
a read-only zipfian  [^workloady] reading only one column.
Second is the standard uniform load (a), mixed read-write  [^workloada] , 
read-only  [^workloadc]  reading all columns.
This is a comparison of the average throughput and lift vs no-IMC:

||comp ||   index || workloadx || workloady || load || workloada || 
workloadc ||
| NONE | - | 49,369  | 17,682 | 11,010 |10,468 |7,779 |
| BASIC | CAM | 57965 | 17,132 |11,854 | 10,318 |   7,552 |
| | | +17.41% | -3.11%  | +7.67%| -1.44%| -2.91% |
|BASIC| CCM | 52,296 | 16,644 | 12,140 | 9,705 | 7,465 |
| | | +6%   | -6% | +10%| -7% | -4%|



> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch, run.sh, workloada, 
> workloadc, workloadx, workloady
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Eshcar Hillel (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eshcar Hillel updated HBASE-20542:
--
Attachment: run.sh
workloadx
workloady
workloadc
workloada

> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch, run.sh, workloada, 
> workloadc, workloadx, workloady
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516172#comment-16516172
 ] 

Ted Yu commented on HBASE-20542:


{code}
+  public void waitForUpdates() {
+if(!updatesLock.isWriteLocked()) {
+  updatesLock.writeLock().lock();
{code}
Where is write lock released ?

> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516158#comment-16516158
 ] 

Ted Yu commented on HBASE-20542:


Took a quick look.
{code}
+while(!succ) {
+  currentActive = getActive();
+  succ = preUpdate(currentActive, cell, memstoreSizing);
{code}
Potentially how many {{preUpdate}} calls would take place when there is 
contention ?
{code}
+   * @return true if the cell can be added to the
*/
   @Override
-  protected void checkActiveSize() {
-return;
+  protected boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
cellToAdd,
{code}
The javadoc for @return is incomplete.


> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Eshcar Hillel (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516156#comment-16516156
 ] 

Eshcar Hillel commented on HBASE-20542:
---

Patch is attached.
To reduce internal fragmentation the size of the active segment is set to be 
the size of one MSLAB chunk (by default 2MB).
An add operation is supplemented with pre-update and post update procedures.
The pre-update procedure atomically increases the size of the segment if this 
increment does not exceed the segment size threshold, and then continues with 
the normal path of updating the memstore.
If the increment will exceed the segment size threshold then the size is not 
increased and instead 
(1) the segment is flushed into the compaction pipeline,
(2) a new active segment is created, 
(3) an IMC task is scheduled in the background,
(4) the operation re-runs the pre-update procedure, this time with the new 
active segment.

This changes calls for an additional optimization.
The IMC no longer needs to acquire the region level updates lock. Instead we 
use segment level read-write lock to synchronize IMC with concurrent update 
operations. This is better since with the new solution IMC only needs to wait 
only for those few operations that already updated the size of the segment in 
the pre-update procedure but are still updating the segment skip list, and does 
not need to wait for operations of other stores. Moreover, update operation do 
not wait for in-memory flush to complete as before.
To synchronize, update operation take the read lock of the segment they are 
updating in the pre-update procedure, and release it in the post-update 
procedure. IMC thread take the write lock of each segment it is compacting. 
This ensures all updates that started before the in-memory flush have completed.

I will upload the patch also in RB.
Feel free to ask questions and comment.


> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19682) Use Collections.emptyList() For Empty List Values

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516148#comment-16516148
 ] 

Hadoop QA commented on HBASE-19682:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} hbase-server: The patch generated 0 new + 157 
unchanged - 1 fixed = 157 total (was 158) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 22s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}169m 
43s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}211m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-19682 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916901/HBASE-19682.5.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 83c7444d38d6 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 
21:23:04 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / ac5bb8155b |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13301/testReport/ |
| Max. process+thread count | 4528 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output |

[jira] [Commented] (HBASE-20749) Upgrade our use of checkstyle to 8.6+

2018-06-18 Thread Sean Busbey (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516125#comment-16516125
 ] 

Sean Busbey commented on HBASE-20749:
-

The current checkstyle version is 5.10.1.

> Upgrade our use of checkstyle to 8.6+
> -
>
> Key: HBASE-20749
> URL: https://issues.apache.org/jira/browse/HBASE-20749
> Project: HBase
>  Issue Type: Improvement
>  Components: build, community
>Reporter: Sean Busbey
>Priority: Minor
>
> We should upgrade our checkstyle version to 8.6 or later so we can use the 
> "match violation message to this regex" feature for suppression. That will 
> allow us to make sure we don't regress on HTrace v3 vs v4 APIs (came up in 
> HBASE-20332).
> We're currently blocked on upgrading to 8.3+ by [checkstyle 
> #5279|https://github.com/checkstyle/checkstyle/issues/5279], a regression 
> that flags our use of both the "separate import groups" and "put static 
> imports over here" configs as an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-20749) Upgrade our use of checkstyle to 8.6+

2018-06-18 Thread Sean Busbey (JIRA)

Sean Busbey created HBASE-20749:
---

 Summary: Upgrade our use of checkstyle to 8.6+
 Key: HBASE-20749
 URL: https://issues.apache.org/jira/browse/HBASE-20749
 Project: HBase
  Issue Type: Improvement
  Components: build, community
Reporter: Sean Busbey


We should upgrade our checkstyle version to 8.6 or later so we can use the 
"match violation message to this regex" feature for suppression. That will 
allow us to make sure we don't regress on HTrace v3 vs v4 APIs (came up in 
HBASE-20332).

We're currently blocked on upgrading to 8.3+ by [checkstyle 
#5279|https://github.com/checkstyle/checkstyle/issues/5279], a regression that 
flags our use of both the "separate import groups" and "put static imports over 
here" configs as an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

2018-06-18 Thread Sean Busbey (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516122#comment-16516122
 ] 

Sean Busbey commented on HBASE-20332:
-

thanks!

filed HBASE-20749.

> shaded mapreduce module shouldn't include hadoop
> 
>
> Key: HBASE-20332
> URL: https://issues.apache.org/jira/browse/HBASE-20332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20332.0.patch, HBASE-20332.1.WIP.patch, 
> HBASE-20332.2.WIP.patch, HBASE-20332.3.patch, HBASE-20332.4.patch, 
> HBASE-20332.5.patch, HBASE-20332.6.patch, HBASE-20332.7.patch
>
>
> AFAICT, we should just entirely skip including hadoop in our shaded mapreduce 
> module
> 1) Folks expect to run yarn / mr apps via {{hadoop jar}} / {{yarn jar}}
> 2) those commands include all the needed Hadoop jars in your classpath by 
> default (both client side and in the containers)
> 3) If you try to use "user classpath first" for your job as a workaround 
> (e.g. for some library your application needs that hadoop provides) then our 
> inclusion of *some but not all* hadoop classes then causes everything to fall 
> over because of mixing rewritten and non-rewritten hadoop classes
> 4) if you don't use "user classpath first" then all of our 
> non-relocated-but-still-shaded hadoop classes are ignored anyways so we're 
> just wasting space



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Eshcar Hillel (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eshcar Hillel updated HBASE-20542:
--
Attachment: HBASE-20542.branch-2.001.patch
Status: Patch Available  (was: Open)

> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

2018-06-18 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516109#comment-16516109
 ] 

Mike Drob commented on HBASE-20332:
---

Please file a follow on jira for upgrading our checkstyle version and let me 
know when that exists.

> shaded mapreduce module shouldn't include hadoop
> 
>
> Key: HBASE-20332
> URL: https://issues.apache.org/jira/browse/HBASE-20332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20332.0.patch, HBASE-20332.1.WIP.patch, 
> HBASE-20332.2.WIP.patch, HBASE-20332.3.patch, HBASE-20332.4.patch, 
> HBASE-20332.5.patch, HBASE-20332.6.patch, HBASE-20332.7.patch
>
>
> AFAICT, we should just entirely skip including hadoop in our shaded mapreduce 
> module
> 1) Folks expect to run yarn / mr apps via {{hadoop jar}} / {{yarn jar}}
> 2) those commands include all the needed Hadoop jars in your classpath by 
> default (both client side and in the containers)
> 3) If you try to use "user classpath first" for your job as a workaround 
> (e.g. for some library your application needs that hadoop provides) then our 
> inclusion of *some but not all* hadoop classes then causes everything to fall 
> over because of mixing rewritten and non-rewritten hadoop classes
> 4) if you don't use "user classpath first" then all of our 
> non-relocated-but-still-shaded hadoop classes are ignored anyways so we're 
> just wasting space



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

2018-06-18 Thread Mike Drob (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516108#comment-16516108
 ] 

Mike Drob commented on HBASE-20332:
---

{code:title=checkstyle-suppressions.xml}
+  TODO Update to use the message suppression filter once we cna update
{code}
s/cna/can

{code:title=checkstyle.xml}
+  TODO include the htrace package once we can upgrade
{code}
specifically call this out as o.a.htrace? Otherwise it's not clear since we 
already disallow o.htrace later.

Can fix on commit, +1 to the patch.

> shaded mapreduce module shouldn't include hadoop
> 
>
> Key: HBASE-20332
> URL: https://issues.apache.org/jira/browse/HBASE-20332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20332.0.patch, HBASE-20332.1.WIP.patch, 
> HBASE-20332.2.WIP.patch, HBASE-20332.3.patch, HBASE-20332.4.patch, 
> HBASE-20332.5.patch, HBASE-20332.6.patch, HBASE-20332.7.patch
>
>
> AFAICT, we should just entirely skip including hadoop in our shaded mapreduce 
> module
> 1) Folks expect to run yarn / mr apps via {{hadoop jar}} / {{yarn jar}}
> 2) those commands include all the needed Hadoop jars in your classpath by 
> default (both client side and in the containers)
> 3) If you try to use "user classpath first" for your job as a workaround 
> (e.g. for some library your application needs that hadoop provides) then our 
> inclusion of *some but not all* hadoop classes then causes everything to fall 
> over because of mixing rewritten and non-rewritten hadoop classes
> 4) if you don't use "user classpath first" then all of our 
> non-relocated-but-still-shaded hadoop classes are ignored anyways so we're 
> just wasting space



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20642) IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException

2018-06-18 Thread Ankit Singhal (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516023#comment-16516023
 ] 

Ankit Singhal commented on HBASE-20642:
---

[~stack],[~elserj], [~mdrob] , attaching a patch for your review guys.
* Added a test case which re-produces the issue
* Fixed client side nonce generation for retry calls
* Moved pre checks on the server under nonce checks.

> IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException 
> -
>
> Key: HBASE-20642
> URL: https://issues.apache.org/jira/browse/HBASE-20642
> Project: HBase
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Attachments: HBASE-20642.001.patch, HBASE-20642.patch
>
>
> [~romil.choksi] reported that IntegrationTestDDLMasterFailover is failing 
> while adding column family during the time master is restarting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20642) IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException

2018-06-18 Thread Ankit Singhal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-20642:
--
Attachment: HBASE-20642.001.patch

> IntegrationTestDDLMasterFailover throws 'InvalidFamilyOperationException 
> -
>
> Key: HBASE-20642
> URL: https://issues.apache.org/jira/browse/HBASE-20642
> Project: HBase
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Attachments: HBASE-20642.001.patch, HBASE-20642.patch
>
>
> [~romil.choksi] reported that IntegrationTestDDLMasterFailover is failing 
> while adding column family during the time master is restarting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-06-18 Thread churro morales (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516009#comment-16516009
 ] 

churro morales commented on HBASE-20618:


that would be ideal, but we don't have any nice way of sending back the rowkey 
of the large row back to the client.  I definitely don't want to parse the 
exception message for it.  

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.6
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-06-18 Thread Swapna (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516001#comment-16516001
 ] 

Swapna commented on HBASE-20618:


[~elserj] , [~eclark]

Any suggestions or alternatives? Do you prefer option 1 (Handling the same on 
client side ) ? 

 

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.6
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515979#comment-16515979
 ] 

Ted Yu commented on HBASE-20734:


WALSplitter is annotated @InterfaceAudience.Private

It has information to both FileSystems (wal and root). By isolating the change 
to WALSplitter, there is a chance we don't need to modify Region interface.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-18 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20734:
---
Fix Version/s: 3.0.0

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-18 Thread Anoop Sam John (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515960#comment-16515960
 ] 

Anoop Sam John commented on HBASE-20704:


If the client end do a retry I believe that is acceptable.

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19064) Synchronous replication for HBase

2018-06-18 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515933#comment-16515933
 ] 

Hudson commented on HBASE-19064:


Results for branch HBASE-19064
[build #165 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/165/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/165//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/165//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/165//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Synchronous replication for HBase
> -
>
> Key: HBASE-19064
> URL: https://issues.apache.org/jira/browse/HBASE-19064
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
>
> The guys from Alibaba made a presentation on HBaseCon Asia about the 
> synchronous replication for HBase. We(Xiaomi) think this is a very useful 
> feature for HBase so we want to bring it into the community version.
> This is a big feature so we plan to do it in a feature branch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19611) Review of QuotaRetriever Class

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515918#comment-16515918
 ] 

Hadoop QA commented on HBASE-19611:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
48s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 34s{color} 
| {color:red} hbase-client generated 1 new + 102 unchanged - 1 fixed = 103 
total (was 103) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} hbase-client: The patch generated 0 new + 0 
unchanged - 4 fixed = 0 total (was 4) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
50s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 57s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
56s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-19611 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903575/HBASE-19611.1.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux e01d9ca93d38 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / ac5bb8155b |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| javac | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13293/artifact/patchprocess/diff-compile-javac-hbase-client.txt
 |
|  Test Results |

[jira] [Commented] (HBASE-20208) Review of SequenceIdAccounting.java

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515908#comment-16515908
 ] 

Ted Yu commented on HBASE-20208:


{code}
[ERROR] 
/testptch/hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceIdAccounting.java:[31,39]
 package org.apache.commons.collections4 does not exist
[INFO] 1 error
{code}

> Review of SequenceIdAccounting.java
> ---
>
> Key: HBASE-20208
> URL: https://issues.apache.org/jira/browse/HBASE-20208
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: 20208.1.patch, HBASE-20208.1.patch
>
>
> # Fix checkstyle warnings
> # Use re-usable libraries where possible
> # Improve Map Access
> What got my attention on this class was:
> {code}
> for (Map.Entry e : sequenceids.entrySet()) {
>   long oldestFlushing = Long.MAX_VALUE;
>   long oldestUnflushed = Long.MAX_VALUE;
>   if (flushing != null && flushing.containsKey(e.getKey())) {
> oldestFlushing = flushing.get(e.getKey());
>   }
>   if (unflushed != null && unflushed.containsKey(e.getKey())) {
> oldestUnflushed = unflushed.get(e.getKey());
>   }
>   long min = Math.min(oldestFlushing, oldestUnflushed);
>   if (min <= e.getValue()) {
> return false;
>   }
> {code}
> Here, the two maps are calling _containsKey_ and then _get_.  It is also 
> calling {{e.getKey()}} repeatedly.
> I propose changing this so that {{e.getKey()}} is only called once and 
> instead of looking up an entry with _containsKey_ and then a _get_, simply 
> use _get_ once and check for a 'null' value to check for existence.  It saves 
> two trips through the Map Collection on each loop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20208) Review of SequenceIdAccounting.java

2018-06-18 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20208:
---
Status: Open  (was: Patch Available)

> Review of SequenceIdAccounting.java
> ---
>
> Key: HBASE-20208
> URL: https://issues.apache.org/jira/browse/HBASE-20208
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: 20208.1.patch, HBASE-20208.1.patch
>
>
> # Fix checkstyle warnings
> # Use re-usable libraries where possible
> # Improve Map Access
> What got my attention on this class was:
> {code}
> for (Map.Entry e : sequenceids.entrySet()) {
>   long oldestFlushing = Long.MAX_VALUE;
>   long oldestUnflushed = Long.MAX_VALUE;
>   if (flushing != null && flushing.containsKey(e.getKey())) {
> oldestFlushing = flushing.get(e.getKey());
>   }
>   if (unflushed != null && unflushed.containsKey(e.getKey())) {
> oldestUnflushed = unflushed.get(e.getKey());
>   }
>   long min = Math.min(oldestFlushing, oldestUnflushed);
>   if (min <= e.getValue()) {
> return false;
>   }
> {code}
> Here, the two maps are calling _containsKey_ and then _get_.  It is also 
> calling {{e.getKey()}} repeatedly.
> I propose changing this so that {{e.getKey()}} is only called once and 
> instead of looking up an entry with _containsKey_ and then a _get_, simply 
> use _get_ once and check for a 'null' value to check for existence.  It saves 
> two trips through the Map Collection on each loop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515898#comment-16515898
 ] 

Ted Yu commented on HBASE-20748:


As Chia-ping commented on the PR, PR is not integrated with QA bot.
Look at the tail of 
https://builds.apache.org/job/PreCommit-HBASE-Build/13302/console to see what 
happened.

bulkLoadWithCustomVersions duplicates existing code. Please refactor the 
current bulkLoad method and include unit test in next patch.

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Assignee: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20208) Review of SequenceIdAccounting.java

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515899#comment-16515899
 ] 

Hadoop QA commented on HBASE-20208:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
53s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
44s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
24s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 24s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
7s{color} | {color:red} hbase-server: The patch generated 1 new + 0 unchanged - 
13 fixed = 1 total (was 13) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  3m 
17s{color} | {color:red} patch has 30 errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  1m 
42s{color} | {color:red} The patch causes 30 errors with Hadoop v2.7.4. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  3m 
31s{color} | {color:red} The patch causes 30 errors with Hadoop v3.0.0. {color} 
|
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
27s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
29s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 28s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20208 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914961/20208.1.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux a706d9ea00c9 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / ac5bb8155b |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| mvninstall |

[jira] [Commented] (HBASE-20738) failed with testGetPassword on Windows

2018-06-18 Thread star (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515893#comment-16515893
 ] 

star commented on HBASE-20738:
--

[~busbey] What's the next should I do?  What do you think about the patch, is 
it necessary to apply the patch somewhere ?

> failed with testGetPassword on Windows 
> ---
>
> Key: HBASE-20738
> URL: https://issues.apache.org/jira/browse/HBASE-20738
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0
>Reporter: star
>Assignee: star
>Priority: Minor
> Attachments: HBASE-20738.0.patch, test-hbase-configuration.patch
>
>
>      When running unit test on Windows, testGetPassword from 
> TestHBaseConfiguration.java failed. The original code produces a uri path 
> like "jceks://file/root/hbase/others" on unix system, but produces a invalid 
> path like "jceks://fileD:\wordspace\hbase\others"，which will cause a 
> URISyntaxException latter.
>      To solve the problem, just add a "/" prefix to windows local path and 
> replace all "\" with "/".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Release Note: 
PR#78: submitted addition of bulkLoadWithCustomVersions method to 
org.apache.hadoop.hbase.spark.HBaseContext.scala.

This will allow the use of a bulkLoad with a custom version Long into HBase
  Status: Patch Available  (was: Open)

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Assignee: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515872#comment-16515872
 ] 

Charles PORROT commented on HBASE-20748:


PR has already been linked in the ticket (PR #78).

I will put the issue in "Patch Available."

Thank you for your time. I will try to follow the guidelines better if I have 
another contribution to make.

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Assignee: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18273) hbase_rotate_log in hbase-daemon.sh script not working for some JDK

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-18273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515861#comment-16515861
 ] 

Hadoop QA commented on HBASE-18273:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
5s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 1s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  2m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-18273 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12874799/HBASE-18273.2.patch |
| Optional Tests |  asflicense  shellcheck  shelldocs  |
| uname | Linux 15ec82c78e39 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / ac5bb8155b |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| shellcheck | v0.4.4 |
| Max. process+thread count | 47 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13299/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> hbase_rotate_log in hbase-daemon.sh script not working for some JDK
> ---
>
> Key: HBASE-18273
> URL: https://issues.apache.org/jira/browse/HBASE-18273
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.6, 1.1.11, 2.0.0-alpha-1
>Reporter: Fangyuan Deng
>Assignee: Fangyuan Deng
>Priority: Major
> Attachments: HBASE-18273.0.patch, HBASE-18273.1.patch, 
> HBASE-18273.2.patch
>
>
> When restarting a hbase process,  hbase_rotate_log $HBASE_LOGGC will rotate 
> GC logs.
> the code looks like this,
>  if [ -f "$log" ]; then # rotate logs
> while [ $num -gt 1 ]; do
> prev=`expr $num - 1`
> [ -f "$log.$prev" ] && mv -f "$log.$prev" "$log.$num"
> num=$prev
> done
> But, some version JDK will add a suffix (.0) to the gc file, like 
> hbase-xxx.gc.0,  rather than hbase-xxx.gc.
> So I add a check before rotate,
>  if [ ! -f "$log" ]; then #for some jdk, gc log has a postfix 0
>   if [ -f "$log.0" ]; then
> mv -f "$log.0" "$log";
>   fi
> fi



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-15438) error: CF specified in importtsv.columns does not match with table CF

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515857#comment-16515857
 ] 

Hadoop QA commented on HBASE-15438:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-15438 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-15438 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799109/patches_2016-04-16.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13298/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> error: CF specified in importtsv.columns does not match with table CF
> -
>
> Key: HBASE-15438
> URL: https://issues.apache.org/jira/browse/HBASE-15438
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.2
> Environment: HDP 2.3 Ubuntu 14
>Reporter: bob zhao
>Assignee: Mikhail Antonov
>Priority: Minor
>  Labels: easyfix, easytest
> Attachments: patches_2016-04-16.patch
>
>
> Try to play with the hbase tsv import, get such error:
> ERROR: Column Families [ Current,  Closing] specified in importtsv.columns 
> does not match with any of the table StocksB column families [Closing, 
> Current].
> the script is:
> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv  
> -Dimporttsv.columns="HBASE_ROW_KEY, Current:Price, Closing:Price"   
> -Dimporttsv.bulk.output="/user/bob/storeDataFileOutput/"  StocksB   
> /user/bob/stocks.txt
> If i remove the space which behind the comma and before CF name, everything 
> is fine. 
> As a Dev, I like to add some space over there for easy reading and checking. 
> Please trim these CF names before processing, thanks!
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-15832) memory leak in FSHLog.

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515856#comment-16515856
 ] 

Hadoop QA commented on HBASE-15832:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-15832 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-15832 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12804772/HBASE-15832-v1.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13300/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> memory leak in FSHLog.
> --
>
> Key: HBASE-15832
> URL: https://issues.apache.org/jira/browse/HBASE-15832
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.2
>Reporter: Jeongdae Kim
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-15832-v1.patch, Screenshot-Java - 
> -home-jeongdae-work-regionserver_jmap_104p_sn5_20160509-sn5_heap.hprof - 
> Eclipse -1.png
>
>
> FSHLog module uses a map to reuse SyncFuture objects, and assumes that this 
> map will be used by RPC Handler threads only. but, in some cases, this 
> assumption is wrong. 
> for example, if some coprocessors are registered, and these coprocessors uses 
> CoprocessorHConnection insteadof HConnection, and request some puts/ or 
> deletes throgh CoprocessorHConnection, all mutations will be handled by 
> hconnection's batchPool, not RPC Handlers. because hconnection's batchPool is 
> dynamically growing or shrinking, all new threads in hconnection are put to 
> the map in FSHLog, and this map will grow continuously.
> in attached image file, the map to reuse SyncFuture occupies about 4GB memory 
> and has (almost all) entries holding hconnection's thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19616) Review of LogCleaner Class

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515852#comment-16515852
 ] 

Hadoop QA commented on HBASE-19616:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-19616 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-19616 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903585/HBASE-19616.1.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13294/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Review of LogCleaner Class
> --
>
> Key: HBASE-19616
> URL: https://issues.apache.org/jira/browse/HBASE-19616
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HBASE-19616.1.patch
>
>
> * Parameterize logging
> * Remove compiler-reported dead code to re-enabling useful logging
> * Use ArrayList instead of LinkedList



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-14069) Add the ability for RegionSplitter to rolling split without using a SplitAlgorithm

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515853#comment-16515853
 ] 

Hadoop QA commented on HBASE-14069:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HBASE-14069 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-14069 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12747108/0001-Improve-RegionSplitter-v1.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13297/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Add the ability for RegionSplitter to rolling split without using a 
> SplitAlgorithm
> --
>
> Key: HBASE-14069
> URL: https://issues.apache.org/jira/browse/HBASE-14069
> Project: HBase
>  Issue Type: New Feature
>  Components: util
>Reporter: Elliott Clark
>Assignee: Lijun Tang
>Priority: Major
> Attachments: 0001-Improve-RegionSplitter-v1.patch, 
> 0001-Improve-RegionSplitter.patch
>
>
> RegionSplittler is the utility that can rolling split regions. It would be 
> nice to be able to split regions and have the normal split points get 
> computed for me so that I'm not reliant on knowing data distribution.
> Tested manually on standalone mode for various test cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19612) Review of ZKUtil Class

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-19612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515851#comment-16515851
 ] 

Hadoop QA commented on HBASE-19612:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-19612 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.7.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-19612 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903581/HBASE-19612.2.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13296/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Review of ZKUtil Class
> --
>
> Key: HBASE-19612
> URL: https://issues.apache.org/jira/browse/HBASE-19612
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HBASE-19612.1.patch, HBASE-19612.2.patch
>
>
> * Use Apache Commons where appropriate
> * Use parameterized logging of SLF4J
> * Fix Typos in comment
> * Use Arrays Instead of LinkedLists for performance sake



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Sean Busbey (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515834#comment-16515834
 ] 

Sean Busbey commented on HBASE-20748:
-

Hi! Please upload your contributions using format-patch, e.g. {{git 
format-patch --stdout origin/master > /some/path/to/HBASE-20748.2.patch}} 
(where the "2" is an incrementing count for the version of the patch). 
Alternatively if you find a reviewer who doesn't mind using GitHub linking a PR 
to this issue will work.

Please put the issue in "Patch Available" state once you have either a PR link 
or a patch uploaded.

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Assignee: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reassigned HBASE-20748:
---

Assignee: Charles PORROT

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Assignee: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19804) [hbase-indexer] Metrics source RegionServer,sub=Server already exists!

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-19804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-19804:

Component/s: regionserver
 metrics

> [hbase-indexer] Metrics source RegionServer,sub=Server already exists!
> --
>
> Key: HBASE-19804
> URL: https://issues.apache.org/jira/browse/HBASE-19804
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase-indexer, metrics, regionserver
>Affects Versions: 2.0.0-beta-1
>Reporter: stack
>Priority: Major
>  Labels: beginner
>
> In the past, the hbase-indexer runs multiple RegionServers per JVM. In old 
> days, they had their own cut-down "RegionServer". In 2.0.0, we made it so 
> they could run an actual RegionServer but with services disabled. The latter 
> has an issue if you run more than one instance per JVM and it is NOT a 
> minihbasecluster instance. It fails with:
> {code:java}
> 1:09:13.371 PM  ERROR  HRegionServer  
> Failed init
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> RegionServer,sub=Server already exists!
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
>   at 
> org.apache.hadoop.hbase.metrics.BaseSourceImpl.(BaseSourceImpl.java:115)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceImpl.(MetricsRegionServerSourceImpl.java:101)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceImpl.(MetricsRegionServerSourceImpl.java:93)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceFactoryImpl.createServer(MetricsRegionServerSourceFactoryImpl.java:69)
>   at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionServer.(MetricsRegionServer.java:56)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1519)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:954)
>   at com.ngdata.sep.impl.SepConsumer$1.run(SepConsumer.java:203){code}
>  
> If you look in [10:26 AM] Wolfgang Hoschek: DefaultMetricsSystem code (found 
> by [~whoschek]), you'll see this:
> {code:java}
> synchronized ObjectName newObjectName(String name) {
> try {
>   if (mBeanNames.map.containsKey(name) && !miniClusterMode) {
> throw new MetricsException(name +" already exists!");
>   }
>   return new ObjectName(mBeanNames.uniqueName(name));
> } catch (Exception e) {
>   throw new MetricsException(e);
> }
>   }{code}
> i.e. if we are in a mini cluster context, we will not fail registering the 
> second bean instance.
>  
> If you look in master startup in HMasterCommandLine, you will see:
>  
> {code:java}
> // If 'local', defer to LocalHBaseCluster instance.  Starts master
> // and regionserver both in the one JVM.
> if (LocalHBaseCluster.isLocal(conf)) {
>   DefaultMetricsSystem.setMiniClusterMode(true);
> {code}
> ... will ensure we don't get the above exception in minihbasecluster context.
>  
> So, the idea here is to make it so being able to run more than one RS per JVM 
> is cleaner than doing the above hack. It needs to be a config too a 
> config. which says don't fail startup if second mbean registration just 
> because two RS in the one context (A later issue will be the accounting of 
> metrics per RS... If more than one RS, then we should make a unique mbean per 
> RS in the JVM).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20748:

Component/s: (was: hbase)

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19735) Create a minimal "client" tarball installation

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-19735:

Fix Version/s: 2.1.0

> Create a minimal "client" tarball installation
> --
>
> Key: HBASE-19735
> URL: https://issues.apache.org/jira/browse/HBASE-19735
> Project: HBase
>  Issue Type: New Feature
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-19735.000.patch, HBASE-19735.001.branch-2.patch, 
> HBASE-19735.002.branch-2.patch, HBASE-19735.003.patch, HBASE-19735.004.patch
>
>
> We're moving ourselves towards more controlled dependencies. A logical next 
> step is to try to do the same for our "binary" artifacts that we create 
> during releases.
> There is code (our's and our dependency's) which the HMaster and RegionServer 
> require which, obviously, clients do not need.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20334) add a test that expressly uses both our shaded client and the one from hadoop 3

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20334:

Fix Version/s: 2.1.0
   3.0.0

> add a test that expressly uses both our shaded client and the one from hadoop 
> 3
> ---
>
> Key: HBASE-20334
> URL: https://issues.apache.org/jira/browse/HBASE-20334
> Project: HBase
>  Issue Type: Sub-task
>  Components: hadoop3, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20334.0.patch, HBASE-20334.1.patch
>
>
> Since we're making a shaded client that bleed out of our namespace and into 
> Hadoop's, we should ensure that we can show our clients coexisting. Even if 
> it's just an IT that successfully talks to both us and HDFS via our 
> respective shaded clients, that'd be a big help in keeping us proactive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515728#comment-16515728
 ] 

Ted Yu commented on HBASE-20748:


Please also add a unit test utilizing the enhancement.

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase, spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515727#comment-16515727
 ] 

Ted Yu commented on HBASE-20748:


bq. method would throw back to

I think you meant 'fall back'

Your code is very similar to the current {{bulkLoad}} method.
Since hbase-spark module isn't in any hbase release, you can customize the 
existing method.

Please upload next patch a diff.

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase, spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515673#comment-16515673
 ] 

Hadoop QA commented on HBASE-20332:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
0s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
18s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
46s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m 
30s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
28s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m  0s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}233m 27s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  8m 
59s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}301m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.thrift.TestThriftHttpServer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20332 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928157/HBASE-20332.7.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml  compile  checkstyle  shellcheck  shelldocs  |
| uname | Linux 72b240bc5637 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 
12:16:42 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / ac5bb8155b |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| shellcheck | v0.4.4 |
| unit |

[jira] [Commented] (HBASE-20708) Remove the usage of RecoverMetaProcedure in master startup

2018-06-18 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515650#comment-16515650
 ] 

Duo Zhang commented on HBASE-20708:
---

Any other concerns sir [~stack]. Thanks.

> Remove the usage of RecoverMetaProcedure in master startup
> --
>
> Key: HBASE-20708
> URL: https://issues.apache.org/jira/browse/HBASE-20708
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20708-v1.patch, HBASE-20708-v2.patch, 
> HBASE-20708-v3.patch, HBASE-20708-v4.patch, HBASE-20708-v5.patch, 
> HBASE-20708-v6.patch, HBASE-20708-v7.patch, HBASE-20708-v8.patch, 
> HBASE-20708-v9.patch, HBASE-20708.patch
>
>
> In HBASE-20700, we make RecoverMetaProcedure use a special lock which is only 
> used by RMP to avoid dead lock with MoveRegionProcedure. But we will always 
> schedule a RMP when master starting up, so we still need to make sure that 
> there is no race between this RMP and other RMPs and SCPs scheduled before 
> the master restarts.
> Please see [#[accompanying design document 
> |https://docs.google.com/document/d/1_872oHzrhJq4ck7f6zmp1J--zMhsIFvXSZyX1Mxg5MA/edit#heading=h.xy1z4alsq7uy]
>  ]where we call out the problem being addressed by this issue in more detail 
> and in which we describe our new approach to Master startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Description: 
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned
val wl = writeValueToHFile(
  keyFamilyQualifier.rowKey, 
  keyFamilyQualifier.family, 
  keyFamilyQualifier.qualifier, 
  cellValue, 
  nowTimeStamp, 
  fs, 
  conn, 
  localTableName, 
  conf, 
  familyHFileWriteOptionsMapInternal, 
  hfileCompression, 
  writerMap, 
  stagingDir
){code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,
  maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.
{code:java}
val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
  keyFamilyQualifier.family,
  keyFamilyQualifier.qualifier,
  cellValue,
  if (version > 0) version else nowTimeStamp,
  fs,
  conn,
  localTableName,
  conf,
  familyHFileWriteOptionsMapInternal,
  hfileCompression,
  writerMap,
  stagingDir){code}
See the attached file for the file with the full proposed method.

 

+Edit:+

The same could be done with bulkLoadThinRows: instead of a:
{code:java}
Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
We expect an:
{code:java}
 Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}

  was:
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned
val wl = writeValueToHFile(
  keyFamilyQualifier.rowKey, 
  keyFamilyQualifier.family, 
  keyFamilyQualifier.qualifier, 
  cellValue, 
  nowTimeStamp, 
  fs, 
  conn, 
  localTableName, 
  conf, 
  familyHFileWriteOptionsMapInternal, 
  hfileCompression, 
  writerMap, 
  stagingDir
){code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Description: 
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned
val wl = writeValueToHFile(
  keyFamilyQualifier.rowKey, 
  keyFamilyQualifier.family, 
  keyFamilyQualifier.qualifier, 
  cellValue, 
  nowTimeStamp, 
  fs, 
  conn, 
  localTableName, 
  conf, 
  familyHFileWriteOptionsMapInternal, 
  hfileCompression, 
  writerMap, 
  stagingDir
){code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,
  maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.
{code:java}
val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
  keyFamilyQualifier.family,
  keyFamilyQualifier.qualifier,
  cellValue,
  if (version > 0) version else nowTimeStamp,
  fs,
  conn,
  localTableName,
  conf,
  familyHFileWriteOptionsMapInternal,
  hfileCompression,
  writerMap,
  stagingDir){code}
See the attached file for the file with the full proposed method.

 

+Edit:+

++The same could be done with bulkLoadThinRows: instead of a:
{code:java}
Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
We expect an:
{code:java}
 Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}

  was:
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned
val wl = writeValueToHFile(
  keyFamilyQualifier.rowKey, 
  keyFamilyQualifier.family, 
  keyFamilyQualifier.qualifier, 
  cellValue, 
  nowTimeStamp, 
  fs, 
  conn, 
  localTableName, 
  conf, 
  familyHFileWriteOptionsMapInternal, 
  hfileCompression, 
  writerMap, 
  stagingDir
){code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,

[jira] [Updated] (HBASE-20708) Remove the usage of RecoverMetaProcedure in master startup

2018-06-18 Thread Duo Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20708:
--
Attachment: HBASE-20708-v9.patch

> Remove the usage of RecoverMetaProcedure in master startup
> --
>
> Key: HBASE-20708
> URL: https://issues.apache.org/jira/browse/HBASE-20708
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Blocker
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20708-v1.patch, HBASE-20708-v2.patch, 
> HBASE-20708-v3.patch, HBASE-20708-v4.patch, HBASE-20708-v5.patch, 
> HBASE-20708-v6.patch, HBASE-20708-v7.patch, HBASE-20708-v8.patch, 
> HBASE-20708-v9.patch, HBASE-20708.patch
>
>
> In HBASE-20700, we make RecoverMetaProcedure use a special lock which is only 
> used by RMP to avoid dead lock with MoveRegionProcedure. But we will always 
> schedule a RMP when master starting up, so we still need to make sure that 
> there is no race between this RMP and other RMPs and SCPs scheduled before 
> the master restarts.
> Please see [#[accompanying design document 
> |https://docs.google.com/document/d/1_872oHzrhJq4ck7f6zmp1J--zMhsIFvXSZyX1Mxg5MA/edit#heading=h.xy1z4alsq7uy]
>  ]where we call out the problem being addressed by this issue in more detail 
> and in which we describe our new approach to Master startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20746) Release 2.1.0

2018-06-18 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515634#comment-16515634
 ] 

Duo Zhang commented on HBASE-20746:
---

It's fine. Just add 2.1.0 to the fix versions of the issue. I will review all 
the related issues again before the final release.

> Release 2.1.0
> -
>
> Key: HBASE-20746
> URL: https://issues.apache.org/jira/browse/HBASE-20746
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> After HBASE-20708 I do no think we will have unresolvable problems for 2.1.0 
> release any more. So let's create a issue to track the release processing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20742) Always create WAL directory for region server

2018-06-18 Thread Duo Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515633#comment-16515633
 ] 

Duo Zhang commented on HBASE-20742:
---

There is a MasterWalManager at master side and we will use its method to get 
the live servers. I think you can do something there.

> Always create WAL directory for region server
> -
>
> Key: HBASE-20742
> URL: https://issues.apache.org/jira/browse/HBASE-20742
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Priority: Major
>
> After HBASE-20708, when master restart, we will scan the wal directory to 
> find out the live servers. In most cases this is OK, as when we create a 
> HRegion instance at RS side, we will create a WAL for it, and the directory 
> which contains the server name will be there, even if user always use 
> SKIP_WAL.
> But there could still be problem as the directory is created in the 
> implementation of WAL, not in the initialization of region server, so if user 
> uses DisabledWALProvider then we will be in trouble.
> So let's fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Component/s: hbase

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase, spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Description: 
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned
val wl = writeValueToHFile(
  keyFamilyQualifier.rowKey, 
  keyFamilyQualifier.family, 
  keyFamilyQualifier.qualifier, 
  cellValue, 
  nowTimeStamp, 
  fs, 
  conn, 
  localTableName, 
  conf, 
  familyHFileWriteOptionsMapInternal, 
  hfileCompression, 
  writerMap, 
  stagingDir
){code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,
  maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.
{code:java}
val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
  keyFamilyQualifier.family,
  keyFamilyQualifier.qualifier,
  cellValue,
  if (version > 0) version else nowTimeStamp,
  fs,
  conn,
  localTableName,
  conf,
  familyHFileWriteOptionsMapInternal,
  hfileCompression,
  writerMap,
  stagingDir){code}
See the attached file for the file with the full proposed method.

  was:
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned 
it.foreach{ 
case (keyFamilyQualifier, cellValue:Array[Byte]) => val wl = 
writeValueToHFile(
keyFamilyQualifier.rowKey, 
keyFamilyQualifier.family, 
keyFamilyQualifier.qualifier, 
cellValue, 
nowTimeStamp, 
fs, 
conn, 
localTableName, 
conf, 
familyHFileWriteOptionsMapInternal, 
hfileCompression, 
writerMap, 
stagingDir)
...{code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,
  maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.
{code:java}
val wl =

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Description: 
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned 
it.foreach{ 
case (keyFamilyQualifier, cellValue:Array[Byte]) => val wl = 
writeValueToHFile(
keyFamilyQualifier.rowKey, 
keyFamilyQualifier.family, 
keyFamilyQualifier.qualifier, 
cellValue, 
nowTimeStamp, 
fs, 
conn, 
localTableName, 
conf, 
familyHFileWriteOptionsMapInternal, 
hfileCompression, 
writerMap, 
stagingDir)
...{code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,
  maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.
{code:java}
val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
  keyFamilyQualifier.family,
  keyFamilyQualifier.qualifier,
  cellValue,
  if (version > 0) version else nowTimeStamp,
  fs,
  conn,
  localTableName,
  conf,
  familyHFileWriteOptionsMapInternal,
  hfileCompression,
  writerMap,
  stagingDir){code}
See the attached file for the file with the full proposed method.

  was:
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned 
it.foreach{ 
case (keyFamilyQualifier, cellValue:Array[Byte]) => val wl = 
writeValueToHFile(
keyFamilyQualifier.rowKey, 
keyFamilyQualifier.family, 
keyFamilyQualifier.qualifier, 
cellValue, 
nowTimeStamp, 
fs, 
conn, 
localTableName, 
conf, 
familyHFileWriteOptionsMapInternal, 
hfileCompression, 
writerMap, 
stagingDir)
...{code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, 
Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,
  maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
In case of illogical version (for instance, a negative version), the method 
would throw back

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Description: 
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned 
it.foreach{ 
case (keyFamilyQualifier, cellValue:Array[Byte]) => val wl = 
writeValueToHFile(
keyFamilyQualifier.rowKey, 
keyFamilyQualifier.family, 
keyFamilyQualifier.qualifier, 
cellValue, 
nowTimeStamp, 
fs, 
conn, 
localTableName, 
conf, 
familyHFileWriteOptionsMapInternal, 
hfileCompression, 
writerMap, 
stagingDir)
...{code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

 

Definition of _bulkLoad_:
{code:java}
def bulkLoad[T](
rdd:RDD[T], 
tableName: TableName, 
flatMap: (T) => Iterator[(KeyFamilyQualifier, 
Array[Byte])], 
stagingDir:String, 
familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
compactionExclude: Boolean = false, 
maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
Definition of a _bulkLoadWithCustomVersions_ method:
{code:java}
def bulkLoadCustomVersions[T](rdd:RDD[T],
  tableName: TableName,
  flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
Long)],
  stagingDir:String,
  familyHFileWriteOptionsMap:
  util.Map[Array[Byte], FamilyHFileWriteOptions] =
  new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
  compactionExclude: Boolean = false,
  maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.
{code:java}
val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
  keyFamilyQualifier.family,
  keyFamilyQualifier.qualifier,
  cellValue,
  if (version > 0) version else nowTimeStamp,
  fs,
  conn,
  localTableName,
  conf,
  familyHFileWriteOptionsMapInternal,
  hfileCompression,
  writerMap,
  stagingDir){code}
See the attached file for the file with the full proposed method.

  was:
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned 
it.foreach{ 
case (keyFamilyQualifier, cellValue:Array[Byte]) => val wl = 
writeValueToHFile(
keyFamilyQualifier.rowKey, 
keyFamilyQualifier.family, 
keyFamilyQualifier.qualifier, 
cellValue, 
nowTimeStamp, 
fs, 
conn, 
localTableName, 
conf, 
familyHFileWriteOptionsMapInternal, 
hfileCompression, 
writerMap, 
stagingDir)
...{code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.

See the attached file for a proposal of this new _bulkLoad_ method.


> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Description: 
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system:
{code:java}
//Here is where we finally iterate through the data in this partition of the 
//RDD that has been sorted and partitioned 
it.foreach{ 
case (keyFamilyQualifier, cellValue:Array[Byte]) => val wl = 
writeValueToHFile(
keyFamilyQualifier.rowKey, 
keyFamilyQualifier.family, 
keyFamilyQualifier.qualifier, 
cellValue, 
nowTimeStamp, 
fs, 
conn, 
localTableName, 
conf, 
familyHFileWriteOptionsMapInternal, 
hfileCompression, 
writerMap, 
stagingDir)
...{code}
 

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.

See the attached file for a proposal of this new _bulkLoad_ method.

  was:
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system.

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.

See the attached file for a proposal of this new _bulkLoad_ method.


> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned 
> it.foreach{ 
> case (keyFamilyQualifier, cellValue:Array[Byte]) => val wl = 
> writeValueToHFile(
> keyFamilyQualifier.rowKey, 
> keyFamilyQualifier.family, 
> keyFamilyQualifier.qualifier, 
> cellValue, 
> nowTimeStamp, 
> fs, 
> conn, 
> localTableName, 
> conf, 
> familyHFileWriteOptionsMapInternal, 
> hfileCompression, 
> writerMap, 
> stagingDir)
> ...{code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> See the attached file for a proposal of this new _bulkLoad_ method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Description: 
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load. This 
makes this method, and its twin _bulkLoadThinRows_, useless if you need to use 
your own versionning system.

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.

See the attached file for a proposal of this new _bulkLoad_ method.

  was:
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load.

This makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
use your own versionning system.

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.

See the attached file for a proposal of this new _bulkLoad_ method.


> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system.
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> See the attached file for a proposal of this new _bulkLoad_ method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles PORROT updated HBASE-20748:
---
Description: 
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
use the system's current time for the version of the cells to bulk-load.

This makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
use your own versionning system.

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.

See the attached file for a proposal of this new _bulkLoad_ method.

  was:
The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark_ use the 
system's current time for the version of the cells to bulk-load.

This makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
use your own versionning system.

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.

See the attached file for a proposal of this new _bulkLoad_ method.


> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load.
> This makes this method, and its twin _bulkLoadThinRows_, useless if you need 
> to use your own versionning system.
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> See the attached file for a proposal of this new _bulkLoad_ method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Charles PORROT (JIRA)

Charles PORROT created HBASE-20748:
--

 Summary: HBaseContext bulkLoad: being able to use custom versions
 Key: HBASE-20748
 URL: https://issues.apache.org/jira/browse/HBASE-20748
 Project: HBase
  Issue Type: Improvement
  Components: spark
Reporter: Charles PORROT
 Attachments: bulkLoadCustomVersions.scala

The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark_ use the 
system's current time for the version of the cells to bulk-load.

This makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
use your own versionning system.

Thus, I propose a third _bulkLoad_ method, based on the original method. 
Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
Array[Byte], Long_), with the _Long_ being the version.

In case of illogical version (for instance, a negative version), the method 
would throw back to the current timestamp.

See the attached file for a proposal of this new _bulkLoad_ method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20679) Add the ability to compile JSP dynamically in Jetty

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515481#comment-16515481
 ] 

Hadoop QA commented on HBASE-20679:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
26s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
46s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  5m 
42s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  8m  
3s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  8m  3s{color} 
| {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
30s{color} | {color:red} root: The patch generated 3 new + 43 unchanged - 0 
fixed = 46 total (was 43) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  4m 
29s{color} | {color:red} patch has 10 errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  5m  
2s{color} | {color:red} The patch causes 10 errors with Hadoop v2.7.4. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 10m 
19s{color} | {color:red} The patch causes 10 errors with Hadoop v3.0.0. {color} 
|
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}184m 20s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
52s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}248m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20679 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928138/HBASE-20679.005.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  shadedjars  hadoopcheck  
xml

[jira] [Commented] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

2018-06-18 Thread Sean Busbey (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515471#comment-16515471
 ] 

Sean Busbey commented on HBASE-20332:
-

-v7
  - comment out the checkstyle based checking for HTrace v3

To make use of the message attribute on the suppression we'll need to update to 
[version 1.2 of the suppression 
DTD|https://checkstyle.org/dtds/suppressions_1_2.dtd] and move to a version of 
checkstyle that recognizes it, which is checkstyle 8.6+. We can't do that yet 
because [checkstyle #5279|https://github.com/checkstyle/checkstyle/issues/5279] 
is still open.

I've commented out the changes to use checkstyle to watch for htrave v3 and 
left them as TODO for when we can upgrade our checkstyle version. I think we're 
better off waiting for that then relying on the precommit 'hbaseanti' check 
that points out lines in the patch file. I am fine with doing both (the 
comments for later and the precommit patch grep) if anyone prefers.

bq. ugh. checkstyle in hte precommit run broke with a complaint that the 
suppression isn't valid. but it works locally? trying to figure out the 
difference.

This was a cached hbase-checkstyle jar in my environment. The reason the 
checkstyle complaints weren't present was because the cached version had 
neither the new rule for {{org.apache.htrace}} nor the suppression that is 
causing the error in precommit.

> shaded mapreduce module shouldn't include hadoop
> 
>
> Key: HBASE-20332
> URL: https://issues.apache.org/jira/browse/HBASE-20332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20332.0.patch, HBASE-20332.1.WIP.patch, 
> HBASE-20332.2.WIP.patch, HBASE-20332.3.patch, HBASE-20332.4.patch, 
> HBASE-20332.5.patch, HBASE-20332.6.patch, HBASE-20332.7.patch
>
>
> AFAICT, we should just entirely skip including hadoop in our shaded mapreduce 
> module
> 1) Folks expect to run yarn / mr apps via {{hadoop jar}} / {{yarn jar}}
> 2) those commands include all the needed Hadoop jars in your classpath by 
> default (both client side and in the containers)
> 3) If you try to use "user classpath first" for your job as a workaround 
> (e.g. for some library your application needs that hadoop provides) then our 
> inclusion of *some but not all* hadoop classes then causes everything to fall 
> over because of mixing rewritten and non-rewritten hadoop classes
> 4) if you don't use "user classpath first" then all of our 
> non-relocated-but-still-shaded hadoop classes are ignored anyways so we're 
> just wasting space



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20332) shaded mapreduce module shouldn't include hadoop

2018-06-18 Thread Sean Busbey (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-20332:

Attachment: HBASE-20332.7.patch

> shaded mapreduce module shouldn't include hadoop
> 
>
> Key: HBASE-20332
> URL: https://issues.apache.org/jira/browse/HBASE-20332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20332.0.patch, HBASE-20332.1.WIP.patch, 
> HBASE-20332.2.WIP.patch, HBASE-20332.3.patch, HBASE-20332.4.patch, 
> HBASE-20332.5.patch, HBASE-20332.6.patch, HBASE-20332.7.patch
>
>
> AFAICT, we should just entirely skip including hadoop in our shaded mapreduce 
> module
> 1) Folks expect to run yarn / mr apps via {{hadoop jar}} / {{yarn jar}}
> 2) those commands include all the needed Hadoop jars in your classpath by 
> default (both client side and in the containers)
> 3) If you try to use "user classpath first" for your job as a workaround 
> (e.g. for some library your application needs that hadoop provides) then our 
> inclusion of *some but not all* hadoop classes then causes everything to fall 
> over because of mixing rewritten and non-rewritten hadoop classes
> 4) if you don't use "user classpath first" then all of our 
> non-relocated-but-still-shaded hadoop classes are ignored anyways so we're 
> just wasting space



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20369) Document incompatibilities between HBase 1.x and HBase 2.0

2018-06-18 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515435#comment-16515435
 ] 

Hadoop QA commented on HBASE-20369:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
35s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m 
43s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 4 line(s) with tabs. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m  
7s{color} | {color:blue} patch has no errors when building the reference guide. 
See footer for rendered docs, which you should manually inspect. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20369 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928152/HBase-20369_v10.patch 
|
| Optional Tests |  asflicense  refguide  |
| uname | Linux 2e7791de1639 4.4.0-98-generic #121-Ubuntu SMP Tue Oct 10 
14:24:03 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / ac5bb8155b |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13291/artifact/patchprocess/branch-site/book.html
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13291/artifact/patchprocess/whitespace-tabs.txt
 |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13291/artifact/patchprocess/patch-site/book.html
 |
| Max. process+thread count | 93 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13291/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Document incompatibilities between HBase 1.x and HBase 2.0
> --
>
> Key: HBASE-20369
> URL: https://issues.apache.org/jira/browse/HBASE-20369
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: Thiriguna Bharat Rao
>Assignee: Thiriguna Bharat Rao
>Priority: Critical
>  Labels: patch
> Attachments: HBASE-20369.patch, HBASE-20369_v1.patch, 
> HBASE-20369_v2.patch, HBASE-20369_v3.patch, HBASE-20369_v4.patch, 
> HBASE-20369_v5.patch, HBASE-20369_v6.patch, HBASE-20369_v7.patch, 
> HBASE-20369_v8.patch, HBASE-20369_v9.patch, HBase-20369_v10.patch, book.adoc
>
>
> Hi, 
> I compiled a  draft document for the HBase incompatibilities from the raw 
> source content that was available in HBase Beta 1 site. Can someone please 
> review and provide a feedback or share your comments on this document?
> Appreciate your support and time.
>  
> Best Regards, 
> Triguna



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

2018-06-18 Thread Francis Liu (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515429#comment-16515429
 ] 

Francis Liu commented on HBASE-20704:
-

For 1.x in the small window the region hasn't be set a closed by the RS yet the 
client will get an NPE when the scan tries to access the fs stream. The client 
will retry.

For master branch it depends if the read is a pread or not. If it is a pread it 
will be the same as 1.x. If not then it has it's own reader in which case the 
file will be removed while the reader is open. I have not tried this but I 
believe it will end up getting a file not found exception once it hits the end 
of the hdfs block currently being read. In both cases the client should retry.

Are these acceptable outcomes? Or do you want the situation to be explicitly 
handled?

> Sometimes some compacted storefiles are not archived on region close
> 
>
> Key: HBASE-20704
> URL: https://issues.apache.org/jira/browse/HBASE-20704
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20369) Document incompatibilities between HBase 1.x and HBase 2.0

2018-06-18 Thread Thiriguna Bharat Rao (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515421#comment-16515421
 ] 

Thiriguna Bharat Rao commented on HBASE-20369:
--

[~elserj] Incorporated the feedback and uploaded the changes in v10 patch. Many 
thanks. 

Best,

Triguna

 

> Document incompatibilities between HBase 1.x and HBase 2.0
> --
>
> Key: HBASE-20369
> URL: https://issues.apache.org/jira/browse/HBASE-20369
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: Thiriguna Bharat Rao
>Assignee: Thiriguna Bharat Rao
>Priority: Critical
>  Labels: patch
> Attachments: HBASE-20369.patch, HBASE-20369_v1.patch, 
> HBASE-20369_v2.patch, HBASE-20369_v3.patch, HBASE-20369_v4.patch, 
> HBASE-20369_v5.patch, HBASE-20369_v6.patch, HBASE-20369_v7.patch, 
> HBASE-20369_v8.patch, HBASE-20369_v9.patch, HBase-20369_v10.patch, book.adoc
>
>
> Hi, 
> I compiled a  draft document for the HBase incompatibilities from the raw 
> source content that was available in HBase Beta 1 site. Can someone please 
> review and provide a feedback or share your comments on this document?
> Appreciate your support and time.
>  
> Best Regards, 
> Triguna



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-20369) Document incompatibilities between HBase 1.x and HBase 2.0

2018-06-18 Thread Thiriguna Bharat Rao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HBASE-20369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiriguna Bharat Rao updated HBASE-20369:
-
Attachment: HBase-20369_v10.patch

> Document incompatibilities between HBase 1.x and HBase 2.0
> --
>
> Key: HBASE-20369
> URL: https://issues.apache.org/jira/browse/HBASE-20369
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: Thiriguna Bharat Rao
>Assignee: Thiriguna Bharat Rao
>Priority: Critical
>  Labels: patch
> Attachments: HBASE-20369.patch, HBASE-20369_v1.patch, 
> HBASE-20369_v2.patch, HBASE-20369_v3.patch, HBASE-20369_v4.patch, 
> HBASE-20369_v5.patch, HBASE-20369_v6.patch, HBASE-20369_v7.patch, 
> HBASE-20369_v8.patch, HBASE-20369_v9.patch, HBase-20369_v10.patch, book.adoc
>
>
> Hi, 
> I compiled a  draft document for the HBase incompatibilities from the raw 
> source content that was available in HBase Beta 1 site. Can someone please 
> review and provide a feedback or share your comments on this document?
> Appreciate your support and time.
>  
> Best Regards, 
> Triguna



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20331) clean up shaded packaging for 2.1

2018-06-18 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HBASE-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515404#comment-16515404
 ] 

Hudson commented on HBASE-20331:


Results for branch HBASE-20331
[build #50 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/50/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/50//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/50//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20331/50//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> clean up shaded packaging for 2.1
> -
>
> Key: HBASE-20331
> URL: https://issues.apache.org/jira/browse/HBASE-20331
> Project: HBase
>  Issue Type: Umbrella
>  Components: Client, mapreduce, shading
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
>
> polishing pass on shaded modules for 2.0 based on trying to use them in more 
> contexts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

97 matches

Mail list logo