[jira] [Created] (HBASE-21879) Read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose

2019-02-12 Thread Zheng Hu (JIRA)
Zheng Hu created HBASE-21879:


 Summary: Read HFile's block to ByteBuffer directly instead of to 
byte for reducing young gc purpose
 Key: HBASE-21879
 URL: https://issues.apache.org/jira/browse/HBASE-21879
 Project: HBase
  Issue Type: Improvement
Reporter: Zheng Hu
Assignee: Zheng Hu
 Fix For: 3.0.0, 2.2.0, 2.3.0, 2.1.4


In HFileBlock#readBlockDataInternal,  we have the following: 
{code}
@VisibleForTesting
protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset,
long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean 
updateMetrics)
 throws IOException {
 // .
  // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with 
BBPool (offheap).
  byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize];
  int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize,
  onDiskSizeWithHeader - preReadHeaderSize, true, offset + 
preReadHeaderSize, pread);
  if (headerBuf != null) {
// ...
  }
  // ...
 }
{code}

In the read path,  we still read the block from hfile to on-heap byte[], then 
copy the on-heap byte[] to offheap bucket cache asynchronously,  and in my  
100% get performance test, I also observed some frequent young gc,  The largest 
memory footprint in the young gen should be the on-heap block byte[].

In fact, we can read HFile's block to ByteBuffer directly instead of to byte[] 
for reducing young gc purpose. we did not implement this before, because no 
ByteBuffer reading interface in the older HDFS client, but 2.7+ has supported 
this now,  so we can fix this now. I think. 

Will provide an patch and some perf-comparison for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21870) Remove /0.94 content and add redirect rule

2019-02-12 Thread Peter Somogyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi resolved HBASE-21870.
---
Resolution: Fixed

> Remove /0.94 content and add redirect rule
> --
>
> Key: HBASE-21870
> URL: https://issues.apache.org/jira/browse/HBASE-21870
> Project: HBase
>  Issue Type: Sub-task
>  Components: website
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Minor
>
> 0.94 release is almost 4 years old so it can be removed from 
> hbase.apache.org. To fix broken link add a redirect rule to .htaccess file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21881) Use Forbidden API Checker to prevent future usages of forbidden api's

2019-02-12 Thread Nihal Jain (JIRA)
Nihal Jain created HBASE-21881:
--

 Summary: Use Forbidden API Checker to prevent future usages of 
forbidden api's
 Key: HBASE-21881
 URL: https://issues.apache.org/jira/browse/HBASE-21881
 Project: HBase
  Issue Type: Improvement
  Components: build
Reporter: Nihal Jain






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21854) Race condition in TestProcedureSkipPersistence

2019-02-12 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-21854:
---

Reopen so I can pull back this nice fix to branch-2.0.

> Race condition in TestProcedureSkipPersistence 
> ---
>
> Key: HBASE-21854
> URL: https://issues.apache.org/jira/browse/HBASE-21854
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.1.3
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Minor
> Fix For: 3.0.0, 2.2.0, 2.3.0, 2.1.4
>
> Attachments: HBASE-21854.patch
>
>
> There is a race condition in TestProcedureSkipPersistence. After the 
> procedure is added, the test stops ProcedureExecutor. In some cases the 
> procedure is not added to the queue in time.
> Failing execution:
> {noformat}
> 2019-02-06 14:18:11,133 INFO  [Time-limited test] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=3, 
> timestamp=1549491521133
> 2019-02-06 14:18:11,135 INFO  [PEWorker-1] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, 
> locked=true; 
> org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure;
>  timeout=2000, timestamp=1549491493135
> 2019-02-06 14:18:11,137 INFO  [Time-limited test] hbase.Waiter(189): Waiting 
> up to [30,000] milli-secs(wait.for.ratio=[1])
> 2019-02-06 14:18:11,139 INFO  [Time-limited test] 
> procedure2.ProcedureTestingUtility(125): RESTART - Stop
> 2019-02-06 14:18:11,139 INFO  [Time-limited test] 
> procedure2.ProcedureExecutor(702): Stopping
> 2019-02-06 14:18:11,139 INFO  [Time-limited test] wal.WALProcedureStore(331): 
> Stopping the WAL Procedure Store, isAbort=false
> 2019-02-06 14:18:11,140 DEBUG [PEWorker-1] 
> procedure2.RootProcedureState(153): Add procedure pid=1, 
> state=WAITING_TIMEOUT, locked=true; 
> org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure 
> as the 0th rollback step
> 2019-02-06 14:18:11,141 WARN  [PEWorker-1] 
> procedure2.ProcedureExecutor$WorkerThread(2074): Worker terminating 
> UNNATURALLY null
> java.lang.RuntimeException: the store must be running before inserting data
>at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:710)
>at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.update(WALProcedureStore.java:603)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.updateStoreOnExec(ProcedureExecutor.java:1943)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1809)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1481)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2058)
> 2019-02-06 14:18:11,145 INFO  [Time-limited test] 
> procedure2.ProcedureTestingUtility(137): RESTART - Start{noformat}
> In a successful run the ProcExecutor is stopped AFTER the procedure is 
> actually in the queue.
> Successful:
> {noformat}
> 2019-02-07 15:48:08,731 INFO  [Time-limited test] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=3, 
> timestamp=1549550918731
> 2019-02-07 15:48:08,731 INFO  [PEWorker-1] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, 
> locked=true; 
> org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure;
>  timeout=2000, timestamp=1549550890731
> 2019-02-07 15:48:08,732 DEBUG [PEWorker-1] 
> procedure2.RootProcedureState(153): Add procedure pid=1, 
> state=WAITING_TIMEOUT, locked=true; 
> org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure 
> as the 0th rollback step
> 2019-02-07 15:48:08,732 INFO  [WALProcedureStoreSyncThread] 
> wal.WALProcedureStore(1217): Remove all state logs with ID less than 1, since 
> all the active procedures are in the latest log
> 2019-02-07 15:48:08,733 DEBUG [WALProcedureStoreSyncThread] 
> wal.WALProcedureStore(1239): Removed 
> log=file:/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-0001.log,
>  
> activeLogs=[/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-0002.log]
> 2019-02-07 15:48:08,734 INFO  [Time-limited test] hbase.Waiter(189): Waiting 
> up to [30,000] milli-secs(wait.for.ratio=[1])
> 2019-02-07 15:48:08,734 INFO  [Time-limited test] 
> 

Failure: HBase Generate Website

2019-02-12 Thread Apache Jenkins Server
Build status: Failure

The HBase website has not been updated to incorporate HBase commit 
${CURRENT_HBASE_COMMIT}.

See https://builds.apache.org/job/hbase_generate_website/1585/console

[jira] [Created] (HBASE-21880) [hbase-connectors] clean up site target

2019-02-12 Thread Peter Somogyi (JIRA)
Peter Somogyi created HBASE-21880:
-

 Summary: [hbase-connectors] clean up site target
 Key: HBASE-21880
 URL: https://issues.apache.org/jira/browse/HBASE-21880
 Project: HBase
  Issue Type: Improvement
  Components: hbase-connectors
Affects Versions: connector-1.0.0
Reporter: Peter Somogyi
Assignee: Peter Somogyi
 Fix For: connector-1.0.0


Site target in hbase-connectors complains when creating Dependency report for 
hbase-spark module.
{noformat}
[INFO] Generating "Dependencies" report --- 
maven-project-info-reports-plugin:2.9:dependencies
[WARNING] Artifact jdk.tools:jdk.tools:jar:1.8 has no file and won't be listed 
in dependency files details.
[WARNING] The repository url 'http://repository.codehaus.org' is invalid - 
Repository 'codehaus' will be blacklisted.
[WARNING] The repository url 'file:${project.basedir}/src/site/resources/repo' 
is invalid - Repository 'project.local' will be blacklisted.
[WARNING] The repository url 
'https://nexus.codehaus.org/content/repositories/snapshots/' is invalid - 
Repository 'codehaus-nexus-snapshots' will be blacklisted.
[WARNING] The repository url 
'http://repository.springsource.com/maven/bundles/external' is invalid - 
Repository 'spring-external' will be blacklisted.
[WARNING] The repository url 'http://snapshots.repository.codehaus.org' is 
invalid - Repository 'codehaus.org' will be blacklisted.
[ERROR] Unable to determine if resource antlr:antlr:jar:2.7.7:test exists in 
http://maven.glassfish.org/content/groups/glassfish
[ERROR] Unable to determine if resource antlr:antlr:jar:2.7.7:test exists in 
http://download.java.net/maven/glassfish
[ERROR] Unable to determine if resource antlr:antlr:jar:2.7.7:test exists in 
http://download.java.net/maven/2{noformat}
Due to a lot of ERRORS the build time is 22 minutes. By configuring 
maven-project-info-reports-plugin the build time goes down to 1:30 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21854) Race condition in TestProcedureSkipPersistence

2019-02-12 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-21854.
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.5

Pushed to branch-2.0 also.

(Thanks [~psomogyi] -- I just pushed it...)

> Race condition in TestProcedureSkipPersistence 
> ---
>
> Key: HBASE-21854
> URL: https://issues.apache.org/jira/browse/HBASE-21854
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2
>Affects Versions: 2.1.3
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>Priority: Minor
> Fix For: 3.0.0, 2.2.0, 2.0.5, 2.3.0, 2.1.4
>
> Attachments: HBASE-21854.patch
>
>
> There is a race condition in TestProcedureSkipPersistence. After the 
> procedure is added, the test stops ProcedureExecutor. In some cases the 
> procedure is not added to the queue in time.
> Failing execution:
> {noformat}
> 2019-02-06 14:18:11,133 INFO  [Time-limited test] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=3, 
> timestamp=1549491521133
> 2019-02-06 14:18:11,135 INFO  [PEWorker-1] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, 
> locked=true; 
> org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure;
>  timeout=2000, timestamp=1549491493135
> 2019-02-06 14:18:11,137 INFO  [Time-limited test] hbase.Waiter(189): Waiting 
> up to [30,000] milli-secs(wait.for.ratio=[1])
> 2019-02-06 14:18:11,139 INFO  [Time-limited test] 
> procedure2.ProcedureTestingUtility(125): RESTART - Stop
> 2019-02-06 14:18:11,139 INFO  [Time-limited test] 
> procedure2.ProcedureExecutor(702): Stopping
> 2019-02-06 14:18:11,139 INFO  [Time-limited test] wal.WALProcedureStore(331): 
> Stopping the WAL Procedure Store, isAbort=false
> 2019-02-06 14:18:11,140 DEBUG [PEWorker-1] 
> procedure2.RootProcedureState(153): Add procedure pid=1, 
> state=WAITING_TIMEOUT, locked=true; 
> org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure 
> as the 0th rollback step
> 2019-02-06 14:18:11,141 WARN  [PEWorker-1] 
> procedure2.ProcedureExecutor$WorkerThread(2074): Worker terminating 
> UNNATURALLY null
> java.lang.RuntimeException: the store must be running before inserting data
>at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:710)
>at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.update(WALProcedureStore.java:603)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.updateStoreOnExec(ProcedureExecutor.java:1943)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1809)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1481)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78)
>at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2058)
> 2019-02-06 14:18:11,145 INFO  [Time-limited test] 
> procedure2.ProcedureTestingUtility(137): RESTART - Start{noformat}
> In a successful run the ProcExecutor is stopped AFTER the procedure is 
> actually in the queue.
> Successful:
> {noformat}
> 2019-02-07 15:48:08,731 INFO  [Time-limited test] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; 
> org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=3, 
> timestamp=1549550918731
> 2019-02-07 15:48:08,731 INFO  [PEWorker-1] 
> procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, 
> locked=true; 
> org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure;
>  timeout=2000, timestamp=1549550890731
> 2019-02-07 15:48:08,732 DEBUG [PEWorker-1] 
> procedure2.RootProcedureState(153): Add procedure pid=1, 
> state=WAITING_TIMEOUT, locked=true; 
> org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure 
> as the 0th rollback step
> 2019-02-07 15:48:08,732 INFO  [WALProcedureStoreSyncThread] 
> wal.WALProcedureStore(1217): Remove all state logs with ID less than 1, since 
> all the active procedures are in the latest log
> 2019-02-07 15:48:08,733 DEBUG [WALProcedureStoreSyncThread] 
> wal.WALProcedureStore(1239): Removed 
> log=file:/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-0001.log,
>  
> activeLogs=[/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-0002.log]
> 2019-02-07 15:48:08,734 INFO  [Time-limited test] hbase.Waiter(189): Waiting 
> up to 

Re: [VOTE] Second release candidate for HBase 2.1.3 is available for download

2019-02-12 Thread Peter Somogyi
+1 (binding)

Checked signatures, checksums: ok
Apache Rat check: ok
Build source: ok (jdk1.8.0_201)
Unit tests: ok
LTT 1M rows: ok
Run in pseudo distributed mode: ok (Hadoop 2.7.7)
Shell basic commands: ok
Web UI: ok
Tested RS Group: ok

On Tue, Feb 12, 2019 at 7:48 AM Guanghao Zhang  wrote:

> +1 (binding)
>
> hbase-2.1.3-bin.tar.gz (jdk1.8.0_171)
> - Verified sha512: ok
> - Start HBase in standalone mode: ok
> - Verified with shell, create/disable/enable/drop/get/put/scan/delete: ok
> - Checked master/regionserver/region Web UI: ok
> - PE write/read 100K rows: good
>
> hbase-2.1.3-src.tar.gz (jdk1.8.0_171)
> - Verified sha512: ok
> - Build tarball: ok
> - Start HBase in standalone mode: ok
> - Verified with shell, create/disable/enable/drop/get/put/scan/delete: ok
> - Checked master/regionserver/region Web UI: ok
> - PE write/read 100K rows: good
>
> 张铎(Duo Zhang)  于2019年2月11日周一 下午8:15写道:
>
> > The second release candidate for HBase 2.1.3 is available for download:
> >
> >   https://dist.apache.org/repos/dist/dev/hbase/hbase-2.1.3RC1
> >
> > Maven artifacts are also available in a staging repository at:
> >
> >
> https://repository.apache.org/content/repositories/orgapachehbase-1254/
> >
> > Artifacts are signed with my key (9AD2AE49) published in our KEYS file at
> >
> >  http://www.apache.org/dist/hbase/KEYS
> >
> > The RC corresponds to the signed tag 2.1.3RC1, which currently points to
> > commit
> >
> >   da5ec9e4c06c537213883cca8f3cc9a7c19daf67
> >
> > HBase 2.1.3 is the fourth maintenance release in the HBase 2.1 line,
> > continuing on the theme of bringing a stable, reliable database to the
> > Hadoop and NoSQL communities. It fixes CVE-2018-1320 by upgrading thrift
> > dependency from 0.9.3 to 0.12.0, all hbase users who use thrift are
> highly
> > recommended to upgrade.
> >
> > 2.1.3 includes ~60 bug and improvement fixes done since the 2.1.2. There
> is
> > an incompatible change, HBASE-21684, where we change the superclass of
> > StoppedRpcClientException from HBaseIOException to DoNotRetryIOException,
> > should be low risk, and feel free to contact us if this breaks anything
> for
> > you.
> >
> > The detailed source and binary compatibility report vs 2.1.2 has been
> > published for your review, at:
> >
> >
> >
> >
> https://dist.apache.org/repos/dist/dev/hbase/hbase-2.1.3RC1/compatibility_report_2.1.2vs2.1.3RC1.html
> >
> > The report shows no incompatibilities.
> >
> > The full list of fixes included in this release is available in the
> > CHANGES.md that ships as part of the release also available here:
> >
> >   https://dist.apache.org/repos/dist/dev/hbase/hbase-2.1.3RC1/CHANGES.md
> >
> > The RELEASENOTES.md are here:
> >
> >
> >
> https://dist.apache.org/repos/dist/dev/hbase/hbase-2.1.3RC1/RELEASENOTES.md
> >
> > Please try out this candidate and vote +1/-1 on whether we should release
> > these artifacts as HBase 2.1.3.
> >
> > The VOTE will remain open for at least 72 hours.
> >
> > Thanks
> >
>


Re: Failure: HBase Generate Website

2019-02-12 Thread Peter Somogyi
A pull request was just merged to hbase-site repo while this build was
running so it failed to push changes. Restarted the job now.

On Tue, Feb 12, 2019 at 3:47 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build status: Failure
>
> The HBase website has not been updated to incorporate HBase commit
> ${CURRENT_HBASE_COMMIT}.
>
> See https://builds.apache.org/job/hbase_generate_website/1585/console


[jira] [Resolved] (HBASE-21855) Backport HBASE-21838 (Create a special ReplicationEndpoint just for verifying the WAL entries are fine) to branch-1

2019-02-12 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-21855.

   Resolution: Fixed
Fix Version/s: (was: 1.5.0)

Just picked back the original patch with minor fixups.

> Backport HBASE-21838 (Create a special ReplicationEndpoint just for verifying 
> the WAL entries are fine) to branch-1
> ---
>
> Key: HBASE-21855
> URL: https://issues.apache.org/jira/browse/HBASE-21855
> Project: HBase
>  Issue Type: Test
>  Components: Replication, test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>
> HBASE-21838 is a good idea and I want to enable it during ITBLL testing of 
> branch-1, so make an equivalent for that branch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21882) NEW

2019-02-12 Thread Andrey Elenskiy (JIRA)
Andrey Elenskiy created HBASE-21882:
---

 Summary: NEW
 Key: HBASE-21882
 URL: https://issues.apache.org/jira/browse/HBASE-21882
 Project: HBase
  Issue Type: Umbrella
Reporter: Andrey Elenskiy






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21888) Add a isClosed method to AsyncConnection

2019-02-12 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21888:
-

 Summary: Add a isClosed method to AsyncConnection
 Key: HBASE-21888
 URL: https://issues.apache.org/jira/browse/HBASE-21888
 Project: HBase
  Issue Type: Task
  Components: asyncclient, Client
Reporter: Duo Zhang
 Fix For: 3.0.0, 2.2.0, 2.3.0


Align with Connection interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21885) Cancel remote procedure call if the remote procedure is succeeded

2019-02-12 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21885:
-

 Summary: Cancel remote procedure call if the remote procedure is 
succeeded
 Key: HBASE-21885
 URL: https://issues.apache.org/jira/browse/HBASE-21885
 Project: HBase
  Issue Type: Improvement
  Components: proc-v2
Reporter: Duo Zhang


I used to think it could rarely rarely happen that a region server can report 
back to master but master can not get the response from region server, only if 
there are strange network errors. But when implementing HBASE-21875, I found a 
way to reproduce the problem without any strange network issues.

First time, we send the request to region server, and it accept the request, 
but before returning, there is a network error cause the connection to be 
broken, so master  will try to send the request to the region server again. But 
then the region server gets too busy, and always returns 
CallQueueTooBigException, then the master will retry forever, even if the 
region has already been opened on the region server.

And this is not only waste more resources, as later we may close the region on 
the region server, and if the region server is back, we will receive an open 
region requst and a close region request at the same time. Not sure if this 
will cause any problems but at least, we haven't thought this condition yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21887) HBaseTestingUtility should not be IA.Public

2019-02-12 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21887:
-

 Summary: HBaseTestingUtility should not be IA.Public
 Key: HBASE-21887
 URL: https://issues.apache.org/jira/browse/HBASE-21887
 Project: HBase
  Issue Type: Task
  Components: Client, test
Reporter: Duo Zhang


It exposes too many internal stuffs to end user, and it is not easy to keep the 
API stable, as it is also used by us to implementing UTs.

For end users, we should only exposes API to start a in process mini hbase 
cluster, and then user can create Connections to communicate with the cluster. 
And could exposes several APIs to start/stop/restart master regionservers. I 
think this is enough?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21886) Run ITBLL for branch-2.2

2019-02-12 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-21886:
--

 Summary: Run ITBLL for branch-2.2
 Key: HBASE-21886
 URL: https://issues.apache.org/jira/browse/HBASE-21886
 Project: HBase
  Issue Type: Sub-task
Reporter: Guanghao Zhang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21374) Backport HBASE-21342 to branch-1

2019-02-12 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reopened HBASE-21374:
-

my backport of this to branch-1.2 caused a findbugs failure. reopening to 
address.

> Backport HBASE-21342 to branch-1
> 
>
> Key: HBASE-21374
> URL: https://issues.apache.org/jira/browse/HBASE-21374
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Mike Drob
>Assignee: mazhenlin
>Priority: Major
> Fix For: 1.5.0, 1.4.10, 1.3.4, 1.2.11
>
> Attachments: HBASE-21374.branch-1.001.patch, 
> HBASE-21374.branch-1.002.patch, HBASE-21374.branch-1.003.patch, 
> HBASE-21374.branch-1.003.patch, HBASE-21374.branch-1.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21883) Enhancements to Major Compaction tool from HBASE-19528

2019-02-12 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HBASE-21883:


 Summary: Enhancements to Major Compaction tool from HBASE-19528
 Key: HBASE-21883
 URL: https://issues.apache.org/jira/browse/HBASE-21883
 Project: HBase
  Issue Type: Improvement
  Components: Client, Compaction, tooling
Affects Versions: 1.5.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.5.1


I would like to add new compaction tools based on [~churromorales]'s tool at 
HBASE-19528.

We internally have tools that pick and compact regions based on multiple 
criteria. Since Rahul already has a version in community, we would like to 
build on top of it instead of pushing yet another tool.

With this jira, I would like to add a tool which looks at regions beyond TTL 
and compacts them in a rsgroup. We have time series data and those regions will 
become dead after a while, so we compact those regions to save disk space. We 
also merge those empty regions to reduce load, but that tool comes later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21884) Fix box/unbox findbugs warning in secure bulk load

2019-02-12 Thread Sean Busbey (JIRA)
Sean Busbey created HBASE-21884:
---

 Summary: Fix box/unbox findbugs warning in secure bulk load
 Key: HBASE-21884
 URL: https://issues.apache.org/jira/browse/HBASE-21884
 Project: HBase
  Issue Type: Task
Affects Versions: 1.5.0, 1.4.10, 1.3.4, 1.2.11
Reporter: Sean Busbey
Assignee: Sean Busbey


{code}

Reason  Tests
FindBugsmodule:hbase-server
Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint.incrementUgiReference(UserGroupInformation)
 At SecureBulkLoadEndpoint.java:then immediately reboxed in 
org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint.incrementUgiReference(UserGroupInformation)
 At SecureBulkLoadEndpoint.java:[line 268]
{code}

Looking at branch-2 and master I suspect we're doing the same wasteful 
operation but findbugs can't see it through the lambda definition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21374) Backport HBASE-21342 to branch-1

2019-02-12 Thread Sean Busbey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey resolved HBASE-21374.
-
Resolution: Fixed

I'm going to fix this in a follow-on, since there are fix version(s) nearing RC.

> Backport HBASE-21342 to branch-1
> 
>
> Key: HBASE-21374
> URL: https://issues.apache.org/jira/browse/HBASE-21374
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Mike Drob
>Assignee: mazhenlin
>Priority: Major
> Fix For: 1.4.10, 1.3.4, 1.2.11, 1.5.0
>
> Attachments: HBASE-21374.branch-1.001.patch, 
> HBASE-21374.branch-1.002.patch, HBASE-21374.branch-1.003.patch, 
> HBASE-21374.branch-1.003.patch, HBASE-21374.branch-1.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Add to HBase slack channel

2019-02-12 Thread Birinder Tiwana
Hi,

I was wondering if I could get access to the 'users' group in  HBase user
slack channel

https://apache-hbase.slack.com

Thanks
Birinder Tiwana


[jira] [Created] (HBASE-21878) [hbase-connectors] Fix hbase-checkstyle version reference

2019-02-12 Thread Peter Somogyi (JIRA)
Peter Somogyi created HBASE-21878:
-

 Summary: [hbase-connectors] Fix hbase-checkstyle version reference
 Key: HBASE-21878
 URL: https://issues.apache.org/jira/browse/HBASE-21878
 Project: HBase
  Issue Type: Bug
  Components: hbase-connectors
Affects Versions: connector-1.0.0
Reporter: Peter Somogyi
Assignee: Peter Somogyi
 Fix For: connector-1.0.0


The connectors repo refers to hbase-checkstyle with ${project.version}. Inside 
hbase-connectors repo it is incorrect, it has to use hbase version instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21875) Change the retry logic in RSProcedureDispatcher to 'retry by default, only if xxx'

2019-02-12 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21875:
-

 Summary: Change the retry logic in RSProcedureDispatcher to 'retry 
by default, only if xxx'
 Key: HBASE-21875
 URL: https://issues.apache.org/jira/browse/HBASE-21875
 Project: HBase
  Issue Type: Improvement
Reporter: Duo Zhang


For now it is not retry by default, only if xxx.

In executeProcedures, we will only throw a fixed set of exception, so we should 
change to retry by default, and check for the exceptions which we do not need 
to retry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21873) IPCUtil.wrapException should keep the original exception types for all the connection exceptions

2019-02-12 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-21873:
---
  Assignee: Sergey Shelukhin  (was: Duo Zhang)

> IPCUtil.wrapException should keep the original exception types for all the 
> connection exceptions
> 
>
> Key: HBASE-21873
> URL: https://issues.apache.org/jira/browse/HBASE-21873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5, 2.3.0
>
> Attachments: HBASE-21862-forUT.patch, HBASE-21862-v1.patch, 
> HBASE-21862-v2.patch, HBASE-21862.patch
>
>
> It's a classic bug, sort of... the call times out to open the region, but RS 
> actually processes it alright. It could also happen if the response didn't 
> make it back due to a network issue.
> As a result region is opened on two servers.
> There are some mitigations possible to narrow down the race window.
> 1) Don't process expired open calls, fail them. Won't help for network issues.
> 2) Don't ignore invalid RS state, kill it (YouAreDead exception) - but that 
> will require fixing other network races where master kills RS, which would 
> require adding state versioning to the protocol.
> The fundamental fix though would require either
> 1) an unknown failure from open to ascertain the state of the region from the 
> server. Again, this would probably require protocol changes to make sure we 
> ascertain the region is not opened, and also that the 
> already-failed-on-master open is NOT going to be processed if it's some queue 
> or even in transit on the network (via a nonce-like mechanism)?
> 2) some form of a distributed lock per region, e.g. in ZK
> 3) some form of 2PC? but the participant list cannot be determined in a 
> manner that's both scalable and guaranteed correct. Theoretically it could be 
> all RSes.
> {noformat}
> 2019-02-08 03:21:31,715 INFO  [PEWorker-7] 
> procedure.MasterProcedureScheduler: Took xlock for pid=260626, ppid=260595, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; 
> TransitRegionStateProcedure table=table, 
> region=d0214809147e43dc6870005742d5d204, ASSIGN
> 2019-02-08 03:21:31,758 INFO  [PEWorker-7] 
> assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; 
> TransitRegionStateProcedure table=table, 
> region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPEN, 
> location=server1,17020,1549567999303; forceNewPlan=false, retain=true
> 2019-02-08 03:21:31,984 INFO  [PEWorker-13] assignment.RegionStateStore: 
> pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, 
> regionState=OPENING, regionLocation=server1,17020,1549623714617
> 2019-02-08 03:22:32,552 WARN  [RSProcedureDispatcher-pool4-t3451] 
> assignment.RegionRemoteProcedureBase: The remote operation pid=260637, 
> ppid=260626, state=RUNNABLE, hasLock=false; 
> org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region ... 
> to server server1,17020,1549623714617 failed
> java.io.IOException: Call to server1/...:17020 failed on local exception: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, 
> waitTime=60145, rpcTimeout=6^M
> at 
> org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:185)^M
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)^M
> ...
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, 
> waitTime=60145, rpcTimeout=6^M
> at 
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)^M
> ... 4 more^M
> {noformat}
> RS:
> {noformat}
> hbase-regionserver.log:2019-02-08 03:22:41,131 INFO  
> [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: 
> Open ...d0214809147e43dc6870005742d5d204.
> ...
> hbase-regionserver.log:2019-02-08 03:25:44,751 INFO  
> [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: 
> Opened ...d0214809147e43dc6870005742d5d204.
> {noformat}
> Retry:
> {noformat}
> 2019-02-08 03:22:32,967 INFO  [PEWorker-6] 
> assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; 
> pid=260626, ppid=260595, 
> state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; 
> TransitRegionStateProcedure table=table, 
> region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, 
> location=server1,17020,1549623714617
> 2019-02-08 03:22:33,084 INFO  [PEWorker-6] 
> assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, 

[jira] [Resolved] (HBASE-21862) IPCUtil.wrapException should keep the original exception types for all the connection exceptions

2019-02-12 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-21862.
---
Resolution: Fixed

> IPCUtil.wrapException should keep the original exception types for all the 
> connection exceptions
> 
>
> Key: HBASE-21862
> URL: https://issues.apache.org/jira/browse/HBASE-21862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5, 2.3.0
>
>
> It's a classic bug, sort of... the call times out to open the region, but RS 
> actually processes it alright. It could also happen if the response didn't 
> make it back due to a network issue.
> As a result region is opened on two servers.
> There are some mitigations possible to narrow down the race window.
> 1) Don't process expired open calls, fail them. Won't help for network issues.
> 2) Don't ignore invalid RS state, kill it (YouAreDead exception) - but that 
> will require fixing other network races where master kills RS, which would 
> require adding state versioning to the protocol.
> The fundamental fix though would require either
> 1) an unknown failure from open to ascertain the state of the region from the 
> server. Again, this would probably require protocol changes to make sure we 
> ascertain the region is not opened, and also that the 
> already-failed-on-master open is NOT going to be processed if it's some queue 
> or even in transit on the network (via a nonce-like mechanism)?
> 2) some form of a distributed lock per region, e.g. in ZK
> 3) some form of 2PC? but the participant list cannot be determined in a 
> manner that's both scalable and guaranteed correct. Theoretically it could be 
> all RSes.
> {noformat}
> 2019-02-08 03:21:31,715 INFO  [PEWorker-7] 
> procedure.MasterProcedureScheduler: Took xlock for pid=260626, ppid=260595, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; 
> TransitRegionStateProcedure table=table, 
> region=d0214809147e43dc6870005742d5d204, ASSIGN
> 2019-02-08 03:21:31,758 INFO  [PEWorker-7] 
> assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; 
> TransitRegionStateProcedure table=table, 
> region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPEN, 
> location=server1,17020,1549567999303; forceNewPlan=false, retain=true
> 2019-02-08 03:21:31,984 INFO  [PEWorker-13] assignment.RegionStateStore: 
> pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, 
> regionState=OPENING, regionLocation=server1,17020,1549623714617
> 2019-02-08 03:22:32,552 WARN  [RSProcedureDispatcher-pool4-t3451] 
> assignment.RegionRemoteProcedureBase: The remote operation pid=260637, 
> ppid=260626, state=RUNNABLE, hasLock=false; 
> org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region ... 
> to server server1,17020,1549623714617 failed
> java.io.IOException: Call to server1/...:17020 failed on local exception: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, 
> waitTime=60145, rpcTimeout=6^M
> at 
> org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:185)^M
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)^M
> ...
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, 
> waitTime=60145, rpcTimeout=6^M
> at 
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)^M
> ... 4 more^M
> {noformat}
> RS:
> {noformat}
> hbase-regionserver.log:2019-02-08 03:22:41,131 INFO  
> [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: 
> Open ...d0214809147e43dc6870005742d5d204.
> ...
> hbase-regionserver.log:2019-02-08 03:25:44,751 INFO  
> [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: 
> Opened ...d0214809147e43dc6870005742d5d204.
> {noformat}
> Retry:
> {noformat}
> 2019-02-08 03:22:32,967 INFO  [PEWorker-6] 
> assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; 
> pid=260626, ppid=260595, 
> state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; 
> TransitRegionStateProcedure table=table, 
> region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, 
> location=server1,17020,1549623714617
> 2019-02-08 03:22:33,084 INFO  [PEWorker-6] 
> assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; 
> TransitRegionStateProcedure table=table, 
> region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, location=null; 
> 

[jira] [Created] (HBASE-21876) Kill the region server if we can not send request to it for a long time in RSProcedureDispatcher

2019-02-12 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21876:
-

 Summary: Kill the region server if we can not send request to it 
for a long time in RSProcedureDispatcher 
 Key: HBASE-21876
 URL: https://issues.apache.org/jira/browse/HBASE-21876
 Project: HBase
  Issue Type: Improvement
  Components: proc-v2
Reporter: Duo Zhang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21877) Should not expose replication source metrics to ReplicationEndpoint

2019-02-12 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-21877:
--

 Summary: Should not expose replication source metrics to 
ReplicationEndpoint
 Key: HBASE-21877
 URL: https://issues.apache.org/jira/browse/HBASE-21877
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang
Assignee: Guanghao Zhang


ReplicationEndpoint is a plugin which implements replication to other HBase 
clusters, or other systems. Now there are some replication source metrics which 
was updated in ReplicationEndpoint. It is easy to forget update replication 
source metrics or misuse the metrics when use implement special 
ReplicationEndpoint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)