[jira] [Created] (HBASE-21879) Read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose
Zheng Hu created HBASE-21879: Summary: Read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose Key: HBASE-21879 URL: https://issues.apache.org/jira/browse/HBASE-21879 Project: HBase Issue Type: Improvement Reporter: Zheng Hu Assignee: Zheng Hu Fix For: 3.0.0, 2.2.0, 2.3.0, 2.1.4 In HFileBlock#readBlockDataInternal, we have the following: {code} @VisibleForTesting protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset, long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean updateMetrics) throws IOException { // . // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with BBPool (offheap). byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize]; int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize, onDiskSizeWithHeader - preReadHeaderSize, true, offset + preReadHeaderSize, pread); if (headerBuf != null) { // ... } // ... } {code} In the read path, we still read the block from hfile to on-heap byte[], then copy the on-heap byte[] to offheap bucket cache asynchronously, and in my 100% get performance test, I also observed some frequent young gc, The largest memory footprint in the young gen should be the on-heap block byte[]. In fact, we can read HFile's block to ByteBuffer directly instead of to byte[] for reducing young gc purpose. we did not implement this before, because no ByteBuffer reading interface in the older HDFS client, but 2.7+ has supported this now, so we can fix this now. I think. Will provide an patch and some perf-comparison for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21870) Remove /0.94 content and add redirect rule
[ https://issues.apache.org/jira/browse/HBASE-21870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Somogyi resolved HBASE-21870. --- Resolution: Fixed > Remove /0.94 content and add redirect rule > -- > > Key: HBASE-21870 > URL: https://issues.apache.org/jira/browse/HBASE-21870 > Project: HBase > Issue Type: Sub-task > Components: website >Reporter: Peter Somogyi >Assignee: Peter Somogyi >Priority: Minor > > 0.94 release is almost 4 years old so it can be removed from > hbase.apache.org. To fix broken link add a redirect rule to .htaccess file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21881) Use Forbidden API Checker to prevent future usages of forbidden api's
Nihal Jain created HBASE-21881: -- Summary: Use Forbidden API Checker to prevent future usages of forbidden api's Key: HBASE-21881 URL: https://issues.apache.org/jira/browse/HBASE-21881 Project: HBase Issue Type: Improvement Components: build Reporter: Nihal Jain -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21854) Race condition in TestProcedureSkipPersistence
[ https://issues.apache.org/jira/browse/HBASE-21854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reopened HBASE-21854: --- Reopen so I can pull back this nice fix to branch-2.0. > Race condition in TestProcedureSkipPersistence > --- > > Key: HBASE-21854 > URL: https://issues.apache.org/jira/browse/HBASE-21854 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Affects Versions: 2.1.3 >Reporter: Peter Somogyi >Assignee: Peter Somogyi >Priority: Minor > Fix For: 3.0.0, 2.2.0, 2.3.0, 2.1.4 > > Attachments: HBASE-21854.patch > > > There is a race condition in TestProcedureSkipPersistence. After the > procedure is added, the test stops ProcedureExecutor. In some cases the > procedure is not added to the queue in time. > Failing execution: > {noformat} > 2019-02-06 14:18:11,133 INFO [Time-limited test] > procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; > org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=3, > timestamp=1549491521133 > 2019-02-06 14:18:11,135 INFO [PEWorker-1] > procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, > locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure; > timeout=2000, timestamp=1549491493135 > 2019-02-06 14:18:11,137 INFO [Time-limited test] hbase.Waiter(189): Waiting > up to [30,000] milli-secs(wait.for.ratio=[1]) > 2019-02-06 14:18:11,139 INFO [Time-limited test] > procedure2.ProcedureTestingUtility(125): RESTART - Stop > 2019-02-06 14:18:11,139 INFO [Time-limited test] > procedure2.ProcedureExecutor(702): Stopping > 2019-02-06 14:18:11,139 INFO [Time-limited test] wal.WALProcedureStore(331): > Stopping the WAL Procedure Store, isAbort=false > 2019-02-06 14:18:11,140 DEBUG [PEWorker-1] > procedure2.RootProcedureState(153): Add procedure pid=1, > state=WAITING_TIMEOUT, locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure > as the 0th rollback step > 2019-02-06 14:18:11,141 WARN [PEWorker-1] > procedure2.ProcedureExecutor$WorkerThread(2074): Worker terminating > UNNATURALLY null > java.lang.RuntimeException: the store must be running before inserting data >at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:710) >at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.update(WALProcedureStore.java:603) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.updateStoreOnExec(ProcedureExecutor.java:1943) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1809) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1481) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2058) > 2019-02-06 14:18:11,145 INFO [Time-limited test] > procedure2.ProcedureTestingUtility(137): RESTART - Start{noformat} > In a successful run the ProcExecutor is stopped AFTER the procedure is > actually in the queue. > Successful: > {noformat} > 2019-02-07 15:48:08,731 INFO [Time-limited test] > procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; > org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=3, > timestamp=1549550918731 > 2019-02-07 15:48:08,731 INFO [PEWorker-1] > procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, > locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure; > timeout=2000, timestamp=1549550890731 > 2019-02-07 15:48:08,732 DEBUG [PEWorker-1] > procedure2.RootProcedureState(153): Add procedure pid=1, > state=WAITING_TIMEOUT, locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure > as the 0th rollback step > 2019-02-07 15:48:08,732 INFO [WALProcedureStoreSyncThread] > wal.WALProcedureStore(1217): Remove all state logs with ID less than 1, since > all the active procedures are in the latest log > 2019-02-07 15:48:08,733 DEBUG [WALProcedureStoreSyncThread] > wal.WALProcedureStore(1239): Removed > log=file:/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-0001.log, > > activeLogs=[/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-0002.log] > 2019-02-07 15:48:08,734 INFO [Time-limited test] hbase.Waiter(189): Waiting > up to [30,000] milli-secs(wait.for.ratio=[1]) > 2019-02-07 15:48:08,734 INFO [Time-limited test] >
Failure: HBase Generate Website
Build status: Failure The HBase website has not been updated to incorporate HBase commit ${CURRENT_HBASE_COMMIT}. See https://builds.apache.org/job/hbase_generate_website/1585/console
[jira] [Created] (HBASE-21880) [hbase-connectors] clean up site target
Peter Somogyi created HBASE-21880: - Summary: [hbase-connectors] clean up site target Key: HBASE-21880 URL: https://issues.apache.org/jira/browse/HBASE-21880 Project: HBase Issue Type: Improvement Components: hbase-connectors Affects Versions: connector-1.0.0 Reporter: Peter Somogyi Assignee: Peter Somogyi Fix For: connector-1.0.0 Site target in hbase-connectors complains when creating Dependency report for hbase-spark module. {noformat} [INFO] Generating "Dependencies" report --- maven-project-info-reports-plugin:2.9:dependencies [WARNING] Artifact jdk.tools:jdk.tools:jar:1.8 has no file and won't be listed in dependency files details. [WARNING] The repository url 'http://repository.codehaus.org' is invalid - Repository 'codehaus' will be blacklisted. [WARNING] The repository url 'file:${project.basedir}/src/site/resources/repo' is invalid - Repository 'project.local' will be blacklisted. [WARNING] The repository url 'https://nexus.codehaus.org/content/repositories/snapshots/' is invalid - Repository 'codehaus-nexus-snapshots' will be blacklisted. [WARNING] The repository url 'http://repository.springsource.com/maven/bundles/external' is invalid - Repository 'spring-external' will be blacklisted. [WARNING] The repository url 'http://snapshots.repository.codehaus.org' is invalid - Repository 'codehaus.org' will be blacklisted. [ERROR] Unable to determine if resource antlr:antlr:jar:2.7.7:test exists in http://maven.glassfish.org/content/groups/glassfish [ERROR] Unable to determine if resource antlr:antlr:jar:2.7.7:test exists in http://download.java.net/maven/glassfish [ERROR] Unable to determine if resource antlr:antlr:jar:2.7.7:test exists in http://download.java.net/maven/2{noformat} Due to a lot of ERRORS the build time is 22 minutes. By configuring maven-project-info-reports-plugin the build time goes down to 1:30 minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21854) Race condition in TestProcedureSkipPersistence
[ https://issues.apache.org/jira/browse/HBASE-21854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-21854. --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.0.5 Pushed to branch-2.0 also. (Thanks [~psomogyi] -- I just pushed it...) > Race condition in TestProcedureSkipPersistence > --- > > Key: HBASE-21854 > URL: https://issues.apache.org/jira/browse/HBASE-21854 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Affects Versions: 2.1.3 >Reporter: Peter Somogyi >Assignee: Peter Somogyi >Priority: Minor > Fix For: 3.0.0, 2.2.0, 2.0.5, 2.3.0, 2.1.4 > > Attachments: HBASE-21854.patch > > > There is a race condition in TestProcedureSkipPersistence. After the > procedure is added, the test stops ProcedureExecutor. In some cases the > procedure is not added to the queue in time. > Failing execution: > {noformat} > 2019-02-06 14:18:11,133 INFO [Time-limited test] > procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; > org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=3, > timestamp=1549491521133 > 2019-02-06 14:18:11,135 INFO [PEWorker-1] > procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, > locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure; > timeout=2000, timestamp=1549491493135 > 2019-02-06 14:18:11,137 INFO [Time-limited test] hbase.Waiter(189): Waiting > up to [30,000] milli-secs(wait.for.ratio=[1]) > 2019-02-06 14:18:11,139 INFO [Time-limited test] > procedure2.ProcedureTestingUtility(125): RESTART - Stop > 2019-02-06 14:18:11,139 INFO [Time-limited test] > procedure2.ProcedureExecutor(702): Stopping > 2019-02-06 14:18:11,139 INFO [Time-limited test] wal.WALProcedureStore(331): > Stopping the WAL Procedure Store, isAbort=false > 2019-02-06 14:18:11,140 DEBUG [PEWorker-1] > procedure2.RootProcedureState(153): Add procedure pid=1, > state=WAITING_TIMEOUT, locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure > as the 0th rollback step > 2019-02-06 14:18:11,141 WARN [PEWorker-1] > procedure2.ProcedureExecutor$WorkerThread(2074): Worker terminating > UNNATURALLY null > java.lang.RuntimeException: the store must be running before inserting data >at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:710) >at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.update(WALProcedureStore.java:603) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.updateStoreOnExec(ProcedureExecutor.java:1943) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1809) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1481) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78) >at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2058) > 2019-02-06 14:18:11,145 INFO [Time-limited test] > procedure2.ProcedureTestingUtility(137): RESTART - Start{noformat} > In a successful run the ProcExecutor is stopped AFTER the procedure is > actually in the queue. > Successful: > {noformat} > 2019-02-07 15:48:08,731 INFO [Time-limited test] > procedure2.TimeoutExecutorThread(82): ADDED pid=-1, state=WAITING_TIMEOUT; > org.apache.hadoop.hbase.procedure2.CompletedProcedureCleaner; timeout=3, > timestamp=1549550918731 > 2019-02-07 15:48:08,731 INFO [PEWorker-1] > procedure2.TimeoutExecutorThread(82): ADDED pid=1, state=WAITING_TIMEOUT, > locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure; > timeout=2000, timestamp=1549550890731 > 2019-02-07 15:48:08,732 DEBUG [PEWorker-1] > procedure2.RootProcedureState(153): Add procedure pid=1, > state=WAITING_TIMEOUT, locked=true; > org.apache.hadoop.hbase.procedure2.TestProcedureSkipPersistence$TestProcedure > as the 0th rollback step > 2019-02-07 15:48:08,732 INFO [WALProcedureStoreSyncThread] > wal.WALProcedureStore(1217): Remove all state logs with ID less than 1, since > all the active procedures are in the latest log > 2019-02-07 15:48:08,733 DEBUG [WALProcedureStoreSyncThread] > wal.WALProcedureStore(1239): Removed > log=file:/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-0001.log, > > activeLogs=[/Users/peter.somogyi/Cloudera/hbase/hbase-procedure/target/test-data/b9a1969a-85a4-15e8-7da5-6198f5acf2de/proc-logs/pv2-0002.log] > 2019-02-07 15:48:08,734 INFO [Time-limited test] hbase.Waiter(189): Waiting > up to
Re: [VOTE] Second release candidate for HBase 2.1.3 is available for download
+1 (binding) Checked signatures, checksums: ok Apache Rat check: ok Build source: ok (jdk1.8.0_201) Unit tests: ok LTT 1M rows: ok Run in pseudo distributed mode: ok (Hadoop 2.7.7) Shell basic commands: ok Web UI: ok Tested RS Group: ok On Tue, Feb 12, 2019 at 7:48 AM Guanghao Zhang wrote: > +1 (binding) > > hbase-2.1.3-bin.tar.gz (jdk1.8.0_171) > - Verified sha512: ok > - Start HBase in standalone mode: ok > - Verified with shell, create/disable/enable/drop/get/put/scan/delete: ok > - Checked master/regionserver/region Web UI: ok > - PE write/read 100K rows: good > > hbase-2.1.3-src.tar.gz (jdk1.8.0_171) > - Verified sha512: ok > - Build tarball: ok > - Start HBase in standalone mode: ok > - Verified with shell, create/disable/enable/drop/get/put/scan/delete: ok > - Checked master/regionserver/region Web UI: ok > - PE write/read 100K rows: good > > 张铎(Duo Zhang) 于2019年2月11日周一 下午8:15写道: > > > The second release candidate for HBase 2.1.3 is available for download: > > > > https://dist.apache.org/repos/dist/dev/hbase/hbase-2.1.3RC1 > > > > Maven artifacts are also available in a staging repository at: > > > > > https://repository.apache.org/content/repositories/orgapachehbase-1254/ > > > > Artifacts are signed with my key (9AD2AE49) published in our KEYS file at > > > > http://www.apache.org/dist/hbase/KEYS > > > > The RC corresponds to the signed tag 2.1.3RC1, which currently points to > > commit > > > > da5ec9e4c06c537213883cca8f3cc9a7c19daf67 > > > > HBase 2.1.3 is the fourth maintenance release in the HBase 2.1 line, > > continuing on the theme of bringing a stable, reliable database to the > > Hadoop and NoSQL communities. It fixes CVE-2018-1320 by upgrading thrift > > dependency from 0.9.3 to 0.12.0, all hbase users who use thrift are > highly > > recommended to upgrade. > > > > 2.1.3 includes ~60 bug and improvement fixes done since the 2.1.2. There > is > > an incompatible change, HBASE-21684, where we change the superclass of > > StoppedRpcClientException from HBaseIOException to DoNotRetryIOException, > > should be low risk, and feel free to contact us if this breaks anything > for > > you. > > > > The detailed source and binary compatibility report vs 2.1.2 has been > > published for your review, at: > > > > > > > > > https://dist.apache.org/repos/dist/dev/hbase/hbase-2.1.3RC1/compatibility_report_2.1.2vs2.1.3RC1.html > > > > The report shows no incompatibilities. > > > > The full list of fixes included in this release is available in the > > CHANGES.md that ships as part of the release also available here: > > > > https://dist.apache.org/repos/dist/dev/hbase/hbase-2.1.3RC1/CHANGES.md > > > > The RELEASENOTES.md are here: > > > > > > > https://dist.apache.org/repos/dist/dev/hbase/hbase-2.1.3RC1/RELEASENOTES.md > > > > Please try out this candidate and vote +1/-1 on whether we should release > > these artifacts as HBase 2.1.3. > > > > The VOTE will remain open for at least 72 hours. > > > > Thanks > > >
Re: Failure: HBase Generate Website
A pull request was just merged to hbase-site repo while this build was running so it failed to push changes. Restarted the job now. On Tue, Feb 12, 2019 at 3:47 PM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > Build status: Failure > > The HBase website has not been updated to incorporate HBase commit > ${CURRENT_HBASE_COMMIT}. > > See https://builds.apache.org/job/hbase_generate_website/1585/console
[jira] [Resolved] (HBASE-21855) Backport HBASE-21838 (Create a special ReplicationEndpoint just for verifying the WAL entries are fine) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-21855. Resolution: Fixed Fix Version/s: (was: 1.5.0) Just picked back the original patch with minor fixups. > Backport HBASE-21838 (Create a special ReplicationEndpoint just for verifying > the WAL entries are fine) to branch-1 > --- > > Key: HBASE-21855 > URL: https://issues.apache.org/jira/browse/HBASE-21855 > Project: HBase > Issue Type: Test > Components: Replication, test >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > > HBASE-21838 is a good idea and I want to enable it during ITBLL testing of > branch-1, so make an equivalent for that branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21882) NEW
Andrey Elenskiy created HBASE-21882: --- Summary: NEW Key: HBASE-21882 URL: https://issues.apache.org/jira/browse/HBASE-21882 Project: HBase Issue Type: Umbrella Reporter: Andrey Elenskiy -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21888) Add a isClosed method to AsyncConnection
Duo Zhang created HBASE-21888: - Summary: Add a isClosed method to AsyncConnection Key: HBASE-21888 URL: https://issues.apache.org/jira/browse/HBASE-21888 Project: HBase Issue Type: Task Components: asyncclient, Client Reporter: Duo Zhang Fix For: 3.0.0, 2.2.0, 2.3.0 Align with Connection interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21885) Cancel remote procedure call if the remote procedure is succeeded
Duo Zhang created HBASE-21885: - Summary: Cancel remote procedure call if the remote procedure is succeeded Key: HBASE-21885 URL: https://issues.apache.org/jira/browse/HBASE-21885 Project: HBase Issue Type: Improvement Components: proc-v2 Reporter: Duo Zhang I used to think it could rarely rarely happen that a region server can report back to master but master can not get the response from region server, only if there are strange network errors. But when implementing HBASE-21875, I found a way to reproduce the problem without any strange network issues. First time, we send the request to region server, and it accept the request, but before returning, there is a network error cause the connection to be broken, so master will try to send the request to the region server again. But then the region server gets too busy, and always returns CallQueueTooBigException, then the master will retry forever, even if the region has already been opened on the region server. And this is not only waste more resources, as later we may close the region on the region server, and if the region server is back, we will receive an open region requst and a close region request at the same time. Not sure if this will cause any problems but at least, we haven't thought this condition yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21887) HBaseTestingUtility should not be IA.Public
Duo Zhang created HBASE-21887: - Summary: HBaseTestingUtility should not be IA.Public Key: HBASE-21887 URL: https://issues.apache.org/jira/browse/HBASE-21887 Project: HBase Issue Type: Task Components: Client, test Reporter: Duo Zhang It exposes too many internal stuffs to end user, and it is not easy to keep the API stable, as it is also used by us to implementing UTs. For end users, we should only exposes API to start a in process mini hbase cluster, and then user can create Connections to communicate with the cluster. And could exposes several APIs to start/stop/restart master regionservers. I think this is enough? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21886) Run ITBLL for branch-2.2
Guanghao Zhang created HBASE-21886: -- Summary: Run ITBLL for branch-2.2 Key: HBASE-21886 URL: https://issues.apache.org/jira/browse/HBASE-21886 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21374) Backport HBASE-21342 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey reopened HBASE-21374: - my backport of this to branch-1.2 caused a findbugs failure. reopening to address. > Backport HBASE-21342 to branch-1 > > > Key: HBASE-21374 > URL: https://issues.apache.org/jira/browse/HBASE-21374 > Project: HBase > Issue Type: Sub-task >Reporter: Mike Drob >Assignee: mazhenlin >Priority: Major > Fix For: 1.5.0, 1.4.10, 1.3.4, 1.2.11 > > Attachments: HBASE-21374.branch-1.001.patch, > HBASE-21374.branch-1.002.patch, HBASE-21374.branch-1.003.patch, > HBASE-21374.branch-1.003.patch, HBASE-21374.branch-1.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21883) Enhancements to Major Compaction tool from HBASE-19528
Thiruvel Thirumoolan created HBASE-21883: Summary: Enhancements to Major Compaction tool from HBASE-19528 Key: HBASE-21883 URL: https://issues.apache.org/jira/browse/HBASE-21883 Project: HBase Issue Type: Improvement Components: Client, Compaction, tooling Affects Versions: 1.5.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 1.5.1 I would like to add new compaction tools based on [~churromorales]'s tool at HBASE-19528. We internally have tools that pick and compact regions based on multiple criteria. Since Rahul already has a version in community, we would like to build on top of it instead of pushing yet another tool. With this jira, I would like to add a tool which looks at regions beyond TTL and compacts them in a rsgroup. We have time series data and those regions will become dead after a while, so we compact those regions to save disk space. We also merge those empty regions to reduce load, but that tool comes later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21884) Fix box/unbox findbugs warning in secure bulk load
Sean Busbey created HBASE-21884: --- Summary: Fix box/unbox findbugs warning in secure bulk load Key: HBASE-21884 URL: https://issues.apache.org/jira/browse/HBASE-21884 Project: HBase Issue Type: Task Affects Versions: 1.5.0, 1.4.10, 1.3.4, 1.2.11 Reporter: Sean Busbey Assignee: Sean Busbey {code} Reason Tests FindBugsmodule:hbase-server Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint.incrementUgiReference(UserGroupInformation) At SecureBulkLoadEndpoint.java:then immediately reboxed in org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint.incrementUgiReference(UserGroupInformation) At SecureBulkLoadEndpoint.java:[line 268] {code} Looking at branch-2 and master I suspect we're doing the same wasteful operation but findbugs can't see it through the lambda definition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21374) Backport HBASE-21342 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey resolved HBASE-21374. - Resolution: Fixed I'm going to fix this in a follow-on, since there are fix version(s) nearing RC. > Backport HBASE-21342 to branch-1 > > > Key: HBASE-21374 > URL: https://issues.apache.org/jira/browse/HBASE-21374 > Project: HBase > Issue Type: Sub-task >Reporter: Mike Drob >Assignee: mazhenlin >Priority: Major > Fix For: 1.4.10, 1.3.4, 1.2.11, 1.5.0 > > Attachments: HBASE-21374.branch-1.001.patch, > HBASE-21374.branch-1.002.patch, HBASE-21374.branch-1.003.patch, > HBASE-21374.branch-1.003.patch, HBASE-21374.branch-1.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Add to HBase slack channel
Hi, I was wondering if I could get access to the 'users' group in HBase user slack channel https://apache-hbase.slack.com Thanks Birinder Tiwana
[jira] [Created] (HBASE-21878) [hbase-connectors] Fix hbase-checkstyle version reference
Peter Somogyi created HBASE-21878: - Summary: [hbase-connectors] Fix hbase-checkstyle version reference Key: HBASE-21878 URL: https://issues.apache.org/jira/browse/HBASE-21878 Project: HBase Issue Type: Bug Components: hbase-connectors Affects Versions: connector-1.0.0 Reporter: Peter Somogyi Assignee: Peter Somogyi Fix For: connector-1.0.0 The connectors repo refers to hbase-checkstyle with ${project.version}. Inside hbase-connectors repo it is incorrect, it has to use hbase version instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21875) Change the retry logic in RSProcedureDispatcher to 'retry by default, only if xxx'
Duo Zhang created HBASE-21875: - Summary: Change the retry logic in RSProcedureDispatcher to 'retry by default, only if xxx' Key: HBASE-21875 URL: https://issues.apache.org/jira/browse/HBASE-21875 Project: HBase Issue Type: Improvement Reporter: Duo Zhang For now it is not retry by default, only if xxx. In executeProcedures, we will only throw a fixed set of exception, so we should change to retry by default, and check for the exceptions which we do not need to retry. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21873) IPCUtil.wrapException should keep the original exception types for all the connection exceptions
[ https://issues.apache.org/jira/browse/HBASE-21873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang reopened HBASE-21873: --- Assignee: Sergey Shelukhin (was: Duo Zhang) > IPCUtil.wrapException should keep the original exception types for all the > connection exceptions > > > Key: HBASE-21873 > URL: https://issues.apache.org/jira/browse/HBASE-21873 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5, 2.3.0 > > Attachments: HBASE-21862-forUT.patch, HBASE-21862-v1.patch, > HBASE-21862-v2.patch, HBASE-21862.patch > > > It's a classic bug, sort of... the call times out to open the region, but RS > actually processes it alright. It could also happen if the response didn't > make it back due to a network issue. > As a result region is opened on two servers. > There are some mitigations possible to narrow down the race window. > 1) Don't process expired open calls, fail them. Won't help for network issues. > 2) Don't ignore invalid RS state, kill it (YouAreDead exception) - but that > will require fixing other network races where master kills RS, which would > require adding state versioning to the protocol. > The fundamental fix though would require either > 1) an unknown failure from open to ascertain the state of the region from the > server. Again, this would probably require protocol changes to make sure we > ascertain the region is not opened, and also that the > already-failed-on-master open is NOT going to be processed if it's some queue > or even in transit on the network (via a nonce-like mechanism)? > 2) some form of a distributed lock per region, e.g. in ZK > 3) some form of 2PC? but the participant list cannot be determined in a > manner that's both scalable and guaranteed correct. Theoretically it could be > all RSes. > {noformat} > 2019-02-08 03:21:31,715 INFO [PEWorker-7] > procedure.MasterProcedureScheduler: Took xlock for pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN > 2019-02-08 03:21:31,758 INFO [PEWorker-7] > assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPEN, > location=server1,17020,1549567999303; forceNewPlan=false, retain=true > 2019-02-08 03:21:31,984 INFO [PEWorker-13] assignment.RegionStateStore: > pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, > regionState=OPENING, regionLocation=server1,17020,1549623714617 > 2019-02-08 03:22:32,552 WARN [RSProcedureDispatcher-pool4-t3451] > assignment.RegionRemoteProcedureBase: The remote operation pid=260637, > ppid=260626, state=RUNNABLE, hasLock=false; > org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region ... > to server server1,17020,1549623714617 failed > java.io.IOException: Call to server1/...:17020 failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, > waitTime=60145, rpcTimeout=6^M > at > org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:185)^M > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)^M > ... > Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, > waitTime=60145, rpcTimeout=6^M > at > org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)^M > ... 4 more^M > {noformat} > RS: > {noformat} > hbase-regionserver.log:2019-02-08 03:22:41,131 INFO > [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: > Open ...d0214809147e43dc6870005742d5d204. > ... > hbase-regionserver.log:2019-02-08 03:25:44,751 INFO > [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: > Opened ...d0214809147e43dc6870005742d5d204. > {noformat} > Retry: > {noformat} > 2019-02-08 03:22:32,967 INFO [PEWorker-6] > assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; > pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, > location=server1,17020,1549623714617 > 2019-02-08 03:22:33,084 INFO [PEWorker-6] > assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE,
[jira] [Resolved] (HBASE-21862) IPCUtil.wrapException should keep the original exception types for all the connection exceptions
[ https://issues.apache.org/jira/browse/HBASE-21862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-21862. --- Resolution: Fixed > IPCUtil.wrapException should keep the original exception types for all the > connection exceptions > > > Key: HBASE-21862 > URL: https://issues.apache.org/jira/browse/HBASE-21862 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5, 2.3.0 > > > It's a classic bug, sort of... the call times out to open the region, but RS > actually processes it alright. It could also happen if the response didn't > make it back due to a network issue. > As a result region is opened on two servers. > There are some mitigations possible to narrow down the race window. > 1) Don't process expired open calls, fail them. Won't help for network issues. > 2) Don't ignore invalid RS state, kill it (YouAreDead exception) - but that > will require fixing other network races where master kills RS, which would > require adding state versioning to the protocol. > The fundamental fix though would require either > 1) an unknown failure from open to ascertain the state of the region from the > server. Again, this would probably require protocol changes to make sure we > ascertain the region is not opened, and also that the > already-failed-on-master open is NOT going to be processed if it's some queue > or even in transit on the network (via a nonce-like mechanism)? > 2) some form of a distributed lock per region, e.g. in ZK > 3) some form of 2PC? but the participant list cannot be determined in a > manner that's both scalable and guaranteed correct. Theoretically it could be > all RSes. > {noformat} > 2019-02-08 03:21:31,715 INFO [PEWorker-7] > procedure.MasterProcedureScheduler: Took xlock for pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN > 2019-02-08 03:21:31,758 INFO [PEWorker-7] > assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPEN, > location=server1,17020,1549567999303; forceNewPlan=false, retain=true > 2019-02-08 03:21:31,984 INFO [PEWorker-13] assignment.RegionStateStore: > pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, > regionState=OPENING, regionLocation=server1,17020,1549623714617 > 2019-02-08 03:22:32,552 WARN [RSProcedureDispatcher-pool4-t3451] > assignment.RegionRemoteProcedureBase: The remote operation pid=260637, > ppid=260626, state=RUNNABLE, hasLock=false; > org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region ... > to server server1,17020,1549623714617 failed > java.io.IOException: Call to server1/...:17020 failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, > waitTime=60145, rpcTimeout=6^M > at > org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:185)^M > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)^M > ... > Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, > waitTime=60145, rpcTimeout=6^M > at > org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)^M > ... 4 more^M > {noformat} > RS: > {noformat} > hbase-regionserver.log:2019-02-08 03:22:41,131 INFO > [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: > Open ...d0214809147e43dc6870005742d5d204. > ... > hbase-regionserver.log:2019-02-08 03:25:44,751 INFO > [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: > Opened ...d0214809147e43dc6870005742d5d204. > {noformat} > Retry: > {noformat} > 2019-02-08 03:22:32,967 INFO [PEWorker-6] > assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; > pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, > location=server1,17020,1549623714617 > 2019-02-08 03:22:33,084 INFO [PEWorker-6] > assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, location=null; >
[jira] [Created] (HBASE-21876) Kill the region server if we can not send request to it for a long time in RSProcedureDispatcher
Duo Zhang created HBASE-21876: - Summary: Kill the region server if we can not send request to it for a long time in RSProcedureDispatcher Key: HBASE-21876 URL: https://issues.apache.org/jira/browse/HBASE-21876 Project: HBase Issue Type: Improvement Components: proc-v2 Reporter: Duo Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21877) Should not expose replication source metrics to ReplicationEndpoint
Guanghao Zhang created HBASE-21877: -- Summary: Should not expose replication source metrics to ReplicationEndpoint Key: HBASE-21877 URL: https://issues.apache.org/jira/browse/HBASE-21877 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang ReplicationEndpoint is a plugin which implements replication to other HBase clusters, or other systems. Now there are some replication source metrics which was updated in ReplicationEndpoint. It is easy to forget update replication source metrics or misuse the metrics when use implement special ReplicationEndpoint. -- This message was sent by Atlassian JIRA (v7.6.3#76005)