Re: Moving 2.0 forward
I set HBASE-18169 as a Blocker because I found some critial problems on our current CPs. The semantics are broken. Although we are allowed to break CP in a major release, I think we need to provide the same ability(in another way). 2017-07-25 1:25 GMT+08:00 Josh Elser : > > > On 7/21/17 12:03 PM, Stack wrote: > >> Status update girls and boys! >> >> hbase-2.0.0-alpha1 went out June 22nd. >> >> alpha2 has been a bit slow to follow (holidays) though there has been >> steady progress closing out blockers and criticals by a bunch of you all. >> The plan is for a release in the first week or so of August. It should be >> fully up on hbase-thirdparty using updated (and relocated) versions of >> netty, guava, and protobuf as well as a default deploy that has >> master-carrying-no-regions. >> >> alpha3 will follow soon after and will focus on making sure our >> user-facing >> APIs are clean (branch-1 compatible, no illicit removals/mods, and so on) >> and that basic upgrade 'works'. >> >> betas start in September? >> >> I've been keeping a rough general state here [1] (please update any >> section >> that is lagging actuality) but for details on what blockers and criticals >> remain, see the JIRA 2.0 view [2]. Recent issue-gardening has brought 2.0 >> into better focus. Feel free to review and punt items you think can wait >> till 3.0 or 2.1. If you want to pull in more stuff, please ask first. >> > > Chia-Ping (I think? -- JIRA is being a pain) had asked on the space-quota > phase2 work (include size of hbase snapshots in a table's "quota usage") if > we should try to also include that work in 2.0. > > I like the idea of this also hitting 2.0 as it would make the feature a > bit more "real", but am obviously a little nervous (I have no reason to be > nervous though). I am pretty happy with the feature in terms of how much it > is covered via testing. > > https://issues.apache.org/jira/browse/HBASE-17748 > > > Thanks, >> St.Ack >> >> 1. >> https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9i >> Eu_ktczrlKHK8N4SZzs/edit# >> 2. https://issues.apache.org/jira/projects/HBASE/versions/12327188 >> > > - Josh >
[jira] [Created] (HBASE-18453) CompactionRequest should not be exposed to user directly
Duo Zhang created HBASE-18453: - Summary: CompactionRequest should not be exposed to user directly Key: HBASE-18453 URL: https://issues.apache.org/jira/browse/HBASE-18453 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang It is an implementation class. And we need to find another to let user know the compaction start and end. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18452) VerifyReplication by Snapshot should cache HDFS token before submit job for kerberos env.
Zheng Hu created HBASE-18452: Summary: VerifyReplication by Snapshot should cache HDFS token before submit job for kerberos env. Key: HBASE-18452 URL: https://issues.apache.org/jira/browse/HBASE-18452 Project: HBase Issue Type: Bug Reporter: Zheng Hu Assignee: Zheng Hu I've ported HBASE-16466 to our internal hbase branch, and tested the feature under our kerberos cluster. The problem we encountered is: {code} 17/07/25 21:21:23 INFO mapreduce.Job: Task Id : attempt_1500987232138_0004_m_03_2, Status : FAILED Error: java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "hadoop-yarn-host"; destination host is: "hadoop-namenode-host":15200; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:775) at org.apache.hadoop.ipc.Client.call(Client.java:1481) at org.apache.hadoop.ipc.Client.call(Client.java:1408) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy13.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:807) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2029) at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1195) at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1191) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1207) at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.checkRegionInfoOnFilesystem(HRegionFileSystem.java:778) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:769) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:748) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5188) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5153) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5125) at org.apache.hadoop.hbase.client.ClientSideRegionScanner.(ClientSideRegionScanner.java:60) at org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl$RecordReader.initialize(TableSnapshotInputFormatImpl.java:191) at org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat$TableSnapshotRegionRecordReader.initialize(TableSnapshotInputFormat.java:148) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:552) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:790) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1885) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1885) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:651) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:738) at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:370) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1530) at org.apache.hadoop.ipc.Client.call(Client.java:1447) ... 33 more Caused by: org.apache.hadoop.security.AccessControlException: Client c
[jira] [Created] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request
Jean-Marc Spaggiari created HBASE-18451: --- Summary: PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request Key: HBASE-18451 URL: https://issues.apache.org/jira/browse/HBASE-18451 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0-alpha-1 Reporter: Jean-Marc Spaggiari If you run a big job every 4 hours, impacting many tables (they have 150 regions per server), ad the end all the regions might have some data to be flushed, and we want, after one hour, trigger a periodic flush. That's totally fine. Now, to avoid a flush storm, when we detect a region to be flushed, we add a "randomDelay" to the delayed flush, that way we spread them away. RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, which is very good. However, because we don't check if there is already a request in the queue, 10 seconds after, we create a new request, with a new randomDelay. If you generate a randomDelay every 10 seconds, at some point, you will end up having a small one, and the flush will be triggered almost immediatly. As a result, instead of spreading all the flush within the next 5 minutes, you end-up getting them all way more quickly. Like within the first minute. Which defeats the purpose of the randomDelay. [code] @Override protected void chore() { final StringBuffer whyFlush = new StringBuffer(); for (Region r : this.server.onlineRegions.values()) { if (r == null) continue; if (((HRegion)r).shouldFlush(whyFlush)) { FlushRequester requester = server.getFlushRequester(); if (requester != null) { long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + MIN_DELAY_TIME; LOG.info(getName() + " requesting flush of " + r.getRegionInfo().getRegionNameAsString() + " because " + whyFlush.toString() + " after random delay " + randomDelay + "ms"); //Throttle the flushes by putting a delay. If we don't throttle, and there //is a balanced write-load on the regions in a table, we might end up //overwhelming the filesystem with too many flushes at once. requester.requestDelayedFlush(r, randomDelay, false); } } } } [code] [code] 2017-07-24 18:44:33,338 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 270785ms 2017-07-24 18:44:43,328 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 200143ms 2017-07-24 18:44:53,954 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 191082ms 2017-07-24 18:45:03,528 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 92532ms 2017-07-24 18:45:14,201 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 238780ms 2017-07-24 18:45:24,195 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 35390ms 2017-07-24 18:45:33,362 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 283034ms 2017-07-24 18:45:43,933 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f has an old edit so flush to free WALs after random delay 84328ms 2017-07-24 18:45:53,866 INFO org.apache.hadoop.hbase.regionserver.HRegio
[jira] [Created] (HBASE-18450) Add test for HBASE-18247
Appy created HBASE-18450: Summary: Add test for HBASE-18247 Key: HBASE-18450 URL: https://issues.apache.org/jira/browse/HBASE-18450 Project: HBase Issue Type: Task Reporter: Appy Assignee: huaxiang sun ref: https://issues.apache.org/jira/browse/HBASE-18247?focusedCommentId=16100472&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16100472 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: user_permission output
OK, found that no parameter is treated as asking for the ACLs table. Maybe the shell command should say so? Just a minor nit though, there are bigger tofu blocks to fry. Sent from my iPhone > On 21. Jul 2017, at 13:33, Lars George wrote: > > Hi, > > I am running the shell's "user_permission" command without any > parameter, and with a ".*" wildcard epxression and get two different > results back: > > hbase(main):003:0> user_permission > User > Namespace,Table,Family,Qualifier:Permission > hbasebook hbase,hbase:acl,,: > [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN] > hbase hbase,hbase:acl,,: > [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN] > 2 row(s) in 0.5110 seconds > > hbase(main):005:0> user_permission ".*" > User > Namespace,Table,Family,Qualifier:Permission > hbasebook hbase,hbase:acl,,: > [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN] > hbasebook default,testtable,,: > [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN] > 2 row(s) in 0.4880 seconds > > See how the second row differs? What is the proper behaviour of each > command? Should it not give the same output? > > Cheers, > Lars
Re: Ident String
Don't think it is necessary, does not to any harm. Sent from my iPhone > On 21. Jul 2017, at 00:10, Stack wrote: > >> On Thu, Jul 20, 2017 at 9:20 PM, Lars George wrote: >> >> Ah ok, I missed that HBASE_IDENT_STR is used to personalize the log and >> pid file names. Hadoop does the same, so that makes sense. But the >> -Dhbase.id.str is not used, neither is it in Hadoop. No worries, just >> wanted to see if anyone had an idea if that was ever used. Does not seem to >> be the case. >> >> > Should we remove it then boss? > St.Ack > > > >> Cheers, >> Lars >> >> Sent from my iPad >> >>> On 19. Jul 2017, at 22:42, Lars George wrote: >>> >>> It dates back to Hadoop: >>> >>> https://github.com/apache/hbase/commit/24b065cc91f7bcdab25fc363469965 >> 7ac2f27104 >>> >>> See this https://github.com/apache/hadoop/blame/trunk/hadoop- >> common-project/hadoop-common/src/main/conf/hadoop-env.sh#L202 >>> >>> It is used there for logs (according to the comment, haven't checked if >> it really does). Are we doing that? Will check... >>> >>> >>> Sent from my iPad >>> On 19. Jul 2017, at 16:43, Stack wrote: It shows when you do a long processing listing. It is a convention from hadoop. It does similar. Here's a few example snippets Lars: For hbase master process listing, you'll find this in the long line...: -Dproc_master Etc: -Dproc_zookeeper -Dproc_resourcemanager St.Ack > On Fri, Jul 14, 2017 at 11:55 AM, Lars George >> wrote: > > Hi, > > Was coming across `HBASE_IDENT_STR` (in hbase-env.sh) which sets > `hbase.id.str` as a command line parameter to the daemon. But I cannot > find where that is used. Could someone point me in the right > direction? > > Cheers, > Lars > >>
[jira] [Created] (HBASE-18449) Fix client.locking.TestEntityLocks
Chia-Ping Tsai created HBASE-18449: -- Summary: Fix client.locking.TestEntityLocks Key: HBASE-18449 URL: https://issues.apache.org/jira/browse/HBASE-18449 Project: HBase Issue Type: Bug Components: test Affects Versions: 2.0.0-alpha-1, 3.0.0 Reporter: Chia-Ping Tsai Assignee: Chia-Ping Tsai Priority: Minor Fix For: 3.0.0, 2.0.0-alpha-2 {noformat} Wanted but not invoked: abortable.abort( , isA(org.apache.hadoop.hbase.HBaseIOException) ); -> at org.apache.hadoop.hbase.client.locking.TestEntityLocks.testHeartbeatException(TestEntityLocks.java:195) Actually, there were zero interactions with this mock. {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18448) Added support for refreshing HFiles through API and shell
Ajay Jadhav created HBASE-18448: --- Summary: Added support for refreshing HFiles through API and shell Key: HBASE-18448 URL: https://issues.apache.org/jira/browse/HBASE-18448 Project: HBase Issue Type: Improvement Affects Versions: 1.3.1, 2.0.0 Reporter: Ajay Jadhav Assignee: Ajay Jadhav Priority: Minor Fix For: 2.0.0, 1.4.0 In the case where multiple HBase clusters are sharing a common rootDir, even after flushing the data from one cluster doesn't mean that other clusters (replicas) will automatically pick the new HFile. Through this patch, we are exposing the refresh HFiles API which when issued from a replica will update the in-memory file handle list with the newly added file. This allows replicas to be consistent with the data written through the primary cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot
[ https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov resolved HBASE-7912. -- Resolution: Fixed Closing this one. Refer to https://issues.apache.org/jira/browse/HBASE-14414 for any further updates > HBase Backup/Restore Based on HBase Snapshot > > > Key: HBASE-7912 > URL: https://issues.apache.org/jira/browse/HBASE-7912 > Project: HBase > Issue Type: Sub-task >Reporter: Richard Ding >Assignee: Vladimir Rodionov > Labels: backup > Fix For: 2.0.0 > > Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, > Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore -0.91.pdf, > HBaseBackupAndRestore.pdf, HBaseBackupAndRestore - v0.8.pdf, > HBaseBackupAndRestore-v0.9.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf, > HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, > HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, > HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, > HBaseBackupRestore-Jira-7912-v6.pdf > > > Finally, we completed the implementation of our backup/restore solution, and > would like to share with community through this jira. > We are leveraging existing hbase snapshot feature, and provide a general > solution to common users. Our full backup is using snapshot to capture > metadata locally and using exportsnapshot to move data to another cluster; > the incremental backup is using offline-WALplayer to backup HLogs; we also > leverage global distribution rolllog and flush to improve performance; other > added-on values such as convert, merge, progress report, and CLI commands. So > that a common user can backup hbase data without in-depth knowledge of hbase. > Our solution also contains some usability features for enterprise users. > The detail design document and CLI command will be attached in this jira. We > plan to use 10~12 subtasks to share each of the following features, and > document the detail implement in the subtasks: > * *Full Backup* : provide local and remote back/restore for a list of tables > * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental > backup) > * *distributed* Logroll and distributed flush > * Backup *Manifest* and history > * *Incremental* backup: to build on top of full backup as daily/weekly backup > * *Convert* incremental backup WAL files into hfiles > * *Merge* several backup images into one(like merge weekly into monthly) > * *add and remove* table to and from Backup image > * *Cancel* a backup process > * backup progress *status* > * full backup based on *existing snapshot* > *-* > *Below is the original description, to keep here as the history for the > design and discussion back in 2013* > There have been attempts in the past to come up with a viable HBase > backup/restore solution (e.g., HBASE-4618). Recently, there are many > advancements and new features in HBase, for example, FileLink, Snapshot, and > Distributed Barrier Procedure. This is a proposal for a backup/restore > solution that utilizes these new features to achieve better performance and > consistency. > > A common practice of backup and restore in database is to first take full > baseline backup, and then periodically take incremental backup that capture > the changes since the full baseline backup. HBase cluster can store massive > amount data. Combination of full backups with incremental backups has > tremendous benefit for HBase as well. The following is a typical scenario > for full and incremental backup. > # The user takes a full backup of a table or a set of tables in HBase. > # The user schedules periodical incremental backups to capture the changes > from the full backup, or from last incremental backup. > # The user needs to restore table data to a past point of time. > # The full backup is restored to the table(s) or to different table name(s). > Then the incremental backups that are up to the desired point in time are > applied on top of the full backup. > We would support the following key features and capabilities. > * Full backup uses HBase snapshot to capture HFiles. > * Use HBase WALs to capture incremental changes, but we use bulk load of > HFiles for fast incremental restore. > * Support single table or a set of tables, and column family level backup and > restore. > * Restore to different table names. > * Support adding additional tables or CF to backup set without interruption > of incremental backup schedule. > * Support rollup/combining of incremental backups into longer period and > bigger incremental backups. > * Unified command line interface for all the above. > The solution will support
[jira] [Created] (HBASE-18447) MetricRegistryInfo#hashCode uses hashCode instead of toHashCode
Peter Somogyi created HBASE-18447: - Summary: MetricRegistryInfo#hashCode uses hashCode instead of toHashCode Key: HBASE-18447 URL: https://issues.apache.org/jira/browse/HBASE-18447 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 3.0.0, 1.4.0 Reporter: Peter Somogyi Assignee: Peter Somogyi Priority: Minor With commons-lang 2.6 .hashCode and .toHashCode gives back the same result but with version 2.4 the hashCode gives back the HashCodeBuilder's hash. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18446) Mark StoreFileScanner as IA.Private
Duo Zhang created HBASE-18446: - Summary: Mark StoreFileScanner as IA.Private Key: HBASE-18446 URL: https://issues.apache.org/jira/browse/HBASE-18446 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang Do not see any reason why it is marked as IA.LimitedPrivate. It is not referenced in any CPs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)