[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: hbase-8763-v1.patch The v1 patch runs fine in my local test env. Trigger one more QA run. [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId
[ https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8763: - Attachment: (was: hbase-8763-v1.patch) [BRAINSTORM] Combine MVCC and SeqId --- Key: HBASE-8763 URL: https://issues.apache.org/jira/browse/HBASE-8763 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Enis Soztutar Assignee: Jeffrey Zhong Priority: Critical Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, hbase-8763-v1.patch, hbase-8763_wip1.patch HBASE-8701 and a lot of recent issues include good discussions about mvcc + seqId semantics. It seems that having mvcc and the seqId complicates the comparator semantics a lot in regards to flush + WAL replay + compactions + delete markers and out of order puts. Thinking more about it I don't think we need a MVCC write number which is different than the seqId. We can keep the MVCC semantics, read point and smallest read points intact, but combine mvcc write number and seqId. This will allow cleaner semantics + implementation + smaller data files. We can do some brainstorming for 0.98. We still have to verify that this would be semantically correct, it should be so by my current understanding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10336) Remove deprecated usage of Hadoop HttpServer in InfoServer
[ https://issues.apache.org/jira/browse/HBASE-10336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Charles updated HBASE-10336: - Attachment: HBASE-10569-10.patch HBASE-10336-10.patch should work with jenkins. It builds/tests fine on my jdk7. Remove deprecated usage of Hadoop HttpServer in InfoServer -- Key: HBASE-10336 URL: https://issues.apache.org/jira/browse/HBASE-10336 Project: HBase Issue Type: Bug Affects Versions: 0.99.0 Reporter: Eric Charles Assignee: Eric Charles Attachments: HBASE-10336-1.patch, HBASE-10336-2.patch, HBASE-10336-3.patch, HBASE-10336-4.patch, HBASE-10336-5.patch, HBASE-10336-6.patch, HBASE-10336-7.patch, HBASE-10336-8.patch, HBASE-10336-9.patch, HBASE-10569-10.patch Recent changes in Hadoop HttpServer give NPE when running on hadoop 3.0.0-SNAPSHOT. This way we use HttpServer is deprecated and will probably be not fixed (see HDFS-5760). We'd better move to the new proposed builder pattern, which means we can no more use inheritance to build our nice InfoServer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11095) Add ip restriction in user permissions
[ https://issues.apache.org/jira/browse/HBASE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988941#comment-13988941 ] Liu Shaohui commented on HBASE-11095: - [~apurtell] Thanks for your suggestions Agree. We may left this issue here, and keep attention to the refactor of AccessController and the notifications bus. Add ip restriction in user permissions -- Key: HBASE-11095 URL: https://issues.apache.org/jira/browse/HBASE-11095 Project: HBase Issue Type: New Feature Components: security Reporter: Liu Shaohui Priority: Minor For some sensitive data, users want to restrict the from ips of hbase users like mysql access control. One direct solution is to add the candidated ips when granting user permisions. {quote} grant user|@group\[@ip-regular expression\] [ table [ column family [ column qualifier ] ] ] {quote} Any comments and suggestions are welcomed. [~apurtell] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HBASE-11107) Provide utility method equivalent to 0.92's Result.getBytes().getSize()
[ https://issues.apache.org/jira/browse/HBASE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rekha Joshi reassigned HBASE-11107: --- Assignee: Rekha Joshi Provide utility method equivalent to 0.92's Result.getBytes().getSize() --- Key: HBASE-11107 URL: https://issues.apache.org/jira/browse/HBASE-11107 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Rekha Joshi Priority: Trivial Currently user has to write code similar to the following for replacement of Result.getBytes().getSize() : {code} +Cell[] cellValues = resultRow.rawCells(); + +long size = 0L; +if (null != cellValues) { + for (Cell cellValue : cellValues) { +size += KeyValueUtil.ensureKeyValue(cellValue).heapSize(); + } +} {code} In ClientScanner, we have: {code} for (Cell kv : rs.rawCells()) { // TODO make method in Cell or CellUtil remainingResultSize -= KeyValueUtil.ensureKeyValue(kv).heapSize(); } {code} A utility method should be provided which computes summation of Cell sizes in a Result. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HBASE-11113) clone_snapshot command prints wrong name upon error
[ https://issues.apache.org/jira/browse/HBASE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rekha Joshi reassigned HBASE-3: --- Assignee: Rekha Joshi clone_snapshot command prints wrong name upon error --- Key: HBASE-3 URL: https://issues.apache.org/jira/browse/HBASE-3 Project: HBase Issue Type: Bug Affects Versions: 0.98.2 Reporter: Andrew Purtell Assignee: Rekha Joshi Priority: Trivial Fix For: 0.99.0, 0.98.3 hbase clone_snapshot 'snapshot', 'existing_table_name' ERROR: Table already exists: snapshot! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HBASE-11064) Odd behaviors of TableName for empty namespace
[ https://issues.apache.org/jira/browse/HBASE-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rekha Joshi reassigned HBASE-11064: --- Assignee: Rekha Joshi Odd behaviors of TableName for empty namespace -- Key: HBASE-11064 URL: https://issues.apache.org/jira/browse/HBASE-11064 Project: HBase Issue Type: Bug Reporter: Hiroshi Ikeda Assignee: Rekha Joshi Priority: Trivial In the class TableName, {code} public static byte [] isLegalFullyQualifiedTableName(final byte[] tableName) { ... int namespaceDelimIndex = ... if (namespaceDelimIndex == 0 || namespaceDelimIndex == -1){ isLegalTableQualifierName(tableName); } else { ... {code} That means, for example, giving :a as the argument throws an exception which says invalid qualifier, instead of invalid namespace. Also, TableName.valueOf(String) and valueOf(byte[]) can create an instance with empty namespace, which is inconsistent. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11113) clone_snapshot command prints wrong name upon error
[ https://issues.apache.org/jira/browse/HBASE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-3: --- Assignee: (was: Rekha Joshi) clone_snapshot command prints wrong name upon error --- Key: HBASE-3 URL: https://issues.apache.org/jira/browse/HBASE-3 Project: HBase Issue Type: Bug Affects Versions: 0.98.2 Reporter: Andrew Purtell Priority: Trivial Fix For: 0.99.0, 0.98.3 hbase clone_snapshot 'snapshot', 'existing_table_name' ERROR: Table already exists: snapshot! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HBASE-11113) clone_snapshot command prints wrong name upon error
[ https://issues.apache.org/jira/browse/HBASE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu resolved HBASE-3. Resolution: Duplicate [~apurtell] This is fixed as part of HBASE-10533 in all the branches. hence closing as duplicate. {code} hbase(main):001:0 clone_snapshot 'mySnap','SYSTEM.CATALOG' ERROR: Table already exists: SYSTEM.CATALOG! {code} Thanks. clone_snapshot command prints wrong name upon error --- Key: HBASE-3 URL: https://issues.apache.org/jira/browse/HBASE-3 Project: HBase Issue Type: Bug Affects Versions: 0.98.2 Reporter: Andrew Purtell Priority: Trivial Fix For: 0.99.0, 0.98.3 hbase clone_snapshot 'snapshot', 'existing_table_name' ERROR: Table already exists: snapshot! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11090) Backport HBASE-11083 ExportSnapshot should provide capability to limit bandwidth consumption
[ https://issues.apache.org/jira/browse/HBASE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988966#comment-13988966 ] Ted Yu commented on HBASE-11090: [~apurtell]: Is the trunk patch good to go ? Backport HBASE-11083 ExportSnapshot should provide capability to limit bandwidth consumption Key: HBASE-11090 URL: https://issues.apache.org/jira/browse/HBASE-11090 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 0.98.3 Attachments: 11090-0.98-v1.txt, 11090-trunk.txt HBASE-11083 allows ExportSnapshot to limit bandwidth usage. Here is *one* approach for backporting: Create the following classes (class name is tentative): hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/util/ThrottledInputStream.java hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/util/ThrottledInputStream.java each of which extends the corresponding ThrottledInputStream in hadoop-1 / hadoop-2 ExportSnapshot would reference util.ThrottledInputStream, depending on which compatibility module gets bundled. ThrottledInputStream.java in hadoop-1 branch was backported through MAPREDUCE-5081 which went into 1.2.0 release. We need to decide how hadoop releases earlier than 1.2.0 should be supported. *Second* approach for backporting is to make a copy of ThrottledInputStream and include it in hbase codebase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10926) Use global procedure to flush table memstore cache
[ https://issues.apache.org/jira/browse/HBASE-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989027#comment-13989027 ] Ted Yu commented on HBASE-10926: Test suite passed: {code} [INFO] HBase . SUCCESS [1.588s] [INFO] HBase - Common SUCCESS [30.163s] [INFO] HBase - Protocol .. SUCCESS [0.295s] [INFO] HBase - Client SUCCESS [35.068s] [INFO] HBase - Hadoop Compatibility .. SUCCESS [5.062s] [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [1.288s] [INFO] HBase - Prefix Tree ... SUCCESS [2.613s] [INFO] HBase - Server SUCCESS [1:12:31.923s] [INFO] HBase - Testing Util .. SUCCESS [1.206s] [INFO] HBase - Thrift SUCCESS [2:03.521s] [INFO] HBase - Shell . SUCCESS [1:44.361s] [INFO] HBase - Integration Tests . SUCCESS [0.887s] [INFO] HBase - Examples .. SUCCESS [5.497s] [INFO] HBase - Assembly .. SUCCESS [0.806s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 1:17:44.897s [INFO] Finished at: Sun May 04 16:04:56 UTC 2014 [INFO] Final Memory: 49M/474M {code} Use global procedure to flush table memstore cache -- Key: HBASE-10926 URL: https://issues.apache.org/jira/browse/HBASE-10926 Project: HBase Issue Type: Improvement Components: Admin Affects Versions: 0.96.2, 0.98.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.99.0 Attachments: HBASE-10926-trunk-v1.patch, HBASE-10926-trunk-v2.patch, HBASE-10926-trunk-v3.patch, HBASE-10926-trunk-v4.patch Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10926) Use global procedure to flush table memstore cache
[ https://issues.apache.org/jira/browse/HBASE-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10926: --- Hadoop Flags: Reviewed Integrated to trunk. Thanks for the patch, Jerry. Use global procedure to flush table memstore cache -- Key: HBASE-10926 URL: https://issues.apache.org/jira/browse/HBASE-10926 Project: HBase Issue Type: Improvement Components: Admin Affects Versions: 0.96.2, 0.98.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.99.0 Attachments: HBASE-10926-trunk-v1.patch, HBASE-10926-trunk-v2.patch, HBASE-10926-trunk-v3.patch, HBASE-10926-trunk-v4.patch Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (HBASE-11113) clone_snapshot command prints wrong name upon error
[ https://issues.apache.org/jira/browse/HBASE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell reopened HBASE-3: This isn't fixed in 0.98 (at least), I found it while testing the RC. clone_snapshot command prints wrong name upon error --- Key: HBASE-3 URL: https://issues.apache.org/jira/browse/HBASE-3 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Priority: Trivial hbase clone_snapshot 'snapshot', 'existing_table_name' ERROR: Table already exists: snapshot! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11113) clone_snapshot command prints wrong name upon error
[ https://issues.apache.org/jira/browse/HBASE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-3: --- Affects Version/s: (was: 0.98.2) clone_snapshot command prints wrong name upon error --- Key: HBASE-3 URL: https://issues.apache.org/jira/browse/HBASE-3 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Priority: Trivial hbase clone_snapshot 'snapshot', 'existing_table_name' ERROR: Table already exists: snapshot! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HBASE-11113) clone_snapshot command prints wrong name upon error
[ https://issues.apache.org/jira/browse/HBASE-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-3. Resolution: Duplicate Fix Version/s: (was: 0.98.3) (was: 0.99.0) Actually let me close this again, it's possible I launched the shell out of dirs for 0.98.0. Will check at next opportunity. clone_snapshot command prints wrong name upon error --- Key: HBASE-3 URL: https://issues.apache.org/jira/browse/HBASE-3 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Priority: Trivial hbase clone_snapshot 'snapshot', 'existing_table_name' ERROR: Table already exists: snapshot! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11090) Backport HBASE-11083 ExportSnapshot should provide capability to limit bandwidth consumption
[ https://issues.apache.org/jira/browse/HBASE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989064#comment-13989064 ] Andrew Purtell commented on HBASE-11090: The port of ThrottledInputStream looks fine. Should it go into hbase-common? Is the POM change intentional? Backport HBASE-11083 ExportSnapshot should provide capability to limit bandwidth consumption Key: HBASE-11090 URL: https://issues.apache.org/jira/browse/HBASE-11090 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 0.98.3 Attachments: 11090-0.98-v1.txt, 11090-trunk.txt HBASE-11083 allows ExportSnapshot to limit bandwidth usage. Here is *one* approach for backporting: Create the following classes (class name is tentative): hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/util/ThrottledInputStream.java hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/util/ThrottledInputStream.java each of which extends the corresponding ThrottledInputStream in hadoop-1 / hadoop-2 ExportSnapshot would reference util.ThrottledInputStream, depending on which compatibility module gets bundled. ThrottledInputStream.java in hadoop-1 branch was backported through MAPREDUCE-5081 which went into 1.2.0 release. We need to decide how hadoop releases earlier than 1.2.0 should be supported. *Second* approach for backporting is to make a copy of ThrottledInputStream and include it in hbase codebase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10926) Use global procedure to flush table memstore cache
[ https://issues.apache.org/jira/browse/HBASE-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989065#comment-13989065 ] Jerry He commented on HBASE-10926: -- Thanks, Ted, Andrew. I spent some time yesterday trying to fix javadoc warnings Ted mentioned. I could not see the location of the warnings. I tried to run 'mvn clean package javadoc:javadoc -DskipTests -DHBasePatchProcess'. But couldn't find details in the output either. I had to turn on javadoc warning in my Eclipse, which shows a bunch of warnings for existing code as well. Was still thinking whether I should fix them or not yesterday ... Use global procedure to flush table memstore cache -- Key: HBASE-10926 URL: https://issues.apache.org/jira/browse/HBASE-10926 Project: HBase Issue Type: Improvement Components: Admin Affects Versions: 0.96.2, 0.98.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.99.0 Attachments: HBASE-10926-trunk-v1.patch, HBASE-10926-trunk-v2.patch, HBASE-10926-trunk-v3.patch, HBASE-10926-trunk-v4.patch Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-11095) Add ip restriction in user permissions
[ https://issues.apache.org/jira/browse/HBASE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989068#comment-13989068 ] Andrew Purtell edited comment on HBASE-11095 at 5/4/14 5:58 PM: I linked this issue to HBASE-7123, please see https://issues.apache.org/jira/browse/HBASE-7123?focusedCommentId=13989067page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13989067 was (Author: apurtell): I liked this issue to HBASE-7123, please see https://issues.apache.org/jira/browse/HBASE-7123?focusedCommentId=13989067page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13989067 Add ip restriction in user permissions -- Key: HBASE-11095 URL: https://issues.apache.org/jira/browse/HBASE-11095 Project: HBase Issue Type: New Feature Components: security Reporter: Liu Shaohui Priority: Minor For some sensitive data, users want to restrict the from ips of hbase users like mysql access control. One direct solution is to add the candidated ips when granting user permisions. {quote} grant user|@group\[@ip-regular expression\] [ table [ column family [ column qualifier ] ] ] {quote} Any comments and suggestions are welcomed. [~apurtell] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11095) Add ip restriction in user permissions
[ https://issues.apache.org/jira/browse/HBASE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989068#comment-13989068 ] Andrew Purtell commented on HBASE-11095: I liked this issue to HBASE-7123, please see https://issues.apache.org/jira/browse/HBASE-7123?focusedCommentId=13989067page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13989067 Add ip restriction in user permissions -- Key: HBASE-11095 URL: https://issues.apache.org/jira/browse/HBASE-11095 Project: HBase Issue Type: New Feature Components: security Reporter: Liu Shaohui Priority: Minor For some sensitive data, users want to restrict the from ips of hbase users like mysql access control. One direct solution is to add the candidated ips when granting user permisions. {quote} grant user|@group\[@ip-regular expression\] [ table [ column family [ column qualifier ] ] ] {quote} Any comments and suggestions are welcomed. [~apurtell] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-7123) Refactor internal methods in AccessController
[ https://issues.apache.org/jira/browse/HBASE-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989067#comment-13989067 ] Andrew Purtell commented on HBASE-7123: --- When refactoring permissionGranted, requirePermission, and related functions, make the decisionmaking the evaluation of a chain of predicates. The chain can be configured by site configuration or perhaps a security policy file. We can incorporate HBASE-11095 as a predicate implementation. Refactor internal methods in AccessController - Key: HBASE-7123 URL: https://issues.apache.org/jira/browse/HBASE-7123 Project: HBase Issue Type: Improvement Components: security Reporter: Andrew Purtell The authorize(), permissionGranted(), and requirePermission() methods in AccessController have organically grown as both the HBase client API and the AccessController itself have evolved, and now have several problems: - Code duplication (minor) - Unused variants (minor) - Signatures optimized for checking certain operations that have a familyMap. Unfortunately different operations have different ideas of what type a familyMap should be. This leads to runtime type checking and the need to convert one family map to another (e.g. {{Mapbyte[], NavigableMapbyte[],Object}} to {{Mapbyte[], Setbyte[]}} (That kind of conversion code in a hot path hurts to look at.) There are too many Java collection type combinations floating around. Some of this should be approached at the client API level too, for example with HBASE-7114. - Only one Permission.Action can be checked at a time. We should really convert these into a bitmap if multiple actions need checking and pass that around instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10926) Use global procedure to flush table memstore cache
[ https://issues.apache.org/jira/browse/HBASE-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-10926: --- Resolution: Fixed Status: Resolved (was: Patch Available) Ted already committed this to trunk [~jerryjch], so we can fix the javadoc issues with an addendum or follow up issue. I think this could be ok to go into 0.98 also. The coprocessor API changes only add new methods. We should consider a bit more if the flush behavior changes are ok in the 0.98 context so will open a backport issue for that discussion. Resolving this issue. Use global procedure to flush table memstore cache -- Key: HBASE-10926 URL: https://issues.apache.org/jira/browse/HBASE-10926 Project: HBase Issue Type: Improvement Components: Admin Affects Versions: 0.96.2, 0.98.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.99.0 Attachments: HBASE-10926-trunk-v1.patch, HBASE-10926-trunk-v2.patch, HBASE-10926-trunk-v3.patch, HBASE-10926-trunk-v4.patch Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11114) Backport HBASE-10926 (Use global procedure to flush table memstore cache) to 0.98
Andrew Purtell created HBASE-4: -- Summary: Backport HBASE-10926 (Use global procedure to flush table memstore cache) to 0.98 Key: HBASE-4 URL: https://issues.apache.org/jira/browse/HBASE-4 Project: HBase Issue Type: Task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.98.3 Backport HBASE-10926 to 0.98. Description from original issue: Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10926) Use global procedure to flush table memstore cache
[ https://issues.apache.org/jira/browse/HBASE-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989073#comment-13989073 ] Andrew Purtell commented on HBASE-10926: See HBASE-4 Use global procedure to flush table memstore cache -- Key: HBASE-10926 URL: https://issues.apache.org/jira/browse/HBASE-10926 Project: HBase Issue Type: Improvement Components: Admin Affects Versions: 0.96.2, 0.98.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.99.0 Attachments: HBASE-10926-trunk-v1.patch, HBASE-10926-trunk-v2.patch, HBASE-10926-trunk-v3.patch, HBASE-10926-trunk-v4.patch Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11114) Backport HBASE-10926 (Use global procedure to flush table memstore cache) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989075#comment-13989075 ] Andrew Purtell commented on HBASE-4: The coprocessor API changes in the committed patch for HBASE-4 add methods only, so would be ok to port back. I think the procedure framework in 0.98 can support the changes also. I think the one question is if the flush behavior change is ok to make from one minor release to another. In my opinion we should take a liberal attitude to changes that improve efficiency without changing client facing API, and when security is active flushing is a restricted activity already. A release note should be sufficient. Backport HBASE-10926 (Use global procedure to flush table memstore cache) to 0.98 - Key: HBASE-4 URL: https://issues.apache.org/jira/browse/HBASE-4 Project: HBase Issue Type: Task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.98.3 Backport HBASE-10926 to 0.98. Description from original issue: Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11112) PerformanceEvaluation should document --multiGet option on its printUsage.
[ https://issues.apache.org/jira/browse/HBASE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989132#comment-13989132 ] Nick Dimiduk commented on HBASE-2: -- I caught this and slipped it in with HBASE-10791, but that's on the 10070 branch that isn't finished. We can commit this one and deal with the merge on the other branch later; fine by me. My only nit is that we don't need the --multiGet in the usage information; extended options aren't exposed there. PerformanceEvaluation should document --multiGet option on its printUsage. -- Key: HBASE-2 URL: https://issues.apache.org/jira/browse/HBASE-2 Project: HBase Issue Type: Bug Components: documentation, Performance Affects Versions: 0.98.2 Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-2-v0-trunk.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11111) Bulk load of very wide rows can go OOM
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989134#comment-13989134 ] Nick Dimiduk commented on HBASE-1: -- Where does the failure happen? Is it in generating HFiles or in LoadIncrementalHFiles? I files HBASE-7743 some time back for addressing the former. A failure in the latter is news to me! Bulk load of very wide rows can go OOM -- Key: HBASE-1 URL: https://issues.apache.org/jira/browse/HBASE-1 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari When doing bulk load of very large rows (2M columns), application will stop with OOME. We should have an option to use the local disk as a temporary storage place to store sorted rows, with warning the user about performances degradation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11008) Align bulk load, flush, and compact to require Action.CREATE
[ https://issues.apache.org/jira/browse/HBASE-11008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989157#comment-13989157 ] Hudson commented on HBASE-11008: FAILURE: Integrated in HBase-0.94-security #480 (See [https://builds.apache.org/job/HBase-0.94-security/480/]) HBASE-11008 Align bulk load, flush, and compact to require Action.CREATE (jdcryans: rev 1591494) * /hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * /hbase/branches/0.94/security/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java Align bulk load, flush, and compact to require Action.CREATE Key: HBASE-11008 URL: https://issues.apache.org/jira/browse/HBASE-11008 Project: HBase Issue Type: Improvement Components: security Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.99.0, 0.98.2, 0.96.3, 0.94.20 Attachments: HBASE-11008-0.94.patch, HBASE-11008-v2.patch, HBASE-11008-v3.patch, HBASE-11008.patch Over in HBASE-10958 we noticed that it might make sense to require Action.CREATE for bulk load, flush, and compact since it is also required for things like enable and disable. This means the following changes: - preBulkLoadHFile goes from WRITE to CREATE - compact/flush go from ADMIN to ADMIN or CREATE -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10958) [dataloss] Bulk loading with seqids can prevent some log entries from being replayed
[ https://issues.apache.org/jira/browse/HBASE-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989158#comment-13989158 ] Hudson commented on HBASE-10958: FAILURE: Integrated in HBase-0.94-security #480 (See [https://builds.apache.org/job/HBase-0.94-security/480/]) HBASE-10958 [dataloss] Bulk loading with seqids can prevent some log entries from being replayed (jdcryans: rev 1591495) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/HFileTestUtil.java [dataloss] Bulk loading with seqids can prevent some log entries from being replayed Key: HBASE-10958 URL: https://issues.apache.org/jira/browse/HBASE-10958 Project: HBase Issue Type: Bug Affects Versions: 0.96.2, 0.98.1, 0.94.18 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.99.0, 0.98.2, 0.96.3, 0.94.20 Attachments: HBASE-10958-0.94.patch, HBASE-10958-less-intrusive-hack-0.96.patch, HBASE-10958-quick-hack-0.96.patch, HBASE-10958-v2.patch, HBASE-10958-v3.patch, HBASE-10958.patch We found an issue with bulk loads causing data loss when assigning sequence ids (HBASE-6630) that is triggered when replaying recovered edits. We're nicknaming this issue *Blindspot*. The problem is that the sequence id given to a bulk loaded file is higher than those of the edits in the region's memstore. When replaying recovered edits, the rule to skip some of them is that they have to be _lower than the highest sequence id_. In other words, the edits that have a sequence id lower than the highest one in the store files *should* have also been flushed. This is not the case with bulk loaded files since we now have an HFile with a sequence id higher than unflushed edits. The log recovery code takes this into account by simply skipping the bulk loaded files, but this bulk loaded status is *lost* on compaction. The edits in the logs that have a sequence id lower than the bulk loaded file that got compacted are put in a blind spot and are skipped during replay. Here's the easiest way to recreate this issue: - Create an empty table - Put one row in it (let's say it gets seqid 1) - Bulk load one file (it gets seqid 2). I used ImporTsv and set hbase.mapreduce.bulkload.assign.sequenceNumbers. - Bulk load a second file the same way (it gets seqid 3). - Major compact the table (the new file has seqid 3 and isn't considered bulk loaded). - Kill the region server that holds the table's region. - Scan the table once the region is made available again. The first row, at seqid 1, will be missing since the HFile with seqid 3 makes us believe that everything that came before it was flushed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10965) Automate detection of presence of Filter#filterRow()
[ https://issues.apache.org/jira/browse/HBASE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989175#comment-13989175 ] Hadoop QA commented on HBASE-10965: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12643213/10965-v7.txt against trunk revision . ATTACHMENT ID: 12643213 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9457//console This message is automatically generated. Automate detection of presence of Filter#filterRow() Key: HBASE-10965 URL: https://issues.apache.org/jira/browse/HBASE-10965 Project: HBase Issue Type: Task Components: Filters Reporter: Ted Yu Assignee: Ted Yu Attachments: 10965-v1.txt, 10965-v2.txt, 10965-v3.txt, 10965-v4.txt, 10965-v6.txt, 10965-v7.txt There is potential inconsistency between the return value of Filter#hasFilterRow() and presence of Filter#filterRow(). Filters may override Filter#filterRow() while leaving return value of Filter#hasFilterRow() being false (inherited from FilterBase). Downside to purely depending on hasFilterRow() telling us whether custom filter overrides filterRow(List) or filterRow() is that the check below may be rendered ineffective: {code} if (nextKv == KV_LIMIT) { if (this.filter != null filter.hasFilterRow()) { throw new IncompatibleFilterException( Filter whose hasFilterRow() returns true is incompatible with scan with limit!); } {code} When user forgets to override hasFilterRow(), the above check becomes not useful. Another limitation is that we cannot optimize FilterList#filterRow() through short circuit when FilterList#hasFilterRow() turns false. See https://issues.apache.org/jira/browse/HBASE-11093?focusedCommentId=13985149page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985149 This JIRA aims to remove the inconsistency by automatically detecting the presence of overridden Filter#filterRow(). If filterRow() is implemented and not inherited from FilterBase, it is equivalent to having hasFilterRow() return true. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11051) checkJavacWarnings in test-patch.sh should bail out early if there is compilation error
[ https://issues.apache.org/jira/browse/HBASE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989177#comment-13989177 ] Hudson commented on HBASE-11051: FAILURE: Integrated in HBase-TRUNK #5127 (See [https://builds.apache.org/job/HBase-TRUNK/5127/]) HBASE-11051 checkJavacWarnings in test-patch.sh should bail out early if there is compilation error (Gustavo) (tedyu: rev 1592063) * /hbase/trunk/dev-support/test-patch.sh checkJavacWarnings in test-patch.sh should bail out early if there is compilation error --- Key: HBASE-11051 URL: https://issues.apache.org/jira/browse/HBASE-11051 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Gustavo Anatoly Priority: Minor Fix For: 0.99.0 Attachments: HBASE-11051-v1.patch, HBASE-11051-v2.patch, HBASE-11051-v3.patch, HBASE-11051-v4.patch, HBASE-11051.patch Currently checkJavacWarnings doesn't exit QA script in the presence of compilation error. Here is one example: https://builds.apache.org/job/PreCommit-HBASE-Build/9360/console . checkJavacWarnings should do the following so that it is clear what caused the QA run to fail: {code} if [[ $? != 0 ]] ; then ERR=`$GREP -A 5 'Compilation failure' $PATCH_DIR/trunkJavacWarnings.txt` echo Trunk compilation is broken? \{code\}$ERR\{code\} cleanupAndExit 1 fi {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10926) Use global procedure to flush table memstore cache
[ https://issues.apache.org/jira/browse/HBASE-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989178#comment-13989178 ] Hudson commented on HBASE-10926: FAILURE: Integrated in HBase-TRUNK #5127 (See [https://builds.apache.org/job/HBase-TRUNK/5127/]) HBASE-10926 Use global procedure to flush table memstore cache (Jerry He) (tedyu: rev 1592368) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/Procedure.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ProcedureCoordinator.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/RegionServerProcedureManagerHost.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/Subprocedure.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/flush * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/flush/FlushTableSubprocedure.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/flush/MasterFlushTableProcedureManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/flush/RegionServerFlushTableProcedureManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityController.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java Use global procedure to flush table memstore cache -- Key: HBASE-10926 URL: https://issues.apache.org/jira/browse/HBASE-10926 Project: HBase Issue Type: Improvement Components: Admin Affects Versions: 0.96.2, 0.98.1 Reporter: Jerry He Assignee: Jerry He Fix For: 0.99.0 Attachments: HBASE-10926-trunk-v1.patch, HBASE-10926-trunk-v2.patch, HBASE-10926-trunk-v3.patch, HBASE-10926-trunk-v4.patch Currently, user can trigger table flush through hbase shell or HBaseAdmin API. To flush the table cache, each region server hosting the regions is contacted and flushed sequentially, which is less efficient. In HBase snapshot global procedure is used to coordinate and flush the regions in a distributed way. Let's provide a distributed table flush for general use. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10993) Deprioritize long-running scanners
[ https://issues.apache.org/jira/browse/HBASE-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-10993: Attachment: HBASE-10993-v4.patch Deprioritize long-running scanners -- Key: HBASE-10993 URL: https://issues.apache.org/jira/browse/HBASE-10993 Project: HBase Issue Type: Sub-task Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10993-v0.patch, HBASE-10993-v1.patch, HBASE-10993-v2.patch, HBASE-10993-v3.patch, HBASE-10993-v4.patch, HBASE-10993-v4.patch Currently we have a single call queue that serves all the normal user requests, and the requests are executed in FIFO. When running map-reduce jobs and user-queries on the same machine, we want to prioritize the user-queries. Without changing too much code, and not having the user giving hints, we can add a “vtime” field to the scanner, to keep track from how long is running. And we can replace the callQueue with a priorityQueue. In this way we can deprioritize long-running scans, the longer a scan request lives the less priority it gets. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-7987) Snapshot Manifest file instead of multiple empty files
[ https://issues.apache.org/jira/browse/HBASE-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-7987: --- Attachment: HBASE-7987-v6.patch Snapshot Manifest file instead of multiple empty files -- Key: HBASE-7987 URL: https://issues.apache.org/jira/browse/HBASE-7987 Project: HBase Issue Type: Improvement Components: snapshots Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 0.99.0 Attachments: HBASE-7987-v0.patch, HBASE-7987-v1.patch, HBASE-7987-v2.patch, HBASE-7987-v2.sketch, HBASE-7987-v3.patch, HBASE-7987-v4.patch, HBASE-7987-v5.patch, HBASE-7987-v6.patch, HBASE-7987.sketch Currently taking a snapshot means creating one empty file for each file in the source table directory, plus copying the .regioninfo file for each region, the table descriptor file and a snapshotInfo file. during the restore or snapshot verification we traverse the filesystem (fs.listStatus()) to find the snapshot files, and we open the .regioninfo files to get the information. to avoid hammering the NameNode and having lots of empty files, we can use a manifest file that contains the list of files and information that we need. To keep the RS parallelism that we have, each RS can write its own manifest. {code} message SnapshotDescriptor { required string name; optional string table; optional int64 creationTime; optional Type type; optional int32 version; } message SnapshotRegionManifest { optional int32 version; required RegionInfo regionInfo; repeated FamilyFiles familyFiles; message StoreFile { required string name; optional Reference reference; } message FamilyFiles { required bytes familyName; repeated StoreFile storeFiles; } } {code} {code} /hbase/.snapshot/snapshotName /hbase/.snapshot/snapshotName/snapshotInfo /hbase/.snapshot/snapshotName/tableName /hbase/.snapshot/snapshotName/tableName/tableInfo /hbase/.snapshot/snapshotName/tableName/regionManifest(.n) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11090) Backport HBASE-11083 ExportSnapshot should provide capability to limit bandwidth consumption
[ https://issues.apache.org/jira/browse/HBASE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989188#comment-13989188 ] Ted Yu commented on HBASE-11090: Hbase-common would be better for the backport. The change in Pom was intentional - the dependency was solely for this class. Backport HBASE-11083 ExportSnapshot should provide capability to limit bandwidth consumption Key: HBASE-11090 URL: https://issues.apache.org/jira/browse/HBASE-11090 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 0.98.3 Attachments: 11090-0.98-v1.txt, 11090-trunk.txt HBASE-11083 allows ExportSnapshot to limit bandwidth usage. Here is *one* approach for backporting: Create the following classes (class name is tentative): hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/util/ThrottledInputStream.java hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/util/ThrottledInputStream.java each of which extends the corresponding ThrottledInputStream in hadoop-1 / hadoop-2 ExportSnapshot would reference util.ThrottledInputStream, depending on which compatibility module gets bundled. ThrottledInputStream.java in hadoop-1 branch was backported through MAPREDUCE-5081 which went into 1.2.0 release. We need to decide how hadoop releases earlier than 1.2.0 should be supported. *Second* approach for backporting is to make a copy of ThrottledInputStream and include it in hbase codebase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11109) flush region sequence id may not be larger than all edits flushed
[ https://issues.apache.org/jira/browse/HBASE-11109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989209#comment-13989209 ] Hadoop QA commented on HBASE-11109: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12643243/11109v2.txt against trunk revision . ATTACHMENT ID: 12643243 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.wal.TestLogRolling Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9458//console This message is automatically generated. flush region sequence id may not be larger than all edits flushed - Key: HBASE-11109 URL: https://issues.apache.org/jira/browse/HBASE-11109 Project: HBase Issue Type: Sub-task Components: wal Affects Versions: 0.99.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 0.99.0 Attachments: 11109.txt, 11109v2.txt, 11109v2.txt This was found by [~jeffreyz] See parent issue. We have this issue since we put the ring buffer/disrupter into the WAL (HBASE-10156). An edits region sequence id is set only after the edit has traversed the ring buffer. Flushing, we just up whatever the current region sequence id is. Crossing the ring buffer may take some time and is done by background threads. The flusher may be taking the region sequence id though edits have not yet made it across the ringbuffer: i.e. edits that are actually scoped by the flush may have region sequence ids in excess of that of the flush sequence id reported. The consequences are not exactly clear. Would rather not have to find out so lets fix this here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11115) Support setting max version per column family in Get
Liu Shaohui created HBASE-5: --- Summary: Support setting max version per column family in Get Key: HBASE-5 URL: https://issues.apache.org/jira/browse/HBASE-5 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Priority: Minor The Get operation only supports setting the max version for all column families. But different column families may have different versions data, and users may want to get data with different versions from different column families in a single Get operation. Though, we can translate this kind of Get to multi single-column-family Gets, these Gets are sequential in regionserver and have different mvcc. Comments and suggestions are welcomed. Thx -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11090) Backport HBASE-11083 ExportSnapshot should provide capability to limit bandwidth consumption
[ https://issues.apache.org/jira/browse/HBASE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11090: --- Attachment: 11090-trunk-v2.txt Patch v2 moves the new class to hbase-common module Backport HBASE-11083 ExportSnapshot should provide capability to limit bandwidth consumption Key: HBASE-11090 URL: https://issues.apache.org/jira/browse/HBASE-11090 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 0.98.3 Attachments: 11090-0.98-v1.txt, 11090-trunk-v2.txt, 11090-trunk.txt HBASE-11083 allows ExportSnapshot to limit bandwidth usage. Here is *one* approach for backporting: Create the following classes (class name is tentative): hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/util/ThrottledInputStream.java hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/util/ThrottledInputStream.java each of which extends the corresponding ThrottledInputStream in hadoop-1 / hadoop-2 ExportSnapshot would reference util.ThrottledInputStream, depending on which compatibility module gets bundled. ThrottledInputStream.java in hadoop-1 branch was backported through MAPREDUCE-5081 which went into 1.2.0 release. We need to decide how hadoop releases earlier than 1.2.0 should be supported. *Second* approach for backporting is to make a copy of ThrottledInputStream and include it in hbase codebase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10933) hbck -fixHdfsOrphans is not working properly it throws null pointer exception
[ https://issues.apache.org/jira/browse/HBASE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989272#comment-13989272 ] Kashif J S commented on HBASE-10933: I have the patch ready for 0.94.19 version. Probably this problem of NUll Pointer Exception does not exist for 0.98.* versions and the trunk. Deepak Sharma : Can you please confirm for 0.98.2. Also, there is a problem with HBaseFsck when Table containing regions do not have data Or Data is in Mem, then HBaseFsck will fail to resolve the INCONSISTENCY. I am working on the fix for that. I will update the patch for review soon. hbck -fixHdfsOrphans is not working properly it throws null pointer exception - Key: HBASE-10933 URL: https://issues.apache.org/jira/browse/HBASE-10933 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.16, 0.98.2 Reporter: Deepak Sharma Assignee: Y. SREENIVASULU REDDY Priority: Critical if we regioninfo file is not existing in hbase region then if we run hbck repair or hbck -fixHdfsOrphans then it is not able to resolve this problem it throws null pointer exception {code} 2014-04-08 20:11:49,750 INFO [main] util.HBaseFsck (HBaseFsck.java:adoptHdfsOrphans(470)) - Attempting to handle orphan hdfs dir: hdfs://10.18.40.28:54310/hbase/TestHdfsOrphans1/5a3de9ca65e587cb05c9384a3981c950 java.lang.NullPointerException at org.apache.hadoop.hbase.util.HBaseFsck$TableInfo.access$000(HBaseFsck.java:1939) at org.apache.hadoop.hbase.util.HBaseFsck.adoptHdfsOrphan(HBaseFsck.java:497) at org.apache.hadoop.hbase.util.HBaseFsck.adoptHdfsOrphans(HBaseFsck.java:471) at org.apache.hadoop.hbase.util.HBaseFsck.restoreHdfsIntegrity(HBaseFsck.java:591) at org.apache.hadoop.hbase.util.HBaseFsck.offlineHdfsIntegrityRepair(HBaseFsck.java:369) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:447) at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3769) at org.apache.hadoop.hbase.util.HBaseFsck.run(HBaseFsck.java:3587) at com.huawei.isap.test.smartump.hadoop.hbase.HbaseHbckRepair.repairToFixHdfsOrphans(HbaseHbckRepair.java:244) at com.huawei.isap.test.smartump.hadoop.hbase.HbaseHbckRepair.setUp(HbaseHbckRepair.java:84) at junit.framework.TestCase.runBare(TestCase.java:132) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) {code} problem i got it is because since in HbaseFsck class {code} private void adoptHdfsOrphan(HbckInfo hi) {code} we are intializing tableinfo using SortedMapString, TableInfo tablesInfo object {code} TableInfo tableInfo = tablesInfo.get(tableName); {code} but in private SortedMapString, TableInfo loadHdfsRegionInfos() {code} for (HbckInfo hbi: hbckInfos) { if (hbi.getHdfsHRI() == null) { // was an orphan continue; } {code} we have check if a region is orphan then that table will can not be added in SortedMapString, TableInfo tablesInfo so later while using this we get null pointer exception -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HBASE-10933) hbck -fixHdfsOrphans is not working properly it throws null pointer exception
[ https://issues.apache.org/jira/browse/HBASE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kashif J S reassigned HBASE-10933: -- Assignee: Kashif J S (was: Y. SREENIVASULU REDDY) hbck -fixHdfsOrphans is not working properly it throws null pointer exception - Key: HBASE-10933 URL: https://issues.apache.org/jira/browse/HBASE-10933 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.16, 0.98.2 Reporter: Deepak Sharma Assignee: Kashif J S Priority: Critical if we regioninfo file is not existing in hbase region then if we run hbck repair or hbck -fixHdfsOrphans then it is not able to resolve this problem it throws null pointer exception {code} 2014-04-08 20:11:49,750 INFO [main] util.HBaseFsck (HBaseFsck.java:adoptHdfsOrphans(470)) - Attempting to handle orphan hdfs dir: hdfs://10.18.40.28:54310/hbase/TestHdfsOrphans1/5a3de9ca65e587cb05c9384a3981c950 java.lang.NullPointerException at org.apache.hadoop.hbase.util.HBaseFsck$TableInfo.access$000(HBaseFsck.java:1939) at org.apache.hadoop.hbase.util.HBaseFsck.adoptHdfsOrphan(HBaseFsck.java:497) at org.apache.hadoop.hbase.util.HBaseFsck.adoptHdfsOrphans(HBaseFsck.java:471) at org.apache.hadoop.hbase.util.HBaseFsck.restoreHdfsIntegrity(HBaseFsck.java:591) at org.apache.hadoop.hbase.util.HBaseFsck.offlineHdfsIntegrityRepair(HBaseFsck.java:369) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:447) at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3769) at org.apache.hadoop.hbase.util.HBaseFsck.run(HBaseFsck.java:3587) at com.huawei.isap.test.smartump.hadoop.hbase.HbaseHbckRepair.repairToFixHdfsOrphans(HbaseHbckRepair.java:244) at com.huawei.isap.test.smartump.hadoop.hbase.HbaseHbckRepair.setUp(HbaseHbckRepair.java:84) at junit.framework.TestCase.runBare(TestCase.java:132) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) {code} problem i got it is because since in HbaseFsck class {code} private void adoptHdfsOrphan(HbckInfo hi) {code} we are intializing tableinfo using SortedMapString, TableInfo tablesInfo object {code} TableInfo tableInfo = tablesInfo.get(tableName); {code} but in private SortedMapString, TableInfo loadHdfsRegionInfos() {code} for (HbckInfo hbi: hbckInfos) { if (hbi.getHdfsHRI() == null) { // was an orphan continue; } {code} we have check if a region is orphan then that table will can not be added in SortedMapString, TableInfo tablesInfo so later while using this we get null pointer exception -- This message was sent by Atlassian JIRA (v6.2#6252)