[jira] [Updated] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eungsop Yoo updated HBASE-14041: Attachment: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch I have attached the new version of patch. Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618230#comment-14618230 ] Ashish Singhi commented on HBASE-14041: --- lgtm Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13387) Add ByteBufferedCell an extension to Cell
[ https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618255#comment-14618255 ] ramkrishna.s.vasudevan commented on HBASE-13387: The first comment I got why you are inverting here. Fine with that change. Add ByteBufferedCell an extension to Cell - Key: HBASE-13387 URL: https://issues.apache.org/jira/browse/HBASE-13387 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: ByteBufferedCell.docx, HBASE-13387_v1.patch, HBASE-13387_v2.patch, WIP_HBASE-13387_V2.patch, WIP_ServerCell.patch, benchmark.zip This came in btw the discussion abt the parent Jira and recently Stack added as a comment on the E2E patch on the parent Jira. The idea is to add a new Interface 'ByteBufferedCell' in which we can add new buffer based getter APIs and getters for position in components in BB. We will keep this interface @InterfaceAudience.Private. When the Cell is backed by a DBB, we can create an Object implementing this new interface. The Comparators has to be aware abt this new Cell extension and has to use the BB based APIs rather than getXXXArray(). Also give util APIs in CellUtil to abstract the checks for new Cell type. (Like matchingXXX APIs, getValueAstype APIs etc) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618248#comment-14618248 ] Hadoop QA commented on HBASE-14041: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744166/0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744166 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14700//console This message is automatically generated. Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12596) bulkload needs to follow locality
[ https://issues.apache.org/jira/browse/HBASE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618251#comment-14618251 ] Ashish Singhi commented on HBASE-12596: --- Looks good overall to me apart from some minor nits, bq. Admin admin = util.getConnection().getAdmin(); close the admin at the end or better include it with in try declaration. bq. Table table = util.createTable(TABLE_NAME, FAMILIES, splitKeys); drop the table at the end bq. Path testDir = util.getDataTestDirOnTestFS(testLocalMRIncrementalLoad); delete it at the end {{doIncrementalLoadTest}} and the new test case has some code duplication, why not extract it to a local method and refer it ? (Not related to patch) can you remove the unused import from the test class. bulkload needs to follow locality - Key: HBASE-12596 URL: https://issues.apache.org/jira/browse/HBASE-12596 Project: HBase Issue Type: Improvement Components: HFile, regionserver Affects Versions: 0.98.8 Environment: hadoop-2.3.0, hbase-0.98.8, jdk1.7 Reporter: Victor Xu Assignee: Victor Xu Fix For: 0.98.14 Attachments: HBASE-12596-0.98-v1.patch, HBASE-12596-0.98-v2.patch, HBASE-12596-0.98-v3.patch, HBASE-12596-0.98-v4.patch, HBASE-12596-0.98-v5.patch, HBASE-12596-master-v1.patch, HBASE-12596-master-v2.patch, HBASE-12596-master-v3.patch, HBASE-12596-master-v4.patch, HBASE-12596-master-v5.patch, HBASE-12596.patch Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles to be loaded; 2. Move these HFiles to the right hdfs directory. However, the locality could be loss during the first step. Why not just write the HFiles directly into the right place? We can do this easily because StoreFile.WriterBuilder has the withFavoredNodes method, and we just need to call it in HFileOutputFormat's getNewWriter(). This feature is enabled by default, and we could use 'hbase.bulkload.locality.sensitive.enabled=false' to disable it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12596) bulkload needs to follow locality
[ https://issues.apache.org/jira/browse/HBASE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618167#comment-14618167 ] Hadoop QA commented on HBASE-12596: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744138/HBASE-12596-master-v5.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744138 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14698//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14698//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14698//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14698//console This message is automatically generated. bulkload needs to follow locality - Key: HBASE-12596 URL: https://issues.apache.org/jira/browse/HBASE-12596 Project: HBase Issue Type: Improvement Components: HFile, regionserver Affects Versions: 0.98.8 Environment: hadoop-2.3.0, hbase-0.98.8, jdk1.7 Reporter: Victor Xu Assignee: Victor Xu Fix For: 0.98.14 Attachments: HBASE-12596-0.98-v1.patch, HBASE-12596-0.98-v2.patch, HBASE-12596-0.98-v3.patch, HBASE-12596-0.98-v4.patch, HBASE-12596-0.98-v5.patch, HBASE-12596-master-v1.patch, HBASE-12596-master-v2.patch, HBASE-12596-master-v3.patch, HBASE-12596-master-v4.patch, HBASE-12596-master-v5.patch, HBASE-12596.patch Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles to be loaded; 2. Move these HFiles to the right hdfs directory. However, the locality could be loss during the first step. Why not just write the HFiles directly into the right place? We can do this easily because StoreFile.WriterBuilder has the withFavoredNodes method, and we just need to call it in HFileOutputFormat's getNewWriter(). This feature is enabled by default, and we could use 'hbase.bulkload.locality.sensitive.enabled=false' to disable it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-14041: -- Status: Patch Available (was: Open) Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13387) Add ByteBufferedCell an extension to Cell
[ https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618287#comment-14618287 ] Anoop Sam John commented on HBASE-13387: bq.the name of the second API can be changed ? createFirstOnRowColTS or something like that? I thought it is implicit mean when we pass the ts also.. If u strongly feel so, I can change it on commit. bq.So for now you are passing the array part alone to the Fake keys? and not the BB based API? The Fake Cells does not override the Bufferedcell? Later improvement? This fake key changes decide on what we will do with the blooms and the hashes. Ya later if needed.. Add ByteBufferedCell an extension to Cell - Key: HBASE-13387 URL: https://issues.apache.org/jira/browse/HBASE-13387 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: ByteBufferedCell.docx, HBASE-13387_v1.patch, HBASE-13387_v2.patch, WIP_HBASE-13387_V2.patch, WIP_ServerCell.patch, benchmark.zip This came in btw the discussion abt the parent Jira and recently Stack added as a comment on the E2E patch on the parent Jira. The idea is to add a new Interface 'ByteBufferedCell' in which we can add new buffer based getter APIs and getters for position in components in BB. We will keep this interface @InterfaceAudience.Private. When the Cell is backed by a DBB, we can create an Object implementing this new interface. The Comparators has to be aware abt this new Cell extension and has to use the BB based APIs rather than getXXXArray(). Also give util APIs in CellUtil to abstract the checks for new Cell type. (Like matchingXXX APIs, getValueAstype APIs etc) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13387) Add ByteBufferedCell an extension to Cell
[ https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618294#comment-14618294 ] Anoop Sam John commented on HBASE-13387: bq.-1 javac. The applied patch generated 20 javac compiler warnings (more than the master's current 16 warnings). Because of more ref to Unsafe. bq.-1 javadoc. The javadoc tool appears to have generated 1 warning messages. New ref to sun.unsafe package in ByteBufferUtils.java. Have to add to OK_COUNT bq.-1 checkstyle. The applied patch generated 1899 checkstyle errors (more than the master's current 1898 errors). An unused import in the patch. Can correct these on commit. Will commit tonight my time Add ByteBufferedCell an extension to Cell - Key: HBASE-13387 URL: https://issues.apache.org/jira/browse/HBASE-13387 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: ByteBufferedCell.docx, HBASE-13387_v1.patch, HBASE-13387_v2.patch, WIP_HBASE-13387_V2.patch, WIP_ServerCell.patch, benchmark.zip This came in btw the discussion abt the parent Jira and recently Stack added as a comment on the E2E patch on the parent Jira. The idea is to add a new Interface 'ByteBufferedCell' in which we can add new buffer based getter APIs and getters for position in components in BB. We will keep this interface @InterfaceAudience.Private. When the Cell is backed by a DBB, we can create an Object implementing this new interface. The Comparators has to be aware abt this new Cell extension and has to use the BB based APIs rather than getXXXArray(). Also give util APIs in CellUtil to abstract the checks for new Cell type. (Like matchingXXX APIs, getValueAstype APIs etc) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13965) Stochastic Load Balancer JMX Metrics
[ https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Chen updated HBASE-13965: - Attachment: HBASE-13965-v6.patch Stochastic Load Balancer JMX Metrics Key: HBASE-13965 URL: https://issues.apache.org/jira/browse/HBASE-13965 Project: HBase Issue Type: Improvement Components: Balancer, metrics Reporter: Lei Chen Assignee: Lei Chen Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png Today’s default HBase load balancer (the Stochastic load balancer) is cost function based. The cost function weights are tunable but no visibility into those cost function results is directly provided. A driving example is a cluster we have been tuning which has skewed rack size (one rack has half the nodes of the other few racks). We are tuning the cluster for uniform response time from all region servers with the ability to tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and RegionCountSkew Cost is difficult without a way to attribute each cost function’s contribution to overall cost. What this jira proposes is to provide visibility via JMX into each cost function of the stochastic load balancer, as well as the overall cost of the balancing plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13965) Stochastic Load Balancer JMX Metrics
[ https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei Chen updated HBASE-13965: - Attachment: (was: HBASE-13965-v6.patch) Stochastic Load Balancer JMX Metrics Key: HBASE-13965 URL: https://issues.apache.org/jira/browse/HBASE-13965 Project: HBase Issue Type: Improvement Components: Balancer, metrics Reporter: Lei Chen Assignee: Lei Chen Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png Today’s default HBase load balancer (the Stochastic load balancer) is cost function based. The cost function weights are tunable but no visibility into those cost function results is directly provided. A driving example is a cluster we have been tuning which has skewed rack size (one rack has half the nodes of the other few racks). We are tuning the cluster for uniform response time from all region servers with the ability to tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and RegionCountSkew Cost is difficult without a way to attribute each cost function’s contribution to overall cost. What this jira proposes is to provide visibility via JMX into each cost function of the stochastic load balancer, as well as the overall cost of the balancing plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-14019) Hbase table import throws RetriesExhaustedException
[ https://issues.apache.org/jira/browse/HBASE-14019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wesley Connor resolved HBASE-14019. --- Resolution: Not A Problem After much searching and SO, I realise you need to create table before hand Hbase table import throws RetriesExhaustedException --- Key: HBASE-14019 URL: https://issues.apache.org/jira/browse/HBASE-14019 Project: HBase Issue Type: Bug Components: hadoop2, hbase Affects Versions: 0.98.9 Environment: hbase-0.98.9-hadoop2 hadoop-2.6 Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.63-2+deb7u1 x86_64 GNU/Linux Oracle jdk1.8.0_45/ Reporter: Wesley Connor Attachments: error.txt hbase-0.98.9-hadoop2/bin/hbase org.apache.hadoop.hbase.mapreduce.Import item_restore /data/item_backup Fails with numerous RetriesExhaustedException The export process eg hbase-0.98.9-hadoop2/bin/hbase org.apache.hadoop.hbase.mapreduce.Export item /data/item_backup works flawlessly and the file item_backup is created import of the same file to a table of a different name, fails. Please see attached job log -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14022) TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+
[ https://issues.apache.org/jira/browse/HBASE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618868#comment-14618868 ] Andrew Purtell commented on HBASE-14022: Bad QA run, thought a Phoenix UT an HBase zombie. I need to commit this to unbreak the 0.98 Hadoop 1 build, so will do so using CTR momentarily to make forward progress. TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+ - Key: HBASE-14022 URL: https://issues.apache.org/jira/browse/HBASE-14022 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.98.14 Attachments: HBASE-14022-0.98.patch Only applicable to 0.98. Another instance where minimum supported versions of the JRE/JDK and Hadoop lag far behind current committer dev tooling. Fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14000) Region server failed to report Master and stuck in reportForDuty retry loop
[ https://issues.apache.org/jira/browse/HBASE-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618818#comment-14618818 ] Pankaj Kumar commented on HBASE-14000: -- Thanks [~jerryhe] for looking on this, ServerNotRunningYetException reported when master (HM1) was initializing, but by the time master (HM1) finish initialization another master (HM2) became active. Since rssStub is referring to the master (HM1) which is in standby mode now, so region server stuck in loop and always trying to connect to the standby master (HM1). Region server failed to report Master and stuck in reportForDuty retry loop --- Key: HBASE-14000 URL: https://issues.apache.org/jira/browse/HBASE-14000 Project: HBase Issue Type: Bug Reporter: Pankaj Kumar Assignee: Pankaj Kumar Attachments: HBASE-14000.patch In a HA cluster, region server got stuck in reportForDuty retry loop if the active master is restarting and later on master switch happens before it reports successfully. Root cause is same as HBASE-13317, but the region server tried to connect master when it was starting, so rssStub reset didnt happen as {code} if (ioe instanceof ServerNotRunningYetException) { LOG.debug(Master is not running yet); } {code} When master starts, master switch happened. So RS always tried to connect to standby master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619107#comment-14619107 ] Hadoop QA commented on HBASE-13415: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744269/HBASE-13415.v2-master.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744269 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 45 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + (\r\022\014\n\004type\030\002 \002(\r\022\016\n\006log_id\030\003 \002(\004\022\023\n\013min_ + + \030\003 \001(\004\J\n\004Type\022\007\n\003EOF\020\001\022\010\n\004INIT\020\002\022\n\n\006INS + + new java.lang.String[] { ClassName, ParentId, ProcId, StartTime, Owner, State, StackId, LastUpdate, Timeout, Exception, Result, StateData, NonceGroup, Nonce, }); {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14704//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14704//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14704//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14704//console This message is automatically generated. Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch, HBASE-13415.v2-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619138#comment-14619138 ] Stephen Yuan Jiang commented on HBASE-13415: The lineLenths from generated files and hence I have no control. I could not find any test failure from testReport (https://builds.apache.org/job/PreCommit-HBASE-Build/14704//testReport/), but the medium and large class test not run. Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch, HBASE-13415.v2-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12848) Utilize Flash storage for WAL
[ https://issues.apache.org/jira/browse/HBASE-12848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619147#comment-14619147 ] Andrew Purtell commented on HBASE-12848: bq. HDFS wont do the movement of block data across diff devices when the rename happens. There is a mover CLI tool for doing so which has to be called explicitly. Manual CLI tool? That's a bit lame. I made a proposal once for a scheme where the NN would store storage policy for various paths as xattrs, provide device class hints to the datanode during block allocation and replication according to defined policy, and automatically coordinate block movement from one storage tier to another should the storage policy xattr be changed for a given path. Not worth reviving that particular JIRA, but much of the groundwork for this is now in place in HDFS. Maybe someone could propose the remaining missing pieces? Utilize Flash storage for WAL - Key: HBASE-12848 URL: https://issues.apache.org/jira/browse/HBASE-12848 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.1.0 Attachments: 12848-v1.patch, 12848-v2.patch, 12848-v3.patch, 12848-v4.patch, 12848-v4.patch One way to improve data ingestion rate is to make use of Flash storage. HDFS is doing the heavy lifting - see HDFS-7228. We assume an environment where: 1. Some servers have a mix of flash, e.g. 2 flash drives and 4 traditional drives. 2. Some servers have all traditional storage. 3. RegionServers are deployed on both profiles within one HBase cluster. This JIRA allows WAL to be managed on flash in a mixed-profile environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619153#comment-14619153 ] Matteo Bertozzi commented on HBASE-13415: - looks good to me, let me commit it. I'll re-run the procedure related tests to confirm that everything is ok. Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch, HBASE-13415.v2-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
Andrew Purtell created HBASE-14042: -- Summary: Fix FATAL level logging in FSHLog where logged for non fatal exceptions Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 1.0.1.1, 1.1.1, 0.98.13 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 We FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14042: --- Description: We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) (was: We FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.)) Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618930#comment-14618930 ] Hadoop QA commented on HBASE-12015: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744247/HBASE-12015.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744247 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14702//console This message is automatically generated. Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankaj Kumar updated HBASE-12015: - Attachment: HBASE-12015.patch Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues
[ https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618931#comment-14618931 ] Hadoop QA commented on HBASE-13897: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744250/HBASE-13897-branch_1-20150709.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744250 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14703//console This message is automatically generated. OOM may occur when Import imports a row with too many KeyValues --- Key: HBASE-13897 URL: https://issues.apache.org/jira/browse/HBASE-13897 Project: HBase Issue Type: Bug Affects Versions: 0.98.13 Reporter: Liu Junhong Assignee: Liu Junhong Fix For: 2.0.0, 0.98.14, 1.3.0 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, HBASE-13897-branch_1-20150709.patch, HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, HBASE-13897-master-20150707.patch, HBASE-13897-master.patch When importing a row with too many KeyValues (may have too many columns or versions),KeyValueReducer will incur OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618952#comment-14618952 ] Ashish Singhi commented on HBASE-12015: --- Pankaj, suffix the patch name with branch name. For example, HBASE-12015-hbase-11339.patch Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics
[ https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618960#comment-14618960 ] Ted Yu commented on HBASE-13965: {code} + if (tableName != null) break; +} + if (tableName != null) break; {code} You can return tableName on line of the first break. This way the second return won't be needed. For CostFunctions whose value is 0, can we omit them in the metrics ? Stochastic Load Balancer JMX Metrics Key: HBASE-13965 URL: https://issues.apache.org/jira/browse/HBASE-13965 Project: HBase Issue Type: Improvement Components: Balancer, metrics Reporter: Lei Chen Assignee: Lei Chen Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png Today’s default HBase load balancer (the Stochastic load balancer) is cost function based. The cost function weights are tunable but no visibility into those cost function results is directly provided. A driving example is a cluster we have been tuning which has skewed rack size (one rack has half the nodes of the other few racks). We are tuning the cluster for uniform response time from all region servers with the ability to tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and RegionCountSkew Cost is difficult without a way to attribute each cost function’s contribution to overall cost. What this jira proposes is to provide visibility via JMX into each cost function of the stochastic load balancer, as well as the overall cost of the balancing plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-12015: -- Attachment: HBASE-12015-hbase-11339.patch On the behalf of Pankaj, attaching the same patch with proper naming so that the patch can trigger Hadoop QA. Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: HBASE-12015-hbase-11339.patch, HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14000) Region server failed to report Master and stuck in reportForDuty retry loop
[ https://issues.apache.org/jira/browse/HBASE-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618978#comment-14618978 ] Jerry He commented on HBASE-14000: -- bq. but by the time master (HM1) finish initialization another master (HM2) became active. When HM1 is initializing, it would continue to hold onto the master znode on ZK. HM2 can not become active master. Unless there is bug somewhere in there. Can you paste the HM1, HM2 and the region sever logs around the time of the master failover? Region server failed to report Master and stuck in reportForDuty retry loop --- Key: HBASE-14000 URL: https://issues.apache.org/jira/browse/HBASE-14000 Project: HBase Issue Type: Bug Reporter: Pankaj Kumar Assignee: Pankaj Kumar Attachments: HBASE-14000.patch In a HA cluster, region server got stuck in reportForDuty retry loop if the active master is restarting and later on master switch happens before it reports successfully. Root cause is same as HBASE-13317, but the region server tried to connect master when it was starting, so rssStub reset didnt happen as {code} if (ioe instanceof ServerNotRunningYetException) { LOG.debug(Master is not running yet); } {code} When master starts, master switch happened. So RS always tried to connect to standby master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics
[ https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618991#comment-14618991 ] Hadoop QA commented on HBASE-13965: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744232/HBASE-13965-v6.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744232 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestJMXListener Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14701//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14701//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14701//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14701//console This message is automatically generated. Stochastic Load Balancer JMX Metrics Key: HBASE-13965 URL: https://issues.apache.org/jira/browse/HBASE-13965 Project: HBase Issue Type: Improvement Components: Balancer, metrics Reporter: Lei Chen Assignee: Lei Chen Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png Today’s default HBase load balancer (the Stochastic load balancer) is cost function based. The cost function weights are tunable but no visibility into those cost function results is directly provided. A driving example is a cluster we have been tuning which has skewed rack size (one rack has half the nodes of the other few racks). We are tuning the cluster for uniform response time from all region servers with the ability to tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and RegionCountSkew Cost is difficult without a way to attribute each cost function’s contribution to overall cost. What this jira proposes is to provide visibility via JMX into each cost function of the stochastic load balancer, as well as the overall cost of the balancing plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics
[ https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618997#comment-14618997 ] Lei Chen commented on HBASE-13965: -- Yes, sounds good, since the full list of cost functions should be known to the user. Stochastic Load Balancer JMX Metrics Key: HBASE-13965 URL: https://issues.apache.org/jira/browse/HBASE-13965 Project: HBase Issue Type: Improvement Components: Balancer, metrics Reporter: Lei Chen Assignee: Lei Chen Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png Today’s default HBase load balancer (the Stochastic load balancer) is cost function based. The cost function weights are tunable but no visibility into those cost function results is directly provided. A driving example is a cluster we have been tuning which has skewed rack size (one rack has half the nodes of the other few racks). We are tuning the cluster for uniform response time from all region servers with the ability to tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and RegionCountSkew Cost is difficult without a way to attribute each cost function’s contribution to overall cost. What this jira proposes is to provide visibility via JMX into each cost function of the stochastic load balancer, as well as the overall cost of the balancing plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankaj Kumar updated HBASE-12015: - Status: Patch Available (was: Open) In DeleteColumnFamilyProcedure, state execution sequence is modified to delete the store files first then modify the table descriptor. We are following the same approach in DeleteTableProcedure. Patch for review. Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues
[ https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Junhong updated HBASE-13897: Attachment: HBASE-13897-branch_1-20150709.patch provide patch for branch-1 OOM may occur when Import imports a row with too many KeyValues --- Key: HBASE-13897 URL: https://issues.apache.org/jira/browse/HBASE-13897 Project: HBase Issue Type: Bug Affects Versions: 0.98.13 Reporter: Liu Junhong Assignee: Liu Junhong Fix For: 2.0.0, 0.98.14, 1.3.0 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, HBASE-13897-branch_1-20150709.patch, HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, HBASE-13897-master-20150707.patch, HBASE-13897-master.patch When importing a row with too many KeyValues (may have too many columns or versions),KeyValueReducer will incur OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14028) DistributedLogReplay drops edits when ITBLL 125M
[ https://issues.apache.org/jira/browse/HBASE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618938#comment-14618938 ] stack commented on HBASE-14028: --- Latest: I see a swath of edits that were posted below not showing up on the far end though they were apparently successfully digested on the remote end (found by aligning count of edits and some extra logging of sequenceids added in testbed): 2015-07-07 07:16:35,728 DEBUG [RS_LOG_REPLAY_OPS-c2024:16020-0-Writer-2] wal.WALEditsReplaySink: Replayed 231 edits in 2458ms into region=IntegrationTestBigLinkedList,\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xF8,1436277280607.bb166b99140bcd32df68676b4e1b60b2., hostname=c2025.halxg.cloudera.com,16020,1436278565173, seqNum=320072187, lastSequenceId=280072763 At the time, the recovering region is flushing -- a few logs are being replayed into this recovering region concurrently -- which is what is unusual around this event. I don't really seen filtering going on sink-side (except if not the primary replica). Adding more logging and retrying. DistributedLogReplay drops edits when ITBLL 125M Key: HBASE-14028 URL: https://issues.apache.org/jira/browse/HBASE-14028 Project: HBase Issue Type: Bug Components: Recovery Affects Versions: 1.2.0 Reporter: stack Testing DLR before 1.2.0RC gets cut, we are dropping edits. Issue seems to be around replay into a deployed region that is on a server that dies before all edits have finished replaying. Logging is sparse on sequenceid accounting so can't tell for sure how it is happening (and if our now accounting by Store is messing up DLR). Digging. I notice also that DLR does not refresh its cache of region location on error -- it just keeps trying till whole WAL fails 8 retries...about 30 seconds. We could do a bit of refactor and have the replay find region in new location if moved during DLR replay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13867) Add endpoint coprocessor guide to HBase book
[ https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-13867: -- Fix Version/s: 1.1.2 1.2.0 1.0.2 2.0.0 Add endpoint coprocessor guide to HBase book Key: HBASE-13867 URL: https://issues.apache.org/jira/browse/HBASE-13867 Project: HBase Issue Type: Task Components: Coprocessors, documentation Reporter: Vladimir Rodionov Assignee: Gaurav Bhardwaj Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.2 Attachments: HBASE-13867.1.patch Endpoint coprocessors are very poorly documented. Coprocessor section of HBase book must be updated either with its own endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some other guides. There is good description here: http://www.3pillarglobal.com/insights/hbase-coprocessors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-13415: --- Attachment: HBASE-13415.v2-master.patch Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch, HBASE-13415.v2-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14042: --- Status: Patch Available (was: Open) Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 1.0.1.1, 1.1.1, 0.98.13 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619061#comment-14619061 ] stack commented on HBASE-14041: --- +1 on patch. Lets get hadoopqa to pass so we can commit (patch failed to apply to master). I wonder if there are more cases of our clearing cache beyond this nice find? Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14042: --- Attachment: HBASE-14042.patch Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics
[ https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619044#comment-14619044 ] stack commented on HBASE-13965: --- bq. For CostFunctions whose value is 0, can we omit them in the metrics ? So metrics can come and go (not show if zero then show when non-zero)? That will be confusing I'd say. Stochastic Load Balancer JMX Metrics Key: HBASE-13965 URL: https://issues.apache.org/jira/browse/HBASE-13965 Project: HBase Issue Type: Improvement Components: Balancer, metrics Reporter: Lei Chen Assignee: Lei Chen Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png Today’s default HBase load balancer (the Stochastic load balancer) is cost function based. The cost function weights are tunable but no visibility into those cost function results is directly provided. A driving example is a cluster we have been tuning which has skewed rack size (one rack has half the nodes of the other few racks). We are tuning the cluster for uniform response time from all region servers with the ability to tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and RegionCountSkew Cost is difficult without a way to attribute each cost function’s contribution to overall cost. What this jira proposes is to provide visibility via JMX into each cost function of the stochastic load balancer, as well as the overall cost of the balancing plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13867) Add endpoint coprocessor guide to HBase book
[ https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619055#comment-14619055 ] stack commented on HBASE-13867: --- [~gliptak] It'd be for the best. Just trying to review this patch on my wide screen is tough given the lines running off the side. Thanks (patch looks great going by what I've seen so far). Add endpoint coprocessor guide to HBase book Key: HBASE-13867 URL: https://issues.apache.org/jira/browse/HBASE-13867 Project: HBase Issue Type: Task Components: Coprocessors, documentation Reporter: Vladimir Rodionov Assignee: Gaurav Bhardwaj Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.2 Attachments: HBASE-13867.1.patch Endpoint coprocessors are very poorly documented. Coprocessor section of HBase book must be updated either with its own endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some other guides. There is good description here: http://www.3pillarglobal.com/insights/hbase-coprocessors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics
[ https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619064#comment-14619064 ] Clay B. commented on HBASE-13965: - [~tedyu] I agree with [~stack] as for Graphite there which I load my metrics into, there is a difference between a null (no value) and a zero. Often we run arithmetic functions on our metrics for alarming for example. Stochastic Load Balancer JMX Metrics Key: HBASE-13965 URL: https://issues.apache.org/jira/browse/HBASE-13965 Project: HBase Issue Type: Improvement Components: Balancer, metrics Reporter: Lei Chen Assignee: Lei Chen Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png Today’s default HBase load balancer (the Stochastic load balancer) is cost function based. The cost function weights are tunable but no visibility into those cost function results is directly provided. A driving example is a cluster we have been tuning which has skewed rack size (one rack has half the nodes of the other few racks). We are tuning the cluster for uniform response time from all region servers with the ability to tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and RegionCountSkew Cost is difficult without a way to attribute each cost function’s contribution to overall cost. What this jira proposes is to provide visibility via JMX into each cost function of the stochastic load balancer, as well as the overall cost of the balancing plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619383#comment-14619383 ] Andrew Purtell commented on HBASE-14042: Bogus precommit. Looks like a Camel UT was erroneously flagged as a zombie. This patch only changes log levels. No unit tests needed. Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619297#comment-14619297 ] Hadoop QA commented on HBASE-12015: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744268/HBASE-12015-hbase-11339.patch against hbase-11339 branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744268 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:red}-1 findbugs{color}. The patch appears to cause Findbugs (version 2.0.3) to fail. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestTableLockManager org.apache.hadoop.hbase.util.TestProcessBasedCluster org.apache.hadoop.hbase.mapreduce.TestImportExport org.apache.hadoop.hbase.master.procedure.TestModifyTableProcedure org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster org.apache.hadoop.hbase.master.TestRegionPlacement2 org.apache.hadoop.hbase.master.procedure.TestDeleteColumnFamilyProcedure org.apache.hadoop.hbase.TestClusterBootOrder {color:red}-1 core zombie tests{color}. There are 5 zombie test(s): at org.apache.hadoop.hbase.io.encoding.TestChangingEncoding.testChangingEncodingWithCompaction(TestChangingEncoding.java:212) at org.apache.hadoop.hbase.io.encoding.TestEncodedSeekers.testEncodedSeeker(TestEncodedSeekers.java:122) at org.apache.hadoop.hbase.io.encoding.TestDataBlockEncoders.testSeekingOnSample(TestDataBlockEncoders.java:206) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompactionInternals(TestCacheOnWrite.java:454) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompaction(TestCacheOnWrite.java:478) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14705//testReport/ Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14705//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14705//console This message is automatically generated. Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: HBASE-12015-hbase-11339.patch, HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues
[ https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619303#comment-14619303 ] Hudson commented on HBASE-13897: FAILURE: Integrated in HBase-1.3 #42 (See [https://builds.apache.org/job/HBase-1.3/42/]) HBASE-13897 OOM may occur when Import imports a row with too many KeyValues (Liu Junhong) (tedyu: rev 7ab78d9ddfc93949bf221560e1a2ff255d39535c) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java OOM may occur when Import imports a row with too many KeyValues --- Key: HBASE-13897 URL: https://issues.apache.org/jira/browse/HBASE-13897 Project: HBase Issue Type: Bug Affects Versions: 0.98.13 Reporter: Liu Junhong Assignee: Liu Junhong Fix For: 2.0.0, 0.98.14, 1.3.0 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, HBASE-13897-branch_1-20150709.patch, HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, HBASE-13897-master-20150707.patch, HBASE-13897-master.patch When importing a row with too many KeyValues (may have too many columns or versions),KeyValueReducer will incur OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14022) TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+
[ https://issues.apache.org/jira/browse/HBASE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619203#comment-14619203 ] Hudson commented on HBASE-14022: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1005 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1005/]) HBASE-14022 TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+ (apurtell: rev 64b33b8a388a6be4c97b5164131f31b628166474) * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableSnapshotInputFormatImpl.java TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+ - Key: HBASE-14022 URL: https://issues.apache.org/jira/browse/HBASE-14022 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.98.14 Attachments: HBASE-14022-0.98.patch Only applicable to 0.98. Another instance where minimum supported versions of the JRE/JDK and Hadoop lag far behind current committer dev tooling. Fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619450#comment-14619450 ] Hadoop QA commented on HBASE-14042: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744294/HBASE-14042-0.98.patch against 0.98 branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744294 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 24 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14707//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14707//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14707//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14707//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14707//console This message is automatically generated. Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619364#comment-14619364 ] Hadoop QA commented on HBASE-14042: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744286/HBASE-14042.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744286 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:red}-1 core zombie tests{color}. There are 2 zombie test(s): at org.apache.camel.component.jetty.jettyproducer.HttpJettyProducerRecipientListCustomThreadPoolTest.testRecipientList(HttpJettyProducerRecipientListCustomThreadPoolTest.java:40) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14706//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14706//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14706//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14706//console This message is automatically generated. Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-12015: --- Attachment: 12015-hbase-11339.patch Same patch for rerun. Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: 12015-hbase-11339.patch, HBASE-12015-hbase-11339.patch, HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619447#comment-14619447 ] Ted Yu commented on HBASE-12015: {code} + * @param mobFamilyDir The table directory. {code} Correct description for the parameter. For testMobFamilyDelete, please close table at the end. Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: 12015-hbase-11339.patch, HBASE-12015-hbase-11339.patch, HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13897) OOM may occur when Import imports a row with too many KeyValues
[ https://issues.apache.org/jira/browse/HBASE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619351#comment-14619351 ] Hudson commented on HBASE-13897: SUCCESS: Integrated in HBase-1.3-IT #27 (See [https://builds.apache.org/job/HBase-1.3-IT/27/]) HBASE-13897 OOM may occur when Import imports a row with too many KeyValues (Liu Junhong) (tedyu: rev 7ab78d9ddfc93949bf221560e1a2ff255d39535c) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java OOM may occur when Import imports a row with too many KeyValues --- Key: HBASE-13897 URL: https://issues.apache.org/jira/browse/HBASE-13897 Project: HBase Issue Type: Bug Affects Versions: 0.98.13 Reporter: Liu Junhong Assignee: Liu Junhong Fix For: 2.0.0, 0.98.14, 1.3.0 Attachments: 13897-v2.txt, HBASE-13897-0.98.patch, HBASE-13897-branch_1-20150709.patch, HBASE-13897-master-20150629.patch, HBASE-13897-master-20150630.patch, HBASE-13897-master-20150707.patch, HBASE-13897-master.patch When importing a row with too many KeyValues (may have too many columns or versions),KeyValueReducer will incur OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618134#comment-14618134 ] Stephen Yuan Jiang commented on HBASE-13415: The checkstyle and test failure are related to the patch. Will post a new patch soon. Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618047#comment-14618047 ] Ashish Singhi commented on HBASE-14041: --- Nice find. bq. if (cause instanceof RegionTooBusyException || cause instanceof RegionOpeningException || cause instanceof ThrottlingException) You need to format the code, it is crossing more than 100 characters. Can you update the javadoc for findException method ? Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618060#comment-14618060 ] Hadoop QA commented on HBASE-13415: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744134/HBASE-13415.v1-master.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744134 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 42 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1906 checkstyle errors (more than the master's current 1898 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + final TableName tableName, final HColumnDescriptor column, final long nonceGroup, final long nonce) { + final TableName tableName, final byte [] columnName, final long nonceGroup, final long nonce) { + final TableName tableName, final HColumnDescriptor column, final long nonceGroup, final long nonce) { + public static DeleteTableRequest buildDeleteTableRequest(final TableName tableName, final long nonceGroup, final long nonce) { + public static EnableTableRequest buildEnableTableRequest(final TableName tableName, final long nonceGroup, final long nonce) { + public static DisableTableRequest buildDisableTableRequest(final TableName tableName, final long nonceGroup, final long nonce) { + final HTableDescriptor hTableDesc, final byte [][] splitKeys, final long nonceGroup, final long nonce) { + final TableName tableName, final HTableDescriptor hTableDesc, final long nonceGroup, final long nonce) { + (\r\022\014\n\004type\030\002 \002(\r\022\016\n\006log_id\030\003 \002(\004\022\023\n\013min_ + + \030\003 \001(\004\J\n\004Type\022\007\n\003EOF\020\001\022\010\n\004INIT\020\002\022\n\n\006INS + {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestHBaseAdminNoCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14699//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14699//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14699//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14699//console This message is automatically generated. Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is
[jira] [Commented] (HBASE-13387) Add ByteBufferedCell an extension to Cell
[ https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618066#comment-14618066 ] ramkrishna.s.vasudevan commented on HBASE-13387: {code} public static Cell createFirstOnRowCol(final Cell cell, byte[] qArray, int qoffest, int qlength) { public static Cell createFirstOnRowCol(Cell cell, long ts) { {code} the name of the second API can be changed ? createFirstOnRowColTS or something like that? So for now you are passing the array part alone to the Fake keys? and not the BB based API? The Fake Cells does not override the Bufferedcell? Later improvement? This fake key changes decide on what we will do with the blooms and the hashes. {code} if (left instanceof ByteBufferedCell) { + return ByteBufferUtils.compareTo(((ByteBufferedCell) left).getRowByteBuffer(), + ((ByteBufferedCell) left).getRowPositionInByteBuffer(), left.getRowLength(), + right.getRowArray(), right.getRowOffset(), right.getRowLength()); +} +if (right instanceof ByteBufferedCell) { + return -(ByteBufferUtils.compareTo(((ByteBufferedCell) right).getRowByteBuffer(), + ((ByteBufferedCell) right).getRowPositionInByteBuffer(), right.getRowLength(), + left.getRowArray(), left.getRowOffset(), left.getRowLength())); +} {code} Any specfic reason for negating? Use left.getXXxArray and right.getXXXBuffer would also work right and there is no need for negating? Rest looks good to me. +1. The abstract and interface diff in jmh is very interesting and I tried that out too. Nice work in that. Something to learn and somethings are puzzling too. Add ByteBufferedCell an extension to Cell - Key: HBASE-13387 URL: https://issues.apache.org/jira/browse/HBASE-13387 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: ByteBufferedCell.docx, HBASE-13387_v1.patch, HBASE-13387_v2.patch, WIP_HBASE-13387_V2.patch, WIP_ServerCell.patch, benchmark.zip This came in btw the discussion abt the parent Jira and recently Stack added as a comment on the E2E patch on the parent Jira. The idea is to add a new Interface 'ByteBufferedCell' in which we can add new buffer based getter APIs and getters for position in components in BB. We will keep this interface @InterfaceAudience.Private. When the Cell is backed by a DBB, we can create an Object implementing this new interface. The Comparators has to be aware abt this new Cell extension and has to use the BB based APIs rather than getXXXArray(). Also give util APIs in CellUtil to abstract the checks for new Cell type. (Like matchingXXX APIs, getValueAstype APIs etc) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14043) Syntax error in Section 26.2 of Reference Guide
Joe McCarthy created HBASE-14043: Summary: Syntax error in Section 26.2 of Reference Guide Key: HBASE-14043 URL: https://issues.apache.org/jira/browse/HBASE-14043 Project: HBase Issue Type: Bug Components: documentation Reporter: Joe McCarthy Priority: Trivial The following string does not appear rendered as the preceding string describing Table.put: link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch(java.util.List, java.lang.Object[])[Table.batch] (non-writeBuffer) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12015) Not cleaning Mob data when Mob CF is removed from table
[ https://issues.apache.org/jira/browse/HBASE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619394#comment-14619394 ] Ted Yu commented on HBASE-12015: Found the following in some of the failed tests: {code} Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/HTraceConfiguration at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:577) at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:357) {code} Test environment seems to have had some issue. Not cleaning Mob data when Mob CF is removed from table --- Key: HBASE-12015 URL: https://issues.apache.org/jira/browse/HBASE-12015 Project: HBase Issue Type: Bug Affects Versions: hbase-11339 Reporter: Anoop Sam John Assignee: Pankaj Kumar Fix For: hbase-11339 Attachments: HBASE-12015-hbase-11339.patch, HBASE-12015.patch During modifyTable, if a MOB CF is removed from a table, the corresponding mob data also should get removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13337) Table regions are not assigning back, after restarting all regionservers at once.
[ https://issues.apache.org/jira/browse/HBASE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619784#comment-14619784 ] Hudson commented on HBASE-13337: SUCCESS: Integrated in HBase-1.2 #58 (See [https://builds.apache.org/job/HBase-1.2/58/]) HBASE-13337 Table regions are not assigning back, after restarting all regionservers at once (Samir Ahmic) (stack: rev 035d882c7b77669c931cf5090bace2e1ce1c5c6c) * hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java Table regions are not assigning back, after restarting all regionservers at once. - Key: HBASE-13337 URL: https://issues.apache.org/jira/browse/HBASE-13337 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Assignee: Samir Ahmic Priority: Blocker Fix For: 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13337-v2.patch, HBASE-13337-v3.patch, HBASE-13337.patch Regions of the table are continouly in state=FAILED_CLOSE. {noformat} RegionState RIT time (ms) 8f62e819b356736053e06240f7f7c6fd t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113929 caf59209ae65ea80fca6bdc6996a7d68 t1,,1427362431330.caf59209ae65ea80fca6bdc6996a7d68. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM2,16040,1427362533691 113929 db52a74988f71e5cf257bbabf31f26f3 t1,,1427362431330.db52a74988f71e5cf257bbabf31f26f3. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM3,16040,1427362533691 113920 43f3a65b9f9ff283f598c5450feab1f8 t1,,1427362431330.43f3a65b9f9ff283f598c5450feab1f8. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113920 {noformat} *Steps to reproduce:* 1. Start HBase cluster with more than one regionserver. 2. Create a table with precreated regions. (lets say 15 regions) 3. Make sure the regions are well balanced. 4. Restart all the Regionservers process at once across the cluster, except HMaster process 5. After restarting the Regionservers, successfully will connect to the HMaster. *Bug:* But no regions are assigning back to the Regionservers. *Master log shows as follows:* {noformat} 2015-03-26 15:05:36,201 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=OFFLINE, ts=1427362536106, server=VM2,16040,1427362242602} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,202 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_OPENsn=VM1,16040,1427362531818 2015-03-26 15:05:36,244 DEBUG [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Force region state offline {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_CLOSE, ts=1427362536244, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_CLOSE 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=1 of 10 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=2 of 10 2015-03-26 15:05:36,249 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=3 of 10
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619783#comment-14619783 ] Hudson commented on HBASE-14042: SUCCESS: Integrated in HBase-1.2 #58 (See [https://builds.apache.org/job/HBase-1.2/58/]) HBASE-14042 Fix FATAL level logging in FSHLog where logged for non fatal exceptions (apurtell: rev be536ada0a0b6ed22c236cf0987cbc74f6eece8a) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0, 1.0.3 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-13415: --- Attachment: HBASE-13415.v3-master.patch Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch, HBASE-13415.v2-master.patch, HBASE-13415.v3-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14029) getting started for standalone still references hadoop-version-specific binary artifacts
[ https://issues.apache.org/jira/browse/HBASE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619796#comment-14619796 ] Hadoop QA commented on HBASE-14029: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744371/HBASE-14029.1.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744371 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + Prior to 1.x version, be sure to choose the version that corresponds with the version of Hadoop you are likely to use later + (in most cases, you should choose the file for Hadoop 2, which will be called something like _hbase-0.98.13-hadoop2-bin.tar.gz_). {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 2 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestTags.testFlushAndCompactionwithCombinations(TestTags.java:295) at org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover.testSecondaryRegionKillWhilePrimaryIsAcceptingWrites(TestRegionReplicaFailover.java:279) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14710//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14710//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14710//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14710//console This message is automatically generated. getting started for standalone still references hadoop-version-specific binary artifacts Key: HBASE-14029 URL: https://issues.apache.org/jira/browse/HBASE-14029 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 1.0.0 Reporter: Sean Busbey Assignee: Gabor Liptak Labels: beginner Attachments: HBASE-14029.1.patch As of HBase 1.0 we no longer have binary artifacts that are tied to a particular hadoop release. The current section of the ref guide for getting started with standalone mode still refers to them: {quote} Choose a download site from this list of Apache Download Mirrors. Click on the suggested top link. This will take you to a mirror of HBase Releases. Click on the folder named stable and then download the binary file that ends in .tar.gz to your local filesystem. Be sure to choose the version that corresponds with the version of Hadoop you are likely to use later. In most cases, you should choose the file for Hadoop 2, which will be called something like hbase-0.98.3-hadoop2-bin.tar.gz. Do not download the file ending in src.tar.gz for now. {quote} Either remove the reference or turn it into a note call-out for versions 0.98 and earlier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619852#comment-14619852 ] Anoop Sam John commented on HBASE-14041: [~Eungsop Yoo] can you make patch so that it can cleanly apply to master branch? Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-14041: --- Assignee: Eungsop Yoo Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Assignee: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619886#comment-14619886 ] Hudson commented on HBASE-14042: SUCCESS: Integrated in HBase-1.2-IT #44 (See [https://builds.apache.org/job/HBase-1.2-IT/44/]) HBASE-14042 Fix FATAL level logging in FSHLog where logged for non fatal exceptions (apurtell: rev be536ada0a0b6ed22c236cf0987cbc74f6eece8a) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0, 1.0.3 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13743) Backport HBASE-13709 (Updates to meta table server columns may be eclipsed) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619511#comment-14619511 ] Dave Latham commented on HBASE-13743: - Would be great - we got bit by this when we ntpd was shut off for a couple days to avoid the leap second and a clock drifted a few seconds. Backport HBASE-13709 (Updates to meta table server columns may be eclipsed) to 0.98 --- Key: HBASE-13743 URL: https://issues.apache.org/jira/browse/HBASE-13743 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.98.14 The problem addressed with HBASE-13709 is more likely on branch-1 and later but still an issue with the 0.98 code. Backport doesn't look too difficult but nontrivial due to the number of fix ups needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14022) TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+
[ https://issues.apache.org/jira/browse/HBASE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619587#comment-14619587 ] Hudson commented on HBASE-14022: SUCCESS: Integrated in HBase-0.98 #1051 (See [https://builds.apache.org/job/HBase-0.98/1051/]) HBASE-14022 TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+ (apurtell: rev 64b33b8a388a6be4c97b5164131f31b628166474) * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableSnapshotInputFormatImpl.java TestMultiTableSnapshotInputFormatImpl uses a class only available in JRE 1.7+ - Key: HBASE-14022 URL: https://issues.apache.org/jira/browse/HBASE-14022 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.98.14 Attachments: HBASE-14022-0.98.patch Only applicable to 0.98. Another instance where minimum supported versions of the JRE/JDK and Hadoop lag far behind current committer dev tooling. Fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13832) Procedure V2: master fail to start due to WALProcedureStore sync failures when HDFS data nodes count is low
[ https://issues.apache.org/jira/browse/HBASE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-13832: Attachment: HBASE-13832-v6.patch Procedure V2: master fail to start due to WALProcedureStore sync failures when HDFS data nodes count is low --- Key: HBASE-13832 URL: https://issues.apache.org/jira/browse/HBASE-13832 Project: HBase Issue Type: Sub-task Components: master, proc-v2 Affects Versions: 2.0.0, 1.1.0, 1.2.0 Reporter: Stephen Yuan Jiang Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13832-v0.patch, HBASE-13832-v1.patch, HBASE-13832-v2.patch, HBASE-13832-v4.patch, HBASE-13832-v5.patch, HBASE-13832-v6.patch, HDFSPipeline.java, hbase-13832-test-hang.patch, hbase-13832-v3.patch when the data node 3, we got failure in WALProcedureStore#syncLoop() during master start. The failure prevents master to get started. {noformat} 2015-05-29 13:27:16,625 ERROR [WALProcedureStoreSyncThread] wal.WALProcedureStore: Sync slot failed, abort. java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK], DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983-490ece56c772,DISK]], original=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK], DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983- 490ece56c772,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951) {noformat} One proposal is to implement some similar logic as FSHLog: if IOException is thrown during syncLoop in WALProcedureStore#start(), instead of immediate abort, we could try to roll the log and see whether this resolve the issue; if the new log cannot be created or more exception from rolling the log, we then abort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-14034) HBase Backup/Restore Phase 1: Abstract Coordination manager (Zk) operations
[ https://issues.apache.org/jira/browse/HBASE-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov resolved HBASE-14034. --- Resolution: Invalid Closing this as 'invalid'. There will be no CM- related code, all Zk operations will be moved into HBas (*hbase:backup* table). HBase Backup/Restore Phase 1: Abstract Coordination manager (Zk) operations --- Key: HBASE-14034 URL: https://issues.apache.org/jira/browse/HBASE-14034 Project: HBase Issue Type: Task Reporter: Vladimir Rodionov Assignee: Vladimir Rodionov Fix For: 2.0.0 Abstract Coordination manager (Zk) operations. See org.apache.hadoop.hbase.coordination package for references. Provide Zookeeper implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14044) #keyvalue / #keyvalue.example anchor(s) in Reference Guide
[ https://issues.apache.org/jira/browse/HBASE-14044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619626#comment-14619626 ] stack commented on HBASE-14044: --- Do you know where it should be pointing [~gumption]? You have a patch sir? Thanks. #keyvalue / #keyvalue.example anchor(s) in Reference Guide -- Key: HBASE-14044 URL: https://issues.apache.org/jira/browse/HBASE-14044 Project: HBase Issue Type: Bug Components: documentation Reporter: Joe McCarthy Priority: Trivial There are several references to a #keyvalue anchor in the Reference Guide, but I do not see any definition of that anchor. There is a #keyvalue.example definition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13890) Get/Scan from MemStore only (Client API)
[ https://issues.apache.org/jira/browse/HBASE-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619649#comment-14619649 ] stack commented on HBASE-13890: --- bq. This is mostly to improve high performance counters (HPC) Just reading them? Or is it to speed increments? Get/Scan from MemStore only (Client API) Key: HBASE-13890 URL: https://issues.apache.org/jira/browse/HBASE-13890 Project: HBase Issue Type: New Feature Components: API, Client, Scanners Reporter: Vladimir Rodionov Assignee: Vladimir Rodionov Attachments: HBASE-13890-v1.patch This is short-circuit read for get/scan when recent data (version) of a cell can be found only in MemStore (with very high probability). Good examples are: Atomic counters and appends. This feature will allow to bypass completely store file scanners and improve performance and latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13852) Replace master-slave terminology in book, site, and javadoc with a more modern vocabulary
[ https://issues.apache.org/jira/browse/HBASE-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619683#comment-14619683 ] Gabor Liptak commented on HBASE-13852: -- Based on a quick review, the master/slave terminology is exposed in interfaces (there is a master command, there are environment variables like HBASE_SLAVE_SLEEP), in the API (stopMaster()), file names, etc. How extensive incompatible changes are acceptable around this? Also is coordinator/regionserver would be the preferred rename? Replace master-slave terminology in book, site, and javadoc with a more modern vocabulary - Key: HBASE-13852 URL: https://issues.apache.org/jira/browse/HBASE-13852 Project: HBase Issue Type: Task Components: documentation, site Reporter: Andrew Purtell Priority: Minor Fix For: 2.0.0 We should reconsider our use of historical master-slave terminology everywhere - book, site, and javadoc - and replace it with a more modern vocabulary. There was a conversation in the background at HBaseCon about this (I was involved in one on Twitter). Out of some of the suggestions, I like coordinator as replacement for master, and worker as one replacement for slave, with the other the more descriptive regionserver or region server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619754#comment-14619754 ] Enis Soztutar commented on HBASE-13415: --- v2 patch looks good. I have only one comment in RB. Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch, HBASE-13415.v2-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619809#comment-14619809 ] Hudson commented on HBASE-14042: FAILURE: Integrated in HBase-1.0 #990 (See [https://builds.apache.org/job/HBase-1.0/990/]) HBASE-14042 Fix FATAL level logging in FSHLog where logged for non fatal exceptions (apurtell: rev 8bb2a0223f6def1540f3fcfd517fc69cccba7e9a) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0, 1.0.3 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14045) Bumping thrift version to 0.9.2.
Srikanth Srungarapu created HBASE-14045: --- Summary: Bumping thrift version to 0.9.2. Key: HBASE-14045 URL: https://issues.apache.org/jira/browse/HBASE-14045 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Srikanth Srungarapu Fix For: 2.0.0, 1.3.0 From mailing list conversation: {quote} Currently, HBase is using Thrift 0.9.0 version, with the latest version being 0.9.2. Currently, the HBase Thrift gateway is vulnerable to crashes due to THRIFT-2660 when used with default transport and the workaround for this problem is switching to framed transport. Unfortunately, the recently added impersonation support \[1\] doesn't work with framed transport leaving thrift gateway using this feature susceptible to crashes. Updating thrift version to 0.9.2 will help us in mitigating this problem. Given that security is one of key requirements for the production clusters, it would be good to ensure our users that security features in thrift gateway can be used without any major concerns. Aside this, there are also some nice fixes pertaining to leaky resources in 0.9.2 like \[2\] and \[3\]. As far compatibility guarantees are concerned, thrift assures 100% wire compatibility. However, it is my understanding that there were some minor additions (new API) in 0.9.2 \[4\] which won't work in 0.9.0, but that won't affect us since we are not using those features. And I tried running test suite and did manual testing with thrift version set to 0.9.2 and things are running smoothly. If there are no objections to this change, I would be more than happy to file a jira and follow this up. \[1\] https://issues.apache.org/jira/browse/HBASE-11349 \[2\] https://issues.apache.org/jira/browse/THRIFT-2274 \[3\] https://issues.apache.org/jira/browse/THRIFT-2359 \[4\] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310800version=12324954 {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619851#comment-14619851 ] Anoop Sam John commented on HBASE-14041: We start doing more Exception type comparison.. (this or that or that)... May be need some mid level parent Exception type checking against it is enough? Surely another jira as an IA.. Just pointing out here. +1 and nice catch. Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12848) Utilize Flash storage for WAL
[ https://issues.apache.org/jira/browse/HBASE-12848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619859#comment-14619859 ] Anoop Sam John commented on HBASE-12848: bq. Allow the inode to be moved atomically in the NN, and then have the DNs lazily (and atomically per block) migration block by block between storage classes Agree. Same thinking and discussion here as well. For WAL as such it might not be that critical as any way the archived WALs will get removed by LogCleaner (as long as no usage by replication etc). We will see more on this in HDFS and HBase side. Thanks. cc [~jingcheng...@intel.com] Utilize Flash storage for WAL - Key: HBASE-12848 URL: https://issues.apache.org/jira/browse/HBASE-12848 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.1.0 Attachments: 12848-v1.patch, 12848-v2.patch, 12848-v3.patch, 12848-v4.patch, 12848-v4.patch One way to improve data ingestion rate is to make use of Flash storage. HDFS is doing the heavy lifting - see HDFS-7228. We assume an environment where: 1. Some servers have a mix of flash, e.g. 2 flash drives and 4 traditional drives. 2. Some servers have all traditional storage. 3. RegionServers are deployed on both profiles within one HBase cluster. This JIRA allows WAL to be managed on flash in a mixed-profile environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14045) Bumping thrift version to 0.9.2.
[ https://issues.apache.org/jira/browse/HBASE-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619870#comment-14619870 ] Sean Busbey commented on HBASE-14045: - could we get a japi-compliance-checker report for the library? Just want to be cognizant of any obvious gotchas to downstreamers. Bumping thrift version to 0.9.2. Key: HBASE-14045 URL: https://issues.apache.org/jira/browse/HBASE-14045 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Srikanth Srungarapu Fix For: 2.0.0, 1.3.0 From mailing list conversation: {quote} Currently, HBase is using Thrift 0.9.0 version, with the latest version being 0.9.2. Currently, the HBase Thrift gateway is vulnerable to crashes due to THRIFT-2660 when used with default transport and the workaround for this problem is switching to framed transport. Unfortunately, the recently added impersonation support \[1\] doesn't work with framed transport leaving thrift gateway using this feature susceptible to crashes. Updating thrift version to 0.9.2 will help us in mitigating this problem. Given that security is one of key requirements for the production clusters, it would be good to ensure our users that security features in thrift gateway can be used without any major concerns. Aside this, there are also some nice fixes pertaining to leaky resources in 0.9.2 like \[2\] and \[3\]. As far compatibility guarantees are concerned, thrift assures 100% wire compatibility. However, it is my understanding that there were some minor additions (new API) in 0.9.2 \[4\] which won't work in 0.9.0, but that won't affect us since we are not using those features. And I tried running test suite and did manual testing with thrift version set to 0.9.2 and things are running smoothly. If there are no objections to this change, I would be more than happy to file a jira and follow this up. \[1\] https://issues.apache.org/jira/browse/HBASE-11349 \[2\] https://issues.apache.org/jira/browse/THRIFT-2274 \[3\] https://issues.apache.org/jira/browse/THRIFT-2359 \[4\] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310800version=12324954 {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13337) Table regions are not assigning back, after restarting all regionservers at once.
[ https://issues.apache.org/jira/browse/HBASE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619750#comment-14619750 ] Hudson commented on HBASE-13337: FAILURE: Integrated in HBase-1.3 #43 (See [https://builds.apache.org/job/HBase-1.3/43/]) HBASE-13337 Table regions are not assigning back, after restarting all regionservers at once (Samir Ahmic) (stack: rev 24cf287df44d1b6d84d5f14fbf5cf254f3df3bcb) * hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java Table regions are not assigning back, after restarting all regionservers at once. - Key: HBASE-13337 URL: https://issues.apache.org/jira/browse/HBASE-13337 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Assignee: Samir Ahmic Priority: Blocker Fix For: 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13337-v2.patch, HBASE-13337-v3.patch, HBASE-13337.patch Regions of the table are continouly in state=FAILED_CLOSE. {noformat} RegionState RIT time (ms) 8f62e819b356736053e06240f7f7c6fd t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113929 caf59209ae65ea80fca6bdc6996a7d68 t1,,1427362431330.caf59209ae65ea80fca6bdc6996a7d68. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM2,16040,1427362533691 113929 db52a74988f71e5cf257bbabf31f26f3 t1,,1427362431330.db52a74988f71e5cf257bbabf31f26f3. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM3,16040,1427362533691 113920 43f3a65b9f9ff283f598c5450feab1f8 t1,,1427362431330.43f3a65b9f9ff283f598c5450feab1f8. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113920 {noformat} *Steps to reproduce:* 1. Start HBase cluster with more than one regionserver. 2. Create a table with precreated regions. (lets say 15 regions) 3. Make sure the regions are well balanced. 4. Restart all the Regionservers process at once across the cluster, except HMaster process 5. After restarting the Regionservers, successfully will connect to the HMaster. *Bug:* But no regions are assigning back to the Regionservers. *Master log shows as follows:* {noformat} 2015-03-26 15:05:36,201 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=OFFLINE, ts=1427362536106, server=VM2,16040,1427362242602} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,202 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_OPENsn=VM1,16040,1427362531818 2015-03-26 15:05:36,244 DEBUG [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Force region state offline {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_CLOSE, ts=1427362536244, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_CLOSE 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=1 of 10 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=2 of 10 2015-03-26 15:05:36,249 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=3 of 10
[jira] [Commented] (HBASE-13415) Procedure V2 - Use nonces for double submits from client
[ https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619790#comment-14619790 ] Stephen Yuan Jiang commented on HBASE-13415: The V3 patch addressed the comment in RB. Procedure V2 - Use nonces for double submits from client Key: HBASE-13415 URL: https://issues.apache.org/jira/browse/HBASE-13415 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13415.v1-master.patch, HBASE-13415.v2-master.patch, HBASE-13415.v3-master.patch The client can submit a procedure, but before getting the procId back, the master might fail. In this case, the client request will fail and the client will re-submit the request. If 1.1 client or if there is no contention for the table lock, the time window is pretty small, but still might happen. If the proc was accepted and stored in the procedure store, a re-submit from the client will add another procedure, which will execute after the first one. The first one will likely succeed, and the second one will fail (for example in the case of create table, the second one will throw TableExistsException). One idea is to use client generated nonces (that we already have) to guard against these cases. The client will submit the request with the nonce and the nonce will be saved together with the procedure in the store. In case of a double submit, the nonce-cache is checked and the procId of the original request is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12596) bulkload needs to follow locality
[ https://issues.apache.org/jira/browse/HBASE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victor Xu updated HBASE-12596: -- Attachment: HBASE-12596-master-v6.patch Commit new version for master branch. bulkload needs to follow locality - Key: HBASE-12596 URL: https://issues.apache.org/jira/browse/HBASE-12596 Project: HBase Issue Type: Improvement Components: HFile, regionserver Affects Versions: 0.98.8 Environment: hadoop-2.3.0, hbase-0.98.8, jdk1.7 Reporter: Victor Xu Assignee: Victor Xu Fix For: 0.98.14 Attachments: HBASE-12596-0.98-v1.patch, HBASE-12596-0.98-v2.patch, HBASE-12596-0.98-v3.patch, HBASE-12596-0.98-v4.patch, HBASE-12596-0.98-v5.patch, HBASE-12596-master-v1.patch, HBASE-12596-master-v2.patch, HBASE-12596-master-v3.patch, HBASE-12596-master-v4.patch, HBASE-12596-master-v5.patch, HBASE-12596-master-v6.patch, HBASE-12596.patch Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles to be loaded; 2. Move these HFiles to the right hdfs directory. However, the locality could be loss during the first step. Why not just write the HFiles directly into the right place? We can do this easily because StoreFile.WriterBuilder has the withFavoredNodes method, and we just need to call it in HFileOutputFormat's getNewWriter(). This feature is enabled by default, and we could use 'hbase.bulkload.locality.sensitive.enabled=false' to disable it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13890) Get/Scan from MemStore only (Client API)
[ https://issues.apache.org/jira/browse/HBASE-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619863#comment-14619863 ] Anoop Sam John commented on HBASE-13890: If the increment will try to find latest counter from memstore only (ya mostly it will be there) it will improve the increment latency greatly. The read will be the max time taking stuff in the op. bq.This is mostly to improve high performance counters (HPC), not Get, not Append (is anybody using them anyway) and not Scan operations. Most recent version of HPCs are always in Memstore (99.99% of a time), but each store file in this region/cf has its version as well (before major compaction) Said that, can we limit the feature for increment only then? There might be use cases for other reads as well but keep it for later IAs? Get/Scan from MemStore only (Client API) Key: HBASE-13890 URL: https://issues.apache.org/jira/browse/HBASE-13890 Project: HBase Issue Type: New Feature Components: API, Client, Scanners Reporter: Vladimir Rodionov Assignee: Vladimir Rodionov Attachments: HBASE-13890-v1.patch This is short-circuit read for get/scan when recent data (version) of a cell can be found only in MemStore (with very high probability). Good examples are: Atomic counters and appends. This feature will allow to bypass completely store file scanners and improve performance and latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619898#comment-14619898 ] Hudson commented on HBASE-14042: FAILURE: Integrated in HBase-1.3 #44 (See [https://builds.apache.org/job/HBase-1.3/44/]) HBASE-14042 Fix FATAL level logging in FSHLog where logged for non fatal exceptions (apurtell: rev c66ff887e63a9da4f6c24c92dae917fb21260948) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0, 1.0.3 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13832) Procedure V2: master fail to start due to WALProcedureStore sync failures when HDFS data nodes count is low
[ https://issues.apache.org/jira/browse/HBASE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619752#comment-14619752 ] Hadoop QA commented on HBASE-13832: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744362/HBASE-13832-v6.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744362 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.qpid.server.queue.ProducerFlowControlTest.testCapacityExceededCausesBlock(ProducerFlowControlTest.java:123) at org.apache.qpid.test.utils.QpidBrokerTestCase.runBare(QpidBrokerTestCase.java:323) at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:155) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14709//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14709//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14709//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14709//console This message is automatically generated. Procedure V2: master fail to start due to WALProcedureStore sync failures when HDFS data nodes count is low --- Key: HBASE-13832 URL: https://issues.apache.org/jira/browse/HBASE-13832 Project: HBase Issue Type: Sub-task Components: master, proc-v2 Affects Versions: 2.0.0, 1.1.0, 1.2.0 Reporter: Stephen Yuan Jiang Assignee: Matteo Bertozzi Priority: Blocker Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13832-v0.patch, HBASE-13832-v1.patch, HBASE-13832-v2.patch, HBASE-13832-v4.patch, HBASE-13832-v5.patch, HBASE-13832-v6.patch, HDFSPipeline.java, hbase-13832-test-hang.patch, hbase-13832-v3.patch when the data node 3, we got failure in WALProcedureStore#syncLoop() during master start. The failure prevents master to get started. {noformat} 2015-05-29 13:27:16,625 ERROR [WALProcedureStoreSyncThread] wal.WALProcedureStore: Sync slot failed, abort. java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK], DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983-490ece56c772,DISK]], original=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK], DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983- 490ece56c772,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951) {noformat} One proposal is to implement some similar logic as FSHLog: if IOException is thrown during syncLoop in WALProcedureStore#start(), instead of immediate abort, we could try to roll the log and see whether this resolve the issue; if the new log cannot be created or more exception from rolling the log, we then abort. -- This
[jira] [Commented] (HBASE-13997) ScannerCallableWithReplicas cause Infinitely blocking
[ https://issues.apache.org/jira/browse/HBASE-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619803#comment-14619803 ] Hadoop QA commented on HBASE-13997: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12744374/hbase-13997_v2.patch against master branch at commit f5ad736282c8c9c27b14131919d60b72834ec9e4. ATTACHMENT ID: 12744374 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFastFail Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14711//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14711//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14711//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14711//console This message is automatically generated. ScannerCallableWithReplicas cause Infinitely blocking - Key: HBASE-13997 URL: https://issues.apache.org/jira/browse/HBASE-13997 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.0.1.1 Reporter: Zephyr Guo Assignee: Zephyr Guo Priority: Minor Attachments: HBASE-13997.patch, hbase-13997_v2.patch Bug in ScannerCallableWithReplicas.addCallsForOtherReplicas method {code:title=code in ScannerCallableWithReplicas.addCallsForOtherReplicas |borderStyle=solid} private int addCallsForOtherReplicas( BoundedCompletionServicePairResult[], ScannerCallable cs, RegionLocations rl, int min, int max) { if (scan.getConsistency() == Consistency.STRONG) { return 0; // not scheduling on other replicas for strong consistency } for (int id = min; id = max; id++) { if (currentScannerCallable.getHRegionInfo().getReplicaId() == id) { continue; //this was already scheduled earlier } ScannerCallable s = currentScannerCallable.getScannerCallableForReplica(id); if (this.lastResult != null) { s.getScan().setStartRow(this.lastResult.getRow()); } outstandingCallables.add(s); RetryingRPC retryingOnReplica = new RetryingRPC(s); cs.submit(retryingOnReplica); } return max - min + 1; //bug? should be max - min,because continue //always happen once } {code} It can cause completed submitted always so that the following code will be infinitely blocked. {code:title=code in ScannerCallableWithReplicas.call|borderStyle=solid} // submitted larger than the actual one submitted += addCallsForOtherReplicas(cs, rl, 0, rl.size() - 1); try { //here will be affected while (completed submitted) { try { FuturePairResult[], ScannerCallable f = cs.take(); PairResult[], ScannerCallable r = f.get(); if (r != null r.getSecond() != null) { updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, pool); } return r == null ? null : r.getFirst(); // great we got an answer } catch (ExecutionException e) { // if not cancel or interrupt, wait until all RPC's are done // one of the tasks failed. Save the exception for later. if (exceptions == null) exceptions = new ArrayListExecutionException(rl.size()); exceptions.add(e);
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619824#comment-14619824 ] Hudson commented on HBASE-14042: FAILURE: Integrated in HBase-TRUNK #6637 (See [https://builds.apache.org/job/HBase-TRUNK/6637/]) HBASE-14042 Fix FATAL level logging in FSHLog where logged for non fatal exceptions (apurtell: rev 41c8ec7aeae859808a217bd7a561e81be7e3c7ac) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0, 1.0.3 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13890) Get/Scan from MemStore only (Client API)
[ https://issues.apache.org/jira/browse/HBASE-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619866#comment-14619866 ] Anoop Sam John commented on HBASE-13890: bq.public OperationWithAttributes setMemstoreOnly(boolean v){ We will have to override this method in Get, Scan, Append and Increment (wherever we want the feature) and return those types rather than OperationWithAttributes.Also Put, Delete etc also will have this method available. Do we need to throw Exception when some one calls this on Put, Delete etc? Get/Scan from MemStore only (Client API) Key: HBASE-13890 URL: https://issues.apache.org/jira/browse/HBASE-13890 Project: HBase Issue Type: New Feature Components: API, Client, Scanners Reporter: Vladimir Rodionov Assignee: Vladimir Rodionov Attachments: HBASE-13890-v1.patch This is short-circuit read for get/scan when recent data (version) of a cell can be found only in MemStore (with very high probability). Good examples are: Atomic counters and appends. This feature will allow to bypass completely store file scanners and improve performance and latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619878#comment-14619878 ] Hudson commented on HBASE-14042: SUCCESS: Integrated in HBase-1.1 #579 (See [https://builds.apache.org/job/HBase-1.1/579/]) HBASE-14042 Fix FATAL level logging in FSHLog where logged for non fatal exceptions (apurtell: rev 49ee8ce4cafdcb9cbe4e9705eab1074defb28859) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0, 1.0.3 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13387) Add ByteBufferedCell an extension to Cell
[ https://issues.apache.org/jira/browse/HBASE-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-13387: --- Attachment: HBASE-13387_v3.patch Add ByteBufferedCell an extension to Cell - Key: HBASE-13387 URL: https://issues.apache.org/jira/browse/HBASE-13387 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: ByteBufferedCell.docx, HBASE-13387_v1.patch, HBASE-13387_v2.patch, HBASE-13387_v3.patch, WIP_HBASE-13387_V2.patch, WIP_ServerCell.patch, benchmark.zip This came in btw the discussion abt the parent Jira and recently Stack added as a comment on the E2E patch on the parent Jira. The idea is to add a new Interface 'ByteBufferedCell' in which we can add new buffer based getter APIs and getters for position in components in BB. We will keep this interface @InterfaceAudience.Private. When the Cell is backed by a DBB, we can create an Object implementing this new interface. The Comparators has to be aware abt this new Cell extension and has to use the BB based APIs rather than getXXXArray(). Also give util APIs in CellUtil to abstract the checks for new Cell type. (Like matchingXXX APIs, getValueAstype APIs etc) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14041) Client MetaCache is cleared if a ThrottlingException is thrown
[ https://issues.apache.org/jira/browse/HBASE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eungsop Yoo updated HBASE-14041: Attachment: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v3.patch I made the new patch on master branch. The previous patches were not. Client MetaCache is cleared if a ThrottlingException is thrown -- Key: HBASE-14041 URL: https://issues.apache.org/jira/browse/HBASE-14041 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Eungsop Yoo Assignee: Eungsop Yoo Priority: Minor Attachments: 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v2.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t-v3.patch, 0001-Do-not-clear-MetaCache-if-a-ThrottlingException-is-t.patch During performance test with the request throttling, I saw that hbase:meta table had been read a lot. Currently the MetaCache of the client is cleared, if a ThrottlingException is thrown. It seems to be not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics
[ https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619554#comment-14619554 ] Lei Chen commented on HBASE-13965: -- Thanks [~clayb] for giving suggestion. I have found that the stochastic load balancer holds a reference to HMaster, which can be used to get the number of tables, therefore the size of the map can be determined. No need to use configurable value. I will update the patch soon. Stochastic Load Balancer JMX Metrics Key: HBASE-13965 URL: https://issues.apache.org/jira/browse/HBASE-13965 Project: HBase Issue Type: Improvement Components: Balancer, metrics Reporter: Lei Chen Assignee: Lei Chen Attachments: HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, HBASE-13965-v6.patch, HBASE-13965_v2.patch, HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png Today’s default HBase load balancer (the Stochastic load balancer) is cost function based. The cost function weights are tunable but no visibility into those cost function results is directly provided. A driving example is a cluster we have been tuning which has skewed rack size (one rack has half the nodes of the other few racks). We are tuning the cluster for uniform response time from all region servers with the ability to tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and RegionCountSkew Cost is difficult without a way to attribute each cost function’s contribution to overall cost. What this jira proposes is to provide visibility via JMX into each cost function of the stochastic load balancer, as well as the overall cost of the balancing plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14042: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: (was: 1.0.2) 1.0.3 Status: Resolved (was: Patch Available) Pushed to 0.98 and up. Thanks for the review [~stack] Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.2.0, 1.1.2, 1.3.0, 1.0.3 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12848) Utilize Flash storage for WAL
[ https://issues.apache.org/jira/browse/HBASE-12848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619685#comment-14619685 ] Lars Hofhansl commented on HBASE-12848: --- I think it's just what I mentioned above. Allow the inode to be moved atomically in the NN, and then have the DNs lazily (and atomically per block) migration block by block between storage classes (i.e. make a copy of the block, then delete the old copy). At any given time all data is accessible, and eventually it will at the right storage class. Utilize Flash storage for WAL - Key: HBASE-12848 URL: https://issues.apache.org/jira/browse/HBASE-12848 Project: HBase Issue Type: Sub-task Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.1.0 Attachments: 12848-v1.patch, 12848-v2.patch, 12848-v3.patch, 12848-v4.patch, 12848-v4.patch One way to improve data ingestion rate is to make use of Flash storage. HDFS is doing the heavy lifting - see HDFS-7228. We assume an environment where: 1. Some servers have a mix of flash, e.g. 2 flash drives and 4 traditional drives. 2. Some servers have all traditional storage. 3. RegionServers are deployed on both profiles within one HBase cluster. This JIRA allows WAL to be managed on flash in a mixed-profile environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14028) DistributedLogReplay drops edits when ITBLL 125M
[ https://issues.apache.org/jira/browse/HBASE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14028: -- Attachment: 14028.logging.txt Patch w/ extra logging used digging in on DLR. DistributedLogReplay drops edits when ITBLL 125M Key: HBASE-14028 URL: https://issues.apache.org/jira/browse/HBASE-14028 Project: HBase Issue Type: Bug Components: Recovery Affects Versions: 1.2.0 Reporter: stack Attachments: 14028.logging.txt Testing DLR before 1.2.0RC gets cut, we are dropping edits. Issue seems to be around replay into a deployed region that is on a server that dies before all edits have finished replaying. Logging is sparse on sequenceid accounting so can't tell for sure how it is happening (and if our now accounting by Store is messing up DLR). Digging. I notice also that DLR does not refresh its cache of region location on error -- it just keeps trying till whole WAL fails 8 retries...about 30 seconds. We could do a bit of refactor and have the replay find region in new location if moved during DLR replay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14028) DistributedLogReplay drops edits when ITBLL 125M
[ https://issues.apache.org/jira/browse/HBASE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619615#comment-14619615 ] stack commented on HBASE-14028: --- I added logging and reran. Found another failure type beyond the above described replay over a coincident flush. Highlevel, region opens, we start to replay edits but well before the replay can finish, the server hosting the newly opened region crashes. Edits in the WAL we were replaying get skipped on second attempt. Here is open before crash: 2015-07-08 12:45:38,317 DEBUG [RS_OPEN_REGION-c2023:16020-0] wal.WALSplitter: Wrote region seqId=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/IntegrationTestBigLinkedList/467eaf13c7ce1f2e1afb1c567322c9e7/recovered.edits/760185051.seqid to file, newSeqId=760185051, maxSeqId=720162792 Here is open after crash: 2015-07-08 12:45:49,920 DEBUG [RS_OPEN_REGION-c2025:16020-1] wal.WALSplitter: Wrote region seqId=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/IntegrationTestBigLinkedList/467eaf13c7ce1f2e1afb1c567322c9e7/recovered.edits/800185051.seqid to file, newSeqId=800185051, maxSeqId=760185051 See how newSeqId the first time around becomes the maxSeqId the second time we open. This is broke (this is the well-padded sequence id set well in advance of any edits that could come in during replay). See how on subsequent replay we end up skipping most of the edits: 2015-07-08 12:46:25,103 INFO [RS_LOG_REPLAY_OPS-c2025:16020-1] wal.WALSplitter: Processed 80 edits across 0 regions; edits skipped=1583; log file=hdfs://c2020.halxg.cloudera.com:8020/hbase/WALs/c2021.halxg.cloudera.com,16020,1436383987497-splitting/c2021.halxg.cloudera.com%2C16020%2C1436383987497.default.1436384632799, length=72993715, corrupted=false, progress failed=false (Says 80 edits for ZERO regions... ) The maximum sequence id in the WAL to replay is 720185601 even though we did not replay all edits. So, at least two issues. Let me put this aside since it looks like it won't make hbase-1.2.0 at this late stage. DistributedLogReplay drops edits when ITBLL 125M Key: HBASE-14028 URL: https://issues.apache.org/jira/browse/HBASE-14028 Project: HBase Issue Type: Bug Components: Recovery Affects Versions: 1.2.0 Reporter: stack Testing DLR before 1.2.0RC gets cut, we are dropping edits. Issue seems to be around replay into a deployed region that is on a server that dies before all edits have finished replaying. Logging is sparse on sequenceid accounting so can't tell for sure how it is happening (and if our now accounting by Store is messing up DLR). Digging. I notice also that DLR does not refresh its cache of region location on error -- it just keeps trying till whole WAL fails 8 retries...about 30 seconds. We could do a bit of refactor and have the replay find region in new location if moved during DLR replay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14027) Clean up netty dependencies
[ https://issues.apache.org/jira/browse/HBASE-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619634#comment-14619634 ] stack commented on HBASE-14027: --- Patch looks good to me. Did you get a chance to verify works for IT? +1 Clean up netty dependencies --- Key: HBASE-14027 URL: https://issues.apache.org/jira/browse/HBASE-14027 Project: HBase Issue Type: Improvement Components: build Affects Versions: 1.0.0 Reporter: Sean Busbey Assignee: Sean Busbey Fix For: 2.0.0, 1.2.0 Attachments: HBASE-14027.1.patch, HBASE-14027.2.patch, HBASE-14027.3.patch We have multiple copies of Netty (3?) getting shipped around. clean some up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14042) Fix FATAL level logging in FSHLog where logged for non fatal exceptions
[ https://issues.apache.org/jira/browse/HBASE-14042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619624#comment-14619624 ] stack commented on HBASE-14042: --- +1 Fix FATAL level logging in FSHLog where logged for non fatal exceptions --- Key: HBASE-14042 URL: https://issues.apache.org/jira/browse/HBASE-14042 Project: HBase Issue Type: Bug Affects Versions: 0.98.13, 1.1.1, 1.0.1.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14042-0.98.patch, HBASE-14042.patch We have FATAL level logging in FSHLog where an IOException causes a log roll to be requested. It isn't a fatal event. Drop the log level to WARN. (Could even be INFO.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13743) Backport HBASE-13709 (Updates to meta table server columns may be eclipsed) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619654#comment-14619654 ] Andrew Purtell commented on HBASE-13743: Thanks for reminding me this is open. Let me work on it today. Backport HBASE-13709 (Updates to meta table server columns may be eclipsed) to 0.98 --- Key: HBASE-13743 URL: https://issues.apache.org/jira/browse/HBASE-13743 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.98.14 The problem addressed with HBASE-13709 is more likely on branch-1 and later but still an issue with the 0.98 code. Backport doesn't look too difficult but nontrivial due to the number of fix ups needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13337) Table regions are not assigning back, after restarting all regionservers at once.
[ https://issues.apache.org/jira/browse/HBASE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619677#comment-14619677 ] Hudson commented on HBASE-13337: SUCCESS: Integrated in HBase-1.3-IT #28 (See [https://builds.apache.org/job/HBase-1.3-IT/28/]) HBASE-13337 Table regions are not assigning back, after restarting all regionservers at once (Samir Ahmic) (stack: rev 24cf287df44d1b6d84d5f14fbf5cf254f3df3bcb) * hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java Table regions are not assigning back, after restarting all regionservers at once. - Key: HBASE-13337 URL: https://issues.apache.org/jira/browse/HBASE-13337 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Assignee: Samir Ahmic Priority: Blocker Fix For: 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13337-v2.patch, HBASE-13337-v3.patch, HBASE-13337.patch Regions of the table are continouly in state=FAILED_CLOSE. {noformat} RegionState RIT time (ms) 8f62e819b356736053e06240f7f7c6fd t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113929 caf59209ae65ea80fca6bdc6996a7d68 t1,,1427362431330.caf59209ae65ea80fca6bdc6996a7d68. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM2,16040,1427362533691 113929 db52a74988f71e5cf257bbabf31f26f3 t1,,1427362431330.db52a74988f71e5cf257bbabf31f26f3. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM3,16040,1427362533691 113920 43f3a65b9f9ff283f598c5450feab1f8 t1,,1427362431330.43f3a65b9f9ff283f598c5450feab1f8. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113920 {noformat} *Steps to reproduce:* 1. Start HBase cluster with more than one regionserver. 2. Create a table with precreated regions. (lets say 15 regions) 3. Make sure the regions are well balanced. 4. Restart all the Regionservers process at once across the cluster, except HMaster process 5. After restarting the Regionservers, successfully will connect to the HMaster. *Bug:* But no regions are assigning back to the Regionservers. *Master log shows as follows:* {noformat} 2015-03-26 15:05:36,201 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=OFFLINE, ts=1427362536106, server=VM2,16040,1427362242602} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,202 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_OPENsn=VM1,16040,1427362531818 2015-03-26 15:05:36,244 DEBUG [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Force region state offline {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_CLOSE, ts=1427362536244, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_CLOSE 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=1 of 10 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=2 of 10 2015-03-26 15:05:36,249 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=3 of 10
[jira] [Assigned] (HBASE-14029) getting started for standalone still references hadoop-version-specific binary artifacts
[ https://issues.apache.org/jira/browse/HBASE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Liptak reassigned HBASE-14029: Assignee: Gabor Liptak getting started for standalone still references hadoop-version-specific binary artifacts Key: HBASE-14029 URL: https://issues.apache.org/jira/browse/HBASE-14029 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 1.0.0 Reporter: Sean Busbey Assignee: Gabor Liptak Labels: beginner Attachments: HBASE-14029.1.patch As of HBase 1.0 we no longer have binary artifacts that are tied to a particular hadoop release. The current section of the ref guide for getting started with standalone mode still refers to them: {quote} Choose a download site from this list of Apache Download Mirrors. Click on the suggested top link. This will take you to a mirror of HBase Releases. Click on the folder named stable and then download the binary file that ends in .tar.gz to your local filesystem. Be sure to choose the version that corresponds with the version of Hadoop you are likely to use later. In most cases, you should choose the file for Hadoop 2, which will be called something like hbase-0.98.3-hadoop2-bin.tar.gz. Do not download the file ending in src.tar.gz for now. {quote} Either remove the reference or turn it into a note call-out for versions 0.98 and earlier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14029) getting started for standalone still references hadoop-version-specific binary artifacts
[ https://issues.apache.org/jira/browse/HBASE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Liptak updated HBASE-14029: - Attachment: HBASE-14029.1.patch getting started for standalone still references hadoop-version-specific binary artifacts Key: HBASE-14029 URL: https://issues.apache.org/jira/browse/HBASE-14029 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 1.0.0 Reporter: Sean Busbey Assignee: Gabor Liptak Labels: beginner Attachments: HBASE-14029.1.patch As of HBase 1.0 we no longer have binary artifacts that are tied to a particular hadoop release. The current section of the ref guide for getting started with standalone mode still refers to them: {quote} Choose a download site from this list of Apache Download Mirrors. Click on the suggested top link. This will take you to a mirror of HBase Releases. Click on the folder named stable and then download the binary file that ends in .tar.gz to your local filesystem. Be sure to choose the version that corresponds with the version of Hadoop you are likely to use later. In most cases, you should choose the file for Hadoop 2, which will be called something like hbase-0.98.3-hadoop2-bin.tar.gz. Do not download the file ending in src.tar.gz for now. {quote} Either remove the reference or turn it into a note call-out for versions 0.98 and earlier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14029) getting started for standalone still references hadoop-version-specific binary artifacts
[ https://issues.apache.org/jira/browse/HBASE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Liptak updated HBASE-14029: - Release Note: HBASE-14029 Correct documentation for Hadoop version specific artifacts Status: Patch Available (was: Open) getting started for standalone still references hadoop-version-specific binary artifacts Key: HBASE-14029 URL: https://issues.apache.org/jira/browse/HBASE-14029 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 1.0.0 Reporter: Sean Busbey Assignee: Gabor Liptak Labels: beginner Attachments: HBASE-14029.1.patch As of HBase 1.0 we no longer have binary artifacts that are tied to a particular hadoop release. The current section of the ref guide for getting started with standalone mode still refers to them: {quote} Choose a download site from this list of Apache Download Mirrors. Click on the suggested top link. This will take you to a mirror of HBase Releases. Click on the folder named stable and then download the binary file that ends in .tar.gz to your local filesystem. Be sure to choose the version that corresponds with the version of Hadoop you are likely to use later. In most cases, you should choose the file for Hadoop 2, which will be called something like hbase-0.98.3-hadoop2-bin.tar.gz. Do not download the file ending in src.tar.gz for now. {quote} Either remove the reference or turn it into a note call-out for versions 0.98 and earlier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13997) ScannerCallableWithReplicas cause Infinitely blocking
[ https://issues.apache.org/jira/browse/HBASE-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13997: -- Attachment: hbase-13997_v2.patch Thanks [~gzh1992n] for the patch. I was writing a unit test for this, but it turns out that that part was working and will not cause a client hang. The off-by-one error is definitely there, but it was not causing a problem because of a related but different issue. Some time ago (HBASE-11564), the semantics for {{ResultBoundedCompletionService}} got changed from being a blocking queue kind of data structure where you submit multiiple tasks and call take() multiple times, into one where you submit multiple tasks, and you only take once. The completed list does not get cleaned when {{take()}} returns. HBASE-11564 did the changes in Get code-path, but not in the scan code path it seems. For example, we are submitting 3 calls to the {{ResultBoundedCompletionService}}, but we had this off-by-one and {{submitted}} is 4. But, since as soon as the first result comes in, if it is an exception, we would call {{cs.take()}} 4 times, and each time it will return the same exception. This does not in fact cause a hang, but still a clean up in the code is needed. Attached v2 patch brings the scanner code path to be similar to the get code path ({{RpcRetryingCallerWithReadReplicas}}). [~devaraj] do you mind taking a look? ScannerCallableWithReplicas cause Infinitely blocking - Key: HBASE-13997 URL: https://issues.apache.org/jira/browse/HBASE-13997 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.0.1.1 Reporter: Zephyr Guo Assignee: Zephyr Guo Priority: Minor Attachments: HBASE-13997.patch, hbase-13997_v2.patch Bug in ScannerCallableWithReplicas.addCallsForOtherReplicas method {code:title=code in ScannerCallableWithReplicas.addCallsForOtherReplicas |borderStyle=solid} private int addCallsForOtherReplicas( BoundedCompletionServicePairResult[], ScannerCallable cs, RegionLocations rl, int min, int max) { if (scan.getConsistency() == Consistency.STRONG) { return 0; // not scheduling on other replicas for strong consistency } for (int id = min; id = max; id++) { if (currentScannerCallable.getHRegionInfo().getReplicaId() == id) { continue; //this was already scheduled earlier } ScannerCallable s = currentScannerCallable.getScannerCallableForReplica(id); if (this.lastResult != null) { s.getScan().setStartRow(this.lastResult.getRow()); } outstandingCallables.add(s); RetryingRPC retryingOnReplica = new RetryingRPC(s); cs.submit(retryingOnReplica); } return max - min + 1; //bug? should be max - min,because continue //always happen once } {code} It can cause completed submitted always so that the following code will be infinitely blocked. {code:title=code in ScannerCallableWithReplicas.call|borderStyle=solid} // submitted larger than the actual one submitted += addCallsForOtherReplicas(cs, rl, 0, rl.size() - 1); try { //here will be affected while (completed submitted) { try { FuturePairResult[], ScannerCallable f = cs.take(); PairResult[], ScannerCallable r = f.get(); if (r != null r.getSecond() != null) { updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, pool); } return r == null ? null : r.getFirst(); // great we got an answer } catch (ExecutionException e) { // if not cancel or interrupt, wait until all RPC's are done // one of the tasks failed. Save the exception for later. if (exceptions == null) exceptions = new ArrayListExecutionException(rl.size()); exceptions.add(e); completed++; } } } catch (CancellationException e) { throw new InterruptedIOException(e.getMessage()); } catch (InterruptedException e) { throw new InterruptedIOException(e.getMessage()); } finally { // We get there because we were interrupted or because one or more of the // calls succeeded or failed. In all case, we stop all our tasks. cs.cancelAll(true); } {code} If all replica-RS occur ExecutionException ,it will be infinitely blocked in cs.take() -- This message was sent by Atlassian JIRA (v6.3.4#6332)