[jira] [Commented] (HBASE-5291) Add Kerberos HTTP SPNEGO authentication support to HBase web consoles
[ https://issues.apache.org/jira/browse/HBASE-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182603#comment-16182603 ] Lars George commented on HBASE-5291: [~mantonov] It looks like this was also committed to 1.3. Should we update the JIRAs fix versions? > Add Kerberos HTTP SPNEGO authentication support to HBase web consoles > - > > Key: HBASE-5291 > URL: https://issues.apache.org/jira/browse/HBASE-5291 > Project: HBase > Issue Type: Improvement > Components: master, regionserver, security >Reporter: Andrew Purtell >Assignee: Josh Elser > Fix For: 2.0.0, 1.4.0 > > Attachments: 5291-addendum.2, HBASE-5291.001.patch, > HBASE-5291.002.patch, HBASE-5291.003.patch, HBASE-5291.004.patch, > HBASE-5291.005-0.98.patch, HBASE-5291.005-branch-1.patch, > HBASE-5291.005.patch, HBASE-5291-addendum.patch > > > Like HADOOP-7119, the same motivations: > {quote} > Hadoop RPC already supports Kerberos authentication. > {quote} > As does the HBase secure RPC engine. > {quote} > Kerberos enables single sign-on. > Popular browsers (Firefox and Internet Explorer) have support for Kerberos > HTTP SPNEGO. > Adding support for Kerberos HTTP SPNEGO to [HBase] web consoles would provide > a unified authentication mechanism and single sign-on for web UI and RPC. > {quote} > Also like HADOOP-7119, the same solution: > A servlet filter is configured in front of all Hadoop web consoles for > authentication. > This filter verifies if the incoming request is already authenticated by the > presence of a signed HTTP cookie. If the cookie is present, its signature is > valid and its value didn't expire; then the request continues its way to the > page invoked by the request. If the cookie is not present, it is invalid or > it expired; then the request is delegated to an authenticator handler. The > authenticator handler then is responsible for requesting/validating the > user-agent for the user credentials. This may require one or more additional > interactions between the authenticator handler and the user-agent (which will > be multiple HTTP requests). Once the authenticator handler verifies the > credentials and generates an authentication token, a signed cookie is > returned to the user-agent for all subsequent invocations. > The authenticator handler is pluggable and 2 implementations are provided out > of the box: pseudo/simple and kerberos. > 1. The pseudo/simple authenticator handler is equivalent to the Hadoop > pseudo/simple authentication. It trusts the value of the user.name query > string parameter. The pseudo/simple authenticator handler supports an > anonymous mode which accepts any request without requiring the user.name > query string parameter to create the token. This is the default behavior, > preserving the behavior of the HBase web consoles before this patch. > 2. The kerberos authenticator handler implements the Kerberos HTTP SPNEGO > implementation. This authenticator handler will generate a token only if a > successful Kerberos HTTP SPNEGO interaction is performed between the > user-agent and the authenticator. Browsers like Firefox and Internet Explorer > support Kerberos HTTP SPNEGO. > We can build on the support added to Hadoop via HADOOP-7119. Should just be a > matter of wiring up the filter to our infoservers in a similar manner. > And from > https://issues.apache.org/jira/browse/HBASE-5050?focusedCommentId=13171086=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13171086 > {quote} > Hadoop 0.23 onwards has a hadoop-auth artifact that provides SPNEGO/Kerberos > authentication for webapps via a filter. You should consider using it. You > don't have to move Hbase to 0.23 for that, just consume the hadoop-auth > artifact, which has no dependencies on the rest of Hadoop 0.23 artifacts. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-13333) Renew Scanner Lease without advancing the RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159855#comment-16159855 ] Lars George commented on HBASE-1: - Ah! Thank you [~chia7712], that is hella confusing but I guess correct then? Sorry for the noise and thanks again for clarifying. > Renew Scanner Lease without advancing the RegionScanner > --- > > Key: HBASE-1 > URL: https://issues.apache.org/jira/browse/HBASE-1 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 > > Attachments: 1-0.98.txt, 1-master.txt > > > We have a usecase (for Phoenix) where we want to let the server know that the > client is still around. Like a client-side heartbeat. > Doing a full heartbeat is complicated, but we could add the ability to make > scanner call with caching set to 0. The server already does the right thing > (it renews the lease, but does not advance the scanner). > It looks like the client (ScannerCallable) also does the right thing. We > cannot break ResultScanner before HBase 2.0, but we can add a renewLease() > method to AbstractClientScaner. Phoenix (or any other caller) can then cast > to ClientScanner and call that method to ensure we renew the lease on the > server. > It would be a simple and fully backwards compatible change. [~giacomotaylor] > Comments? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-13333) Renew Scanner Lease without advancing the RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159781#comment-16159781 ] Lars George commented on HBASE-1: - [~chia7712] yes indeed, but it is missing now? See: https://github.com/apache/hbase/blob/branch-1.3/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultScanner.java The patch for HBASE-1 includes: {code} diff --git a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultScanner.java b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultScanner.java index 381505c..6b7f1dd 100644 --- a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultScanner.java +++ b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultScanner.java @@ -52,4 +52,10 @@ public interface ResultScanner extends Closeable, Iterable { */ @Override void close(); + + /** + * Allow the client to renew the scanner's lease on the server. + * @return true if the lease was successfully renewed, false otherwise. + */ + boolean renewLease(); } {code} I think I am just dull here, so please someone point out my mistake? > Renew Scanner Lease without advancing the RegionScanner > --- > > Key: HBASE-1 > URL: https://issues.apache.org/jira/browse/HBASE-1 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 > > Attachments: 1-0.98.txt, 1-master.txt > > > We have a usecase (for Phoenix) where we want to let the server know that the > client is still around. Like a client-side heartbeat. > Doing a full heartbeat is complicated, but we could add the ability to make > scanner call with caching set to 0. The server already does the right thing > (it renews the lease, but does not advance the scanner). > It looks like the client (ScannerCallable) also does the right thing. We > cannot break ResultScanner before HBase 2.0, but we can add a renewLease() > method to AbstractClientScaner. Phoenix (or any other caller) can then cast > to ClientScanner and call that method to ensure we renew the lease on the > server. > It would be a simple and fully backwards compatible change. [~giacomotaylor] > Comments? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-13312) SmallScannerCallable does not increment scan metrics
[ https://issues.apache.org/jira/browse/HBASE-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16158401#comment-16158401 ] Lars George commented on HBASE-13312: - Was that not committed to branch-1.3? > SmallScannerCallable does not increment scan metrics > > > Key: HBASE-13312 > URL: https://issues.apache.org/jira/browse/HBASE-13312 > Project: HBase > Issue Type: Bug > Components: Client, Scanners >Affects Versions: 1.0.0 >Reporter: Lars George >Assignee: Andrew Purtell > Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 > > Attachments: HBASE-13312-0.98.patch, HBASE-13312.patch > > > The subclass of {{ScannerCallable}}, called {{SmallScannerCallable}} seems to > miss calling any of the increment methods of the {{ScanMetrics}}. The > superclass does so, but the super methods are not invoked. It emits the > metrics dutifully at the end of {{next()}}, but there are no useful numbers > in it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-13333) Renew Scanner Lease without advancing the RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16158344#comment-16158344 ] Lars George commented on HBASE-1: - [~mantonov] [~lhofhansl] This was not committed to branch-1.3 it seems, which is an oversight? All other versions of HBase 1.x have it, as does 2.0. It should be in 1.3 too, no? > Renew Scanner Lease without advancing the RegionScanner > --- > > Key: HBASE-1 > URL: https://issues.apache.org/jira/browse/HBASE-1 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 > > Attachments: 1-0.98.txt, 1-master.txt > > > We have a usecase (for Phoenix) where we want to let the server know that the > client is still around. Like a client-side heartbeat. > Doing a full heartbeat is complicated, but we could add the ability to make > scanner call with caching set to 0. The server already does the right thing > (it renews the lease, but does not advance the scanner). > It looks like the client (ScannerCallable) also does the right thing. We > cannot break ResultScanner before HBase 2.0, but we can add a renewLease() > method to AbstractClientScaner. Phoenix (or any other caller) can then cast > to ClientScanner and call that method to ensure we renew the lease on the > server. > It would be a simple and fully backwards compatible change. [~giacomotaylor] > Comments? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-13312) SmallScannerCallable does not increment scan metrics
[ https://issues.apache.org/jira/browse/HBASE-13312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16158282#comment-16158282 ] Lars George commented on HBASE-13312: - [~apurtell] When running my book example (see https://github.com/larsgeorge/hbase-book/blob/master/ch03/src/main/java/client/ScanCacheBatchExample.java, just run it against a locally started HBase instance) against a 1.3.1 today, I noticed the small scanner still reports zero RPC, but should say one, no? Am I missing something? Do we need to revisit? > SmallScannerCallable does not increment scan metrics > > > Key: HBASE-13312 > URL: https://issues.apache.org/jira/browse/HBASE-13312 > Project: HBase > Issue Type: Bug > Components: Client, Scanners >Affects Versions: 1.0.0 >Reporter: Lars George >Assignee: Andrew Purtell > Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1 > > Attachments: HBASE-13312-0.98.patch, HBASE-13312.patch > > > The subclass of {{ScannerCallable}}, called {{SmallScannerCallable}} seems to > miss calling any of the increment methods of the {{ScanMetrics}}. The > superclass does so, but the super methods are not invoked. It emits the > metrics dutifully at the end of {{next()}}, but there are no useful numbers > in it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17584) Expose ScanMetrics with ResultScanner rather than Scan
[ https://issues.apache.org/jira/browse/HBASE-17584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16158238#comment-16158238 ] Lars George commented on HBASE-17584: - [~Apache9] If this is pushed to branch-1, could you please update the fix versions in this JIRA? > Expose ScanMetrics with ResultScanner rather than Scan > -- > > Key: HBASE-17584 > URL: https://issues.apache.org/jira/browse/HBASE-17584 > Project: HBase > Issue Type: Sub-task > Components: Client, mapreduce, scan >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-17584-branch-1.patch, HBASE-17584.patch, > HBASE-17584-v1.patch > > > I think this have been discussed many times... It is a bad practice to > directly modify the Scan object passed in when calling getScanner. The reason > that we can not use a copy is we need to use the Scan object to expose scan > metrics. So we need to find another way to expose the metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18526) FIFOCompactionPolicy pre-check uses wrong scope
[ https://issues.apache.org/jira/browse/HBASE-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16124565#comment-16124565 ] Lars George commented on HBASE-18526: - lgtm, +1, thanks Vlad > FIFOCompactionPolicy pre-check uses wrong scope > --- > > Key: HBASE-18526 > URL: https://issues.apache.org/jira/browse/HBASE-18526 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 1.3.1 >Reporter: Lars George >Assignee: Vladimir Rodionov > Attachments: HBASE-18526-v1.patch > > > See https://issues.apache.org/jira/browse/HBASE-14468 > It adds this check to {{HMaster.checkCompactionPolicy()}}: > {code} > // 1. Check TTL > if (hcd.getTimeToLive() == HColumnDescriptor.DEFAULT_TTL) { > message = "Default TTL is not supported for FIFO compaction"; > throw new IOException(message); > } > // 2. Check min versions > if (hcd.getMinVersions() > 0) { > message = "MIN_VERSION > 0 is not supported for FIFO compaction"; > throw new IOException(message); > } > // 3. blocking file count > String sbfc = htd.getConfigurationValue(HStore.BLOCKING_STOREFILES_KEY); > if (sbfc != null) { > blockingFileCount = Integer.parseInt(sbfc); > } > if (blockingFileCount < 1000) { > message = > "blocking file count '" + HStore.BLOCKING_STOREFILES_KEY + "' " > + blockingFileCount > + " is below recommended minimum of 1000"; > throw new IOException(message); > } > {code} > Why does it only check the blocking file count on the HTD level, while > others are check on the HCD level? Doing this for example fails > because of it: > {noformat} > hbase(main):008:0> create 'ttltable', { NAME => 'cf1', TTL => 300, > CONFIGURATION => { 'hbase.hstore.defaultengine.compactionpolicy.class' > => 'org.apache.hadoop.hbase.regionserver.compactions.FIFOCompactionPolicy', > 'hbase.hstore.blockingStoreFiles' => 2000 } } > ERROR: org.apache.hadoop.hbase.DoNotRetryIOException: blocking file > count 'hbase.hstore.blockingStoreFiles' 10 is below recommended > minimum of 1000 Set hbase.table.sanity.checks to false at conf or > table descriptor if you want to bypass sanity checks > at > org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure(HMaster.java:1782) > at > org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1663) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1545) > at > org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:469) > at > org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:58549) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) > Caused by: java.io.IOException: blocking file count > 'hbase.hstore.blockingStoreFiles' 10 is below recommended minimum of > 1000 > at > org.apache.hadoop.hbase.master.HMaster.checkCompactionPolicy(HMaster.java:1773) > at > org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1661) > ... 7 more > {noformat} > The check should be performed on the column family level instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18526) FIFOCompactionPolicy pre-check uses wrong scope
[ https://issues.apache.org/jira/browse/HBASE-18526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115709#comment-16115709 ] Lars George commented on HBASE-18526: - Pinging [~vrodionov] as per the mailing list... > FIFOCompactionPolicy pre-check uses wrong scope > --- > > Key: HBASE-18526 > URL: https://issues.apache.org/jira/browse/HBASE-18526 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 1.3.1 >Reporter: Lars George > > See https://issues.apache.org/jira/browse/HBASE-14468 > It adds this check to {{HMaster.checkCompactionPolicy()}}: > {code} > // 1. Check TTL > if (hcd.getTimeToLive() == HColumnDescriptor.DEFAULT_TTL) { > message = "Default TTL is not supported for FIFO compaction"; > throw new IOException(message); > } > // 2. Check min versions > if (hcd.getMinVersions() > 0) { > message = "MIN_VERSION > 0 is not supported for FIFO compaction"; > throw new IOException(message); > } > // 3. blocking file count > String sbfc = htd.getConfigurationValue(HStore.BLOCKING_STOREFILES_KEY); > if (sbfc != null) { > blockingFileCount = Integer.parseInt(sbfc); > } > if (blockingFileCount < 1000) { > message = > "blocking file count '" + HStore.BLOCKING_STOREFILES_KEY + "' " > + blockingFileCount > + " is below recommended minimum of 1000"; > throw new IOException(message); > } > {code} > Why does it only check the blocking file count on the HTD level, while > others are check on the HCD level? Doing this for example fails > because of it: > {noformat} > hbase(main):008:0> create 'ttltable', { NAME => 'cf1', TTL => 300, > CONFIGURATION => { 'hbase.hstore.defaultengine.compactionpolicy.class' > => 'org.apache.hadoop.hbase.regionserver.compactions.FIFOCompactionPolicy', > 'hbase.hstore.blockingStoreFiles' => 2000 } } > ERROR: org.apache.hadoop.hbase.DoNotRetryIOException: blocking file > count 'hbase.hstore.blockingStoreFiles' 10 is below recommended > minimum of 1000 Set hbase.table.sanity.checks to false at conf or > table descriptor if you want to bypass sanity checks > at > org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure(HMaster.java:1782) > at > org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1663) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1545) > at > org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:469) > at > org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:58549) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) > Caused by: java.io.IOException: blocking file count > 'hbase.hstore.blockingStoreFiles' 10 is below recommended minimum of > 1000 > at > org.apache.hadoop.hbase.master.HMaster.checkCompactionPolicy(HMaster.java:1773) > at > org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1661) > ... 7 more > {noformat} > The check should be performed on the column family level instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18526) FIFOCompactionPolicy pre-check uses wrong scope
Lars George created HBASE-18526: --- Summary: FIFOCompactionPolicy pre-check uses wrong scope Key: HBASE-18526 URL: https://issues.apache.org/jira/browse/HBASE-18526 Project: HBase Issue Type: Bug Components: master Affects Versions: 1.3.1 Reporter: Lars George See https://issues.apache.org/jira/browse/HBASE-14468 It adds this check to {{HMaster.checkCompactionPolicy()}}: {code} // 1. Check TTL if (hcd.getTimeToLive() == HColumnDescriptor.DEFAULT_TTL) { message = "Default TTL is not supported for FIFO compaction"; throw new IOException(message); } // 2. Check min versions if (hcd.getMinVersions() > 0) { message = "MIN_VERSION > 0 is not supported for FIFO compaction"; throw new IOException(message); } // 3. blocking file count String sbfc = htd.getConfigurationValue(HStore.BLOCKING_STOREFILES_KEY); if (sbfc != null) { blockingFileCount = Integer.parseInt(sbfc); } if (blockingFileCount < 1000) { message = "blocking file count '" + HStore.BLOCKING_STOREFILES_KEY + "' " + blockingFileCount + " is below recommended minimum of 1000"; throw new IOException(message); } {code} Why does it only check the blocking file count on the HTD level, while others are check on the HCD level? Doing this for example fails because of it: {noformat} hbase(main):008:0> create 'ttltable', { NAME => 'cf1', TTL => 300, CONFIGURATION => { 'hbase.hstore.defaultengine.compactionpolicy.class' => 'org.apache.hadoop.hbase.regionserver.compactions.FIFOCompactionPolicy', 'hbase.hstore.blockingStoreFiles' => 2000 } } ERROR: org.apache.hadoop.hbase.DoNotRetryIOException: blocking file count 'hbase.hstore.blockingStoreFiles' 10 is below recommended minimum of 1000 Set hbase.table.sanity.checks to false at conf or table descriptor if you want to bypass sanity checks at org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure(HMaster.java:1782) at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1663) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1545) at org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:469) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:58549) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) Caused by: java.io.IOException: blocking file count 'hbase.hstore.blockingStoreFiles' 10 is below recommended minimum of 1000 at org.apache.hadoop.hbase.master.HMaster.checkCompactionPolicy(HMaster.java:1773) at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1661) ... 7 more {noformat} The check should be performed on the column family level instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18473) VC.listLabels() erroneously closes any connection
Lars George created HBASE-18473: --- Summary: VC.listLabels() erroneously closes any connection Key: HBASE-18473 URL: https://issues.apache.org/jira/browse/HBASE-18473 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.11, 1.2.6, 1.3.1 Reporter: Lars George In HBASE-13358 the {{VisibilityClient.listLabels()}} was amended to take in a connection from the caller, which totally makes sense. But the patch forgot to remove the unconditional call to {{connection.close()}} in the {{finally}} block: {code} finally { if (table != null) { table.close(); } if (connection != null) { connection.close(); } } {code} Remove the second {{if}} completely. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18388) Fix description on region page, explaining what a region name is made of
Lars George created HBASE-18388: --- Summary: Fix description on region page, explaining what a region name is made of Key: HBASE-18388 URL: https://issues.apache.org/jira/browse/HBASE-18388 Project: HBase Issue Type: Improvement Components: master, regionserver, UI Affects Versions: 2.0.0-alpha-1, 1.3.1 Reporter: Lars George Priority: Minor In the {{RegionListTmpl.jamon}} we have this: {code} Region names are made of the containing table's name, a comma, the start key, a comma, and a randomly generated region id. To illustrate, the region named domains,apache.org,5464829424211263407 is party to the table domains, has an id of 5464829424211263407 and the first key in the region is apache.org.The hbase:meta 'table' is an internal system table (or a 'catalog' table in db-speak). The hbase:meta table keeps a list of all regions in the system. The empty key is used to denote table start and table end. A region with an empty start key is the first region in a table. If a region has both an empty start key and an empty end key, it's the only region in the table. See http://hbase.org;>HBase Home for further explication. {code} This is wrong and worded oddly. What needs to be fixed facts wise is: - Region names contain (separated by commas) the full table name (including the namespace), the start key, the time the region was created, and finally a dot with an MD5 hash of everything before the dot. For example: {{test,,1499410125885.1544f69aeaf787755caa11d3567a9621.}} - The trailing dot is to distinguish legacy region names (like those used by the {{hbase:meta}} table) - The MD5 hash is used as the directory name within the HBase storage directories - The names for the meta table use a Jenkins hash instead, also leaving out the trailing dot, for example {{hbase:meta,,1.1588230740}}. The time is always set to {{1}}. - The start key is printed in safe characters, escaping unprintable characters - The link to the HBase home page to explain more is useless and should be removed. - Also, for region replicas, the replica ID is inserted into the name, like so {{replicatable,,1486289678486_0001.3e8b7655299b21b3038ff8d39062467f.}}, see the {{_0001}} part. As for the wording, I would just make this all flow a little better, that "is party of" sounds weird to me (IMHO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18387) [Thrift] Make principal configurable in DemoClient.java
Lars George created HBASE-18387: --- Summary: [Thrift] Make principal configurable in DemoClient.java Key: HBASE-18387 URL: https://issues.apache.org/jira/browse/HBASE-18387 Project: HBase Issue Type: Improvement Reporter: Lars George Priority: Minor In the Thrift1 demo client we have this code: {code} transport = new TSaslClientTransport("GSSAPI", null, "hbase", // Thrift server user name, should be an authorized proxy user. host, // Thrift server domain saslProperties, null, transport); {code} This will only work when the Thrift server is started with the {{hbase}} principal. Often this may deviate, for example I am using {{hbase-thrift}} to separate the names from those of backend servers. What we need is either an additional command line option to specify the name, or a property that can be set with -D and can be passed at runtime. I prefer the former, as the latter is making this a little convoluted. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18382) [Thrift] Add transport type info to info server
[ https://issues.apache.org/jira/browse/HBASE-18382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-18382: Description: It would be really helpful to know if the Thrift server was started using the HTTP or binary transport. Any additional info, like QOP settings for SASL etc. would be great too. Right now the UI is very limited and shows {{true/false}} for, for example, {{Compact Transport}}. It'd suggest to change this to show something more useful like this: {noformat} Thrift Impl Type: non-blocking Protocol: Binary Transport: Framed QOP: Authentication & Confidential {noformat} or {noformat} Protocol: Binary + HTTP Transport: Standard QOP: none {noformat} was: It would be really helpful to know if the Thrift server was started using the HTTP or binary transport. Any additional info, like QOP settings for SASL etc. would be great too. Right now the UI is very limited and shows {{true/false}} for, for example, {{Compact Transport}}. It'd suggest to change this to show something more useful like this: {noformat} Thrift Impl Type: non-blocking Protocol: Binary Transport: Framed QOP: Authentication & Confidential {noformat} or {noformat} Protocol: Binary + HTTP Transport: Standard QOP: none {noformat}} > [Thrift] Add transport type info to info server > > > Key: HBASE-18382 > URL: https://issues.apache.org/jira/browse/HBASE-18382 > Project: HBase > Issue Type: Improvement > Components: Thrift >Reporter: Lars George >Priority: Minor > Labels: beginner > > It would be really helpful to know if the Thrift server was started using the > HTTP or binary transport. Any additional info, like QOP settings for SASL > etc. would be great too. Right now the UI is very limited and shows > {{true/false}} for, for example, {{Compact Transport}}. It'd suggest to > change this to show something more useful like this: > {noformat} > Thrift Impl Type: non-blocking > Protocol: Binary > Transport: Framed > QOP: Authentication & Confidential > {noformat} > or > {noformat} > Protocol: Binary + HTTP > Transport: Standard > QOP: none > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18382) [Thrift] Add transport type info to info server
Lars George created HBASE-18382: --- Summary: [Thrift] Add transport type info to info server Key: HBASE-18382 URL: https://issues.apache.org/jira/browse/HBASE-18382 Project: HBase Issue Type: Improvement Components: Thrift Reporter: Lars George Priority: Minor It would be really helpful to know if the Thrift server was started using the HTTP or binary transport. Any additional info, like QOP settings for SASL etc. would be great too. Right now the UI is very limited and shows {{true/false}} for, for example, {{Compact Transport}}. It'd suggest to change this to show something more useful like this: {noformat} Thrift Impl Type: non-blocking Protocol: Binary Transport: Framed QOP: Authentication & Confidential {noformat}} or {noformat} Protocol: Binary + HTTP Transport: Standard QOP: none {noformat}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18382) [Thrift] Add transport type info to info server
[ https://issues.apache.org/jira/browse/HBASE-18382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-18382: Description: It would be really helpful to know if the Thrift server was started using the HTTP or binary transport. Any additional info, like QOP settings for SASL etc. would be great too. Right now the UI is very limited and shows {{true/false}} for, for example, {{Compact Transport}}. It'd suggest to change this to show something more useful like this: {noformat} Thrift Impl Type: non-blocking Protocol: Binary Transport: Framed QOP: Authentication & Confidential {noformat} or {noformat} Protocol: Binary + HTTP Transport: Standard QOP: none {noformat}} was: It would be really helpful to know if the Thrift server was started using the HTTP or binary transport. Any additional info, like QOP settings for SASL etc. would be great too. Right now the UI is very limited and shows {{true/false}} for, for example, {{Compact Transport}}. It'd suggest to change this to show something more useful like this: {noformat} Thrift Impl Type: non-blocking Protocol: Binary Transport: Framed QOP: Authentication & Confidential {noformat}} or {noformat} Protocol: Binary + HTTP Transport: Standard QOP: none {noformat}} > [Thrift] Add transport type info to info server > > > Key: HBASE-18382 > URL: https://issues.apache.org/jira/browse/HBASE-18382 > Project: HBase > Issue Type: Improvement > Components: Thrift >Reporter: Lars George >Priority: Minor > Labels: beginner > > It would be really helpful to know if the Thrift server was started using the > HTTP or binary transport. Any additional info, like QOP settings for SASL > etc. would be great too. Right now the UI is very limited and shows > {{true/false}} for, for example, {{Compact Transport}}. It'd suggest to > change this to show something more useful like this: > {noformat} > Thrift Impl Type: non-blocking > Protocol: Binary > Transport: Framed > QOP: Authentication & Confidential > {noformat} > or > {noformat} > Protocol: Binary + HTTP > Transport: Standard > QOP: none > {noformat}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17708) Expose config to set two-way auth over TLS in HttpServer and add a test
[ https://issues.apache.org/jira/browse/HBASE-17708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964003#comment-15964003 ] Lars George commented on HBASE-17708: - Hey [~apurtell], I was referring to {noformat} hadoop.ssl.require.client.cert false Whether client certificates are required {noformat} > Expose config to set two-way auth over TLS in HttpServer and add a test > --- > > Key: HBASE-17708 > URL: https://issues.apache.org/jira/browse/HBASE-17708 > Project: HBase > Issue Type: Bug > Components: security >Reporter: stack > > Up on dev mailing list [~larsgeorge] reports: > {code} > On Mon, Feb 27, 2017 at 5:57 PM, Lars Georgewrote: > Hi, > We have support for two-way authentication over TLS in the HttpServer > class, but never use it, nor have a config property that could set it. > Hadoop has the same option but exposes it via config. Should we not do > the same? > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17791) Locality should not be affected for non-faulty region servers at startup
Lars George created HBASE-17791: --- Summary: Locality should not be affected for non-faulty region servers at startup Key: HBASE-17791 URL: https://issues.apache.org/jira/browse/HBASE-17791 Project: HBase Issue Type: Improvement Components: Balancer, Region Assignment Affects Versions: 1.1.8, 1.2.4, 1.0.3, 1.3.0 Reporter: Lars George Priority: Blocker We seem to have an issue with store file locality as soon as a single server is missing or faulty upon restart. HBASE-15251 is addressing a subset of the problem, ensuring that some remaining files or an empty server do not trigger a full recovery, but is missing further, similar cases. Especially, the case where a server fails to report in before the active master is finished waiting for them to do so, or where servers have been decomissioned (but not removed from the {{regionservers}} file), and finally the true case of a dead node. In case a single node is faulty, the user regions are _not_ assigned as saved in the {{hbase:meta}} table, but completely randomized in a round-robin fashion. An additional factor is that in this case the regions are _not_ assigned to the best matching node (the one with a copy of the data locally), but to any node, leaving the locality in shambles. What is also bad, if the {{hbase.hstore.min.locality.to.skip.major.compact}} property is left at the default {{0.0f}}, then an older region that had no writes since the last major compaction happened is just skipped (as expected, usually) and locality stays bad as-is. All reads for those aged-out regions will be network reads. But in any event, having to run a major compaction after a restart is not good anyways. The issue is the code in {{AssignmentManager.processDeadServersAndRegionsInTransition()}}, which is handed a list of dead servers. But it immediately sets the {{failover}} flag and the code {code} failoverCleanupDone(); if (!failover) { // Fresh cluster startup. LOG.info("Clean cluster startup. Don't reassign user regions"); assignAllUserRegions(allRegions); } else { LOG.info("Failover! Reassign user regions"); } {code} is not triggering the assignment of the regions to those servers that are still present and have all their region data local. What should happen is that only the missing regions are reassigned, just like in the case of a server failing while the cluster is running. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-15251) During a cluster restart, Hmaster thinks it is a failover by mistake
[ https://issues.apache.org/jira/browse/HBASE-15251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924349#comment-15924349 ] Lars George commented on HBASE-15251: - I believe this is not enough, as a failed server should be handled better, no matter if it had regions or not. Also, this should be ported to all 1.x branches. > During a cluster restart, Hmaster thinks it is a failover by mistake > > > Key: HBASE-15251 > URL: https://issues.apache.org/jira/browse/HBASE-15251 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.0.0 >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0 > > Attachments: HBASE-15251-master.patch, HBASE-15251-master-v1.patch > > > We often need to do cluster restart as part of release for a cluster of > > 1000 nodes. We have tried our best to get clean shutdown but 50% of the time, > hmaster still thinks it is a failover. This increases the restart time from 5 > min to 30 min and decreases locality from 99% to 5% since we didn't use a > locality-aware balancer. We had a bug HBASE-14129 but the fix didn't work. > After adding more logging and inspecting the logs, we identified two things > that trigger the failover handling: > 1. When Hmaster.AssignmentManager detects any dead servers on service > manager during joinCluster(), it determines this is a failover without > further check. I added a check whether there is even any region assigned to > these servers. During a clean restart, the regions are not even assigned. > 2. When there are some leftover empty folders for log and split directories > or empty wal files, it is also treated as a failover. I added a check for > that. Although this can be resolved by manual cleanup, it is still too > tedious for restarting a large cluster. > Patch will follow shortly. The fix is tested and used in production now. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HBASE-14129) If any regionserver gets shutdown uncleanly during full cluster restart, locality looks to be lost
[ https://issues.apache.org/jira/browse/HBASE-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George resolved HBASE-14129. - Resolution: Won't Fix Closing as "won't fix" as the hardcoded flag is too intrusive. The cluster should be able to handle this by fixing the logic in the {{AssignmentManager}}. > If any regionserver gets shutdown uncleanly during full cluster restart, > locality looks to be lost > -- > > Key: HBASE-14129 > URL: https://issues.apache.org/jira/browse/HBASE-14129 > Project: HBase > Issue Type: Bug >Reporter: churro morales > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-14129.patch > > > We were doing a cluster restart the other day. Some regionservers did not > shut down cleanly. Upon restart our locality went from 99% to 5%. Upon > looking at the AssignmentManager.joinCluster() code it calls > AssignmentManager.processDeadServersAndRegionsInTransition(). > If the failover flag gets set for any reason it seems we don't call > assignAllUserRegions(). Then it looks like the balancer does the work in > assigning those regions, we don't use a locality aware balancer and we lost > our region locality. > I don't have a solid grasp on the reasoning for these checks but there could > be some potential workarounds here. > 1. After shutting down your cluster, move your WALs aside (replay later). > 2. Clean up your zNodes > That seems to work, but requires a lot of manual labor. Another solution > which I prefer would be to have a flag for ./start-hbase.sh --clean > If we start master with that flag then we do a check in > AssignmentManager.processDeadServersAndRegionsInTransition() thus if this > flag is set we call: assignAllUserRegions() regardless of the failover state. > I have a patch for the later solution, that is if I am understanding the > logic correctly. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17774) Improve locality info in table details page
[ https://issues.apache.org/jira/browse/HBASE-17774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906538#comment-15906538 ] Lars George commented on HBASE-17774: - Sorry, did see HBASE-15675 later. I will consolidate what is missing here soon. But the locality and empty store files may still be an issue. Let me try out master and update this JIRA accordingly. > Improve locality info in table details page > --- > > Key: HBASE-17774 > URL: https://issues.apache.org/jira/browse/HBASE-17774 > Project: HBase > Issue Type: Improvement > Components: UI >Affects Versions: 1.3.0 >Reporter: Lars George > > The {{table.jsp}} page does list the locality info of each region, but is > missing some vital other details, which are: > - An extra column listing store and store file counts (could be two separate > columns too like in Master and RS UI, but a single column saves space, for > example "3/6", meaning three stores and six store files in total > - Hide locality "0.0" if the stores are all empty, i.e. no store file is > present. It makes little sense penalizing an empty store when it comes to the > total locality of the table. > - Make region name clickable like on RS status page, linking to > {{region.jsp}} page showing store details. > - Summary row at the end, showing the total locality (see note above), number > stores, number of store files. That is, compute the _real_ locality, which is > considering only the actual store files. > I also wish we had some simple but effective charts/diagrams here, like a > heatmap for data distribution within the table (since we have all of this > info), or another heatmap for current request distribution. Strong +1 to add > that here too. > I have reasoned about this at customers way too often to not have this > improved somehow. For the enterprise level admin, a good UI goes a long way! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17774) Improve locality info in table details page
Lars George created HBASE-17774: --- Summary: Improve locality info in table details page Key: HBASE-17774 URL: https://issues.apache.org/jira/browse/HBASE-17774 Project: HBase Issue Type: Improvement Components: UI Affects Versions: 1.3.0 Reporter: Lars George The {{table.jsp}} page does list the locality info of each region, but is missing some vital other details, which are: - An extra column listing store and store file counts (could be two separate columns too like in Master and RS UI, but a single column saves space, for example "3/6", meaning three stores and six store files in total - Hide locality "0.0" if the stores are all empty, i.e. no store file is present. It makes little sense penalizing an empty store when it comes to the total locality of the table. - Make region name clickable like on RS status page, linking to {{region.jsp}} page showing store details. - Summary row at the end, showing the total locality (see note above), number stores, number of store files. That is, compute the _real_ locality, which is considering only the actual store files. I also wish we had some simple but effective charts/diagrams here, like a heatmap for data distribution within the table (since we have all of this info), or another heatmap for current request distribution. Strong +1 to add that here too. I have reasoned about this at customers way too often to not have this improved somehow. For the enterprise level admin, a good UI goes a long way! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HBASE-17635) enable_table_replication script cannot handle replication scope
[ https://issues.apache.org/jira/browse/HBASE-17635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George resolved HBASE-17635. - Resolution: Duplicate Sorry, this was opened just a few weeks earlier in HBASE-17460, closing this JIRA as a dupe. > enable_table_replication script cannot handle replication scope > --- > > Key: HBASE-17635 > URL: https://issues.apache.org/jira/browse/HBASE-17635 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 1.3.1 >Reporter: Lars George > > When you add a peer, then enable a table for replication using > {{enable_table_replication}}, the script will create the table on the peer > cluster, but with one difference: > _Master Cluster_: > {noformat} > hbase(main):027:0> describe 'testtable' > Table testtable is ENABLED > > > testtable > > > COLUMN FAMILIES DESCRIPTION > > > {NAME => 'cf1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', > REPLICATION_SCOPE => '1', VERSIONS => '1', COMPRESSION => 'NONE', > MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', > BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} > > > 1 row(s) in 0.0700 seconds > {noformat} > _Peer Cluster_: > {noformat} > hbase(main):003:0> describe 'testtable' > Table testtable is ENABLED > > > testtable > > > COLUMN FAMILIES DESCRIPTION > > > {NAME => 'cf1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', > REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '1', TTL => > 'FOREVER', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'FALSE', > BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} > > > 1 row(s) in 0.1260 seconds > {noformat} > Note that the replication scope is different. Removing the peer, adding it > again and enabling the table gives this now: > {noformat} > hbase(main):026:0> enable_table_replication 'testtable' > ERROR: Table testtable exists in peer cluster 1, but the table descriptors > are not same when compared with source cluster. Thus can not enable the > table's replication switch. > {noformat} > That is dumb, as it was the same script that enabled the replication scope in > the first place. It should skip that particular attribute when comparing the > cluster schemas. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17722) Metrics subsystem stop/start messages add a lot of useless bulk to operational logging
[ https://issues.apache.org/jira/browse/HBASE-17722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893808#comment-15893808 ] Lars George commented on HBASE-17722: - +1 on setting those lower, they are annoying as heck. We should ideally vet log output changes while reviewing patches, so as to not even have to do this afterwards, like now? > Metrics subsystem stop/start messages add a lot of useless bulk to > operational logging > -- > > Key: HBASE-17722 > URL: https://issues.apache.org/jira/browse/HBASE-17722 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 1.3.0, 1.2.4 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Trivial > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-17722.patch > > > Metrics subsystem stop/start messages add a lot of useless bulk to > operational logging. Say you are collecting logs from a fleet of thousands of > servers and want to have them around for ~month or longer. It adds up. > I think these should at least be at DEBUG level and ideally at TRACE. They > don't offer much utility. Unfortunately they are Hadoop classes so we can > tweak log4j.properties defaults instead. We do this in test resources but not > in what we ship in conf/ . > {noformat} > INFO [] impl.MetricsSystemImpl: HBase metrics system started > INFO [] impl.MetricsSystemImpl: Stopping HBase metrics > system... > INFO [] impl.MetricsSystemImpl: HBase metrics system stopped. > INFO [] impl.MetricsConfig: loaded properties from > hadoop-metrics2-hbase.properties > INFO [] impl.MetricsSystemImpl: Scheduled snapshot period at > 10 second(s). > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-6449) Dapper like tracing
[ https://issues.apache.org/jira/browse/HBASE-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-6449: --- Resolution: Fixed Assignee: Jonathan Leavitt Fix Version/s: 0.95.2 0.96.0 Status: Resolved (was: Patch Available) Closing this, as it is in HBase for quite some time now. I tried it out and it worked as advertised. Not seeing why this is open still. > Dapper like tracing > --- > > Key: HBASE-6449 > URL: https://issues.apache.org/jira/browse/HBASE-6449 > Project: HBase > Issue Type: New Feature > Components: Client, IPC/RPC >Affects Versions: 0.95.2 >Reporter: Jonathan Leavitt >Assignee: Jonathan Leavitt > Labels: tracing > Fix For: 0.96.0, 0.95.2 > > Attachments: htrace1.diff, htrace2.diff, trace.png > > > Add [Dapper|http://research.google.com/pubs/pub36356.html] like tracing to > HBase. [Accumulo|http://accumulo.apache.org] added something similar with > their cloudtrace package. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17636) Fix speling [sic] error in enable replication script output
Lars George created HBASE-17636: --- Summary: Fix speling [sic] error in enable replication script output Key: HBASE-17636 URL: https://issues.apache.org/jira/browse/HBASE-17636 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 1.3.1 Reporter: Lars George When enabling the replication for a table: {noformat} hbase(main):012:0> enable_table_replication 'repltest' 0 row(s) in 7.6080 seconds The replication swith of table 'repltest' successfully enabled {noformat} See {{swith}} as opposed to {{switch}}. Also, that sentence is somewhat too complicated. Better is maybe {{Replication for table successfully enabled.}}? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17635) enable_table_replication script cannot handle replication scope
Lars George created HBASE-17635: --- Summary: enable_table_replication script cannot handle replication scope Key: HBASE-17635 URL: https://issues.apache.org/jira/browse/HBASE-17635 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 1.3.1 Reporter: Lars George When you add a peer, then enable a table for replication using {{enable_table_replication}}, the script will create the table on the peer cluster, but with one difference: _Master Cluster_: {noformat} hbase(main):027:0> describe 'testtable' Table testtable is ENABLED testtable COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '1', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} 1 row(s) in 0.0700 seconds {noformat} _Peer Cluster_: {noformat} hbase(main):003:0> describe 'testtable' Table testtable is ENABLED testtable COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '1', TTL => 'FOREVER', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} 1 row(s) in 0.1260 seconds {noformat} Note that the replication scope is different. Removing the peer, adding it again and enabling the table gives this now: {noformat} hbase(main):026:0> enable_table_replication 'testtable' ERROR: Table testtable exists in peer cluster 1, but the table descriptors are not same when compared with source cluster. Thus can not enable the table's replication switch. {noformat} That is dumb, as it was the same script that enabled the replication scope in the first place. It should skip that particular attribute when comparing the cluster schemas. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17632) Modify example health script to work on CentOS 6 etc.
[ https://issues.apache.org/jira/browse/HBASE-17632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-17632: Fix Version/s: 1.3.1 Description: The current example uses {{snmpwalk}} which for starters requires that {{snmpd}} is running (kinda makes sense). But on CentOS 6 I was not able to get its anticipated results, i.e. the snmpwalk of the interfaces returns nothing, causing the test to always fail. Although this being an example, we should be able to devise a script that _basically_ works out of the box. > Modify example health script to work on CentOS 6 etc. > - > > Key: HBASE-17632 > URL: https://issues.apache.org/jira/browse/HBASE-17632 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Lars George > Fix For: 1.3.1 > > > The current example uses {{snmpwalk}} which for starters requires that > {{snmpd}} is running (kinda makes sense). But on CentOS 6 I was not able to > get its anticipated results, i.e. the snmpwalk of the interfaces returns > nothing, causing the test to always fail. Although this being an example, we > should be able to devise a script that _basically_ works out of the box. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17632) Modify example health script to work on CentOS 6 etc.
Lars George created HBASE-17632: --- Summary: Modify example health script to work on CentOS 6 etc. Key: HBASE-17632 URL: https://issues.apache.org/jira/browse/HBASE-17632 Project: HBase Issue Type: Bug Components: master, regionserver Reporter: Lars George -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17631) Canary interval too low
Lars George created HBASE-17631: --- Summary: Canary interval too low Key: HBASE-17631 URL: https://issues.apache.org/jira/browse/HBASE-17631 Project: HBase Issue Type: Bug Components: canary Affects Versions: 1.3.1 Reporter: Lars George The interval currently is {{6000}} milliseconds, or six seconds, which makes little sense to test that often in succession. We should set the default to at least 60 seconds, or even every 5 minutes? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17630) Health Script not shutting down server process with certain script behavior
Lars George created HBASE-17630: --- Summary: Health Script not shutting down server process with certain script behavior Key: HBASE-17630 URL: https://issues.apache.org/jira/browse/HBASE-17630 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 1.3.1 Reporter: Lars George As discussed on dev@... I tried the supplied {{healthcheck.sh}}, but did not have {{snmpd}} running. That caused the script to take a long time to error out, which exceed the 10 seconds the check was meant to run. That resets the check and it keeps reporting the error, but never stops the servers: {noformat} 2017-02-04 05:55:08,962 INFO [regionserver/slave-1.internal.larsgeorge.com/10.0.10.10:16020] hbase.HealthCheckChore: Health Check Chore runs every 10sec 2017-02-04 05:55:08,975 INFO [regionserver/slave-1.internal.larsgeorge.com/10.0.10.10:16020] hbase.HealthChecker: HealthChecker initialized with script at /opt/hbase/bin/healthcheck.sh, timeout=6 ... 2017-02-04 05:55:50,435 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.HealthCheckChore: Health status at 412837hrs, 55mins, 50sec : ERROR check link, OK: disks ok, 2017-02-04 05:55:50,436 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.ScheduledChore: Chore: CompactionChecker missed its start time 2017-02-04 05:55:50,437 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.ScheduledChore: Chore: slave-1.internal.larsgeorge.com,16020,1486216506007-MemstoreFlusherChore missed its start time 2017-02-04 05:55:50,438 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_2] hbase.ScheduledChore: Chore: HealthChecker missed its start time 2017-02-04 05:56:20,522 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_2] hbase.HealthCheckChore: Health status at 412837hrs, 56mins, 20sec : ERROR check link, OK: disks ok, 2017-02-04 05:56:20,523 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_2] hbase.ScheduledChore: Chore: HealthChecker missed its start time 2017-02-04 05:56:50,600 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_2] hbase.HealthCheckChore: Health status at 412837hrs, 56mins, 50sec : ERROR check link, OK: disks ok, 2017-02-04 05:56:50,600 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_2] hbase.ScheduledChore: Chore: HealthChecker missed its start time 2017-02-04 05:57:20,681 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.HealthCheckChore: Health status at 412837hrs, 57mins, 20sec : ERROR check link, OK: disks ok, 2017-02-04 05:57:20,681 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.ScheduledChore: Chore: HealthChecker missed its start time 2017-02-04 05:57:50,763 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.HealthCheckChore: Health status at 412837hrs, 57mins, 50sec : ERROR check link, OK: disks ok, 2017-02-04 05:57:50,764 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.ScheduledChore: Chore: HealthChecker missed its start time 2017-02-04 05:58:20,844 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.HealthCheckChore: Health status at 412837hrs, 58mins, 20sec : ERROR check link, OK: disks ok, 2017-02-04 05:58:20,844 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.ScheduledChore: Chore: HealthChecker missed its start time 2017-02-04 05:58:50,923 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.HealthCheckChore: Health status at 412837hrs, 58mins, 50sec : ERROR check link, OK: disks ok, 2017-02-04 05:58:50,923 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_1] hbase.ScheduledChore: Chore: HealthChecker missed its start time 2017-02-04 05:59:21,017 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_2] hbase.HealthCheckChore: Health status at 412837hrs, 59mins, 21sec : ERROR check link, OK: disks ok, 2017-02-04 05:59:21,018 INFO [slave-1.internal.larsgeorge.com,16020,1486216506007_ChoreService_2] hbase.ScheduledChore: Chore: HealthChecker missed its start time {noformat} We need to fix the handling of the timeout of the health check script and ho the chore is treating that to shut down the server process. The current settings of check frequency and timeout overlap and cause the above. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17609) Allow for region merging in the UI
[ https://issues.apache.org/jira/browse/HBASE-17609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856926#comment-15856926 ] Lars George commented on HBASE-17609: - What would be nice is to have checkboxes next to the regions, tick what you want (for merging usually regions that are next to each other) and then merge (or split or compact) the selected regions. The current page is sooo 1999. > Allow for region merging in the UI > --- > > Key: HBASE-17609 > URL: https://issues.apache.org/jira/browse/HBASE-17609 > Project: HBase > Issue Type: Task >Affects Versions: 2.0.0, 1.4.0 >Reporter: churro morales >Assignee: churro morales > Attachments: HBASE-17609-branch-1.3.patch, HBASE-17609.patch > > > HBASE-49 discussed having the ability to merge regions through the HBase UI, > but online region merging wasn't around back then. > I have created additional form fields for the table.jsp where you can pass in > two encoded region names (must be adjacent regions) and a merge can be called > through the UI. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-15347) Update CHANGES.txt for 1.3
[ https://issues.apache.org/jira/browse/HBASE-15347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15825965#comment-15825965 ] Lars George commented on HBASE-15347: - And we have no {{hbase-spark}} module in 1.3. This adoc is 2.0/master only, no? > Update CHANGES.txt for 1.3 > -- > > Key: HBASE-15347 > URL: https://issues.apache.org/jira/browse/HBASE-15347 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 1.3.0 >Reporter: Mikhail Antonov >Assignee: Mikhail Antonov > Fix For: 1.3.0 > > Attachments: HBASE-15347-branch-1.3.v1.patch, HBASE-15347.patch, > HBASE-15347-v2.branch-1.3.patch > > > Going to post the steps in preparing changes file for 1.3 here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15347) Update CHANGES.txt for 1.3
[ https://issues.apache.org/jira/browse/HBASE-15347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15825935#comment-15825935 ] Lars George commented on HBASE-15347: - [~mantonov] sorry for being thick here, but this JIRA adds {{spark.adoc}}, but it is *not* in the attached patch. Where did it come from? > Update CHANGES.txt for 1.3 > -- > > Key: HBASE-15347 > URL: https://issues.apache.org/jira/browse/HBASE-15347 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 1.3.0 >Reporter: Mikhail Antonov >Assignee: Mikhail Antonov > Fix For: 1.3.0 > > Attachments: HBASE-15347-branch-1.3.v1.patch, HBASE-15347.patch, > HBASE-15347-v2.branch-1.3.patch > > > Going to post the steps in preparing changes file for 1.3 here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17399) region_mover.rb uses different default filename, also needs slight change
Lars George created HBASE-17399: --- Summary: region_mover.rb uses different default filename, also needs slight change Key: HBASE-17399 URL: https://issues.apache.org/jira/browse/HBASE-17399 Project: HBase Issue Type: Bug Affects Versions: 1.3.0 Reporter: Lars George The command line help prints: {noformat} -f, --filename=FILE File to save regions list into unloading, \ or read from loading; default /tmp/hostname:port {noformat} while in reality, the code does this: {code} def getFilename(options, targetServer, port) filename = options[:file] if not filename filename = "/tmp/" + ENV['USER'] + targetServer + ":" + port end return filename end {code} An example for a generated file is: {noformat} /tmp/larsgeorgeslave-3.internal.larsgeorge.com\:16020 {noformat} I suggest we fix the command line help explanation. But first, we should also fix how the name is generated, *adding* a divider between the user name and the host name, and also change how the port is attached to the host name. Currently this results in a rather strange {{\:}} which could be hard to handle. Maybe we simply use an exclamation mark or hash for both? For example: {noformat} /tmp/larsgeorge!slave-3.internal.larsgeorge.com!16020 /tmp/larsgeorge#slave-3.internal.larsgeorge.com#16020 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17391) [Shell] Add shell command to get list of servers, with filters
Lars George created HBASE-17391: --- Summary: [Shell] Add shell command to get list of servers, with filters Key: HBASE-17391 URL: https://issues.apache.org/jira/browse/HBASE-17391 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 1.3.0 Reporter: Lars George For some operations, for example calling {{update_config}}, the user needs to specify the full server name. For region servers that is easier to find, but not so much for the master (using {{zk_dump}} works but is noisy). It woould be good to add a utility call that lists the servers, preferably with an optional filter (a regexp, server type, or globbing style format) that allows to whittle down the potentially long is of servers. For example: {noformat} hbase(main):001:0> list_servers "master" master-1.internal.larsgeorge.com,16000,1483018890074 hbase(main):002:0> list_servers "rs" slave-1.internal.larsgeorge.com,16020,1482996572051 slave-3.internal.larsgeorge.com,16020,1482996572481 slave-2.internal.larsgeorge.com,16020,1482996570909 hbase(main):003:0> list_servers "rs:s.*\.com.*" slave-1.internal.larsgeorge.com,16020,1482996572051 slave-3.internal.larsgeorge.com,16020,1482996572481 slave-2.internal.larsgeorge.com,16020,1482996570909 hbase(main):004:0> list_servers ":.*160?0.*" master-1.internal.larsgeorge.com,16000,1483018890074 slave-1.internal.larsgeorge.com,16020,1482996572051 slave-3.internal.larsgeorge.com,16020,1482996572481 slave-2.internal.larsgeorge.com,16020,1482996570909 {noformat} I could imagine to have {{master}}, {{backup-master}}, {{rs}}, and maybe even {{zk}} too. The optional regexp shown uses a colon as a divider. This combines the "by-type", and using a filter. Example #4 skips the type and only is using the filter. Of course, you could also implement this differently, say with two parameters... just suggesting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17390) Online update of configuration for all servers leaves out masters
Lars George created HBASE-17390: --- Summary: Online update of configuration for all servers leaves out masters Key: HBASE-17390 URL: https://issues.apache.org/jira/browse/HBASE-17390 Project: HBase Issue Type: Bug Affects Versions: 1.3.0 Reporter: Lars George Looking at the admin API and this method {code} public void updateConfiguration() throws IOException { for (ServerName server : this.getClusterStatus().getServers()) { updateConfiguration(server); } } {code} you can see that it calls {{getServers()}} which only returns the region servers. What is missing is also calling on {{getMaster()}} and {{getBackupMasters()}} to also send them a signal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10367) RegionServer graceful stop / decommissioning
[ https://issues.apache.org/jira/browse/HBASE-10367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784771#comment-15784771 ] Lars George commented on HBASE-10367: - The state has not changed and I have users I talk to struggle with this mess way too often. We need to have a clear story on this. Voting up! > RegionServer graceful stop / decommissioning > > > Key: HBASE-10367 > URL: https://issues.apache.org/jira/browse/HBASE-10367 > Project: HBase > Issue Type: Improvement >Reporter: Enis Soztutar > > Right now, we have a weird way of node decommissioning / graceful stop, which > is a graceful_stop.sh bash script, and a region_mover ruby script, and some > draining server support which you have to manually write to a znode > (really!). Also draining servers is only partially supported in LB operations > (LB does take that into account for roundRobin assignment, but not for normal > balance) > See > http://hbase.apache.org/book/node.management.html and HBASE-3071 > I think we should support graceful stop as a first class citizen. Thinking > about it, it seems that the difference between regionserver stop and graceful > stop is that regionserver stop will close the regions, but the master will > only assign them after the znode is deleted. > In the new master design (or even before), if we allow RS to be able to close > regions on its own (without master initiating it), then graceful stop becomes > regular stop. The RS already closes the regions cleanly, and will reject new > region assignments, so that we don't need much of the balancer or draining > server trickery. > This ties into the new master/AM redesign (HBASE-5487), but still deserves > it's own jira. Let's use this to brainstorm on the design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17084) Delete HMerge (dead code)
[ https://issues.apache.org/jira/browse/HBASE-17084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664542#comment-15664542 ] Lars George commented on HBASE-17084: - Wait! Please consider the discussion on dev@ with the subject "Merge and HMerge". > Delete HMerge (dead code) > - > > Key: HBASE-17084 > URL: https://issues.apache.org/jira/browse/HBASE-17084 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy >Priority: Trivial > Attachments: HBASE-17084.master.001.patch > > > HMerge isn't used anywhere. Can we delete it? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-15445) Add support for ACLs for web based UIs
[ https://issues.apache.org/jira/browse/HBASE-15445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George reassigned HBASE-15445: --- Assignee: Lars George > Add support for ACLs for web based UIs > -- > > Key: HBASE-15445 > URL: https://issues.apache.org/jira/browse/HBASE-15445 > Project: HBase > Issue Type: Bug > Components: master, regionserver, REST, Thrift >Affects Versions: 1.2.0, 1.0.3, 1.1.3 >Reporter: Lars George >Assignee: Lars George > > Since 0.99 and HBASE-10336 we have our own HttpServer class that (like the > counterpart in Hadoop) supports setting an ACL to allow only named users to > access the web based UIs of the server processes. In secure mode we should > support this as it works hand-in-hand with Kerberos authorization and the UGI > class. It seems all we have to do is add a property allowing to set the ACL > property as a list of users and/or groups that have access to the UIs if > needed. > As an add-on, we could combine this with the {{read-only}} flag, so that some > users can only access the UIs with any option to trigger, for example, > splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15445) Add support for ACLs for web based UIs
[ https://issues.apache.org/jira/browse/HBASE-15445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-15445: Assignee: Robert Neumann (was: Lars George) > Add support for ACLs for web based UIs > -- > > Key: HBASE-15445 > URL: https://issues.apache.org/jira/browse/HBASE-15445 > Project: HBase > Issue Type: Bug > Components: master, regionserver, REST, Thrift >Affects Versions: 1.2.0, 1.0.3, 1.1.3 >Reporter: Lars George >Assignee: Robert Neumann > > Since 0.99 and HBASE-10336 we have our own HttpServer class that (like the > counterpart in Hadoop) supports setting an ACL to allow only named users to > access the web based UIs of the server processes. In secure mode we should > support this as it works hand-in-hand with Kerberos authorization and the UGI > class. It seems all we have to do is add a property allowing to set the ACL > property as a list of users and/or groups that have access to the UIs if > needed. > As an add-on, we could combine this with the {{read-only}} flag, so that some > users can only access the UIs with any option to trigger, for example, > splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16815) Low scan ratio in RPC queue tuning triggers divide by zero exception
[ https://issues.apache.org/jira/browse/HBASE-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578669#comment-15578669 ] Lars George commented on HBASE-16815: - +1 on the patch, good idea to up the level since this is printed only once anyways. Good on you [~zghaobac]. > Low scan ratio in RPC queue tuning triggers divide by zero exception > > > Key: HBASE-16815 > URL: https://issues.apache.org/jira/browse/HBASE-16815 > Project: HBase > Issue Type: Bug > Components: regionserver, rpc >Affects Versions: 2.0.0, 1.3.0 >Reporter: Lars George >Assignee: Guanghao Zhang > Attachments: HBASE-16815.patch > > > Trying the following settings: > {noformat} > > hbase.ipc.server.callqueue.handler.factor > 0.5 > > > hbase.ipc.server.callqueue.read.ratio > 0.5 > > > hbase.ipc.server.callqueue.scan.ratio > 0.1 > > {noformat} > With 30 default handlers, this means 15 queues. Further, it means 8 write > queues and 7 read queues. 10% of that is {{0.7}} which is then floor'ed to > {{0}}. The debug log confirms it, as the tertiary check omits the scan > details when they are zero: > {noformat} > 2016-10-12 12:50:27,305 INFO [main] ipc.SimpleRpcScheduler: Using fifo as > user call queue, count=15 > 2016-10-12 12:50:27,311 DEBUG [main] ipc.RWQueueRpcExecutor: FifoRWQ.default > writeQueues=7 writeHandlers=15 readQueues=8 readHandlers=14 > {noformat} > But the code in {{RWQueueRpcExecutor}} calls {{RpcExecutor.startHandler()}} > nevertheless and that does this: > {code} > for (int i = 0; i < numHandlers; i++) { > final int index = qindex + (i % qsize); > String name = "RpcServer." + threadPrefix + ".handler=" + > handlers.size() + ",queue=" + > index + ",port=" + port; > {code} > The modulo triggers then > {noformat} > 2016-10-12 11:41:22,810 ERROR [main] master.HMasterCommandLine: Master exiting > java.lang.RuntimeException: Failed construction of Master: class > org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:145) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:220) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:155) > at > org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:222) > at > org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:137) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) > at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2524) > Caused by: java.lang.ArithmeticException: / by zero > at > org.apache.hadoop.hbase.ipc.RpcExecutor.startHandlers(RpcExecutor.java:125) > at > org.apache.hadoop.hbase.ipc.RWQueueRpcExecutor.startHandlers(RWQueueRpcExecutor.java:178) > at org.apache.hadoop.hbase.ipc.RpcExecutor.start(RpcExecutor.java:78) > at > org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.start(SimpleRpcScheduler.java:272) > at org.apache.hadoop.hbase.ipc.RpcServer.start(RpcServer.java:2212) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.start(RSRpcServices.java:1143) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:615) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:396) > at > org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.(HMasterCommandLine.java:312) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:422) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:140) > ... 7 more > {noformat} > That causes the server to not even start. I would suggest we either skip the > {{startHandler()}} call altogether, or make it zero aware. > Another possible option is to reserve at least _one_ scan handler/queue when > the scan ratio is greater than zero, but only of there is more than one read > handler/queue to begin with. Otherwise the scan handler/queue should be zero > and share the one read handler/queue. > Makes sense? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16815) Low scan ratio in RPC queue tuning triggers divide by zero exception
Lars George created HBASE-16815: --- Summary: Low scan ratio in RPC queue tuning triggers divide by zero exception Key: HBASE-16815 URL: https://issues.apache.org/jira/browse/HBASE-16815 Project: HBase Issue Type: Bug Components: regionserver, rpc Affects Versions: 2.0.0, 1.3.0 Reporter: Lars George Trying the following settings: {noformat} hbase.ipc.server.callqueue.handler.factor 0.5 hbase.ipc.server.callqueue.read.ratio 0.5 hbase.ipc.server.callqueue.scan.ratio 0.1 {noformat} With 30 default handlers, this means 15 queues. Further, it means 8 write queues and 7 read queues. 10% of that is {{0.7}} which is then floor'ed to {{0}}. The debug log confirms it, as the tertiary check omits the scan details when they are zero: {noformat} 2016-10-12 12:50:27,305 INFO [main] ipc.SimpleRpcScheduler: Using fifo as user call queue, count=15 2016-10-12 12:50:27,311 DEBUG [main] ipc.RWQueueRpcExecutor: FifoRWQ.default writeQueues=7 writeHandlers=15 readQueues=8 readHandlers=14 {noformat} But the code in {{RWQueueRpcExecutor}} calls {{RpcExecutor.startHandler()}} nevertheless and that does this: {code} for (int i = 0; i < numHandlers; i++) { final int index = qindex + (i % qsize); String name = "RpcServer." + threadPrefix + ".handler=" + handlers.size() + ",queue=" + index + ",port=" + port; {code} The modulo triggers then {noformat} 2016-10-12 11:41:22,810 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:145) at org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:220) at org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:155) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:222) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:137) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2524) Caused by: java.lang.ArithmeticException: / by zero at org.apache.hadoop.hbase.ipc.RpcExecutor.startHandlers(RpcExecutor.java:125) at org.apache.hadoop.hbase.ipc.RWQueueRpcExecutor.startHandlers(RWQueueRpcExecutor.java:178) at org.apache.hadoop.hbase.ipc.RpcExecutor.start(RpcExecutor.java:78) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.start(SimpleRpcScheduler.java:272) at org.apache.hadoop.hbase.ipc.RpcServer.start(RpcServer.java:2212) at org.apache.hadoop.hbase.regionserver.RSRpcServices.start(RSRpcServices.java:1143) at org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:615) at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:396) at org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.(HMasterCommandLine.java:312) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:140) ... 7 more {noformat} That causes the server to not even start. I would suggest we either skip the {{startHandler()}} call altogether, or make it zero aware. Another possible option is to reserve at least _one_ scan handler/queue when the scan ratio is greater than zero, but only of there is more than one read handler/queue to begin with. Otherwise the scan handler/queue should be zero and share the one read handler/queue. Makes sense? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16401) Enable HeapMemoryManager by default in 2.0
[ https://issues.apache.org/jira/browse/HBASE-16401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420987#comment-15420987 ] Lars George commented on HBASE-16401: - Whoops, I meant 20%-60% for each, so that it makes 80% in total. > Enable HeapMemoryManager by default in 2.0 > -- > > Key: HBASE-16401 > URL: https://issues.apache.org/jira/browse/HBASE-16401 > Project: HBase > Issue Type: Improvement >Reporter: stack >Priority: Critical > Fix For: 2.0.0 > > > Back in HBASE-5349, on the end of the issue, we talked about enabling > HeapMemoryManager by default. Lets do it for 2.0 with some conservative > boundaries. Do it now so we have some experience running it before release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16401) Enable HeapMemoryManager by default in 2.0
[ https://issues.apache.org/jira/browse/HBASE-16401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420058#comment-15420058 ] Lars George commented on HBASE-16401: - I'd suggest something like 20-40% for both. We now have 40% + 40% as default. I would say a sensible min range would be 20% for both, and leave the 40% as maximum? > Enable HeapMemoryManager by default in 2.0 > -- > > Key: HBASE-16401 > URL: https://issues.apache.org/jira/browse/HBASE-16401 > Project: HBase > Issue Type: Improvement >Reporter: stack >Priority: Critical > Fix For: 2.0.0 > > > Back in HBASE-5349, on the end of the issue, we talked about enabling > HeapMemoryManager by default. Lets do it for 2.0 with some conservative > boundaries. Do it now so we have some experience running it before release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-13928) Correct doc bug introduced in HBASE-11735
[ https://issues.apache.org/jira/browse/HBASE-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George resolved HBASE-13928. - Resolution: Duplicate Was fixed in HBASE-15528. > Correct doc bug introduced in HBASE-11735 > - > > Key: HBASE-13928 > URL: https://issues.apache.org/jira/browse/HBASE-13928 > Project: HBase > Issue Type: Task > Components: documentation >Affects Versions: 0.99.0, 0.98.4, 0.98.5 >Reporter: Misty Stanley-Jones >Assignee: Misty Stanley-Jones > Fix For: 2.0.0, 0.98.6, 0.99.0 > > > {quote}Biju Nair added a comment - 09/Jun/15 04:53 > I think the parameter hbase.bucketcache.sizes is used in the document patch > instead of hbase.bucketcache.bucket.sizes to configure bucket sizes. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13928) Correct doc bug introduced in HBASE-11735
[ https://issues.apache.org/jira/browse/HBASE-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15370869#comment-15370869 ] Lars George commented on HBASE-13928: - Oh and what he is saying is - just to clarify - that the {{hbase-default.xml}} is wrong. > Correct doc bug introduced in HBASE-11735 > - > > Key: HBASE-13928 > URL: https://issues.apache.org/jira/browse/HBASE-13928 > Project: HBase > Issue Type: Task > Components: documentation >Affects Versions: 0.99.0, 0.98.4, 0.98.5 >Reporter: Misty Stanley-Jones >Assignee: Misty Stanley-Jones > Fix For: 0.99.0, 2.0.0, 0.98.6 > > > {quote}Biju Nair added a comment - 09/Jun/15 04:53 > I think the parameter hbase.bucketcache.sizes is used in the document patch > instead of hbase.bucketcache.bucket.sizes to configure bucket sizes. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HBASE-13928) Correct doc bug introduced in HBASE-11735
[ https://issues.apache.org/jira/browse/HBASE-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George reopened HBASE-13928: - [~gsbiju] is right, the added {{hbase-default.xml}} key is wrong, and never used. It is missing the {{.bucket.}} as mentioned and should be reading {{hbase.bucketcache.bucket.sizes}}. Only then an operator can set the sizes. The wrong property is misleading at most, but needs fixing anyways. > Correct doc bug introduced in HBASE-11735 > - > > Key: HBASE-13928 > URL: https://issues.apache.org/jira/browse/HBASE-13928 > Project: HBase > Issue Type: Task > Components: documentation >Affects Versions: 0.99.0, 0.98.4, 0.98.5 >Reporter: Misty Stanley-Jones >Assignee: Misty Stanley-Jones > Fix For: 0.99.0, 2.0.0, 0.98.6 > > > {quote}Biju Nair added a comment - 09/Jun/15 04:53 > I think the parameter hbase.bucketcache.sizes is used in the document patch > instead of hbase.bucketcache.bucket.sizes to configure bucket sizes. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16188) Add EventCounter information to log4j properties file
Lars George created HBASE-16188: --- Summary: Add EventCounter information to log4j properties file Key: HBASE-16188 URL: https://issues.apache.org/jira/browse/HBASE-16188 Project: HBase Issue Type: Improvement Affects Versions: 1.2.1 Reporter: Lars George Priority: Minor Hadoop's {{JvmMetrics}}, which HBase also is using in Metrics2 and provides it as an MBean, has the ability to count log4j log calls. This is tracked by a special {{Appender}} class, also provided by Hadoop, called {{EventCounter}}. We should add some info how to enable this (or maybe even enable it by default?). The appender needs to be added in two places, shown here: {noformat} hbase.root.logger=INFO,console ... # Define the root logger to the system property "hbase.root.logger". log4j.rootLogger=${hbase.root.logger}, EventCounter log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter {noformat} We could simply add this commented out akin to the {{hbase-env.sh}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16187) Fix typo in blog post for metrics2
Lars George created HBASE-16187: --- Summary: Fix typo in blog post for metrics2 Key: HBASE-16187 URL: https://issues.apache.org/jira/browse/HBASE-16187 Project: HBase Issue Type: Bug Components: website Reporter: Lars George Assignee: Sean Busbey See https://blogs.apache.org/hbase/entry/migration_to_the_new_metrics s/sudo/pseudo -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16186) Fix AssignmentManager MBean name
Lars George created HBASE-16186: --- Summary: Fix AssignmentManager MBean name Key: HBASE-16186 URL: https://issues.apache.org/jira/browse/HBASE-16186 Project: HBase Issue Type: Bug Components: master Affects Versions: 1.2.1 Reporter: Lars George Fix For: 2.0.0 The MBean has a spelling error, listed as "AssignmentManger" (note the missing "a"). This is a publicly available name that tools might already use to filter metrics etc. We should change this across major versions only? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15445) Add support for ACLs for web based UIs
Lars George created HBASE-15445: --- Summary: Add support for ACLs for web based UIs Key: HBASE-15445 URL: https://issues.apache.org/jira/browse/HBASE-15445 Project: HBase Issue Type: Bug Components: master, regionserver, REST, Thrift Affects Versions: 1.1.3, 1.0.3, 1.2.0 Reporter: Lars George Since 0.99 and HBASE-10336 we have our own HttpServer class that (like the counterpart in Hadoop) supports setting an ACL to allow only named users to access the web based UIs of the server processes. In secure mode we should support this as it works hand-in-hand with Kerberos authorization and the UGI class. It seems all we have to do is add a property allowing to set the ACL property as a list of users and/or groups that have access to the UIs if needed. As an add-on, we could combine this with the {{read-only}} flag, so that some users can only access the UIs with any option to trigger, for example, splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15098) Normalizer switch in configuration is not used
[ https://issues.apache.org/jira/browse/HBASE-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109554#comment-15109554 ] Lars George commented on HBASE-15098: - [~stack] thanks, and indeed, I am scrambling... I trust from the resolve and above Hudson messages that you also pushed it into all other listed targets. Appreciated. > Normalizer switch in configuration is not used > -- > > Key: HBASE-15098 > URL: https://issues.apache.org/jira/browse/HBASE-15098 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 1.2.0 >Reporter: Lars George >Assignee: Ted Yu >Priority: Blocker > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-15098.v1.patch > > > The newly added global switch to enable the new normalizer functionality is > never used apparently, meaning it is always on. The {{hbase-default.xml}} has > this: > {noformat} > > hbase.normalizer.enabled > false > If set to true, Master will try to keep region size > within each table approximately the same. > > {noformat} > But only a test class uses it to set the switch to "true". We should > implement a proper {{if}} statement that checks this value and properly > disables the feature cluster wide if not wanted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15098) Normalizer switch in configuration is not used
[ https://issues.apache.org/jira/browse/HBASE-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108204#comment-15108204 ] Lars George commented on HBASE-15098: - +1, lgtm. Shall I commit or [~tedyu], since he did the patch. Happy to otherwise. > Normalizer switch in configuration is not used > -- > > Key: HBASE-15098 > URL: https://issues.apache.org/jira/browse/HBASE-15098 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 1.2.0 >Reporter: Lars George >Priority: Blocker > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-15098.v1.patch > > > The newly added global switch to enable the new normalizer functionality is > never used apparently, meaning it is always on. The {{hbase-default.xml}} has > this: > {noformat} > > hbase.normalizer.enabled > false > If set to true, Master will try to keep region size > within each table approximately the same. > > {noformat} > But only a test class uses it to set the switch to "true". We should > implement a proper {{if}} statement that checks this value and properly > disables the feature cluster wide if not wanted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15098) Normalizer switch in configuration is not used
[ https://issues.apache.org/jira/browse/HBASE-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-15098: Fix Version/s: 1.3.0 > Normalizer switch in configuration is not used > -- > > Key: HBASE-15098 > URL: https://issues.apache.org/jira/browse/HBASE-15098 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 1.2.0 >Reporter: Lars George > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.2.1 > > > The newly added global switch to enable the new normalizer functionality is > never used apparently, meaning it is always on. The {{hbase-default.xml}} has > this: > {noformat} > > hbase.normalizer.enabled > false > If set to true, Master will try to keep region size > within each table approximately the same. > > {noformat} > But only a test class uses it to set the switch to "true". We should > implement a proper {{if}} statement that checks this value and properly > disables the feature cluster wide if not wanted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15098) Normalizer switch in configuration is not used
Lars George created HBASE-15098: --- Summary: Normalizer switch in configuration is not used Key: HBASE-15098 URL: https://issues.apache.org/jira/browse/HBASE-15098 Project: HBase Issue Type: Bug Components: master Affects Versions: 1.2.0 Reporter: Lars George Fix For: 2.0.0, 1.2.0, 1.2.1 The newly added global switch to enable the new normalizer functionality is never used apparently, meaning it is always on. The {{hbase-default.xml}} has this: {noformat} hbase.normalizer.enabled false If set to true, Master will try to keep region size within each table approximately the same. {noformat} But only a test class uses it to set the switch to "true". We should implement a proper {{if}} statement that checks this value and properly disables the feature cluster wide if not wanted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15098) Normalizer switch in configuration is not used
[ https://issues.apache.org/jira/browse/HBASE-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096149#comment-15096149 ] Lars George commented on HBASE-15098: - Weird thing is, this was meant to be in there? See https://github.com/apache/hbase/commit/fd37ccb63c545850c08c132b2f6470354a6629f9#diff-910fe86f307ab33e4e946c666e739972R557 Apparently https://issues.apache.org/jira/browse/HBASE-14367 removed the switch in favor of the task always running, but then be able to enable it per table using the shell? Does this mean we just have to remove the configuration property from {{hbase-default.xml}}? > Normalizer switch in configuration is not used > -- > > Key: HBASE-15098 > URL: https://issues.apache.org/jira/browse/HBASE-15098 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 1.2.0 >Reporter: Lars George > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.2.1 > > > The newly added global switch to enable the new normalizer functionality is > never used apparently, meaning it is always on. The {{hbase-default.xml}} has > this: > {noformat} > > hbase.normalizer.enabled > false > If set to true, Master will try to keep region size > within each table approximately the same. > > {noformat} > But only a test class uses it to set the switch to "true". We should > implement a proper {{if}} statement that checks this value and properly > disables the feature cluster wide if not wanted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14367) Add normalization support to shell
[ https://issues.apache.org/jira/browse/HBASE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096165#comment-15096165 ] Lars George commented on HBASE-14367: - Was the defect filed? What is the JIRA? Could it be linked here? I dislike dangling comments for some reason. :( > Add normalization support to shell > -- > > Key: HBASE-14367 > URL: https://issues.apache.org/jira/browse/HBASE-14367 > Project: HBase > Issue Type: Bug > Components: Balancer, shell >Affects Versions: 1.1.2 >Reporter: Lars George >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-14367-branch-1.2.v1.patch, > HBASE-14367-branch-1.2.v2.patch, HBASE-14367-branch-1.2.v3.patch, > HBASE-14367-branch-1.v1.patch, HBASE-14367-v1.patch, HBASE-14367.patch > > > https://issues.apache.org/jira/browse/HBASE-13103 adds support for setting a > normalization flag per {{HTableDescriptor}}, along with the server side chore > to do the work. > What is lacking is to easily set this from the shell, right now you need to > use the Java API to modify the descriptor. This issue is to add the flag as a > known attribute key and/or other means to toggle this per table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14975) Don't color the total RIT line yellow if it's zero
[ https://issues.apache.org/jira/browse/HBASE-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15091781#comment-15091781 ] Lars George commented on HBASE-14975: - This was a (partial) dupe of https://issues.apache.org/jira/browse/HBASE-13839. Bummer to now have missed some other fixes. > Don't color the total RIT line yellow if it's zero > -- > > Key: HBASE-14975 > URL: https://issues.apache.org/jira/browse/HBASE-14975 > Project: HBase > Issue Type: Bug > Components: UI >Reporter: Elliott Clark >Assignee: Pallavi Adusumilli > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14975.patch, Screen Shot 2015-12-14 at 11.37.13 > AM.png, Screenshot 2016-01-04.png > > > Right now if there are regions in transition, sometimes the RIT over 60 > seconds line is colored yellow. It shouldn't be colored yellow if there are > no regions that have been in transition too long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13839) Fix AssgnmentManagerTmpl.jamon issues (coloring, content etc.)
[ https://issues.apache.org/jira/browse/HBASE-13839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15091783#comment-15091783 ] Lars George commented on HBASE-13839: - HBASE-14975 has partial fix of issues. This here needs to be rebased now. [~mwarhaftig], would you mind checking again, I am happy to push it to commit. > Fix AssgnmentManagerTmpl.jamon issues (coloring, content etc.) > -- > > Key: HBASE-13839 > URL: https://issues.apache.org/jira/browse/HBASE-13839 > Project: HBase > Issue Type: Bug > Components: master, UI >Affects Versions: 1.1.0 >Reporter: Lars George >Assignee: Matt Warhaftig > Labels: beginner > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13838_post.tiff, HBASE-13838_pre.tiff, > hbase-13839-v1.patch > > > The template for the RIT in the Master status page, > AssignmentManagerTmpl.jamon) has a few issues: > - The oldest RIT should not be _red_, looks like a failed entry > The RIT entries should be for example yellow/amber when over the threshold > time, and red if 2x the threshold - or red for the oldest once over the > threshold. > - Region count over RIT threshold should only be colored if > 0 > The summary line (first of two) should not be colored unless there is a value > > 0 in it. > - Color is overriden by table-stripped CSS style! > The Bootstrap stylesheet cancels out the hardcoded coloring! The > table-stripped resets the conditional coloring and should be fixed. Best is > to use "alert-warning" etc. that come from the Bootstrap theme stylesheet. > That should maybe already work in combination with the "table-stripped" from > the same. > - Should sort descending by time > Currently the list of regions is sorted by encoded region name. Better is to > have the table sorted by RIT time descending. > We should also think about a pagination option for the currently hardcoded > 100 entries max. Maybe a separate issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14864) Add support for bucketing of keys into client library
[ https://issues.apache.org/jira/browse/HBASE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-14864: Description: This has been discussed and taught so many times, I believe it is time to support it properly. The idea is to be able to assign an optional _bucketing_ strategy to a table, which translates the user given row keys into a bucketed version. This is done by either simple count, or by parts of the key. Possibly some simple functionality should help _compute_ bucket keys. For example, given a key {{\-\--...}} you could imagine that a rule can be defined that takes the _epoch_ part and chunks it into, for example, 5 minute buckets. This allows to store small time series together and make reading (especially over many servers) much more efficient. The client also supports the proper scan logic to fan a scan over the buckets as needed. There may be an executor service (implicitly or explicitly provided) that is used to fetch the original data with user visible ordering from the distributed buckets. Note that this has been attempted a few times to various extends out in the field, but then withered away. This is an essential feature that when present in the API will make users consider this earlier, instead of when it is too late (when hot spotting occurs for example). The selected bucketing strategy and settings could be stored in the table descriptor key/value pairs. This will allow any client to observe the strategy transparently. If not set the behaviour is the same as today, so the new feature is not touching any critical path in terms of code, and is fully client side. (But could be considered for say UI support as well - if needed). The strategies are pluggable using classes, but a few default implementations are supplied. was: This has been discussed and taught so many times, I believe it is time to support it properly. The idea is to be able to assign an optional _bucketing_ strategy to a table, which translates the user given row keys into a bucketed version. This is done by either simple count, or by parts of the key. Possibly some simple functionality should help _compute_ bucket keys. For example, given a key {{---...}} you could imagine that a rule can be defined that takes the _epoch_ part and chunks it into, for example, 5 minute buckets. This allows to store small time series together and make reading (especially over many servers) much more efficient. The client also supports the proper scan logic to fan a scan over the buckets as needed. There may be an executor service (implicitly or explicitly provided) that is used to fetch the original data with user visible ordering from the distributed buckets. Note that this has been attempted a few times to various extends out in the field, but then withered away. This is an essential feature that when present in the API will make users consider this earlier, instead of when it is too late (when hot spotting occurs for example). The selected bucketing strategy and settings could be stored in the table descriptor key/value pairs. This will allow any client to observe the strategy transparently. If not set the behaviour is the same as today, so the new feature is not touching any critical path in terms of code, and is fully client side. (But could be considered for say UI support as well - if needed). The strategies are pluggable using classes, but a few default implementations are supplied. > Add support for bucketing of keys into client library > - > > Key: HBASE-14864 > URL: https://issues.apache.org/jira/browse/HBASE-14864 > Project: HBase > Issue Type: New Feature > Components: Client >Reporter: Lars George > > This has been discussed and taught so many times, I believe it is time to > support it properly. The idea is to be able to assign an optional _bucketing_ > strategy to a table, which translates the user given row keys into a bucketed > version. This is done by either simple count, or by parts of the key. > Possibly some simple functionality should help _compute_ bucket keys. > For example, given a key {{\-\--...}} you could > imagine that a rule can be defined that takes the _epoch_ part and chunks it > into, for example, 5 minute buckets. This allows to store small time series > together and make reading (especially over many servers) much more efficient. > The client also supports the proper scan logic to fan a scan over the buckets > as needed. There may be an executor service (implicitly or explicitly > provided) that is used to fetch the original data with user visible ordering > from the distributed buckets. > Note that this has been attempted a few times to various extends out in the > field, but then withered away. This is an essential
[jira] [Created] (HBASE-14864) Add support for bucketing of keys into client library
Lars George created HBASE-14864: --- Summary: Add support for bucketing of keys into client library Key: HBASE-14864 URL: https://issues.apache.org/jira/browse/HBASE-14864 Project: HBase Issue Type: New Feature Components: Client Reporter: Lars George This has been discussed and taught so many times, I believe it is time to support it properly. The idea is to be able to assign an optional _bucketing_ strategy to a table, which translates the user given row keys into a bucketed version. This is done by either simple count, or by parts of the key. Possibly some simple functionality should help _compute_ bucket keys. For example, given a key {{---...}} you could imagine that a rule can be defined that takes the _epoch_ part and chunks it into, for example, 5 minute buckets. This allows to store small time series together and make reading (especially over many servers) much more efficient. The client also supports the proper scan logic to fan a scan over the buckets as needed. There may be an executor service (implicitly or explicitly provided) that is used to fetch the original data with user visible ordering from the distributed buckets. Note that this has been attempted a few times to various extends out in the field, but then withered away. This is an essential feature that when present in the API will make users consider this earlier, instead of when it is too late (when hot spotting occurs for example). The selected bucketing strategy and settings could be stored in the table descriptor key/value pairs. This will allow any client to observe the strategy transparently. If not set the behaviour is the same as today, so the new feature is not touching any critical path in terms of code, and is fully client side. (But could be considered for say UI support as well - if needed). The strategies are pluggable using classes, but a few default implementations are supplied. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14864) Add support for bucketing of keys into client library
[ https://issues.apache.org/jira/browse/HBASE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15019180#comment-15019180 ] Lars George commented on HBASE-14864: - Yes, say you have an epoch like {noformat} $ date -j -f %s 1433763479 Mon Jun 8 04:37:59 PDT 2015 {noformat} or the date it results to given as {{20150608043759}}, it would be good to say instead of sending this to a _random_, hashed location per time, that you instead _round_ it to every five minutes. That way all keys from say {{201506080435}} to {{201506080440}} arrive at the same bucket. Now when you read the data you can fetch it as a block per server, and therefore IO is more efficient (per server). > Add support for bucketing of keys into client library > - > > Key: HBASE-14864 > URL: https://issues.apache.org/jira/browse/HBASE-14864 > Project: HBase > Issue Type: New Feature > Components: Client >Reporter: Lars George > > This has been discussed and taught so many times, I believe it is time to > support it properly. The idea is to be able to assign an optional _bucketing_ > strategy to a table, which translates the user given row keys into a bucketed > version. This is done by either simple count, or by parts of the key. > Possibly some simple functionality should help _compute_ bucket keys. > For example, given a key {{\-\--...}} you could > imagine that a rule can be defined that takes the _epoch_ part and chunks it > into, for example, 5 minute buckets. This allows to store small time series > together and make reading (especially over many servers) much more efficient. > The client also supports the proper scan logic to fan a scan over the buckets > as needed. There may be an executor service (implicitly or explicitly > provided) that is used to fetch the original data with user visible ordering > from the distributed buckets. > Note that this has been attempted a few times to various extends out in the > field, but then withered away. This is an essential feature that when present > in the API will make users consider this earlier, instead of when it is too > late (when hot spotting occurs for example). > The selected bucketing strategy and settings could be stored in the table > descriptor key/value pairs. This will allow any client to observe the > strategy transparently. If not set the behaviour is the same as today, so the > new feature is not touching any critical path in terms of code, and is fully > client side. (But could be considered for say UI support as well - if needed). > The strategies are pluggable using classes, but a few default implementations > are supplied. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14864) Add support for bucketing of keys into client library
[ https://issues.apache.org/jira/browse/HBASE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15019181#comment-15019181 ] Lars George commented on HBASE-14864: - This is certainly related and could be included in the support. > Add support for bucketing of keys into client library > - > > Key: HBASE-14864 > URL: https://issues.apache.org/jira/browse/HBASE-14864 > Project: HBase > Issue Type: New Feature > Components: Client >Reporter: Lars George > > This has been discussed and taught so many times, I believe it is time to > support it properly. The idea is to be able to assign an optional _bucketing_ > strategy to a table, which translates the user given row keys into a bucketed > version. This is done by either simple count, or by parts of the key. > Possibly some simple functionality should help _compute_ bucket keys. > For example, given a key {{\-\--...}} you could > imagine that a rule can be defined that takes the _epoch_ part and chunks it > into, for example, 5 minute buckets. This allows to store small time series > together and make reading (especially over many servers) much more efficient. > The client also supports the proper scan logic to fan a scan over the buckets > as needed. There may be an executor service (implicitly or explicitly > provided) that is used to fetch the original data with user visible ordering > from the distributed buckets. > Note that this has been attempted a few times to various extends out in the > field, but then withered away. This is an essential feature that when present > in the API will make users consider this earlier, instead of when it is too > late (when hot spotting occurs for example). > The selected bucketing strategy and settings could be stored in the table > descriptor key/value pairs. This will allow any client to observe the > strategy transparently. If not set the behaviour is the same as today, so the > new feature is not touching any critical path in terms of code, and is fully > client side. (But could be considered for say UI support as well - if needed). > The strategies are pluggable using classes, but a few default implementations > are supplied. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14343) Fix debug message in SimpleRegionNormalizer for small regions
[ https://issues.apache.org/jira/browse/HBASE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972484#comment-14972484 ] Lars George commented on HBASE-14343: - No worries. Looks good to me, committing... > Fix debug message in SimpleRegionNormalizer for small regions > - > > Key: HBASE-14343 > URL: https://issues.apache.org/jira/browse/HBASE-14343 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.1.1 >Reporter: Lars George >Assignee: Lars Francke >Priority: Trivial > Labels: beginner > Attachments: HBASE-14343-extended.patch, HBASE-14343.patch > > > The {{SimpleRegionNormalizer}} has this: > {code} > if ((smallestRegion.getSecond() + > smallestNeighborOfSmallestRegion.getSecond() > < avgRegionSize)) { > LOG.debug("Table " + table + ", smallest region size: " + > smallestRegion.getSecond() > + " and its smallest neighbor size: " + > smallestNeighborOfSmallestRegion.getSecond() > + ", less than half the avg size, merging them"); > {code} > It does *not* check for "less than half the avg size" but only "less than the > avg size", that is, drop the "half". Fix message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14343) Fix debug message in SimpleRegionNormalizer for small regions
[ https://issues.apache.org/jira/browse/HBASE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-14343: Resolution: Fixed Status: Resolved (was: Patch Available) Trivial patch. Applied to branch-1, branch-1.2, and master. Thanks Lars! > Fix debug message in SimpleRegionNormalizer for small regions > - > > Key: HBASE-14343 > URL: https://issues.apache.org/jira/browse/HBASE-14343 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.1.1 >Reporter: Lars George >Assignee: Lars Francke >Priority: Trivial > Labels: beginner > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-14343-extended.patch, HBASE-14343.patch > > > The {{SimpleRegionNormalizer}} has this: > {code} > if ((smallestRegion.getSecond() + > smallestNeighborOfSmallestRegion.getSecond() > < avgRegionSize)) { > LOG.debug("Table " + table + ", smallest region size: " + > smallestRegion.getSecond() > + " and its smallest neighbor size: " + > smallestNeighborOfSmallestRegion.getSecond() > + ", less than half the avg size, merging them"); > {code} > It does *not* check for "less than half the avg size" but only "less than the > avg size", that is, drop the "half". Fix message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14343) Fix debug message in SimpleRegionNormalizer for small regions
[ https://issues.apache.org/jira/browse/HBASE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-14343: Fix Version/s: 1.3.0 1.2.0 2.0.0 > Fix debug message in SimpleRegionNormalizer for small regions > - > > Key: HBASE-14343 > URL: https://issues.apache.org/jira/browse/HBASE-14343 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.1.1 >Reporter: Lars George >Assignee: Lars Francke >Priority: Trivial > Labels: beginner > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-14343-extended.patch, HBASE-14343.patch > > > The {{SimpleRegionNormalizer}} has this: > {code} > if ((smallestRegion.getSecond() + > smallestNeighborOfSmallestRegion.getSecond() > < avgRegionSize)) { > LOG.debug("Table " + table + ", smallest region size: " + > smallestRegion.getSecond() > + " and its smallest neighbor size: " + > smallestNeighborOfSmallestRegion.getSecond() > + ", less than half the avg size, merging them"); > {code} > It does *not* check for "less than half the avg size" but only "less than the > avg size", that is, drop the "half". Fix message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14556) Make prefetchOnOpen configurable for index and/or data blocks
Lars George created HBASE-14556: --- Summary: Make prefetchOnOpen configurable for index and/or data blocks Key: HBASE-14556 URL: https://issues.apache.org/jira/browse/HBASE-14556 Project: HBase Issue Type: Bug Components: BlockCache, regionserver Reporter: Lars George This came up in user discussions. It would be great to add an extra option to the {{CacheConfig}} that allows to specify what blocks are cached during region/file opening. This should allows to set {{BlockIndexOnly}}, {{BloomFilterOnly}}, {{AllIndexesOnly}}, {{DataOnly}}, and {{AllBlocks}}. For large datasets it is not viable to load all blocks into memory, but to speed up access it still makes sense to prefetch the index blocks (being block index and Bloom filter blocks). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14287) Bootstrapping a cluster leaves temporary WAL directory laying around
[ https://issues.apache.org/jira/browse/HBASE-14287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742414#comment-14742414 ] Lars George commented on HBASE-14287: - For quick reference, the missing change is this: {noformat} HRegion meta = HRegion.createHRegion(metaHRI, rd, c, - HTableDescriptor.META_TABLEDESC); + HTableDescriptor.META_TABLEDESC, null, true, true); setInfoFamilyCachingForMeta(true); {noformat} So the patch should be something like this: {noformat} diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java index bcf9ba0..e8b26e6 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java @@ -537,7 +537,8 @@ public class MasterFileSystem { HRegionInfo metaHRI = new HRegionInfo(HRegionInfo.FIRST_META_REGIONINFO); HTableDescriptor metaDescriptor = new FSTableDescriptors(c).get(TableName.META_TABLE_NAME); setInfoFamilyCachingForMeta(metaDescriptor, false); - HRegion meta = HRegion.createHRegion(metaHRI, rd, c, metaDescriptor); + HRegion meta = HRegion.createHRegion(metaHRI, rd, c, metaDescriptor, +null, true, true); setInfoFamilyCachingForMeta(metaDescriptor, true); HRegion.closeHRegion(meta); } catch (IOException e) { {noformat} > Bootstrapping a cluster leaves temporary WAL directory laying around > > > Key: HBASE-14287 > URL: https://issues.apache.org/jira/browse/HBASE-14287 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Affects Versions: 1.0.2, 1.1.1 >Reporter: Lars George >Priority: Minor > > When a new cluster is started, it creates a temporary WAL as {{hbase:meta}} > is created during bootstrapping the system. Then this log is closed before > properly opened on a region server. The temp WAL file is scheduled for > removal, moved to oldWALs and eventually claimed. Issue is that the WAL > directory with the temp region is not removed. For example: > {noformat} > drwxr-xr-x - hadoop hadoop 0 2015-05-28 10:21 > /hbase/WALs/hregion-65589555 > {noformat} > The directory is empty and does not harm, but on the other hand it is not > needed anymore and should be removed. Cosmetic and good housekeeping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14287) Bootstrapping a cluster leaves temporary WAL directory laying around
[ https://issues.apache.org/jira/browse/HBASE-14287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742411#comment-14742411 ] Lars George commented on HBASE-14287: - Indeed! HBASE-12204 misses the updated line from HBASE-11982 and reverts it back. [~tedyu] mind to fix this with an addendum? {noformat} - setInfoFamilyCachingForMeta(false); - HRegion meta = HRegion.createHRegion(metaHRI, rd, c, - HTableDescriptor.META_TABLEDESC, null, true, true); - setInfoFamilyCachingForMeta(true); + HTableDescriptor metaDescriptor = new FSTableDescriptors(c).get(TableName.META_TABLE_NAME); + setInfoFamilyCachingForMeta(metaDescriptor, false); + HRegion meta = HRegion.createHRegion(metaHRI, rd, c, metaDescriptor); + setInfoFamilyCachingForMeta(metaDescriptor, true) {noformat} > Bootstrapping a cluster leaves temporary WAL directory laying around > > > Key: HBASE-14287 > URL: https://issues.apache.org/jira/browse/HBASE-14287 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Affects Versions: 1.0.2, 1.1.1 >Reporter: Lars George >Priority: Minor > > When a new cluster is started, it creates a temporary WAL as {{hbase:meta}} > is created during bootstrapping the system. Then this log is closed before > properly opened on a region server. The temp WAL file is scheduled for > removal, moved to oldWALs and eventually claimed. Issue is that the WAL > directory with the temp region is not removed. For example: > {noformat} > drwxr-xr-x - hadoop hadoop 0 2015-05-28 10:21 > /hbase/WALs/hregion-65589555 > {noformat} > The directory is empty and does not harm, but on the other hand it is not > needed anymore and should be removed. Cosmetic and good housekeeping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14409) Clarify use of hbase.hstore.compaction.max.size in ExploringCompactionPolicy
Lars George created HBASE-14409: --- Summary: Clarify use of hbase.hstore.compaction.max.size in ExploringCompactionPolicy Key: HBASE-14409 URL: https://issues.apache.org/jira/browse/HBASE-14409 Project: HBase Issue Type: Bug Components: Compaction Affects Versions: 1.1.2, 1.0.2 Reporter: Lars George Assignee: stack As discussed in https://issues.apache.org/jira/browse/HBASE-7842: Why is the {{ExploringCompactionPolicy}} overloading the {{hbase.hstore.compaction.max.size}} parameter, which is used in the original ratio-based policy _just_ to exclude store files that are larger than this threshold. The ECP does the same, but later on uses the same threshold (if set) to drop a possible selection when the sum of all store files in the selection exceeds this limit. Why? Here the code: {code} if (size > comConf.getMaxCompactSize()) { continue; } {code} The ref guide says this: {noformat} * Do size-based sanity checks against each StoreFile in this set of StoreFiles. ** If the size of this StoreFile is larger than `hbase.hstore.compaction.max.size`, take it out of consideration. ** If the size is greater than or equal to `hbase.hstore.compaction.min.size`, sanity-check it against the file-based ratio to see whether it is too large to be considered. {noformat} This seems wrong, no? It does not do this by each store file, but by the current selection candidate. It still speaks of the max size key, but here in the traditional sense, i.e. eliminate single store files that exceed the limit. But that is not what the code does at this spot. We should either remove that check, since larger files are already removed in {{selectCompaction()}} of the base class, or we should see what was meant to happen here and clarify/fix the code and/or description. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-7842) Add compaction policy that explores more storefile groups
[ https://issues.apache.org/jira/browse/HBASE-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739407#comment-14739407 ] Lars George commented on HBASE-7842: Hey [~eclark], could you help me understand why the {{ExploringCompactionPolicy}} is overloading the {{hbase.hstore.compaction.max.size}} parameter, which is used in the original ratio-based policy _just_ to exclude store files that are larger than this threshold. The ECP does the same, but later on uses the same threshold (if set) to drop a possible selection when the sum of all store files in the selection exceeds this limit. Why? Here the code: {code} if (size > comConf.getMaxCompactSize()) { continue; } {code} The ref guide says this: {noformat} * Do size-based sanity checks against each StoreFile in this set of StoreFiles. ** If the size of this StoreFile is larger than `hbase.hstore.compaction.max.size`, take it out of consideration. ** If the size is greater than or equal to `hbase.hstore.compaction.min.size`, sanity-check it against the file-based ratio to see whether it is too large to be considered. {noformat} This seems wrong, no? It does not do this by each store file, but by the current selection candidate. It still speaks of the max size key, but here in the traditional sense, i.e. eliminate single store files that exceed the limit. But that is not what the code does at this spot. Please advise? > Add compaction policy that explores more storefile groups > - > > Key: HBASE-7842 > URL: https://issues.apache.org/jira/browse/HBASE-7842 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 0.98.0, 0.95.1 > > Attachments: HBASE-7842-0.patch, HBASE-7842-2.patch, > HBASE-7842-3.patch, HBASE-7842-4.patch, HBASE-7842-5.patch, > HBASE-7842-6.patch, HBASE-7842-7.patch, HBASE-7842-ADD.patch > > > Some workloads that are not as stable can have compactions that are too large > or too small using the current storefile selection algorithm. > Currently: > * Find the first file that Size(fi) <= Sum(0, i-1, FileSize(fx)) > * Ensure that there are the min number of files (if there aren't then bail > out) > * If there are too many files keep the larger ones. > I would propose something like: > * Find all sets of storefiles where every file satisfies > ** FileSize(fi) <= Sum(0, i-1, FileSize(fx)) > ** Num files in set =< max > ** Num Files in set >= min > * Then pick the set of files that maximizes ((# storefiles in set) / > Sum(FileSize(fx))) > The thinking is that the above algorithm is pretty easy reason about, all > files satisfy the ratio, and should rewrite the least amount of data to get > the biggest impact in seeks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730821#comment-14730821 ] Lars George commented on HBASE-13103: - Added linked issue HBASE-14367 for the shell work. It is an easy one but needs a little insight into how to do this best. [~mantonov], you want to take a stab? > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14367) Add normalization support to shell
Lars George created HBASE-14367: --- Summary: Add normalization support to shell Key: HBASE-14367 URL: https://issues.apache.org/jira/browse/HBASE-14367 Project: HBase Issue Type: Bug Components: shell Affects Versions: 1.1.2 Reporter: Lars George Fix For: 2.0.0, 1.2.0, 1.3.0 https://issues.apache.org/jira/browse/HBASE-13103 adds support for setting a normalization flag per {{HTableDescriptor}}, along with the server side chore to do the work. What is lacking is to easily set this from the shell, right now you need to use the Java API to modify the descriptor. This issue is to add the flag as a known attribute key and/or other means to toggle this per table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725094#comment-14725094 ] Lars George commented on HBASE-13103: - Nope: {noformat} hbase(main):028:0> alter 'testtable', {NORMALIZATION_ENABLED => 'true'} NameError: uninitialized constant NORMALIZATION_ENABLED {noformat} And even if so, it requires knowledge about the internal key name (says in the Java doc for the key in HTD). > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725107#comment-14725107 ] Lars George commented on HBASE-13103: - You may be able to force it like so: {noformat} hbase(main):035:0> alter 'normtable', {CONFIGURATION => {'NORMALIZATION_ENABLED' => 'true'}} Updating all regions with the new schema... 1/1 regions updated. Done. {noformat} but that is error-prone as you could easily misspell the arbitrary key string. I vote for proper shell support. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14342) Recursive call in RegionMergeTransactionImpl.getJournal()
Lars George created HBASE-14342: --- Summary: Recursive call in RegionMergeTransactionImpl.getJournal() Key: HBASE-14342 URL: https://issues.apache.org/jira/browse/HBASE-14342 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 1.1.1 Reporter: Lars George Fix For: 1.2.0, 1.1.2 HBASE-12975 in its branch-1 patch (https://issues.apache.org/jira/secure/attachment/12708578/HBASE-12975-branch-1.patch) introduced a recursive call for {{getJournal()}}. Needs to return just the {{journal}} variable like master patch does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723616#comment-14723616 ] Lars George commented on HBASE-13103: - Is there follow up work or a JIRA tracking adding this to the shell? Is the only way to enable this per table using the Java API? > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723619#comment-14723619 ] Lars George commented on HBASE-13103: - Sorry, above was for [~mantonov] I guess. :) Please advise. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10003) OnlineMerge should be extended to allow bulk merging
[ https://issues.apache.org/jira/browse/HBASE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723238#comment-14723238 ] Lars George commented on HBASE-10003: - Would HBASE-13103 supersede this issue? > OnlineMerge should be extended to allow bulk merging > > > Key: HBASE-10003 > URL: https://issues.apache.org/jira/browse/HBASE-10003 > Project: HBase > Issue Type: Improvement > Components: Admin, Usability >Affects Versions: 0.98.0, 0.94.6 >Reporter: Clint Heath >Assignee: takeshi.miao >Priority: Critical > Labels: beginner > > Now that we have Online Merge capabilities, the function of that tool should > be extended to make it much easier for HBase operations folks to use. > Currently it is a very manual process (one fraught with confusion) to hand > pick two regions that are contiguous to each other in the META table such > that the admin can manually request those two regions to be merged. > In the real world, when admins find themselves wanting to merge regions, it's > usually because they've greatly increased their hbase.hregion.max.filesize > property and they have way too many regions on a table and want to reduce the > region count for that entire table quickly and easily. > Why can't the OnlineMerge command just take a "-max" argument along with a > table name which tells it to go ahead and merge all regions of said table > until the resulting regions are all of max size? This takes the voodoo out > of the process and quickly gets the admin what they're looking for. > As part of this improvement, I also suggest a "-regioncount" argument for > OnlineMerge, which will attempt to reduce the table's region count down to > the specified #. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-6613) Automatically merge empty regions
[ https://issues.apache.org/jira/browse/HBASE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723214#comment-14723214 ] Lars George commented on HBASE-6613: Should we close this as superseded by HBASE-13103? > Automatically merge empty regions > - > > Key: HBASE-6613 > URL: https://issues.apache.org/jira/browse/HBASE-6613 > Project: HBase > Issue Type: New Feature > Components: master, regionserver, util >Affects Versions: 0.94.1 >Reporter: Ionut Ignatescu > > Consider an usecase where row keys has an increasing value(time-series data > for example) and data retention is set to a concrete value(60 days for > example). > After a period of time, longer than retention, empty regions will appear. > This will cause high memory use on region servers. > In my opinion, regions merge could be part of major compaction or another > tool should be provided. From my understading, it is possible to merge 2 > empty regions without make table offline, but it's not possible to merge one > empty region with a non-empty region without close/unassing this regions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14343) Fix debug message in SimpleRegionNormalizer for small regions
Lars George created HBASE-14343: --- Summary: Fix debug message in SimpleRegionNormalizer for small regions Key: HBASE-14343 URL: https://issues.apache.org/jira/browse/HBASE-14343 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 1.1.1 Reporter: Lars George Priority: Trivial The {{SimpleRegionNormalizer}} has this: {code} if ((smallestRegion.getSecond() + smallestNeighborOfSmallestRegion.getSecond() < avgRegionSize)) { LOG.debug("Table " + table + ", smallest region size: " + smallestRegion.getSecond() + " and its smallest neighbor size: " + smallestNeighborOfSmallestRegion.getSecond() + ", less than half the avg size, merging them"); {code} It does *not* check for "less than half the avg size" but only "less than the avg size", that is, drop the "half". Fix message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14343) Fix debug message in SimpleRegionNormalizer for small regions
[ https://issues.apache.org/jira/browse/HBASE-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723374#comment-14723374 ] Lars George commented on HBASE-14343: - {code} if (tableRegions == null || tableRegions.size() < 3) { LOG.debug("Table " + table + " has " + tableRegions.size() + " regions, required min number" + " of regions for normalizer to run is 3, not running normalizer"); return EmptyNormalizationPlan.getInstance(); } {code} Also, the debug message above will through an NPE if the {{tableRegions}} is {{null}} as checked. Missing extra check in debug message construction. > Fix debug message in SimpleRegionNormalizer for small regions > - > > Key: HBASE-14343 > URL: https://issues.apache.org/jira/browse/HBASE-14343 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.1.1 >Reporter: Lars George >Assignee: Lars Francke >Priority: Trivial > Labels: beginner > > The {{SimpleRegionNormalizer}} has this: > {code} > if ((smallestRegion.getSecond() + > smallestNeighborOfSmallestRegion.getSecond() > < avgRegionSize)) { > LOG.debug("Table " + table + ", smallest region size: " + > smallestRegion.getSecond() > + " and its smallest neighbor size: " + > smallestNeighborOfSmallestRegion.getSecond() > + ", less than half the avg size, merging them"); > {code} > It does *not* check for "less than half the avg size" but only "less than the > avg size", that is, drop the "half". Fix message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14286) Fix spelling error in safetyBumper parameter in WALSplitter
[ https://issues.apache.org/jira/browse/HBASE-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711763#comment-14711763 ] Lars George commented on HBASE-14286: - lgtm, +1 Fix spelling error in safetyBumper parameter in WALSplitter - Key: HBASE-14286 URL: https://issues.apache.org/jira/browse/HBASE-14286 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.2.0 Reporter: Lars George Assignee: Gabor Liptak Priority: Trivial Attachments: HBASE-14286.1.patch In {{WALSplitter]] we have this code: {code} public static long writeRegionSequenceIdFile(final FileSystem fs, final Path regiondir, long newSeqId, long saftyBumper) throws IOException { {code} We should fix the parameter name to be {{safetyBumper}}. Same for the JavaDoc above the method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14273) Rename MVCC to MVCC: From MultiVersionConsistencyControl to MultiVersionConcurrencyControl
[ https://issues.apache.org/jira/browse/HBASE-14273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706559#comment-14706559 ] Lars George commented on HBASE-14273: - What versions should we push this into? I'd suggest the unreleased major and minor only, i.e. 1.2 and 2.0 branches. I mean it does not hurt anyone else, and no one has really questioned it. It is internal too, so no API impact. Right? Rename MVCC to MVCC: From MultiVersionConsistencyControl to MultiVersionConcurrencyControl -- Key: HBASE-14273 URL: https://issues.apache.org/jira/browse/HBASE-14273 Project: HBase Issue Type: Bug Reporter: stack Assignee: Lars Francke Labels: beginner Attachments: HBASE-14273.patch [~larsgeorge] noticed that our MVCC class has Consistency as the first 'C' when it should be 'Concurrency'. The issue that named this class, HBASE-4544 talks about 'Concurrency' but then it went in as Consistency (Why has no one noticed this before now? Thanks [~larsgeorge]) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14255) Simplify Cell creation post 1.0
[ https://issues.apache.org/jira/browse/HBASE-14255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707367#comment-14707367 ] Lars George commented on HBASE-14255: - Hmmm, thanks [~ndimiduk], I looked there and was not able to get done what I tried. I am at a loss now to replicate. Let me check tomorrow and if I cannot redo I will close this issue. Sorry for the (possible) noise. Simplify Cell creation post 1.0 --- Key: HBASE-14255 URL: https://issues.apache.org/jira/browse/HBASE-14255 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 1.0.0, 2.0.0 Reporter: Lars George Priority: Critical After the switch to the new Cell based client API, and making KeyValue private (but especially as soon as DBB backed Cells land) it is rather difficult to create a {{Cell}} instance. I am using this now: {code} @Override public void postGetOp(ObserverContextRegionCoprocessorEnvironment e, Get get, ListCell results) throws IOException { Put put = new Put(get.getRow()); put.addColumn(get.getRow(), FIXED_COLUMN, Bytes.toBytes(counter.get())); CellScanner scanner = put.cellScanner(); scanner.advance(); Cell cell = scanner.current(); LOG.debug(Adding fake cell: + cell); results.add(cell); } {code} That is, I have to create a {{Put}} instance to add a cell and then retrieve its instance. The {{KeyValue}} methods are private now and should not be used. Create a CellBuilder helper? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14256) Flush task message may be confusing when region is recovered
[ https://issues.apache.org/jira/browse/HBASE-14256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707343#comment-14707343 ] Lars George commented on HBASE-14256: - Read the patch, looks good to me, +1. Thanks for also fixing the typo, I noticed that too but forgot to mention. :) Flush task message may be confusing when region is recovered Key: HBASE-14256 URL: https://issues.apache.org/jira/browse/HBASE-14256 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.2.0 Reporter: Lars George Assignee: Gabor Liptak Labels: beginner Attachments: HBASE-14256.1.patch In {{HRegion.setRecovering()}} we have this code: {code} // force a flush only if region replication is set up for this region. Otherwise no need. boolean forceFlush = getTableDesc().getRegionReplication() 1; // force a flush first MonitoredTask status = TaskMonitor.get().createStatus( Flushing region + this + because recovery is finished); try { if (forceFlush) { internalFlushcache(status); } {code} So we only optionally force flush after a recovery of a region, but the message always is set to Flushing..., which might be confusing. We should change the message based on {{forceFlush}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14287) Bootstrapping a cluster leaves temporary WAL directory laying around
Lars George created HBASE-14287: --- Summary: Bootstrapping a cluster leaves temporary WAL directory laying around Key: HBASE-14287 URL: https://issues.apache.org/jira/browse/HBASE-14287 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 1.1.1, 1.0.2 Reporter: Lars George Priority: Minor When a new cluster is started, it creates a temporary WAL as {{hbase:meta}} is created during bootstrapping the system. Then this log is closed before properly opened on a region server. The temp WAL file is scheduled for removal, moved to oldWALs and eventually claimed. Issue is that the WAL directory with the temp region is not removed. For example: {noformat} drwxr-xr-x - hadoop hadoop 0 2015-05-28 10:21 /hbase/WALs/hregion-65589555 {noformat} The directory is empty and does not harm, but on the other hand it is not needed anymore and should be removed. Cosmetic and good housekeeping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14286) Fix spelling error in safetyBumper parameter in WALSplitter
Lars George created HBASE-14286: --- Summary: Fix spelling error in safetyBumper parameter in WALSplitter Key: HBASE-14286 URL: https://issues.apache.org/jira/browse/HBASE-14286 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.2.0 Reporter: Lars George Priority: Trivial In {{WALSplitter]] we have this code: {code} public static long writeRegionSequenceIdFile(final FileSystem fs, final Path regiondir, long newSeqId, long saftyBumper) throws IOException { {code} We should fix the parameter name to be {{safetyBumper}}. Same for the JavaDoc above the method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14285) Improve local log servlet to show last n-KB etc.
Lars George created HBASE-14285: --- Summary: Improve local log servlet to show last n-KB etc. Key: HBASE-14285 URL: https://issues.apache.org/jira/browse/HBASE-14285 Project: HBase Issue Type: Improvement Components: master, regionserver, REST, Thrift Affects Versions: 2.0.0, 1.2.0 Reporter: Lars George Most of the time servers have very large logs laying around and displaying them with the current log servlet is not useful, as it downloads the entire log to the web browser. YARN has a better version which we could use, or simply implement our own. Would be nice if it could paginate to be able to go through a log from front or end in pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14224) Fix coprocessor handling of duplicate classes
[ https://issues.apache.org/jira/browse/HBASE-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706295#comment-14706295 ] Lars George commented on HBASE-14224: - Didn't try (yet) but read the path. Looks great, I like the tests, I was going to do/say the same. +1 from reading the patch. Fix coprocessor handling of duplicate classes - Key: HBASE-14224 URL: https://issues.apache.org/jira/browse/HBASE-14224 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.2.0, 1.1.1 Reporter: Lars George Assignee: stack Priority: Critical Attachments: 14224.txt, 14224v2.txt, 14224v3.txt, 14224v4.txt, 14224v5.txt, problem.pdf While discussing with [~misty] over on HBASE-13907 we noticed some inconsistency when copros are loaded. Sometimes you can load them more than once, sometimes you can not. Need to consolidate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14256) Flush task message may be confusing when region is recovered
Lars George created HBASE-14256: --- Summary: Flush task message may be confusing when region is recovered Key: HBASE-14256 URL: https://issues.apache.org/jira/browse/HBASE-14256 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.2.0 Reporter: Lars George In {{HRegion.setRecovering()}} we have this code: {code} // force a flush only if region replication is set up for this region. Otherwise no need. boolean forceFlush = getTableDesc().getRegionReplication() 1; // force a flush first MonitoredTask status = TaskMonitor.get().createStatus( Flushing region + this + because recovery is finished); try { if (forceFlush) { internalFlushcache(status); } {code} So we only optionally force flush after a recovery of a region, but the message always is set to Flushing..., which might be confusing. We should change the message based on {{forceFlush}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14255) Simplify Cell creation post 1.0
Lars George created HBASE-14255: --- Summary: Simplify Cell creation post 1.0 Key: HBASE-14255 URL: https://issues.apache.org/jira/browse/HBASE-14255 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 1.0.0, 2.0.0 Reporter: Lars George After the switch to the new Cell based client API, and making KeyValue private (but especially as soon as DBB backed Cells land) it is rather difficult to create a {{Cell}} instance. I am using this now: {code} @Override public void postGetOp(ObserverContextRegionCoprocessorEnvironment e, Get get, ListCell results) throws IOException { Put put = new Put(get.getRow()); put.addColumn(get.getRow(), FIXED_COLUMN, Bytes.toBytes(counter.get())); CellScanner scanner = put.cellScanner(); scanner.advance(); Cell cell = scanner.current(); LOG.debug(Adding fake cell: + cell); results.add(cell); } {code} That is, I have to create a {{Put}} instance to add a cell and then retrieve its instance. The {{KeyValue}} methods are private now and should not be used. Create a CellBuilder helper? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14257) Periodic flusher only handles hbase:meta, not other system tables
Lars George created HBASE-14257: --- Summary: Periodic flusher only handles hbase:meta, not other system tables Key: HBASE-14257 URL: https://issues.apache.org/jira/browse/HBASE-14257 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.2.0 Reporter: Lars George In {{HRegion.shouldFlush}} we have {code} long modifiedFlushCheckInterval = flushCheckInterval; if (getRegionInfo().isMetaRegion() getRegionInfo().getReplicaId() == HRegionInfo.DEFAULT_REPLICA_ID) { modifiedFlushCheckInterval = META_CACHE_FLUSH_INTERVAL; } {code} That method is called by the {{PeriodicMemstoreFlusher}} thread, and prefers the {{hbase:meta}} only for faster flushing. It should be doing the same for other system tables. I suggest to use {{HRI.isSystemTable()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14224) Fix coprocessor handling of duplicate classes
[ https://issues.apache.org/jira/browse/HBASE-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698757#comment-14698757 ] Lars George commented on HBASE-14224: - Yes, and I hope I have that describe completely in the attach PDF (or the linked note). If not, please add here. Fix coprocessor handling of duplicate classes - Key: HBASE-14224 URL: https://issues.apache.org/jira/browse/HBASE-14224 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.2.0, 1.1.1 Reporter: Lars George Priority: Critical Attachments: problem.pdf While discussing with [~misty] over on HBASE-13907 we noticed some inconsistency when copros are loaded. Sometimes you can load them more than once, sometimes you can not. Need to consolidate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14224) Fix coprocessor handling of duplicate classes
[ https://issues.apache.org/jira/browse/HBASE-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698755#comment-14698755 ] Lars George commented on HBASE-14224: - Whoops, here the online version if that helps: https://www.evernote.com/l/ACFO6OrjlNNHeZDPxhubGw8uDUSwAaOgxQU Fix coprocessor handling of duplicate classes - Key: HBASE-14224 URL: https://issues.apache.org/jira/browse/HBASE-14224 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.2.0, 1.1.1 Reporter: Lars George Priority: Critical Attachments: problem.pdf While discussing with [~misty] over on HBASE-13907 we noticed some inconsistency when copros are loaded. Sometimes you can load them more than once, sometimes you can not. Need to consolidate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14224) Fix coprocessor handling of duplicate classes
Lars George created HBASE-14224: --- Summary: Fix coprocessor handling of duplicate classes Key: HBASE-14224 URL: https://issues.apache.org/jira/browse/HBASE-14224 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 1.1.1, 1.0.1, 2.0.0, 1.2.0 Reporter: Lars George While discussing with [~misty] over on HBASE-13907 we noticed some inconsistency when copros are loaded. Sometimes you can load them more than once, sometimes you can not. Need to consolidate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14224) Fix coprocessor handling of duplicate classes
[ https://issues.apache.org/jira/browse/HBASE-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697183#comment-14697183 ] Lars George commented on HBASE-14224: - Easy to fix, in {{CoprocessorHost.loadSystemCoprocessors()}} changed this {code} if (findCoprocessor(className) != null) { continue; } {code} to something like this: {code} if (findCoprocessor(className) != null || configured.contains(className)) { continue; } {code} and in the shell's {{admin.rb}} change the code in {{alter}} to use addCoprocessor - the whole finding a new ID is already taken care of in that method anyways. So this also simplifies code. Then check the other {{loadCoprocessors()}} classes to be safe too. Fix coprocessor handling of duplicate classes - Key: HBASE-14224 URL: https://issues.apache.org/jira/browse/HBASE-14224 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.2.0, 1.1.1 Reporter: Lars George Attachments: problem.pdf While discussing with [~misty] over on HBASE-13907 we noticed some inconsistency when copros are loaded. Sometimes you can load them more than once, sometimes you can not. Need to consolidate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14224) Fix coprocessor handling of duplicate classes
[ https://issues.apache.org/jira/browse/HBASE-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-14224: Attachment: problem.pdf Attached a PDF that describes the issue in detail. Fix coprocessor handling of duplicate classes - Key: HBASE-14224 URL: https://issues.apache.org/jira/browse/HBASE-14224 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.2.0, 1.1.1 Reporter: Lars George Attachments: problem.pdf While discussing with [~misty] over on HBASE-13907 we noticed some inconsistency when copros are loaded. Sometimes you can load them more than once, sometimes you can not. Need to consolidate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13907) Document how to deploy a coprocessor
[ https://issues.apache.org/jira/browse/HBASE-13907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693552#comment-14693552 ] Lars George commented on HBASE-13907: - [~misty] So the open question #3 is, what if a CP was loaded using the site file, using custom config values for it, and then an admin tries to override them in the CLI? How would that be possible, the CP is already loaded, no? I tried once and did issue the same CP load command on the CLI, which added the same CP twice, just with a higher ID. So, if you have loaded a system CP using the site XML file, you can load another on the CLI, same class, but it will have lower priority. Document how to deploy a coprocessor Key: HBASE-13907 URL: https://issues.apache.org/jira/browse/HBASE-13907 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-13907-1.patch, HBASE-13907-2.patch, HBASE-13907-v3.patch, HBASE-13907.patch Capture this information: Where are the dependencies located for these classes? Is there a path on HDFS or local disk that dependencies need to be placed so that each RegionServer has access to them? It is suggested to bundle them as a single jar so that RS can load the whole jar and resolve dependencies. If you are not able to do that, you need place the dependencies in regionservers class path so that they are loaded during RS startup. Do either of these options work for you? Btw, you can load the coprocessors/filters into path specified by hbase.dynamic.jars.dir [1], so that they are loaded dynamically by regionservers when the class is accessed (or you can place them in the RS class path too, so that they are loaded during RS JVM startup). How would one deploy these using an automated system? (puppet/chef/ansible/etc) You can probably use these tools to automate shipping the jars to above locations? Tests our developers have done suggest that simply disabling a coprocessor, replacing the jar with a different version, and enabling the coprocessor again does not load the newest version. With that in mind how does one know which version is currently deployed and enabled without resorting to parsing `hbase shell` output or restarting hbase? Actually this is a design issue with current classloader. You can't reload a class in a JVM unless you delete all the current references to it. Since the current JVM (classloader) has reference to it, you can't overwrite it unless you kill the JVM, which is equivalent to restarting it. So you still have the older class loaded in place. For this to work, classloader design should be changed. If it works for you, you can rename the coprocessor class name and the new version of jar and RS loads it properly. Where does logging go, and how does one access it? Does logging need to be configured in a certain way? Can you please specify which logging you are referring to? Where is a good location to place configuration files? Same as above, are these hbase configs or something else? If hbase configs, are these gateway configs/server side? -- This message was sent by Atlassian JIRA (v6.3.4#6332)