[jira] [Resolved] (HBASE-20951) Ratis LogService backed WALs

2022-06-13 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-20951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-20951.

Resolution: Later

> Ratis LogService backed WALs
> 
>
> Key: HBASE-20951
> URL: https://issues.apache.org/jira/browse/HBASE-20951
> Project: HBase
>  Issue Type: New Feature
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
>
> Umbrella issue for the Ratis+WAL work:
> Design doc: 
> [https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20KbSJwBHVxbO7ge5ORqbCk/edit#|https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20KbSJwBHVxbO7ge5ORqbCk/edit]
> The (over-simplified) goal is to re-think the current WAL APIs we have now, 
> ensure that they are de-coupled from the notion of being backed by HDFS, swap 
> the current implementations over to the new API, and then wire up the Ratis 
> LogService to the new WAL API.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HBASE-27042) hboss doesn't compile against hadoop branch-3.3 now that s3guard is cut

2022-05-23 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-27042.

Hadoop Flags: Reviewed
Release Note: Adds support for Apache Hadoop 3.3.3 and removes S3Guard 
vestiges.
  Resolution: Fixed

Thanks Steve!

> hboss doesn't compile against hadoop branch-3.3 now that s3guard is cut
> ---
>
> Key: HBASE-27042
> URL: https://issues.apache.org/jira/browse/HBASE-27042
> Project: HBase
>  Issue Type: Bug
>  Components: hboss
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: hbase-filesystem-1.0.0-alpha2
>
>
> HBoss doesn't compile against hadoop builds containing HADOOP-17409, "remove 
> s3guard", as test setup tries to turn it off.
> there's no need for s3guard any more, so hboss can just avoid all settings 
> and expect it to be disabled (hadoop 3.3.3. or earlier) or removed (3.4+)
> (hboss version is 1.0.0-alpha2-SNAPSHOT)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HBASE-27044) Serialized procedures which point to users from other Kerberos domains can prevent master startup

2022-05-16 Thread Josh Elser (Jira)
Josh Elser created HBASE-27044:
--

 Summary: Serialized procedures which point to users from other 
Kerberos domains can prevent master startup
 Key: HBASE-27044
 URL: https://issues.apache.org/jira/browse/HBASE-27044
 Project: HBase
  Issue Type: Bug
  Components: proc-v2
Reporter: Josh Elser


We ran into an interesting bug when test teams were running HBase against cloud 
storage without ensuring that the previous location was cleaned. This resulted 
in an hbase.rootdir that had:
 * A valid HBase MasterData Region
 * A valid hbase:meta
 * A valid collection of HBase tables
 * An empty ZooKeeper

Through the changes that we've worked on prior, those described in HBASE-24286 
were effective in getting every _except_ the Procedures back online without 
issue. Parsing the existing procedures produced an interesting error:
{noformat}
java.lang.IllegalArgumentException: Illegal principal name 
hbase/wrong-hostname.domain@WRONG_REALM: 
org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No 
rules applied to hbase/wrong-hostname.domain@WRONG_REALM
at org.apache.hadoop.security.User.(User.java:51)
at org.apache.hadoop.security.User.(User.java:43)
at 
org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1418)
at 
org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1402)
at 
org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.toUserInfo(MasterProcedureUtil.java:60)
at 
org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.deserializeStateData(ModifyTableProcedure.java:262)
at 
org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:294)
at 
org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43)
at 
org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:411)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:78)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.load(ProcedureExecutor.java:339)
at 
org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:285)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:330)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:600)
at 
org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1581)
at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:835)
at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2205)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:514)
at java.lang.Thread.run(Thread.java:750) {noformat}
What's actually happening is that we are storing the {{User}} into the 
procedure and then relying on UserGroupInformation to parse the {{User}} 
protobuf into a UGI to get the "short" username.

When the serialized procedure (whether in the MasterData region over via PV2 
WAL files, I think) gets loaded, we end up needing Hadoop auth_to_local 
configuration to be able to parse that kerberos principal back to a name. 
However, Hadoop's KerberosName will only unwrap Kerberos principals which match 
the local Kerberos realm (defined by the krb5.conf's default_realm, 
[ref|https://github.com/frohoff/jdk8u-jdk/blob/master/src/share/classes/sun/security/krb5/Config.java#L978-L983])

The interesting part is that we don't seem to ever use the user _other_ than to 
display the {{owner}} attribute for procedures on the HBase UI. There is a 
method in hbase-procedure which can filter procedures based on Owner, but I 
didn't see any usages of that method.

Given the pushback against HBASE-24286, I assume that, for the same reasons, we 
would see pushback against fixing this issue. However, I wanted to call it out 
for posterity. The expectation of users is that HBase _should_ implicitly 
handle this case.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HBASE-26588) Implement a migration tool to help users migrate SFT implementation for a large set of tables

2022-04-04 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26588.

Resolution: Later

Closing since we have HBASE-26673. Can re-open this if we have a reason that 
HBASE-26673 is insufficient.

> Implement a migration tool to help users migrate SFT implementation for a 
> large set of tables
> -
>
> Key: HBASE-26588
> URL: https://issues.apache.org/jira/browse/HBASE-26588
> Project: HBase
>  Issue Type: Sub-task
>  Components: tooling
>Reporter: Duo Zhang
>Priority: Major
>
> It will be very useful for our users who deploy HBase on S3 like systems.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26767) Rest server should not use a large Header Cache.

2022-02-23 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26767.

Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed! Thanks for the great work, Sergey.

> Rest server should not use a large Header Cache.
> 
>
> Key: HBASE-26767
> URL: https://issues.apache.org/jira/browse/HBASE-26767
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 2.4.9
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.10
>
>
> In the RESTServer we set the HeaderCache size to DEFAULT_HTTP_MAX_HEADER_SIZE 
> (65536). That's not compatible with jetty-9.4.x because the cache size is 
> limited by Character.MAX_VALUE - 1  (65534) there. According to the Jetty 
> source code comments, it's possible to have a buffer overflow in the cache 
> for higher values and that might lead to wrong/incomplete values returned by 
> cache and following incorrect header handling.  
> There are a couple of ways to fix it:
> 1. change the value of DEFAULT_HTTP_MAX_HEADER_SIZE to 65534
> 2. make header cache size configurable and set its size separately from the 
> header size. 
> I believe that the second would give us more flexibility.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26644) Spurious compaction failures with file tracker

2022-02-10 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26644.

Resolution: Not A Problem

Yep, all good. I believe you fixed this in HBASE-26675

> Spurious compaction failures with file tracker
> --
>
> Key: HBASE-26644
> URL: https://issues.apache.org/jira/browse/HBASE-26644
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
>
> Noticed when running a basic {{{}hbase pe randomWrite{}}}, we'll see 
> compactions failing at various points.
> One example:
> {noformat}
> 2022-01-03 17:41:18,319 ERROR 
> [regionserver/localhost:16020-shortCompactions-0] 
> regionserver.CompactSplit(670): Compaction failed 
> region=TestTable,0004054490,1641249249856.2dc7251c6eceb660b9c7bb0b587db913.,
>  storeName=2dc7251c6eceb660b9c7bb0b587db913/info0,       priority=6, 
> startTime=1641249666161
> java.io.IOException: Root-level entries already added in single-level mode
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeSingleLevelIndex(HFileBlockIndex.java:1136)
>   at 
> org.apache.hadoop.hbase.io.hfile.CompoundBloomFilterWriter$MetaWriter.write(CompoundBloomFilterWriter.java:279)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl$1.writeToBlock(HFileWriterImpl.java:713)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeBlock(HFileBlock.java:1205)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:660)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:377)
>   at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.commitWriter(DefaultCompactor.java:70)
>   at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:386)
>   at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62)
>   at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125)
>   at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1141)
>   at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2388)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:654)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:697)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)  {noformat}
> This isn't a super-critical issue because compactions will be retried 
> automatically and they appear to eventually succeed. However, when the max 
> storefiles limit is reaching, this does cause ingest to hang (as I was doing 
> with my modest configuration).
> We had seen a similar kind of problem in our testing when backporting to 
> HBase 2.4 (not upstream as the decision was to not do this) which we 
> eventually tracked down to a bad merge-conflict resolution to the new HFile 
> Cleaner. However, initial investigations don't have the same exact problem.
> It seems that we have some kind of generic race condition. Would be good to 
> add more logging to catch this in the future (since we have two separate 
> instances of this category of bug already).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26655) Initial commit with basic functionality and example code

2022-01-20 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26655.

Hadoop Flags: Reviewed
  Resolution: Fixed

> Initial commit with basic functionality and example code
> 
>
> Key: HBASE-26655
> URL: https://issues.apache.org/jira/browse/HBASE-26655
> Project: HBase
>  Issue Type: Sub-task
>  Components: security
>Reporter: Andor Molnar
>Assignee: Andor Molnar
>Priority: Major
> Fix For: HBASE-26553
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26687) Account for HBASE-24500 in regionInfoMismatch tool

2022-01-19 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26687.

Hadoop Flags: Reviewed
  Resolution: Fixed

Thanks for the speedy review, Peter!

> Account for HBASE-24500 in regionInfoMismatch tool
> --
>
> Key: HBASE-26687
> URL: https://issues.apache.org/jira/browse/HBASE-26687
> Project: HBase
>  Issue Type: Bug
>  Components: hbck2
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: hbase-operator-tools-1.3.0
>
>
> Had a coworker try to use the RegionInfoMismatch tool I added in HBASE-26656. 
> Curiously, the tool failed on the sanity check I added.
> {noformat}
> Aborting: sanity-check failed on updated RegionInfo. Expected encoded region 
> name 736ee6186975de6967cd9e9e242423f0 but got 
> 323748c77dde5b05982df0285b013232.
> Incorrectly created RegionInfo was: {ENCODED => 
> 323748c77dde5b05982df0285b013232, NAME => 
> 'test4,,1642405560420_0002.323748c77dde5b05982df0285b013232.', STARTKEY => 
> '', ENDKEY => ''}
> {noformat}
> I couldn't understand why the tool wasn't working until I hooked up a 
> debugger and realized that the problem wasn't in my code :). The version of 
> HBase on the system did not have the fix from HBASE-24500 included which 
> meant that I was hitting the same "strange behavior", as Duo put it, in the 
> RegionInfoBuilder "copy constructor".
> While the versions of HBase which do not have this fix are EOL in terms of 
> Apache releases, we can easily work around this in operator-tools (which may 
> be used by any hbase 2.x release still in the wild).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26687) Account for HBASE-24500 in regionInfoMismatch tool

2022-01-19 Thread Josh Elser (Jira)
Josh Elser created HBASE-26687:
--

 Summary: Account for HBASE-24500 in regionInfoMismatch tool
 Key: HBASE-26687
 URL: https://issues.apache.org/jira/browse/HBASE-26687
 Project: HBase
  Issue Type: Bug
  Components: hbck2
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: hbase-operator-tools-1.3.0


Had a coworker try to use the RegionInfoMismatch tool I added in HBASE-26656. 
Curiously, the tool failed on the sanity check I added.
{noformat}
Aborting: sanity-check failed on updated RegionInfo. Expected encoded region 
name 736ee6186975de6967cd9e9e242423f0 but got 323748c77dde5b05982df0285b013232.
Incorrectly created RegionInfo was: {ENCODED => 
323748c77dde5b05982df0285b013232, NAME => 
'test4,,1642405560420_0002.323748c77dde5b05982df0285b013232.', STARTKEY => '', 
ENDKEY => ''}

{noformat}
I couldn't understand why the tool wasn't working until I hooked up a debugger 
and realized that the problem wasn't in my code :). The version of HBase on the 
system did not have the fix from HBASE-24500 included which meant that I was 
hitting the same "strange behavior", as Duo put it, in the RegionInfoBuilder 
"copy constructor".

While the versions of HBase which do not have this fix are EOL in terms of 
Apache releases, we can easily work around this in operator-tools (which may be 
used by any hbase 2.x release still in the wild).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26669) Add JWT section to HBase book

2022-01-13 Thread Josh Elser (Jira)
Josh Elser created HBASE-26669:
--

 Summary: Add JWT section to HBase book
 Key: HBASE-26669
 URL: https://issues.apache.org/jira/browse/HBASE-26669
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Josh Elser
 Fix For: HBASE-26553


Add a chapter to the hbase book about JWT authentication and everything that 
users and admins need to know.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26668) Define user experience for JWT renewal

2022-01-13 Thread Josh Elser (Jira)
Josh Elser created HBASE-26668:
--

 Summary: Define user experience for JWT renewal
 Key: HBASE-26668
 URL: https://issues.apache.org/jira/browse/HBASE-26668
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
 Fix For: HBASE-26553


We need to define what our level of support will be for an HBase application 
which must run longer than the lifetime of a JWT token.

The JWT 2.0 RFCs mention different kinds of tokens, notably a Refresh token may 
be helpful [https://datatracker.ietf.org/doc/html/rfc8693]

This is inter-twined with HBASE-26667. For example, if we maintained a Refresh 
token in the client, we would have to build in logic (like we have for Kerberos 
credentials) to automatically launch a thread and know where to obtain a new 
JWT token from.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26667) Integrate user-experience for hbase-client

2022-01-13 Thread Josh Elser (Jira)
Josh Elser created HBASE-26667:
--

 Summary: Integrate user-experience for hbase-client
 Key: HBASE-26667
 URL: https://issues.apache.org/jira/browse/HBASE-26667
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
 Fix For: HBASE-26553


Today, we have two mechanism in order to get the tokens needed to authenticate:
 # Kerberos, we rely on a Kerberos ticket being present in a well-known 
location (defined by JVM properties) or via programmatic invocation of 
UserGroupInformation
 # Delegation tokens, we rely on special API to be called (our mapreduce API) 
which loads the token into the current UserGroupInformation "context" (the JAAS 
PrivilegedAction).

The JWT bearer token approach is very similar to the delegation token 
mechanism, but HBase does not generate this JWT (as we do with delegation 
tokens). How does a client provide this token to the hbase-client (i.e. 
{{ConnectionFactory.getConnection()}} or a {{UserGroupInformation}} call)? We 
should be mindful of all of the different "entrypoints" to HBase ({{{}hbase 
...{}}} commands, {{java -cp}} commands, Phoenix commands, Spark comands, etc). 
Our solution should be effective for all of these approaches and not require 
downstream changes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26666) Address bearer token being sent over wire before RPC encryption is enabled

2022-01-13 Thread Josh Elser (Jira)
Josh Elser created HBASE-2:
--

 Summary: Address bearer token being sent over wire before RPC 
encryption is enabled
 Key: HBASE-2
 URL: https://issues.apache.org/jira/browse/HBASE-2
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
 Fix For: HBASE-26553


Today, HBase must complete the SASL handshake (saslClient.complete()) prior to 
turning on any RPC encryption (hbase.rpc.protection=privacy, 
sasl.QOP=auth-conf).

This is a problem because we have to transmit the bearer token to the server 
before we can complete the sasl handshake. This would mean that we would 
insecurely transmit the bearer token (which is equivalent to any other 
password) which is a bad smell.

Ideally, if we can solve this problem for the oauth bearer mechanism, we could 
also apply it to our delegation token interface for digest-md5 (which, I 
believe, suffers the same problem).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26665) Standalone unit test in hbase-examples

2022-01-13 Thread Josh Elser (Jira)
Josh Elser created HBASE-26665:
--

 Summary: Standalone unit test in hbase-examples
 Key: HBASE-26665
 URL: https://issues.apache.org/jira/browse/HBASE-26665
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
Assignee: Andor Molnar


Andor is already working on this with nimbus, but filing this for him.

We should have a unit test which exercises the oauth bearer authentication 
mechanism so that we know if the feature is functional at a basic level 
(without having to set up on OAuth server).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26656) [operator-tools] Provide a utility to detect and correct incorrect RegionInfo's in hbase:meta

2022-01-10 Thread Josh Elser (Jira)
Josh Elser created HBASE-26656:
--

 Summary: [operator-tools] Provide a utility to detect and correct 
incorrect RegionInfo's in hbase:meta
 Key: HBASE-26656
 URL: https://issues.apache.org/jira/browse/HBASE-26656
 Project: HBase
  Issue Type: Improvement
  Components: hbase-operator-tools, hbck2
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: hbase-operator-tools-2.0.0


HBASE-23328 describes a problem in which the serialized RegionInfo in the value 
of hbase:meta cells have an encoded regionname which doesn't match the encoded 
region name in the rowkey for that cell.

This problem is normally harmless as assignment only consults the rowkey to get 
the encoded region name. However, this problem does break other HBCK2 tooling, 
like {{{}extraRegionsInMeta{}}}. 

Rather than try to update each tool to account for when this problem may be 
present, create a new tool which an operator can run to correct meta and then 
use any subsequent tools as originally intended.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26644) Spurious compaction failures with file tracker

2022-01-04 Thread Josh Elser (Jira)
Josh Elser created HBASE-26644:
--

 Summary: Spurious compaction failures with file tracker
 Key: HBASE-26644
 URL: https://issues.apache.org/jira/browse/HBASE-26644
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser


Noticed when running a basic {{{}hbase pe randomWrite{}}}, we'll see 
compactions failing at various points.

One example:
{noformat}
2022-01-03 17:41:18,319 ERROR [regionserver/localhost:16020-shortCompactions-0] 
regionserver.CompactSplit(670): Compaction failed 
region=TestTable,0004054490,1641249249856.2dc7251c6eceb660b9c7bb0b587db913.,
 storeName=2dc7251c6eceb660b9c7bb0b587db913/info0,       priority=6, 
startTime=1641249666161
java.io.IOException: Root-level entries already added in single-level mode
  at 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeSingleLevelIndex(HFileBlockIndex.java:1136)
  at 
org.apache.hadoop.hbase.io.hfile.CompoundBloomFilterWriter$MetaWriter.write(CompoundBloomFilterWriter.java:279)
  at 
org.apache.hadoop.hbase.io.hfile.HFileWriterImpl$1.writeToBlock(HFileWriterImpl.java:713)
  at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeBlock(HFileBlock.java:1205)
  at 
org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:660)
  at 
org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:377)
  at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.commitWriter(DefaultCompactor.java:70)
  at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:386)
  at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62)
  at 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125)
  at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1141)
  at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2388)
  at 
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:654)
  at 
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:697)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)  {noformat}
This isn't a super-critical issue because compactions will be retried 
automatically and they appear to eventually succeed. However, when the max 
storefiles limit is reaching, this does cause ingest to hang (as I was doing 
with my modest configuration).

We had seen a similar kind of problem in our testing when backporting to HBase 
2.4 (not upstream as the decision was to not do this) which we eventually 
tracked down to a bad merge-conflict resolution to the new HFile Cleaner. 
However, initial investigations don't have the same exact problem.

It seems that we have some kind of generic race condition. Would be good to add 
more logging to catch this in the future (since we have two separate instances 
of this category of bug already).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26612) Adjust log level when looking for .filelist/{f1,f2}

2021-12-20 Thread Josh Elser (Jira)
Josh Elser created HBASE-26612:
--

 Summary: Adjust log level when looking for .filelist/{f1,f2}
 Key: HBASE-26612
 URL: https://issues.apache.org/jira/browse/HBASE-26612
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
Assignee: Josh Elser


Currently, we get a really big exception in the RegionServer log under normal 
assignment conditions when we are currently using .filelist/f2 as the tracker 
file.

We should move this to debug/trace to avoid making operators think there is a 
problem when there isn't one.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26605) TestHStore#testRefreshStoreFiles broken due to unqualified and qualified paths

2021-12-18 Thread Josh Elser (Jira)
Josh Elser created HBASE-26605:
--

 Summary: TestHStore#testRefreshStoreFiles broken due to 
unqualified and qualified paths
 Key: HBASE-26605
 URL: https://issues.apache.org/jira/browse/HBASE-26605
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.9
Reporter: Josh Elser
Assignee: Josh Elser


Was looking at a failures of this method where 
{noformat}
[ERROR] org.apache.hadoop.hbase.regionserver.TestHStore.testRefreshStoreFiles  
Time elapsed: 4.2 s  <<< ERROR!
java.util.NoSuchElementException
    at 
org.apache.hbase.thirdparty.com.google.common.collect.AbstractIndexedListIterator.next(AbstractIndexedListIterator.java:75)
    at 
org.apache.hadoop.hbase.regionserver.TestHStore.closeCompactedFile(TestHStore.java:962)
    at 
org.apache.hadoop.hbase.regionserver.TestHStore.testRefreshStoreFiles(TestHStore.java:1000)
 {noformat}
This was on a branch where I had some HBASE-26067 changes backported, so I 
thought the problem _was_ those changes. After a bit of digging, I believe the 
test case itself is "broken" (the test passes, but for the wrong reasons).

This test methods adds some files to a Store (via memstore flush or direct 
addition of a file) and eventually tries to get the first file which is 
candidate to be removed. The test {*}never compacted any files{*}. This was the 
first sign that the test itself was wrong.

After lots of comparison with the HBASE-26067 logging to compare against, I 
found that the Store was listing a file which was created by the memstore flush 
as a file to retain AND a file to remove. Second warning. Upon closer 
inspection, I finally noticed that one of the files was qualified with the 
filesystem URI and the other was not.
{noformat}
2021-12-18 16:57:10,903 INFO  [Time-limited test] regionserver.HStore(675): 
toBeAddedFiles=[file:/Users/jelser/projects/cldr/hbase-copy.git/hbase-server/target/test-data/f4ed7913-e62a-8d5f-a968-bb4c94d5494a/TestStoretestRefreshStoreFiles/data/default/table/297ad8361c3326bfb1520dbc54b1c3bd/family/dd8a430b391546d8b9bdc39bb77d447b,
 
file:/Users/jelser/projects/cldr/hbase-copy.git/hbase-server/target/test-data/f4ed7913-e62a-8d5f-a968-bb4c94d5494a/TestStoretestRefreshStoreFiles/data/default/table/297ad8361c3326bfb1520dbc54b1c3bd/family/d4c5442b772c43fd9ebdfed1a11c0e73],
 
toBeRemovedFiles=[/Users/jelser/projects/cldr/hbase-copy.git/hbase-server/target/test-data/f4ed7913-e62a-8d5f-a968-bb4c94d5494a/TestStoretestRefreshStoreFiles/data/default/table/297ad8361c3326bfb1520dbc54b1c3bd/family/d4c5442b772c43fd9ebdfed1a11c0e73]
 {noformat}
{{d4c5442b772c43fd9ebdfed1a11c0e73}} how are we both adding and removing this 
file! Turns out, this is because one of them is "/..." and the other is 
"file:/...". Either the problem is in TestHStore in how it is creating/adding 
these files behind the scenes or we should be qualifying the Path inside of 
StoreFileInfo with the filesystem that we're using.

I remember too vividly the problems when trying to separate the rootdir and 
waldir from each other and am cautious against adding something to 
StoreFileInfo to {{{}fs.qualifyPath(p){}}}. Need to look some more, but will 
get a patch up to fix.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26599) Netty exclusion through ZooKeeper not effective as intended

2021-12-17 Thread Josh Elser (Jira)
Josh Elser created HBASE-26599:
--

 Summary: Netty exclusion through ZooKeeper not effective as 
intended
 Key: HBASE-26599
 URL: https://issues.apache.org/jira/browse/HBASE-26599
 Project: HBase
  Issue Type: Bug
  Components: dependencies
Affects Versions: 2.4.8
Reporter: Josh Elser
Assignee: Josh Elser


Picking up where [~psomogyi] has been digging this week. We've been seeing an 
issue where MiniDFS-based tests fail to start due to missing netty classes.

HBASE-25969 seems to have intended to remove transitive Netty but was 
ineffective (at least for hadoop.profile=3.0). The dependency exclusion was for 
{{io.netty:netty}} and {{io.netty:netty-all}} but ZooKeeper 3.5.7 transitively 
depends on {{netty-handler}} and {{netty-transport-native-epoll}}  (per 
[https://search.maven.org/artifact/org.apache.zookeeper/zookeeper/3.5.7/jar)]

The funny part is that we _should_ have seen failures in any hbase unit test 
using MiniDFS because we excluded netty and netty-all in HBASE-25969, but 
because we missed the exclusions, we still keep running.

The intent of HBASE-25969 was good, but I think we need to revisit the 
execution. We need netty dependencies on the scope=test classpath. We just want 
to keep them off the scope=compile classpath (out of our client and server 
jars).

disclaimer: I have not yet looked at 2.5.x or master yet to see if this also 
affects them.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26265) Update ref guide to mention the new store file tracker implementations

2021-12-16 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26265.

Resolution: Fixed

> Update ref guide to mention the new store file tracker implementations
> --
>
> Key: HBASE-26265
> URL: https://issues.apache.org/jira/browse/HBASE-26265
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Duo Zhang
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: HBASE-26067
>
>
> For example, when to use these store file trackers.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26286) Add support for specifying store file tracker when restoring or cloning snapshot

2021-12-15 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26286.

Hadoop Flags: Reviewed
  Resolution: Fixed

Thanks for everyone's careful reviews! Great work, Szabolcs!

> Add support for specifying store file tracker when restoring or cloning 
> snapshot
> 
>
> Key: HBASE-26286
> URL: https://issues.apache.org/jira/browse/HBASE-26286
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, snapshots
>Reporter: Duo Zhang
>Assignee: Szabolcs Bukros
>Priority: Major
> Fix For: HBASE-26067
>
>
> As discussed in HBASE-26280.
> https://issues.apache.org/jira/browse/HBASE-26280?focusedCommentId=17414894=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17414894



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26568) hbase master got stuck after running couple of days in Azure setup

2021-12-13 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26568.

Resolution: Workaround

Resolving with "Workaround" being upgrade.

> hbase master got stuck after running couple of days in Azure setup
> --
>
> Key: HBASE-26568
> URL: https://issues.apache.org/jira/browse/HBASE-26568
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.1
> Environment: Azure cloud
>Reporter: kaushik mandal
>Priority: Major
> Attachments: hbase-master-log-0.txt, hbase-master-log-1.txt
>
>
> hadoop hbase version 2.0.1
> hadoop hdfs version 2.7.7
>  
> In Azure cluster setup, hbase master got hangs or not responding after 
> running couple of days
> and the only way to recover hbase master is delete /hbase and restart. Bellow 
> is the error getting in the hbase-master
>  
> Error message
> ==
> 2021-11-18 13:06:55,396 INFO 
> [RpcServer.priority.FPBQ.Fifo.handler=19,queue=1,port=16000] 
> assignment.AssignProcedure: Retry=10 of max=10; pid=320, ppid=319, 
> state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta, 
> region=1588230740; rit=OPENING, 
> location=nokiainfra-altiplano-hbase-regionserver-1.nokiainfra-altiplano-hbase-regionserver.default.svc.cluster.local,16020,1637238611975
>  2021-11-18 13:06:55,396 INFO [PEWorker-16] assignment.AssignProcedure: 
> Retry=11 of max=10; pid=320, ppid=319, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, 
> region=1588230740; rit=OFFLINE, location=null 2021-11-18 13:06:55,944 ERROR 
> [PEWorker-16] procedure2.ProcedureExecutor: CODE-BUG: Uncaught runtime 
> exception for pid=319, state=FAILED:RECOVER_META_ASSIGN_REGIONS, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; RecoverMetaProcedure failedMetaServer=null, splitWal=true 
> java.lang.UnsupportedOperationException: unhandled 
> state=RECOVER_META_ASSIGN_REGIONS at 
> org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.rollbackState(RecoverMetaProcedure.java:209)
>  at 
> org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.rollbackState(RecoverMetaProcedure.java:52)
>  at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
>  at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864) 
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1372)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1328)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1197)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1760)
>  2021-11-18 13:06:55,958 ERROR [PEWorker-16] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=319, 
> state=FAILED:RECOVER_META_ASSIGN_REGIONS, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> attempts exceeded; RecoverMetaProcedure failedMetaServer=null, splitWal=true 
> java.lang.UnsupportedOperationException: unhandled 
> state=RECOVER_META_ASSIGN_REGIONS at 
> org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.rollbackState(RecoverMetaProcedure.java:209)
>  at 
> org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.rollbackState(RecoverMetaProcedure.java:52)
>  at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
>  at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864) 
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1372)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1328)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1197)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
>  at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1760)
>  2021-11-18 13:06:55,969 ERROR [PEWorker-16] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception for pid=319, 
> state=FAILED:RECOVER_META_ASSIGN_REGIONS, 
> exception=org.apache.hadoop.hbase.client.RetriesExhaustedException via 
> AssignProcedure:org.apache.hadoop.hbase.client.RetriesExhaustedException: Max 
> 

[jira] [Reopened] (HBASE-26557) log4j2 has a critical RCE vulnerability

2021-12-13 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser reopened HBASE-26557:


> log4j2 has a critical RCE vulnerability
> ---
>
> Key: HBASE-26557
> URL: https://issues.apache.org/jira/browse/HBASE-26557
> Project: HBase
>  Issue Type: Bug
>  Components: logging, security
>Reporter: Yutong Xiao
>Assignee: Yutong Xiao
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Impacted log4j version: Apache Log4j 2.x <= 2.14.1
> I found that our current log4j version at master is 2.14.1.
> Should upgrade the version to 2.15.0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26550) NPE if balance request comes in before master is initialized

2021-12-08 Thread Josh Elser (Jira)
Josh Elser created HBASE-26550:
--

 Summary: NPE if balance request comes in before master is 
initialized
 Key: HBASE-26550
 URL: https://issues.apache.org/jira/browse/HBASE-26550
 Project: HBase
  Issue Type: Bug
  Components: Balancer, master
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 3.0.0-alpha-2


Noticed this in a unit test from [https://github.com/apache/hbase/pull/3851]

I believe this is a result of the new balance() implementation in the Master, 
and a client submitting a request to the master before it's completed its 
instantiation. Simple fix to avoid the NPE.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26512) Make timestamp format configurable in HBase shell scan output

2021-12-01 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26512.

Hadoop Flags: Reviewed
  Resolution: Fixed

Thanks for the patch, Istvan! I took the liberty of writing some release notes. 
Please let me know if you have any changes.

> Make timestamp format configurable in HBase shell scan output
> -
>
> Key: HBASE-26512
> URL: https://issues.apache.org/jira/browse/HBASE-26512
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Affects Versions: 3.0.0-alpha-1
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.9
>
>
> HBASE-23930 and HBASE-24937 has changed the timestamp format shown in scan 
> results in HBase shells.
> This may break existing use cases that use hbase shell as a client. (as 
> opposed to the java, rest, or thrift APIs)
> I propose adding a configuration option to make it configurable.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26461) [hboss] Delete self lock without orphaning znode

2021-11-17 Thread Josh Elser (Jira)
Josh Elser created HBASE-26461:
--

 Summary: [hboss] Delete self lock without orphaning znode
 Key: HBASE-26461
 URL: https://issues.apache.org/jira/browse/HBASE-26461
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser


Fallout from HBASE-26437
{quote}Could do the {{removeInMemoryLocks}} separately in HBASE-26453, but I 
think then znodes would get created again when unlocking, failing this PR 
tests. So, once we fix {{{}removeInMemoryLocks{}}}, we need to make sure 
{{rename}} and {{delete}} would not recreate the path again when calling 
{{{}unlock{}}}.
{quote}
The changes from HBASE-26453 inadvertently passed their unit tests because we 
didn't remove the Mutex object like we intended to do (after deleting a 
file/dir or renaming a file/dir, we intend to remove the mutex and znode for 
that file/dir and all beneath it).

Right now, we only actually delete the children (znode and mutex objects) for 
that deleted/renamed path. Meaning, we are still orphaning resources. I 
implemented the fix in lockRename based on what we did in lockDelete, so we're 
making incremental progress.

The lock cleanup process and Mutex logic need to be reworked because we cannot 
do it in two-phases as we currently do. In order to get the mutex to release it 
(when we are holding it already), we currently will re-create znodes back in 
ZooKeeper.

The other solution, based on googling, appears to be to use a 
[Reaper|https://www.javadoc.io/doc/org.apache.curator/curator-recipes/2.4.1/org/apache/curator/framework/recipes/locks/Reaper.html].
 This might also be an easier solution to the problem to do the rest of the 
cleanup.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26453) [hboss] removeInMemoryLocks can remove still in-use locks

2021-11-17 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26453.

Fix Version/s: hbase-filesystem-1.0.0-alpha2
 Hadoop Flags: Reviewed
   Resolution: Fixed

> [hboss] removeInMemoryLocks can remove still in-use locks
> -
>
> Key: HBASE-26453
> URL: https://issues.apache.org/jira/browse/HBASE-26453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: hbase-filesystem-1.0.0-alpha1
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Fix For: hbase-filesystem-1.0.0-alpha2
>
>
> While implementing HBASE-26437, I was fighting with unit tests which just 
> wouldn't complete. After adding the code change to delete the locks held by 
> the {{src}} in a {{mv src dst}} operation, releasing the {{dst}} lock would 
> claim that the current thread doesn't hold the lock.
> After investigating, the specific contract test in question is doing a rename 
> of the form: {{{}mv /foo /foodest{}}}. This actually breaks the logic which 
> tries to determine if a lock's path is contained beneath the path we're 
> tryign to clean up. Specifically: cleaning up locks beneath {{/foo}} 
> incorrectly removes locks for {{/foodest}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26267) Master initialization fails if Master Region WAL dir is missing

2021-11-16 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26267.

Hadoop Flags: Reviewed
  Resolution: Fixed

Thanks for the reminders to merge this, Duo. Merged this to branch-2.4, 
branch-2, and master.

> Master initialization fails if Master Region WAL dir is missing
> ---
>
> Key: HBASE-26267
> URL: https://issues.apache.org/jira/browse/HBASE-26267
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 2.4.6
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.9
>
>
> From a recent branch-2.4 build:
> {noformat}
> 2021-09-07 19:31:19,666 ERROR [master/localhost:16000:becomeActiveMaster] 
> master.HMaster(159): * ABORTING master localhost,16000,1631057476442: 
> Unhandled exception. Starting shutdown. *
> java.io.FileNotFoundException: File 
> hdfs://localhost:8020/hbase-2.4-wals/MasterData/WALs does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:226)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:303)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:839)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2189)
> at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:512)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> If the WAL directory is missing but the Master Region already exists, we will 
> try to list the contents of the Master Region's WAL directory which may or 
> may not exist. If we simply check to make sure the directory exists and then 
> the rest of the initialization code works as expected.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26453) [hboss] removeInMemoryLocks can remove still in-use locks

2021-11-12 Thread Josh Elser (Jira)
Josh Elser created HBASE-26453:
--

 Summary: [hboss] removeInMemoryLocks can remove still in-use locks
 Key: HBASE-26453
 URL: https://issues.apache.org/jira/browse/HBASE-26453
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-filesystem-1.0.0-alpha1
Reporter: Josh Elser
Assignee: Josh Elser


While implementing HBASE-26437, I was fighting with unit tests which just 
wouldn't complete. After adding the code change to delete the locks held by the 
{{src}} in a {{mv src dst}} operation, releasing the {{dst}} lock would claim 
that the current thread doesn't hold the lock.

After investigating, the specific contract test in question is doing a rename 
of the form: {{{}mv /foo /foodest{}}}. This actually breaks the logic which 
tries to determine if a lock's path is contained beneath the path we're tryign 
to clean up. Specifically: cleaning up locks beneath {{/foo}} incorrectly 
removes locks for {{/foodest}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26437) [hboss] Rename does not clean up znodes for src location

2021-11-09 Thread Josh Elser (Jira)
Josh Elser created HBASE-26437:
--

 Summary: [hboss] Rename does not clean up znodes for src location
 Key: HBASE-26437
 URL: https://issues.apache.org/jira/browse/HBASE-26437
 Project: HBase
  Issue Type: Bug
  Components: hboss
Affects Versions: hbase-filesystem-1.0.0-alpha1
Reporter: Josh Elser
Assignee: Josh Elser


We ran into a fun situation where the partition hosting ZK data was repeatedly 
filling up while heavy ExportSnapshot+clone_snapshot operations were running 
(10's of TB). The cluster was previously working just fine.

Upon investigation of the ZK tree, we found a large number of znodes beneath 
/hboss, specifically many in the corresponding ZK HBOSS path for 
$hbase.rootdir/.tmp.

Tracing back from the code, we saw that the CloneSnapshotProcedure (like 
CreateTableProcedure) will create the table filesystem layout in 
$hbase.rootdir/.tmp and then rename it into $hbase.rootdir/data/. 
However, it appears that, upon rename, HBOSS was not cleaning up the src path's 
znode. This is a bug as it allows ZK to grow unbounded (which explains why this 
problem slowly arose and not suddenly).

As a workaround, HBase can be stopped and the corresponding ZK path for 
$hbase.rootdir/.tmp can be cleaned up to reclaim 1/2 the space taken up by 
znodes for imported hbase tables (we would still have znodes for 
$hbase.rootdir/data/...)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-22394) HdfsFileStatus incompatibility when used with Hadoop 3.1.x

2021-10-19 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-22394.

Resolution: Not A Bug

Like HBASE-24154, this is just "how it is" in HBase presently. The HBase PMC 
does not release multiple artifacts for both Hadoop2 and Hadoop3 support at the 
current time. Current HBase2 releases still compile against Hadoop2 by default, 
and using Hadoop 3 against HBase2 requires a recompilation of HBase because of 
incompatible changes between Hadoop2 and Hadoop3.

We may choose to publish multiple HBase artifacts (built against different 
Hadoop version) in the future, but that should start as a dev-list discussion 
as it will have lots of implications.

> HdfsFileStatus incompatibility when used with Hadoop 3.1.x
> --
>
> Key: HBASE-22394
> URL: https://issues.apache.org/jira/browse/HBASE-22394
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.1.4
>Reporter: Raymond Lau
>Priority: Major
>
> Hbase 2.1.4 works fine with Hadoop 3.0.3 but when I attempted to upgrade to 
> Hadoop 3.1.2, I get the following error in the region server:
> {noformat}
> 2019-05-10 12:49:10,303 ERROR HRegionServer - * ABORTING region server 
> [REDACTED],16020,1557506923574: Unhandled: Found interface 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus, but class was expected *
> java.lang.IncompatibleClassChangeError: Found interface 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus, but class was expected
> at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.createOutput(FanOutOneBlockAsyncDFSOutputHelper.java:768)
> at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$400(FanOutOneBlockAsyncDFSOutputHelper.java:118)
> at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$16.doCall(FanOutOneBlockAsyncDFSOutputHelper.java:848)
> {noformat}
> Hadoop 3.1.1+ is listed as compatible with Hbase 2.1.x at 
> [https://hbase.apache.org/book.html#basic.prerequisites].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26350) Missing server side debugging on failed SASL handshake

2021-10-11 Thread Josh Elser (Jira)
Josh Elser created HBASE-26350:
--

 Summary: Missing server side debugging on failed SASL handshake
 Key: HBASE-26350
 URL: https://issues.apache.org/jira/browse/HBASE-26350
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.5.0, 3.0.0-alpha-2, 2.3.7, 2.4.8


In trying to debug some problems with the pluggable authentication, I noticed 
that we are eating the IOException without logging it (at any level) in 
ServerRpcConnection.

This makes it super hard to debug when that pluggable interface has a problem 
because the context gets lost (clients just get a pretty useless DNRIOE).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25900) HBoss tests compile/failure against Hadoop 3.3.1

2021-10-05 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-25900.

Fix Version/s: hbase-filesystem-1.0.0-alpha2
 Hadoop Flags: Reviewed
   Resolution: Fixed

Thanks Steve and Peter for the help along the way.

> HBoss tests compile/failure against Hadoop 3.3.1
> 
>
> Key: HBASE-25900
> URL: https://issues.apache.org/jira/browse/HBASE-25900
> Project: HBase
>  Issue Type: Bug
>  Components: Filesystem Integration
>Affects Versions: 1.0.2
>Reporter: Steve Loughran
>Assignee: Josh Elser
>Priority: Major
> Fix For: hbase-filesystem-1.0.0-alpha2
>
>
> Changes in Hadoop 3.3.x stop the tests compiling/working. 
> * changes in signature of nominally private classes (HADOOP-17497): fix, 
> update
> * HADOOP-16721  -s3a rename throwing more exceptions, but no longer failing 
> if the dest parent doesn't exist. Fix: change s3a.xml
> * HADOOP-17531/HADOOP-17620 distcp moving to listIterator; test failures. 
> * HADOOP-13327: tests on syncable which expect files being written to to be 
> visible. Fix: skip that test
> The fix for HADOOP-17497 stops this compiling against Hadoop < 3.3.1. This is 
> unfortunate but I can't see an easy fix. The new signature takes a parameters 
> class, so we can (and already are) adding new config options without breaking 
> this signature again. And I've tagged it as LimitedPrivate so that future 
> developers will know it's used here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26277) Revert 26240, Apply InterfaceAudience.Private to BalanceResponse$Builder

2021-09-10 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26277.

Hadoop Flags: Reviewed
  Resolution: Fixed

Thanks for the quick fix-up, Bryan!

> Revert 26240, Apply InterfaceAudience.Private to BalanceResponse$Builder
> 
>
> Key: HBASE-26277
> URL: https://issues.apache.org/jira/browse/HBASE-26277
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Minor
> Fix For: 2.5.0, 3.0.0-alpha-2
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26267) Master initialization fails if Master Region WAL dir is missing

2021-09-08 Thread Josh Elser (Jira)
Josh Elser created HBASE-26267:
--

 Summary: Master initialization fails if Master Region WAL dir is 
missing
 Key: HBASE-26267
 URL: https://issues.apache.org/jira/browse/HBASE-26267
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 2.4.6
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.7


>From a recent branch-2.4 build:

{noformat}
2021-09-07 19:31:19,666 ERROR [master/localhost:16000:becomeActiveMaster] 
master.HMaster(159): * ABORTING master localhost,16000,1631057476442: 
Unhandled exception. Starting shutdown. *
java.io.FileNotFoundException: File 
hdfs://localhost:8020/hbase-2.4-wals/MasterData/WALs does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126)
at 
org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:226)
at 
org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:303)
at 
org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:839)
at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2189)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:512)
at java.lang.Thread.run(Thread.java:748)
{noformat}

If the WAL directory is missing but the Master Region already exists, we will 
try to list the contents of the Master Region's WAL directory which may or may 
not exist. If we simply check to make sure the directory exists and then the 
rest of the initialization code works as expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26147) Add dry run mode to hbase balancer

2021-09-01 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-26147.

Resolution: Fixed

Thanks for the excellent work, Bryan. This really came together to be a great 
change. I took the liberty of writing some release notes – please feel free to 
update them as you see fit.

Thanks Duo, Nick, and everyone else who helped out in reviews.

My apologies that I botched application of the branch-2 PR (putting my own 
email address instead of Bryan's). I reverted my original commit and re-applied 
it with correct metadata. Sorry for sullying the commit log. 

> Add dry run mode to hbase balancer
> --
>
> Key: HBASE-26147
> URL: https://issues.apache.org/jira/browse/HBASE-26147
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, master
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-2
>
>
> It's often rather hard to know how the cost function changes you're making 
> will affect the balance of the cluster, and currently the only way to know is 
> to run it. If the cost decisions are not good, you may have just moved many 
> regions towards a non-ideal balance. Region moves themselves are not free for 
> clients, and the resulting balance may cause a regression.
> We should add a mode to the balancer so that it can be invoked without 
> actually executing any plans. This will allow an administrator to iterate on 
> their cost functions and used the balancer's logging to see how their changes 
> would affect the cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26236) Simple travis build for hbase-filesystem

2021-08-29 Thread Josh Elser (Jira)
Josh Elser created HBASE-26236:
--

 Summary: Simple travis build for hbase-filesystem
 Key: HBASE-26236
 URL: https://issues.apache.org/jira/browse/HBASE-26236
 Project: HBase
  Issue Type: Improvement
  Components: hboss
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: hbase-filesystem-1.0.0-alpha2


Noticed that we don't have any kind of precommit checks. Time to make a quick 
one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26212) Allow AuthUtil automatic renewal to be disabled

2021-08-20 Thread Josh Elser (Jira)
Josh Elser created HBASE-26212:
--

 Summary: Allow AuthUtil automatic renewal to be disabled
 Key: HBASE-26212
 URL: https://issues.apache.org/jira/browse/HBASE-26212
 Project: HBase
  Issue Type: Improvement
  Components: Client, security
Reporter: Josh Elser
Assignee: Josh Elser


Talking with [~bbende] who was looking at some "spam" in the NiFi log where 
AuthUtil was complaining that it couldn't renew the UGI. This is did not cause 
him problems (NiFi could always read/write to HBase), but it generated a lot of 
noise in the NiFi log.

NiFi is special in that it's managing renewals on its own (for all services it 
can communicate with), rather than letting each client do it on its own. 
Specifically, one way they do this is by doing a keytab-based login via JAAS, 
constructing a UGI object from that JAAS login, and then invoking HBase in a 
normal UGI.doAs().

The problem comes in that AuthUtil _thinks_ that it is capable of renewing this 
UGI instance on its own. AuthUtil can determine that the current UGI came from 
a keytab, and thus thinks that it can renew it. However, this actually fails 
because the LoginContext inside UGI *isn't* actually something that UGI can 
renew (remember: because NiFI did it directly via JAAS and not via UGI)
{noformat}
2021-08-19 17:32:19,438 ERROR [Relogin service.Chore.1] 
org.apache.hadoop.hbase.AuthUtil Got exception while trying to refresh 
credentials: loginUserFromKeyTab must be done first
java.io.IOException: loginUserFromKeyTab must be done first
at 
org.apache.hadoop.security.UserGroupInformation.reloginFromKeytab(UserGroupInformation.java:1194)
at 
org.apache.hadoop.security.UserGroupInformation.checkTGTAndReloginFromKeytab(UserGroupInformation.java:1125)
at org.apache.hadoop.hbase.AuthUtil$1.chore(AuthUtil.java:206) 
{noformat}
After talking with Bryan about this: we don't see a good way for HBase to 
detect this specific "A UGI instance, but not created by UGI" case because the 
LoginContext inside UGI is private. It is great that AuthUtil will 
automatically try to renew keytab logins, even if not using 
{{hbase.client.keytab.file}} and {{hbase.client.keytab.principal}}, so I don't 
want to break that functionality{{.}}

NiFi is unique in this case that it is fully managing the renewals, so I think 
the best path forward is to add an option which lets NiFi disable AuthUtil 
since it knows it can safely do this. This should affect any others users (but 
also give us an option if AuthUtil ever does cause problems).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26165) 2.3.5 listed on website downloads page but row intends to be for 2.3.6

2021-08-02 Thread Josh Elser (Jira)
Josh Elser created HBASE-26165:
--

 Summary: 2.3.5 listed on website downloads page but row intends to 
be for 2.3.6
 Key: HBASE-26165
 URL: https://issues.apache.org/jira/browse/HBASE-26165
 Project: HBase
  Issue Type: Task
  Components: website
Reporter: Josh Elser
Assignee: Josh Elser


Typo on downloads.html. Row is for 2.3.6 but still says 2.3.5.

Missed in HBASE-26162. PR coming.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26164) Update dependencies in hbase-filesystem

2021-08-02 Thread Josh Elser (Jira)
Josh Elser created HBASE-26164:
--

 Summary: Update dependencies in hbase-filesystem
 Key: HBASE-26164
 URL: https://issues.apache.org/jira/browse/HBASE-26164
 Project: HBase
  Issue Type: Task
  Components: hboss
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: hbase-filesystem-1.0.0-alpha2


hbase-filesystem still has some old dependencies. Notably aws-java-sdk is at 
1.11.525 whereas hadoop is all the way at 1.11.1026.

We're also still building HBase 2 against 2.1.4 instead of anything newer. Bump 
up the relevant dependencies to something more current and make sure the code 
still works.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-22078) corrupted procs in proc WAL

2021-03-31 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-22078.

Resolution: Incomplete

Goin' looking at missing stacks for an instance I've just run into. Came across 
this -- expect it to not go anywhere after 2 years and no logs.

> corrupted procs in proc WAL
> ---
>
> Key: HBASE-22078
> URL: https://issues.apache.org/jira/browse/HBASE-22078
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Not sure what the root cause is... there are ~500 proc wal files (I actually 
> wonder if cleanup is also blocked by this, since I see these lines on master 
> restart, do WALs with abandoned procedures like that get deleted?).
> {noformat}
> 2019-03-20 07:37:53,212 ERROR [master/...:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 7571, max stack id is 7754, root 
> procedure is Procedure(pid=66829, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure)
> 2019-03-20 07:37:53,213 ERROR [master/...:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 7600, max stack id is 7754, root 
> procedure is Procedure(pid=66829, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure)
> 2019-03-20 07:37:53,213 ERROR [master/...:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 7610, max stack id is 7754, root 
> procedure is Procedure(pid=66829, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure)
> 2019-03-20 07:37:53,213 ERROR [master/...:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 7631, max stack id is 7754, root 
> procedure is Procedure(pid=66829, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure)
> 2019-03-20 07:37:53,213 ERROR [master/...:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 7650, max stack id is 7754, root 
> procedure is Procedure(pid=66829, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure)
> 2019-03-20 07:37:53,213 ERROR [master/...:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 7651, max stack id is 7754, root 
> procedure is Procedure(pid=66829, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure)
> 2019-03-20 07:37:53,213 ERROR [master/...:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 7657, max stack id is 7754, root 
> procedure is Procedure(pid=66829, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure)
> 2019-03-20 07:37:53,213 ERROR [master/...:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 7683, max stack id is 7754, root 
> procedure is Procedure(pid=66829, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure)
> {noformat}
> Followed by 
> {noformat}
> 2019-03-20 07:37:53,751 ERROR [master/...:17000:becomeActiveMaster] 
> procedure2.ProcedureExecutor: Corrupt pid=66829, 
> state=WAITING:DISABLE_TABLE_ADD_REPLICATION_BARRIER, hasLock=false; 
> DisableTableProcedure table=...
> {noformat}
> And 1000s of child procedures and grandchild procedures of this procedure.
> I think this area needs general review... we should have a record for the 
> procedure durably persisted before we create any child procedures, so I'm not 
> sure how this could happen. Actually, I also wonder why we even have separate 
> proc WAL when HBase already has a working WAL that's more or less time 
> tested... 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25712) Port failure to close InputStream to 1.x

2021-03-29 Thread Josh Elser (Jira)
Josh Elser created HBASE-25712:
--

 Summary: Port failure to close InputStream to 1.x
 Key: HBASE-25712
 URL: https://issues.apache.org/jira/browse/HBASE-25712
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
Assignee: Josh Elser


Port the parent issue (replication not closing a socket) to branch-1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25692) Failure to instantiate WALCellCodec leaks socket

2021-03-24 Thread Josh Elser (Jira)
Josh Elser created HBASE-25692:
--

 Summary: Failure to instantiate WALCellCodec leaks socket
 Key: HBASE-25692
 URL: https://issues.apache.org/jira/browse/HBASE-25692
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 2.4.2, 2.4.1, 2.3.4, 2.3.2, 2.2.6, 2.2.5, 2.4.0, 2.2.4, 
2.1.9, 2.3.3, 2.2.3, 2.1.8, 2.2.2, 2.1.7, 2.1.6, 2.2.1, 2.1.5, 2.0.6, 2.1.4, 
2.3.1, 2.3.0, 2.1.3, 2.1.2, 2.1.1, 2.2.0, 2.1.0
Reporter: Josh Elser
Assignee: Josh Elser


I was looking at an HBase user's cluster with [~danilocop] where they saw two 
otherwise identical clusters where one of them was regularly had sockets in 
CLOSE_WAIT going from RegionServers to a distributed storage appliance.

After a lot of analysis, we eventually figured out that these sockets in 
CLOSE_WAIT were directly related to an FSDataInputStream which we forgot to 
close inside of the RegionServer. The subtlety was that only one of these HBase 
clusters was set up to do replication (to the other cluster). The HBase cluster 
experiencing this problem was shipping edits to a peer, and had previously been 
using Phoenix. At some point, the cluster had Phoenix removed from it.

What we found was that replication still had WALs to ship which were for 
Phoenix tables. Phoenix, in this version, still used the custom WALCellCodec; 
however, this codec class was missing from the RS classpath after the owner of 
the cluster removed Phoenix.

When we try to instantiate the Codec implementation via ReflectionUtils, we end 
up throwing an UnsupportedOperationException which wraps a 
NoClassDefFoundException. However, in WALFactory, we _only_ close the 
FSDataInputStream when we catch an IOException. 

Thus, replication sits in a "fast" loop, trying to ship these edits, each time 
leaking a new socket because of the InputStream not being closed. There is an 
obvious workaround for this specific issue, but we should not leak this inside 
HBase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25601) Remove search hadoop references in book

2021-02-23 Thread Josh Elser (Jira)
Josh Elser created HBASE-25601:
--

 Summary: Remove search hadoop references in book
 Key: HBASE-25601
 URL: https://issues.apache.org/jira/browse/HBASE-25601
 Project: HBase
  Issue Type: Task
  Components: documentation
Reporter: Josh Elser
Assignee: Josh Elser


Remove references to this newly-owned domain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25449) 'dfs.client.read.shortcircuit' should not be set in hbase-default.xml

2021-01-08 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-25449.

Hadoop Flags: Reviewed
Release Note: The presence of HDFS short-circuit read configuration 
properties in hbase-default.xml inadvertently causes short-circuit reads to not 
happen inside of RegionServers, despite short-circuit reads being enabled in 
hdfs-site.xml.
  Resolution: Fixed

Thanks for a great fix (and test), [~shenshengli]!

> 'dfs.client.read.shortcircuit' should not be set in hbase-default.xml
> -
>
> Key: HBASE-25449
> URL: https://issues.apache.org/jira/browse/HBASE-25449
> Project: HBase
>  Issue Type: Improvement
>  Components: conf
>Affects Versions: 2.0.1
>Reporter: shenshengli
>Assignee: shenshengli
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.7, 2.5.0, 2.4.1, 2.3.5
>
>
> I think this parameter is not suitable for in hbase-default.xml, because in 
> this case, HDFS explicitly set to "dfs.client.read.shortcircuit=true", hbase 
> rely on HDFS configuration, the parameters in hbase service still is 
> false.Must be explicitly in hbase-site.xml is set to 
> "dfs.client.read.shortcircuit=true" to take effect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24268) REST and Thrift server do not handle the "doAs" parameter case insensitively

2020-11-24 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-24268.

Hadoop Flags: Reviewed
Release Note: This change allows the REST and Thrift servers to handle the 
"doAs" parameter case-insensitively, which is deemed as correct per the 
"specification" provided by the Hadoop community.
  Resolution: Fixed

Thanks for the contribution, [~RichardAntal]!

> REST and Thrift server do not handle the "doAs" parameter case insensitively
> 
>
> Key: HBASE-24268
> URL: https://issues.apache.org/jira/browse/HBASE-24268
> Project: HBase
>  Issue Type: Bug
>  Components: REST, Thrift
>Reporter: Istvan Toth
>Assignee: Richard Antal
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> Hadoop does a case-insensitve comparison on the doAs parameter name when 
> handling principal impersonation.
> The HBase Rest and Thrift servers do not do that, they only accept the "doAs" 
> form.
> According to HADOOP-11083, the the correct Hadoop behaviour is accepting doAs 
> in any case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25279) Non-daemon thread in ZKWatcher

2020-11-12 Thread Josh Elser (Jira)
Josh Elser created HBASE-25279:
--

 Summary: Non-daemon thread in ZKWatcher
 Key: HBASE-25279
 URL: https://issues.apache.org/jira/browse/HBASE-25279
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 3.0.0-alpha-1


ZKWatcher spawns an ExecutorService which doesn't mark its threads as daemons 
which will prevent clean shut downs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25278) Add option to toggle CACHE_BLOCKS in count.rb

2020-11-12 Thread Josh Elser (Jira)
Josh Elser created HBASE-25278:
--

 Summary: Add option to toggle CACHE_BLOCKS in count.rb
 Key: HBASE-25278
 URL: https://issues.apache.org/jira/browse/HBASE-25278
 Project: HBase
  Issue Type: New Feature
  Components: shell
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 3.0.0-alpha-1, 2.4.0


A trick I've found myself doing a couple of times (hat-tip to [~psomogyi]) is 
to edit table.rb so that the `count` shell command will not instruct 
RegionServers to not cache any data blocks. This is a quick+dirty way to force 
a table to be loaded into block cache (i.e. for performance testing).

We can easily add another option to avoid having to edit the ruby files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24860) Bump copyright year in NOTICE

2020-08-11 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-24860.

Resolution: Fixed

Pushed without review.

FYI [~zhangduo], in case you have to do a 3.4.0-rc2

> Bump copyright year in NOTICE
> -
>
> Key: HBASE-24860
> URL: https://issues.apache.org/jira/browse/HBASE-24860
> Project: HBase
>  Issue Type: Task
>  Components: thirdparty
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Trivial
> Fix For: thirdparty-3.5.0
>
>
> year in NOTICE is 2018, should be 2020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24860) Bump copyright year in NOTICE

2020-08-11 Thread Josh Elser (Jira)
Josh Elser created HBASE-24860:
--

 Summary: Bump copyright year in NOTICE
 Key: HBASE-24860
 URL: https://issues.apache.org/jira/browse/HBASE-24860
 Project: HBase
  Issue Type: Task
  Components: thirdparty
Reporter: Josh Elser
Assignee: Josh Elser






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24858) Builds without a `clean` phase fail on hbase-shaded-jetty

2020-08-11 Thread Josh Elser (Jira)
Josh Elser created HBASE-24858:
--

 Summary: Builds without a `clean` phase fail on hbase-shaded-jetty
 Key: HBASE-24858
 URL: https://issues.apache.org/jira/browse/HBASE-24858
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser


In 3.4.0-rc1 that [~zhangduo] created, I noticed that builds of the project 
failed on hbase-shaded-jetty when I did not include {{clean}} in my build 
command.

{noformat}
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time:  3.604 s
[INFO] Finished at: 2020-08-11T14:38:45-04:00
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (default) on project 
hbase-shaded-jetty: Error creating shaded jar: duplicate entry: 
META-INF/services/org.apache.hbase.thirdparty.org.eclipse.jetty.http.HttpFieldPreEncoder
 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
{noformat}

Poking around, it appears that this is actually to do with the creating of the 
shaded source jar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24834) TestReplicationSource.testWALEntryFilter failing in branch-2+

2020-08-11 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-24834.

Resolution: Cannot Reproduce

Maybe fixed by HBASE-24830?

> TestReplicationSource.testWALEntryFilter failing in branch-2+
> -
>
> Key: HBASE-24834
> URL: https://issues.apache.org/jira/browse/HBASE-24834
> Project: HBase
>  Issue Type: Task
>  Components: Replication, test
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> {noformat}
> [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 46.497 s <<< FAILURE! - in 
> org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource
> [ERROR] 
> org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource.testWALEntryFilter
>   Time elapsed: 1.405 s  <<< FAILURE!
> java.lang.AssertionError
>   at 
> org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource.testWALEntryFilter(TestReplicationSource.java:177)
>  {noformat}
> Noticed this during HBASE-24779, but didn't fix it. Believe it comes from 
> HBASE-24817.
> Looks to be that the edit we expect to not be filtered is actually getting 
> filtered because it has no cells (thus, no change and doesn't need to be 
> replicated). Should be a simple fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24834) TestReplicationSource.testWALEntryFilter failing in branch-2+

2020-08-07 Thread Josh Elser (Jira)
Josh Elser created HBASE-24834:
--

 Summary: TestReplicationSource.testWALEntryFilter failing in 
branch-2+
 Key: HBASE-24834
 URL: https://issues.apache.org/jira/browse/HBASE-24834
 Project: HBase
  Issue Type: Task
  Components: Replication, test
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 3.0.0-alpha-1, 2.4.0


{noformat}
[ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 46.497 
s <<< FAILURE! - in 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource
[ERROR] 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource.testWALEntryFilter
  Time elapsed: 1.405 s  <<< FAILURE!
java.lang.AssertionError
at 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource.testWALEntryFilter(TestReplicationSource.java:177)
 {noformat}

Noticed this during HBASE-24779, but didn't fix it. Believe it comes from 
HBASE-24817.

Looks to be that the edit we expect to not be filtered is actually getting 
filtered because it has no cells (thus, no change and doesn't need to be 
replicated). Should be a simple fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24779) Improve insight into replication WAL readers hung on checkQuota

2020-07-27 Thread Josh Elser (Jira)
Josh Elser created HBASE-24779:
--

 Summary: Improve insight into replication WAL readers hung on 
checkQuota
 Key: HBASE-24779
 URL: https://issues.apache.org/jira/browse/HBASE-24779
 Project: HBase
  Issue Type: Task
Reporter: Josh Elser
Assignee: Josh Elser


Helped a customer this past weekend who, with a large number of RegionServers, 
has some RegionServers which replicated data to a peer without issues while 
other RegionServers did not.

The number of queue logs varied over the past 24hrs in the same manner. Some 
spikes in queued logs into 100's of logs, but other times, only 1's-10's of 
logs were queued.

We were able to validate that there were "good" and "bad" RegionServers by 
creating a test table, assigning it to a regionserver, enabling replication on 
that table, and validating if the local puts were replicated to a peer. On a 
good RS, data was replicated immediately. On a bad RS, data was never 
replicated (at least, on the order of 10's of minutes which we waited).

On the "bad RS", we were able to observe that the \{{wal-reader}} thread(s) on 
that RS were spending time in a Thread.sleep() in a different location than the 
other. Specifically it was sitting in the 
{{ReplicationSourceWALReader#checkQuota()}}'s sleep call, _not_ the 
{{handleEmptyWALBatch()}} method on the same class.

My only assumption is that, somehow, these RegionServers got into a situation 
where they "allocated" memory from the quota but never freed it. Then, because 
the WAL reader thinks it has no free memory, it blocks indefinitely and there 
are no pending edits to ship and (thus) free that memory. A cursory glance at 
the code gives me a _lot_ of anxiety around places where we don't properly 
clean it up (e.g. batches that fail to ship, dropping a peer). As a first stab, 
let me add some more debugging so we can actually track this state properly for 
the operators and their sanity.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-22146) SpaceQuotaViolationPolicy Disable is not working in Namespace level

2020-07-21 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-22146.

Resolution: Fixed

Thanks for your patch and thoughtful tests, Surbhi!

> SpaceQuotaViolationPolicy Disable is not working in Namespace level
> ---
>
> Key: HBASE-22146
> URL: https://issues.apache.org/jira/browse/HBASE-22146
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 2.0.0
>Reporter: Uma Maheswari
>Assignee: Surbhi Kochhar
>Priority: Major
>  Labels: Quota, space
> Fix For: 3.0.0-alpha-1, 2.3.1, 2.4.0, 2.2.7
>
>
> SpaceQuotaViolationPolicy Disable is not working in Namespace level
> PFB the steps:
>  * Create Namespace and set Quota violation policy as Disable
>  * Create tables under namespace and violate Quota
> Expected result: Tables to get disabled
> Actual Result: Tables are not getting disabled
> Note: mutation operation is not allowed on the table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-19365) FSTableDescriptors#get() can return null reference, in some cases, it is not checked

2020-06-17 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-19365.

Hadoop Flags: Reviewed
  Resolution: Fixed

Thanks for the review, [~vjasani]!

> FSTableDescriptors#get() can return null reference, in some cases, it is not 
> checked
> 
>
> Key: HBASE-19365
> URL: https://issues.apache.org/jira/browse/HBASE-19365
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.6
>Reporter: Hua Xiang
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 1.7.0
>
>
> In one of our cases, 1.2.0 based master could not start because the null 
> reference is not checked. Master crashed because of the following exception.
> {code}
> 2017-11-20 08:30:20,178 FATAL org.apache.hadoop.hbase.master.HMaster: Failed 
> to become active master
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:2993)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:494)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:821)
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:192)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1827)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24235) Java client with IBM JDK does not work if HBase is configured with Kerberos

2020-06-17 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-24235.

Resolution: Incomplete

> Java client with IBM JDK does not work if HBase is configured with Kerberos
> ---
>
> Key: HBASE-24235
> URL: https://issues.apache.org/jira/browse/HBASE-24235
> Project: HBase
>  Issue Type: Bug
>  Components: Client, java
>Affects Versions: 2.1.0
>Reporter: Mubashir Kazia
>Priority: Major
>
> When a java HBase client is run with IBM JDK connecting to a HBase cluster 
> configured with Kerberos Authentication, the client fails to connect to 
> HBase. The client is using {{UGI.loginUserFromKeytab(principal, keytab) }} to 
> get a Kerberos ticket and then it is creating create connection, table, 
> scanner and iterate. Code works fine with Oracle/Open JDK. It fails when run 
> with IBM JDK.
> Following exception is found in logs with DEBUG level logging:
> {code:java}
> DEBUG client.RpcRetryingCallerImpl: Call exception, tries=6, retries=11, 
> started=4700 ms ago, cancelled=false, msg=Call to 
> nightly6x-1.nightly6x.root.hwx.site/172.27.21.201:22101 failed on local 
> exception: javax.security.sasl.SaslException: Failure to initialize security 
> context [Caused by org.ietf.jgss.GSSException, major code: 13, minor code: 0
> major string: Invalid credentials
> minor string: SubjectCredFinder: no JAAS Subject], details=row 
> 'users,,99' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, 
> hostname=nightly6x-1.nightly6x.root.hwx.site,22101,1587511170413, seqNum=-1, 
> see https://s.apache.org/timeout, 
> exception=javax.security.sasl.SaslException: Call to 
> nightly6x-1.nightly6x.root.hwx.site/172.27.21.201:22101 failed on local 
> exception: javax.security.sasl.SaslException: Failure to initialize security 
> context [Caused by org.ietf.jgss.GSSException, major code: 13, minor code: 0
> major string: Invalid credentials
> minor string: SubjectCredFinder: no JAAS Subject] [Caused by 
> javax.security.sasl.SaslException: Failure to initialize security context 
> [Caused by org.ietf.jgss.GSSException, major code: 13, minor code: 0
> major string: Invalid credentials
> minor string: SubjectCredFinder: no JAAS Subject]]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:83)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:57)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:437)
> at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:220)
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
> at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
> at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
> at 
> org.apache.hadoop.hbase.ipc.BufferCallBeforeInitHandler.userEventTriggered(BufferCallBeforeInitHandler.java:92)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:326)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:312)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:304)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.userEventTriggered(DefaultChannelPipeline.java:1426)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:326)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:312)
> at 
> org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireUserEventTriggered(DefaultChannelPipeline.java:924)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.failInit(NettyRpcConnection.java:179)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection.saslNegotiate(NettyRpcConnection.java:197)
> at 
> 

[jira] [Resolved] (HBASE-23195) FSDataInputStreamWrapper unbuffer can NOT invoke the classes that NOT implements CanUnbuffer but its parents class implements CanUnbuffer

2020-06-12 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23195.

Hadoop Flags: Reviewed
  Resolution: Fixed

Thanks for the work, [~zhaoyim]! Sorry it took so long to get it committed. 
Thanks for sticking with it.

> FSDataInputStreamWrapper unbuffer can NOT invoke the classes that NOT 
> implements CanUnbuffer but its parents class implements CanUnbuffer 
> --
>
> Key: HBASE-23195
> URL: https://issues.apache.org/jira/browse/HBASE-23195
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 2.0.2
>Reporter: Zhao Yi Ming
>Assignee: Zhao Yi Ming
>Priority: Critical
> Fix For: 3.0.0-alpha-1, 2.3.0, 2.4.0, 2.2.6
>
>
> FSDataInputStreamWrapper unbuffer can NOT invoke the classes that NOT 
> implements CanUnbuffer but its parents class implements CanUnbuffer
> For example:
> There are 1 interface I1 and one class implements I1 named PC1 and the class 
> C1 extends from PC1
> If we want to invoke the C1 unbuffer() method the FSDataInputStreamWrapper 
> unbuffer  can NOT do that. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-19013) Wire up cmake build to maven build

2020-06-10 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-19013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-19013.

Resolution: Invalid

Nah, I don't think so :)

> Wire up cmake build to maven build
> --
>
> Key: HBASE-19013
> URL: https://issues.apache.org/jira/browse/HBASE-19013
> Project: HBase
>  Issue Type: Sub-task
> Environment: {quote}
> Hook up cmake with the general mvn build by providing a pom.xml in the 
> module. calling mvn compile or test with -Pnative should kick in the native 
> build with cmake. See Hadoop's integration for an example.
> {quote}
>Reporter: Josh Elser
>Priority: Major
> Fix For: HBASE-14850
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24066) Expose shaded clients through WebUI as Maven repository

2020-05-28 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-24066.

Resolution: Incomplete

Resolving this as I don't see a path forward for the change at this time.

> Expose shaded clients through WebUI as Maven repository
> ---
>
> Key: HBASE-24066
> URL: https://issues.apache.org/jira/browse/HBASE-24066
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
>
> Props to [~busbey] for this idea.
> We have a number of shaded jars which are (largely) sufficient for launching 
> any Java application against HBase. However, if users have multiple versions 
> of HBase in their organization, it might be confusing to know "which client" 
> do I need to use? Can we expose our shaded clients from HBase in such a way 
> that build tools can just ask HBase for the client jars they should use?
> The idea here is that we can use embedded Jetty to "fake" out a Maven 
> repository that users can put in their client applications. We have no extra 
> burden from HBase because we already are packaging these jars. I'll link an 
> example Maven application which uses this "feature".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24319) Clearly document how profiles for the sake of Hadoop compatibility work across all branches

2020-05-04 Thread Josh Elser (Jira)
Josh Elser created HBASE-24319:
--

 Summary: Clearly document how profiles for the sake of Hadoop 
compatibility work across all branches
 Key: HBASE-24319
 URL: https://issues.apache.org/jira/browse/HBASE-24319
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Josh Elser


In HBASE-24280, we investigated a test failure which was ultimately caused by 
the simultaneous activation of the (intended mutually exclusive) hadoop-2 and 
hadoop-3 profiles.

After master has moved to only supporting profile activation via the profile 
itself (rather than a system property) with the removal of the hadoop-2 
profile, the build was inadvertently broken as all branches (or is it just 2.x 
branches and master?) use the one build/yetus scripts in dev-support.

To make sure that these scripts continue to work against all branches, we need 
to have a clear decision on how profile activation is expected to work in our 
HBase build. Otherwise, we'll come back to this problem where each branch does 
things ever-so-slightly different, requiring a bunch of {{if branch-2; else if 
branch-2.2; else if branch-2.3}} type changes to our yetus scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24280) TestSecureRESTServer started failing in nightlies for Hadoop3

2020-04-28 Thread Josh Elser (Jira)
Josh Elser created HBASE-24280:
--

 Summary: TestSecureRESTServer started failing in nightlies for 
Hadoop3
 Key: HBASE-24280
 URL: https://issues.apache.org/jira/browse/HBASE-24280
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.3.0


[~ndimiduk] pointed out that, after this change went in, TestSecureRESTServer 
started failing with Hadoop3 on branch-2.3

https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/56/

Of course, I ran this with 1.8.0_241 and Maven 3.6.33 and it passed :) {{mvn 
clean package -Dtest=TestSecureRESTServer -Dhadoop.profile=3.0 
-DfailIfNoTests=false}}

FYI [~stoty] in case you can repro a failure and want to dig in. Feel free to 
re-assign.

It looks like we didn't have a nightly run of branch-2.2 due to docker 
container build issues. Will be interesting to see if it fails there. It did 
not fail the master nightly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24272) Backport "Implement proxyuser/doAs mechanism for hbase-http" to 2.1.x

2020-04-28 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-24272.

Resolution: Won't Fix

Crap. Thanks, Peter. Sorry for the time sink, Istvan.

> Backport "Implement proxyuser/doAs mechanism for hbase-http" to 2.1.x
> -
>
> Key: HBASE-24272
> URL: https://issues.apache.org/jira/browse/HBASE-24272
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Josh Elser
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 2.1.10
>
>
> Could you make the modification to your change for branch-2.1 please, 
> [~stoty]?
> If not, let me know and I'll dig in and fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24275) shell命令扫描无法使用

2020-04-28 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-24275.

Resolution: Invalid

Sorry for not being competent enough to reply in Chinese.

It sounds like you need some help using the HBase shell and converting String 
and byte[]. We reserve Jira for concrete changes to HBase. Your question is 
something best served using the user mailing lists. If you didn't know, there 
is also a user...@hbase.apache.org which is designated for communication in 
Chinese. Please ask your for assistance there. Thank you.

https://lists.apache.org/list.html?user...@hbase.apache.org

> shell命令扫描无法使用
> -
>
> Key: HBASE-24275
> URL: https://issues.apache.org/jira/browse/HBASE-24275
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 1.2.1
> Environment: hbase(main):001:0> scan 
> 'bill_2_20200216',\{LIMIT=>1,COLUMN=>['info:content:(org.apache.hadoop.hbase.util.Bytes).toString']}
> ROW COLUMN+CELL 
>  18675205573_200A9992000603031_1 column=info:content, 
> timestamp=1584028631493, value=\xE4\xB8\xBB\xE4\xBB\xBB\xEF\xBC\x8C\xE6\x82
>  _021623344100823069 
> \xA8\xE5\xA5\xBD\xE3\x80\x82\xE6\x82\xA8\xE6\x98\x8E\xE5\xA4\xA9\xEF\xBC\x882\xE6\x9C\x8817\xE6\
>  
> x97\xA5\xE6\x98\x9F\xE6\x9C\x9F\xE4\xB8\x80\xEF\xBC\x89\xE4\xB8\x8A\xE5\x8D\x88\xE7\x9A\x84\xE8\
>  
> xA1\x8C\xE7\xA8\x8B\xE5\xAE\x89\xE6\x8E\x92\xE5\xA6\x82\xE4\xB8\x8B\xEF\xBC\x9A\x0A9:30\xE6\x97\
>  
> xB6\xEF\xBC\x8C\xE5\x9C\xA8\xE5\xB8\x82\xE5\x95\x86\xE5\x8A\xA1\xE5\xB1\x805\xE6\xA5\xBC\xE5\xA4
>  
> \xA7\xE4\xBC\x9A\xE8\xAE\xAE\xE5\xAE\xA4\xEF\xBC\x8C\xE6\x94\xB6\xE5\x90\xAC\xE6\x94\xB6\xE7\x9C
>  
> \x8B\xE7\x9C\x81\xE6\x8E\xA8\xE8\xBF\x9B\xE5\xA4\x96\xE8\xB4\xB8\xE4\xBC\x81\xE4\xB8\x9A\xE5\xA4
>  
> \x8D\xE5\xB7\xA5\xE5\xA4\x8D\xE4\xBA\xA7\xE4\xBC\x9A\xE8\xAE\xAE\xE6\x9A\xA8\xE5\xA4\x96\xE8\xB4
>  
> \xB8\xE5\xBD\xA2\xE5\x8A\xBF\xE7\xA0\x94\xE5\x88\xA4\xE4\xBC\x9A\xE3\x80\x82\x0A\xE8\xAF\xB7\xE6
>  
> \x82\xA8\xE7\x9F\xA5\xE6\x82\x89\xE5\xB9\xB6\xE5\x8F\x82\xE5\x8A\xA0\xEF\xBC\x8C\xE8\xB0\xA2\xE8
>  
> \xB0\xA2\xE3\x80\x82\xEF\xBC\x88\xE8\x81\x94\xE7\xB3\xBB\xE4\xBA\xBA\xEF\xBC\x9A\xE6\x9D\xA8\xE4
>  
> \xBC\x9F\xE5\xBC\xBA\xEF\xBC\x8C\xE8\x81\x94\xE7\xB3\xBB\xE7\x94\xB5\xE8\xAF\x9D\xEF\xBC\x9A1831
>  8286014\xEF\xBC\x89 
> 1 row(s) in 0.4480 seconds
> hbase(main):002:0> get 
> 'bill_2_20200216','910010_13436291012_1_021617402200563234','info:content:toString'
> COLUMN CELL 
> 0 row(s) in 0.0610 seconds
> hbase(main):003:0> get 
> 'bill_2_20200216','18675205573_200A9992000603031_1_021623344100823069','info:content:toString'
> COLUMN CELL 
> 0 row(s) in 0.0170 seconds
> hbase(main):004:0> get 'bill_2_20200216',' 
> 18675205573_200A9992000603031_1_021623344100823069','info:content:toString'
> COLUMN CELL 
>  info:content timestamp=1584028631493, value=主任,您好。您明天(2月17日星期一)上午的行
>  安排如下:
> 9:30时,在市商务局5楼大会议室,收听收看省推进外贸企业
>  工复产会议暨外贸形势研判会。
> 请您知悉并参加,谢谢。(联系人:杨
>  ¼强,联系电话:18318286014)
>Reporter: lacsar
>Priority: Minor
>
> 在shell命令行查询字段值的时候想将byte转为string,用scan命令查询后没有效果,结果还是byte,但是get命令可以转为string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24252) Implement proxyuser/doAs mechanism for hbase-http

2020-04-27 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-24252.

Hadoop Flags: Reviewed
  Resolution: Fixed

Spun out a child-task for the 2.1.x backport.

Wrote some release notes for you, Istvan. Happy to modify if you have better.

Thanks for a very nice change!

> Implement proxyuser/doAs mechanism for hbase-http
> -
>
> Key: HBASE-24252
> URL: https://issues.apache.org/jira/browse/HBASE-24252
> Project: HBase
>  Issue Type: Improvement
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.4.0, 2.2.5
>
>
> The REST and Thrift interfaces for HBase already implement the standard 
> hadoop ProxyUser mechanism for SPNEGO, but it is not implemented in 
> hbase-httpserver.
> Implement it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24272) Backport "Implement proxyuser/doAs mechanism for hbase-http" to 2.1.x

2020-04-27 Thread Josh Elser (Jira)
Josh Elser created HBASE-24272:
--

 Summary: Backport "Implement proxyuser/doAs mechanism for 
hbase-http" to 2.1.x
 Key: HBASE-24272
 URL: https://issues.apache.org/jira/browse/HBASE-24272
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
Assignee: Istvan Toth
 Fix For: 2.1.10


Could you make the modification to your change for branch-2.1 please, [~stoty]?

If not, let me know and I'll dig in and fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24179) 2.x backports for HBASE-23881

2020-04-13 Thread Josh Elser (Jira)
Josh Elser created HBASE-24179:
--

 Summary: 2.x backports for HBASE-23881
 Key: HBASE-24179
 URL: https://issues.apache.org/jira/browse/HBASE-24179
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
Assignee: Josh Elser






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23881) TestShadeSaslAuthenticationProvider failures

2020-04-13 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23881.

Hadoop Flags: Reviewed
  Resolution: Fixed

Forgot to the close this out.

Thanks Duo and Bharath for your reviews.

> TestShadeSaslAuthenticationProvider failures
> 
>
> Key: HBASE-23881
> URL: https://issues.apache.org/jira/browse/HBASE-23881
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Bharath Vissapragada
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0
>
>
> TestShadeSaslAuthenticationProvider now fails deterministically with the 
> following exception..
> {noformat}
> java.lang.Exception: Unexpected exception, 
> expected but 
> was
>   at 
> org.apache.hadoop.hbase.security.provider.example.TestShadeSaslAuthenticationProvider.testNegativeAuthentication(TestShadeSaslAuthenticationProvider.java:233)
> {noformat}
> The test now fails a different place than before merging HBASE-18095 because 
> the RPCs are also a part of connection setup. We might need to rewrite the 
> test..  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23562) [operator tools] Add a RegionsMerge tool that allows for merging multiple adjacent regions until a desired number of regions is reached.

2020-04-10 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23562.

Fix Version/s: hbase-operator-tools-1.1.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Big thanks to [~bszabolcs] for completing this while Wellington is out.

Great work, both of you.

> [operator tools] Add a RegionsMerge tool that allows for merging multiple 
> adjacent regions until a desired number of regions is reached.
> 
>
> Key: HBASE-23562
> URL: https://issues.apache.org/jira/browse/HBASE-23562
> Project: HBase
>  Issue Type: New Feature
>  Components: hbase-operator-tools
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Minor
> Fix For: hbase-operator-tools-1.1.0
>
>
> There's been a few occasions where different customers had faced the need to 
> reduce the number of regions for specific tables, either due to mistakenly 
> pre-split or after a purge in table data. This jira is for adding a simple 
> merge tool that takes the table name and the desired number of regions, then 
> perform region merges until the total number of regions for the given table 
> reached the passed valued. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-20919) meta region can't be re-onlined when restarting cluster if opening rsgroup

2020-04-01 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-20919.

Resolution: Incomplete

> meta region can't be re-onlined when restarting cluster if opening rsgroup
> --
>
> Key: HBASE-20919
> URL: https://issues.apache.org/jira/browse/HBASE-20919
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer, master, rsgroup
>Affects Versions: 2.0.1
>Reporter: chenyang
>Assignee: ChenYang
>Priority: Major
>  Labels: balancer
> Attachments: HBASE-20919-branch-2.0-01.patch, 
> HBASE-20919-branch-2.0-02.patch, HBASE-20919-branch-2.0-02.patch, bug2.png, 
> hbase-hbase-master-bjpg-rs4729.yz02.no_02patch.log, 
> hbase-hbase-master-bjpg-rs4729.yz02.with_02patch.log, 
> hbase-hbase-master-bjpg-rs4730.yz02.log.test
>
>
> if you open rsgroup, hbase-site.xml contains  below configuration.
> {code:java}
> 
>   hbase.coprocessor.master.classes
>   org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint
> 
> 
>   hbase.master.loadbalancer.class/name>
>  org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer
> 
> {code}
> And you shut down the whole HBase cluster in the way:
>  # first shut down region server one by one
>  # shut down master
> Then you restart whole cluster in the way:
>  # start master
>  # start regionserver
> The hbase:meta region can not be re-online and the rsgroup can not be 
> initialized successfully.
>  master logs:
> {code:java}
> 2018-07-12 18:27:08,775 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  rsgroup.RSGro
> upInfoManagerImpl$RSGroupStartupWorker: Waiting for catalog tables to come 
> online
> 2018-07-12 18:27:08,876 INFO 
> [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-bjpg-rs4730.yz02,16000,1531389637409]
>  zookeeper.Met
> aTableLocator: Failed verification of hbase:meta,,1 at 
> address=bjpg-rs4732.yz02,60020,1531388712053, 
> exception=org.apache.hadoop.hbase.NotServingRegionExcepti
> on: hbase:meta,,1 is not online on bjpg-rs4732.yz02,60020,1531389727928
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3249)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3226)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1729)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:28286)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logs show that hbase:meta region is not online and rsgroup keeps retrying 
> to initialize.
>   
>  but why the hbase:meta region is not online?
>  The info-level logs and jstack had not enough infomation, so I added some 
> debug logs in test-source-code. Then i checked the master`s logs and region 
> server`s logs, and found the meta region assign procedure which hold the meta 
> region lock not completed and not released the lock forever, so the 
> recoverMetaProcedure could not be executed. 
>   
>  Why the first procedure not completed and not released meta region lock?
>  In the test logs, i found when assignmentManager assigned the region, it 
> need to call the rsgroup balancer which  have not been initialized 
> completely, so throw NPE.  As a result, the procedure not completed and not 
> released the lock forever.
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.generateGroupMaps(RSGroupBasedLoadBalancer.java:262)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupBasedLoadBalancer.roundRobinAssignment(RSGroupBasedLoadBalancer.java:162)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignmentPlans(AssignmentManager.java:1864)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.processAssignQueue(AssignmentManager.java:1809)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.access$400(AssignmentManager.java:113)
> at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager$2.run(AssignmentManager.java:1693)
> {code}
> !bug2.png!
> As shown in the figure named bug2.png listed in attachments, when we shutdown 
> the last region server, the master submit a ServerCrashProcedure. In the 
> procedure, it will reassign hbase:meta region, but at that moment, 

[jira] [Created] (HBASE-24066) Expose shaded clients through WebUI as Maven repository

2020-03-26 Thread Josh Elser (Jira)
Josh Elser created HBASE-24066:
--

 Summary: Expose shaded clients through WebUI as Maven repository
 Key: HBASE-24066
 URL: https://issues.apache.org/jira/browse/HBASE-24066
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Josh Elser
Assignee: Josh Elser


Props to [~busbey] for this idea.

We have a number of shaded jars which are (largely) sufficient for launching 
any Java application against HBase. However, if users have multiple versions of 
HBase in their organization, it might be confusing to know "which client" do I 
need to use? Can we expose our shaded clients from HBase in such a way that 
build tools can just ask HBase for the client jars they should use?

The idea here is that we can use embedded Jetty to "fake" out a Maven 
repository that users can put in their client applications. We have no extra 
burden from HBase because we already are packaging these jars. I'll link an 
example Maven application which uses this "feature".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23774) Announce user-zh list

2020-01-30 Thread Josh Elser (Jira)
Josh Elser created HBASE-23774:
--

 Summary: Announce user-zh list
 Key: HBASE-23774
 URL: https://issues.apache.org/jira/browse/HBASE-23774
 Project: HBase
  Issue Type: Task
  Components: website
 Environment: A
Reporter: Josh Elser
Assignee: Josh Elser


Let folks know about the new user-zh list that is dedicated for user questions 
in chinese (as opposed to the norm of english on user)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23768) Backport to 1.x

2020-01-29 Thread Josh Elser (Jira)
Josh Elser created HBASE-23768:
--

 Summary: Backport to 1.x
 Key: HBASE-23768
 URL: https://issues.apache.org/jira/browse/HBASE-23768
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 1.6.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-17115) HMaster/HRegion Info Server does not honour admin.acl

2020-01-29 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-17115.

Hadoop Flags: Reviewed
Release Note: 
Implements authorization for the HBase Web UI by limiting access to certain 
endpoints which could be used to extract sensitive information from HBase.

Access to these restricted endpoints can be limited to a group of 
administrators, identified either by a list of users 
(hbase.security.authentication.spnego.admin.users) or by a list of groups
(hbase.security.authentication.spnego.admin.groups).  By default, neither of 
these values are set which will preserve backwards compatibility (allowing all 
authenticated users to access all endpoints).

Further, users who have sensitive information in the HBase service 
configuration can set hbase.security.authentication.ui.config.protected to true 
which will treat the configuration endpoint as a protected, admin-only 
resource. By default, all authenticated users may access the configuration 
endpoint.
  Resolution: Fixed

PreCommit on 1.x looks like it's busted. Resolving this for now and will 
revisit a 1.x backport when I can figure out what's going on with precommit.

> HMaster/HRegion Info Server does not honour admin.acl
> -
>
> Key: HBASE-17115
> URL: https://issues.apache.org/jira/browse/HBASE-17115
> Project: HBase
>  Issue Type: Bug
>Reporter: Mohammad Arshad
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.2.3, 2.1.9
>
>
> Currently there is no way to enable protected URLs like /jmx,  /conf  only 
> for admins. This is applicable for both Master and RegionServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23765) Use 127.0.0.1 instead of localhost when setting up Kerberos-related tests

2020-01-29 Thread Josh Elser (Jira)
Josh Elser created HBASE-23765:
--

 Summary: Use 127.0.0.1 instead of localhost when setting up 
Kerberos-related tests
 Key: HBASE-23765
 URL: https://issues.apache.org/jira/browse/HBASE-23765
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: Josh Elser


[~ndimiduk] gave an ask over on HBASE-23760 to change some of the Hadoop-level 
configuration properties around secure cluster setup from localhost to 
127.0.0.1. He and [~bharathv] have been chasing some issues with ZooKeeper and 
not having a resolution of localhost to 127.0.0.1.

Before I start making a change, how sure are we that this is an issue? Assuming 
that it's the nightlies that we see these on, how about we make a change to 
increase the krb5 and spnego debugging to see if we aren't resolving names 
properly?

There might be a debug property for DNS lookups in Java too maybe?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23760) (2.1) Helper method to configure secure DFS cluster for tests

2020-01-28 Thread Josh Elser (Jira)
Josh Elser created HBASE-23760:
--

 Summary: (2.1) Helper method to configure secure DFS cluster for 
tests
 Key: HBASE-23760
 URL: https://issues.apache.org/jira/browse/HBASE-23760
 Project: HBase
  Issue Type: Task
  Components: security, test
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.1.9


Let's get [~weichiu]'s HBASE-20950 onto branch-2.1. It will make the backport 
for HBASE-17115 that much easier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23722) TestCustomSaslAuthenticationProvider failing in nightlies

2020-01-23 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23722.

Fix Version/s: 2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> TestCustomSaslAuthenticationProvider failing in nightlies
> -
>
> Key: HBASE-23722
> URL: https://issues.apache.org/jira/browse/HBASE-23722
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.3.0
>
>
> {noformat}
> 2020-01-22 21:15:57,250 DEBUG 
> [hconnection-0x5f0ea4c6-metaLookup-shared--pool15-t14] 
> client.RpcRetryingCallerImpl(132): Call exception, tries=10, retries=16, 
> started=38409 ms ago, cancelled=false, msg=Call to 
> a8b44f950ced/172.17.0.3:42595 failed on local exception: java.io.IOException: 
> java.lang.NullPointerException, details=row 
> 'testPositiveAuthentication,r1,99' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, hostname=a8b44f950ced,42595,1579726988645, 
> seqNum=-1, see https://s.apache.org/timeout, exception=java.io.IOException: 
> Call to a8b44f950ced/172.17.0.3:42595 failed on local exception: 
> java.io.IOException: java.lang.NullPointerException
>   at sun.reflect.GeneratedConstructorAccessor40.newInstance(Unknown 
> Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:220)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:378)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:91)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:409)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:405)
>   at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:117)
>   at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:132)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:422)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:316)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:91)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:571)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:42810)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:242)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:58)
>   at 
> org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:396)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:370)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.NullPointerException
>   at org.apache.hadoop.hbase.ipc.IPCUtil.toIOE(IPCUtil.java:154)
>   ... 17 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.security.provider.BuiltInProviderSelector.selectProvider(BuiltInProviderSelector.java:128)
>   at 
> org.apache.hadoop.hbase.security.provider.TestCustomSaslAuthenticationProvider$InMemoryProviderSelector.selectProvider(TestCustomSaslAuthenticationProvider.java:390)
>   at 
> org.apache.hadoop.hbase.security.provider.SaslClientAuthenticationProviders.selectProvider(SaslClientAuthenticationProviders.java:214)
>   at 
> org.apache.hadoop.hbase.ipc.RpcConnection.(RpcConnection.java:106)
>   at 
> org.apache.hadoop.hbase.ipc.BlockingRpcConnection.(BlockingRpcConnection.java:219)
> 

[jira] [Created] (HBASE-23722) TestCustomSaslAuthenticationProvider failing in nightlies

2020-01-22 Thread Josh Elser (Jira)
Josh Elser created HBASE-23722:
--

 Summary: TestCustomSaslAuthenticationProvider failing in nightlies
 Key: HBASE-23722
 URL: https://issues.apache.org/jira/browse/HBASE-23722
 Project: HBase
  Issue Type: Sub-task
Reporter: Josh Elser
Assignee: Josh Elser


{noformat}
2020-01-22 21:15:57,250 DEBUG 
[hconnection-0x5f0ea4c6-metaLookup-shared--pool15-t14] 
client.RpcRetryingCallerImpl(132): Call exception, tries=10, retries=16, 
started=38409 ms ago, cancelled=false, msg=Call to 
a8b44f950ced/172.17.0.3:42595 failed on local exception: java.io.IOException: 
java.lang.NullPointerException, details=row 
'testPositiveAuthentication,r1,99' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, hostname=a8b44f950ced,42595,1579726988645, 
seqNum=-1, see https://s.apache.org/timeout, exception=java.io.IOException: 
Call to a8b44f950ced/172.17.0.3:42595 failed on local exception: 
java.io.IOException: java.lang.NullPointerException
at sun.reflect.GeneratedConstructorAccessor40.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:220)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:378)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:91)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:409)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:405)
at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:117)
at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:132)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:422)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:316)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:91)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:571)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:42810)
at 
org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332)
at 
org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:242)
at 
org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:58)
at 
org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:396)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:370)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
at 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.hbase.ipc.IPCUtil.toIOE(IPCUtil.java:154)
... 17 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hbase.security.provider.BuiltInProviderSelector.selectProvider(BuiltInProviderSelector.java:128)
at 
org.apache.hadoop.hbase.security.provider.TestCustomSaslAuthenticationProvider$InMemoryProviderSelector.selectProvider(TestCustomSaslAuthenticationProvider.java:390)
at 
org.apache.hadoop.hbase.security.provider.SaslClientAuthenticationProviders.selectProvider(SaslClientAuthenticationProviders.java:214)
at 
org.apache.hadoop.hbase.ipc.RpcConnection.(RpcConnection.java:106)
at 
org.apache.hadoop.hbase.ipc.BlockingRpcConnection.(BlockingRpcConnection.java:219)
at 
org.apache.hadoop.hbase.ipc.BlockingRpcClient.createConnection(BlockingRpcClient.java:72)
at 
org.apache.hadoop.hbase.ipc.BlockingRpcClient.createConnection(BlockingRpcClient.java:38)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.getConnection(AbstractRpcClient.java:350)
at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:419)
... 16 more
 

[jira] [Resolved] (HBASE-23701) Make sure HBaseClassTestRule doesn't suffer same issue as HBaseClassTestRuleChecker

2020-01-17 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23701.

Hadoop Flags: Reviewed
  Resolution: Fixed

Thanks for the reviews, Bharath and Viraj!

> Make sure HBaseClassTestRule doesn't suffer same issue as 
> HBaseClassTestRuleChecker
> ---
>
> Key: HBASE-23701
> URL: https://issues.apache.org/jira/browse/HBASE-23701
> Project: HBase
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 3.0.0, 2.3.0, 2.2.3, 2.1.9
>
>
> [~bharathv] pointed out on HBASE-23695 
> ([https://github.com/apache/hbase/pull/1052]) that HBaseClassTestRule suffers 
> the same potential bug that I fixed in HBASE-23695 for 
> HBaseClassTestRuleChecker. Make sure the fix is in both places.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23701) Make sure HBaseClassTestRule doesn't suffer same issue as HBaseClassTestRuleChecker

2020-01-16 Thread Josh Elser (Jira)
Josh Elser created HBASE-23701:
--

 Summary: Make sure HBaseClassTestRule doesn't suffer same issue as 
HBaseClassTestRuleChecker
 Key: HBASE-23701
 URL: https://issues.apache.org/jira/browse/HBASE-23701
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser


[~bharathv] pointed out on HBASE-23695 
([https://github.com/apache/hbase/pull/1052]) that HBaseClassTestRule suffers 
the same potential bug that I fixed in HBASE-23695 for 
HBaseClassTestRuleChecker. Make sure the fix is in both places.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23347) Pluggable RPC authentication

2020-01-16 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23347.

Fix Version/s: 2.3.0
 Hadoop Flags: Reviewed
 Release Note: 
This change introduces an internal abstraction layer which allows for new 
SASL-based authentication mechanisms to be used inside HBase services. All 
existing SASL-based authentication mechanism were ported to the new 
abstraction, making no external change in runtime semantics, client API, or RPC 
serialization format.

Developers familiar with extending HBase can implement authentication mechanism 
beyond simple Kerberos and DelegationTokens which authenticate HBase users 
against some other user database. HBase service authentication (Master to/from 
RegionServer) continue to operate solely over Kerberos.
   Resolution: Fixed

Pushed to branch-2 and master. Thanks to everyone who played a part in 
reviewing this.

> Pluggable RPC authentication
> 
>
> Key: HBASE-23347
> URL: https://issues.apache.org/jira/browse/HBASE-23347
> Project: HBase
>  Issue Type: Improvement
>  Components: rpc, security
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 3.0.0, 2.3.0
>
>
> Today in HBase, we rely on SASL to implement Kerberos and delegation token 
> authentication. The RPC client and server logic is very tightly coupled to 
> our three authentication mechanism (the previously two mentioned plus simple 
> auth'n) for no good reason (other than "that's how it was built", best as I 
> can tell).
> SASL's function is to decouple the "application" from how a request is being 
> authenticated, which means that, to support a variety of other authentication 
> approaches, we just need to be a little more flexible in letting developers 
> create their own authentication mechanism for HBase.
> This is less for the "average joe" user to write their own authentication 
> plugin (eek), but more to allow us HBase developers to start iterating, see 
> what is possible.
> I'll attach a full write-up on what I have today as to how I think we can add 
> these abstractions, as well as an initial implementation of this idea, with a 
> unit test that shows an end-to-end authentication solution against HBase.
> cc/ [~wchevreuil] as he's been working with me behind the scenes, giving lots 
> of great feedback and support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23695) Fail more gracefully when test class is missing Category

2020-01-16 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23695.

Fix Version/s: 2.1.9
   2.2.3
   2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Fail more gracefully when test class is missing Category
> 
>
> Key: HBASE-23695
> URL: https://issues.apache.org/jira/browse/HBASE-23695
> Project: HBase
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 3.0.0, 2.3.0, 2.2.3, 2.1.9
>
>
> When a test class is missing a category, you might see an error such as:
> {noformat}
> [ERROR] Test mechanism  Time elapsed: 0.305 s  <<< ERROR!
> java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism  Time 
> elapsed: 0.102 s  <<< ERROR!
> java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism  Time 
> elapsed: 0.103 s  <<< ERROR!
> java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism  Time 
> elapsed: 0.102 s  <<< ERROR!
> java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism  Time 
> elapsed: 0.098 s  <<< ERROR!
> java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism :: 0 
> {noformat}
> You have to dig into the dump file to find out the actual error was:
> {noformat}
> org.apache.maven.surefire.testset.TestSetFailedException: Test mechanism :: 0
>         at 
> org.apache.maven.surefire.common.junit4.JUnit4RunListener.rethrowAnyTestMechanismFailures(JUnit4RunListener.java:192)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:167)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:377)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:138)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:465)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:451)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>         at 
> org.apache.hadoop.hbase.HBaseClassTestRuleChecker.testStarted(HBaseClassTestRuleChecker.java:44)
>         at 
> org.junit.runner.notification.RunNotifier$5.notifyListener(RunNotifier.java:156)
>         at 
> org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72)
>         at 
> org.junit.runner.notification.RunNotifier.fireTestStarted(RunNotifier.java:153)
>         at 
> org.apache.maven.surefire.common.junit4.Notifier.fireTestStarted(Notifier.java:100)
>         at 
> org.junit.internal.runners.model.EachTestNotifier.fireTestStarted(EachTestNotifier.java:42)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:364)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>         at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>         at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>         ... 4 more {noformat}
> We can fix this up to get a proper exception thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23695) Fail more gracefully when test class is missing Category

2020-01-15 Thread Josh Elser (Jira)
Josh Elser created HBASE-23695:
--

 Summary: Fail more gracefully when test class is missing Category
 Key: HBASE-23695
 URL: https://issues.apache.org/jira/browse/HBASE-23695
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser


When a test class is missing a category, you might see an error such as:
{noformat}
[ERROR] Test mechanism  Time elapsed: 0.305 s  <<< ERROR!
java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism  Time 
elapsed: 0.102 s  <<< ERROR!
java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism  Time 
elapsed: 0.103 s  <<< ERROR!
java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism  Time 
elapsed: 0.102 s  <<< ERROR!
java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism  Time 
elapsed: 0.098 s  <<< ERROR!
java.lang.ArrayIndexOutOfBoundsException: 0[ERROR] Test mechanism :: 0 
{noformat}
You have to dig into the dump file to find out the actual error was:
{noformat}
org.apache.maven.surefire.testset.TestSetFailedException: Test mechanism :: 0
        at 
org.apache.maven.surefire.common.junit4.JUnit4RunListener.rethrowAnyTestMechanismFailures(JUnit4RunListener.java:192)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:167)
        at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:377)
        at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:138)
        at 
org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:465)
        at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:451)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
        at 
org.apache.hadoop.hbase.HBaseClassTestRuleChecker.testStarted(HBaseClassTestRuleChecker.java:44)
        at 
org.junit.runner.notification.RunNotifier$5.notifyListener(RunNotifier.java:156)
        at 
org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72)
        at 
org.junit.runner.notification.RunNotifier.fireTestStarted(RunNotifier.java:153)
        at 
org.apache.maven.surefire.common.junit4.Notifier.fireTestStarted(Notifier.java:100)
        at 
org.junit.internal.runners.model.EachTestNotifier.fireTestStarted(EachTestNotifier.java:42)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:364)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
        ... 4 more {noformat}
We can fix this up to get a proper exception thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23679) FileSystem instance leaks due to bulk loads with Kerberos enabled

2020-01-13 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23679.

Hadoop Flags: Reviewed
Release Note: 
This issues fixes an issue with Bulk Loading on installations with Kerberos 
enabled and more than a single RegionServer. When multiple tables are involved 
in hosting a table's regions which are being bulk-loaded into, all but the 
RegionServer hosting the table's first Region will "leak" one 
DistributedFileSystem object onto the heap, never freeing that memory. 
Eventually, with enough bulk loads, this will create a situation for 
RegionServers where they have no free heap space and will either spend all time 
in JVM GC, lose their ZK session, or crash with an OutOfMemoryError.

The only mitigation for this issue is to periodically restart RegionServers. 
All earlier versions of HBase 2.x are subject to this issue (2.0.x, <=2.1.8, 
<=2.2.3)
  Resolution: Fixed

Thanks Wellington and Busbey for the reviews.

> FileSystem instance leaks due to bulk loads with Kerberos enabled
> -
>
> Key: HBASE-23679
> URL: https://issues.apache.org/jira/browse/HBASE-23679
> Project: HBase
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 3.0.0, 2.3.0, 2.1.9, 2.2.4
>
>
> Spent the better part of a week chasing an issue on HBase 2.x where the 
> number of DistributedFileSystem instances on the heap of a RegionServer would 
> grow unbounded. Looking at multiple heap-dumps, it was obvious to see that we 
> had an immense number of DFS instances cached (in FileSystem$Cache) for the 
> same user, with the unique number of Tokens contained in that DFS's UGI 
> member (one hbase delegation token, and two HDFS delegation tokens – we only 
> do this for bulk loads). For the user's clusters, they eventually experienced 
> 10x perf degradation as RegionServers spent all of their time in JVM GC (they 
> were unlucky to not have RegionServers crash outright, as this would've, 
> albeit temporarily, fixed the issue).
> The problem seems to be two-fold with changes by HBASE-15291 being largely 
> the cause. This issue tried to close FileSystem instances which were being 
> leaked – however, it did this by instrumenting the method 
> {{SecureBulkLoadManager.cleanupBulkLoad(..)}}. Two big issues with this 
> approach:
>  # It relies on clients to call this method (client's hanging up will leak 
> resources in RegionServers)
>  # This method is only called on the RegionServer hosting the first Region of 
> the table which was bulk-loaded into. For multiple RegionServers, they are 
> left to leak resources.
> HBASE-21342 later tried to fix an issue where FS objects were now being 
> closed prematurely via reference-counting (which appears to work fine), but 
> does not address the other two issues above. Point #2 makes debugging this 
> issue harder than normal because it doesn't manifest on a single node 
> instance :)
> Through all of this, I (re)learned the dirty history of UGI and how its 
> caching doesn't work so great HADOOP-6670. I see trying to continue to 
> leverage the FileSystem$CACHE as a potentially dangerous thing (we've been 
> back here multiple times already). My opinion at this point is that we should 
> cleanly create a new FileSystem instance during the call to 
> {{SecureBulkLoadManager#secureBulkLoadHFiles(..)}} and close it in a finally 
> block in that same method. This both simplifies the lifecycle of a FileSystem 
> instance in the bulk-load codepath but also helps us avoid future problems 
> with UGI and FS caching. The one downside is that we pay the penalty to 
> create a new FileSystem instance, but I'm of the opinion that we cross that 
> bridge when we get there.
> Thanks for [~jdcryans] and [~busbey] for their help along the way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23679) FileSystem instance leaks due to bulk loads with Kerberos enabled

2020-01-10 Thread Josh Elser (Jira)
Josh Elser created HBASE-23679:
--

 Summary: FileSystem instance leaks due to bulk loads with Kerberos 
enabled
 Key: HBASE-23679
 URL: https://issues.apache.org/jira/browse/HBASE-23679
 Project: HBase
  Issue Type: Bug
Reporter: Josh Elser
Assignee: Josh Elser


Spent the better part of a week chasing an issue on HBase 2.x where the number 
of DistributedFileSystem instances on the heap of a RegionServer would grow 
unbounded. Looking at multiple heap-dumps, it was obvious to see that we had an 
immense number of DFS instances cached (in FileSystem$Cache) for the same user, 
with the unique number of Tokens contained in that DFS's UGI member (one hbase 
delegation token, and two HDFS delegation tokens – we only do this for bulk 
loads). For the user's clusters, they eventually experienced 10x perf 
degradation as RegionServers spent all of their time in JVM GC (they were 
unlucky to not have RegionServers crash outright, as this would've, albeit 
temporarily, fixed the issue).

The problem seems to be two-fold with changes by HBASE-15291 being largely the 
cause. This issue tried to close
 FileSystem instances which were being leaked – however, it did this by 
instrumenting the method
 {{SecureBulkLoadManager.cleanupBulkLoad(..)}}. Two big issues with this 
approach:
 1. It relies on clients to call this method (client's hanging up will leak 
resources in RegionServers)
 2. This method is only called on the RegionServer hosting the first Region of 
the table which was bulk-loaded into. For
 multiple RegionServers, they are left to leak resources.

HBASE-21342 later tried to fix an issue where FS objects were now being closed 
prematurely via reference-counting (which appears to work fine), but does not 
address the other two issues above.

Through all of this, I (re)learned the dirty history of UGI and how its caching 
doesn't work so great HADOOP-6670. I see trying to continue to leverage the 
FileSystem$CACHE as a potentially dangerous thing (we've been back here 
multiple times already). My opinion at this point is that we should cleanly 
create a new FileSystem instance during the call to 
{{SecureBulkLoadManager#secureBulkLoadHFiles(..)}} and close it in a finally 
block in that same method. This both simplifies the lifecycle of a FileSystem 
instance in the bulk-load codepath but also helps us avoid future problems with 
UGI and FS caching. The one downside is that we pay the penalty to create a new 
FileSystem instance, but I'm of the opinion that we cross that bridge when we 
get there.

Thanks for [~jdcryans] and [~busbey] for their help along the way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23550) FSHLog crash with java.lang.IllegalArgumentException: offset (40) + length (8) exceed the capacity of the array: 41

2019-12-09 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23550.

Resolution: Invalid

You're running into PHOENIX-5455. You need a newer version of Phoenix that can 
handle the KeyValue changes referenced by HBASE-22034

> FSHLog crash with java.lang.IllegalArgumentException: offset (40) + length 
> (8) exceed the capacity of the array: 41
> ---
>
> Key: HBASE-23550
> URL: https://issues.apache.org/jira/browse/HBASE-23550
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.10
>Reporter: Alex Batyrshin
>Priority: Major
>
> We are using HBase-1.4.10 with Phoenix-4.14.2, so config file looks like:
> {code:java}
> 
> hbase.regionserver.wal.codec
> 
> org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
>  {code}
> Trying to dump wal
> {code:java}
> $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --dump 
> file:///tmp/hbase06-wal
> ...
> Sequence=146194465 , region=51e4797be6c6602054d4aaeb94e01526 at write 
> timestamp=Mon Dec 09 18:46:30 MSK 2019
> row=\x157809008119\x00\xC7\x02:;d\x02\x15\x46218513?jTGJ=u, 
> column=d:_0
> cell total size sum: 120
> row=\x157809008119\x00\xC7\x02:;d\x02\x15\x46218513?jTGJ=u, 
> column=d:d:apd
> cell total size sum: 128
> row=\x157809008119\x00\xC7\x02:;d\x02\x15\x46218513?jTGJ=u, 
> column=d:d:st
> cell total size sum: 120
> row=\x157809008119\x00\xC7\x02:;d\x02\x15\x46218513?jTGJ=u, 
> column=d:d:pt
> cell total size sum: 120
> row=\x157809008119\x00\xC7\x02:;d\x02\x15\x46218513?jTGJ=u, 
> column=d:d:gt
> cell total size sum: 128
> row=\x157809008119\x00\xC7\x02:;d\x02\x15\x46218513?jTGJ=u, 
> column=d:d:o
> cell total size sum: 128
> edit heap size: 784
> position: 49623
> Exception in thread "main" java.lang.IllegalArgumentException: offset (40) + 
> length (8) exceed the capacity of the array: 41
> at 
> org.apache.hadoop.hbase.util.Bytes.explainWrongLengthOrOffset(Bytes.java:779)
> at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:753)
> at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:738)
> at org.apache.hadoop.hbase.KeyValue.getTimestamp(KeyValue.java:1569)
> at org.apache.hadoop.hbase.KeyValue.getTimestamp(KeyValue.java:1560)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.toStringMap(WALPrettyPrinter.java:352)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.processFile(WALPrettyPrinter.java:297)
> at 
> org.apache.hadoop.hbase.wal.WALPrettyPrinter.run(WALPrettyPrinter.java:438)
> at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.main(FSHLog.java:2028){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23347) Pluggable RPC authentication

2019-11-27 Thread Josh Elser (Jira)
Josh Elser created HBASE-23347:
--

 Summary: Pluggable RPC authentication
 Key: HBASE-23347
 URL: https://issues.apache.org/jira/browse/HBASE-23347
 Project: HBase
  Issue Type: Improvement
  Components: rpc, security
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 3.0.0


Today in HBase, we rely on SASL to implement Kerberos and delegation token 
authentication. The RPC client and server logic is very tightly coupled to our 
three authentication mechanism (the previously two mentioned plus simple 
auth'n) for no good reason (other than "that's how it was built", best as I can 
tell).

SASL's function is to decouple the "application" from how a request is being 
authenticated, which means that, to support a variety of other authentication 
approaches, we just need to be a little more flexible in letting developers 
create their own authentication mechanism for HBase.

This is less for the "average joe" user to write their own authentication 
plugin (eek), but more to allow us HBase developers to start iterating, see 
what is possible.

I'll attach a full write-up on what I have today as to how I think we can add 
these abstractions, as well as an initial implementation of this idea, with a 
unit test that shows an end-to-end authentication solution against HBase.

cc/ [~wchevreuil] as he's been working with me behind the scenes, giving lots 
of great feedback and support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23265) Coprocessor restart after region split rollback

2019-11-06 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23265.

Resolution: Won't Fix

HBase 1.1 is marked as end-of-maintenance and not receiving more fixes. Same 
goes for HBase 1.2.

If you can validate that this is also a problem on a current HBase 1.x release, 
I would assume that we'd be interested in fixing it.

> Coprocessor restart after region split rollback
> ---
>
> Key: HBASE-23265
> URL: https://issues.apache.org/jira/browse/HBASE-23265
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.1.2
>Reporter: Ondrej Kvasnicka
>Priority: Minor
>
> According to our observation, a running coprocessor that is automatically 
> stopped prior to region split will not be automatically restarted for the 
> region that was about to be split in case the split attempt is rolled back.
> The expected behavior would be for the coprocessor to be automatically 
> restarted in such situation.
> According to [this 
> conversation|https://lists.apache.org/thread.html/9f83d32c50e0f9b05fff183897f5a3ccdb3c2e94b0e9d1a0f9064646@%3Cuser.hbase.apache.org%3E]
>  the lack of the automatic restart looks more like an omission rather than an 
> intended behavior.
> A possible work-around seems to be handling the restart explicitly in 
> RegionObserver#postRollBackSplit() implementation, such as in [this 
> example|https://github.com/splicemachine/spliceengine/blob/aa9d66e3d33ea04d67203eb86b3e6e3ba768e267/hbase_pipeline/src/main/java/com/splicemachine/derby/hbase/SpliceIndexObserver.java#L206].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-15519) Add per-user metrics

2019-10-23 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-15519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-15519.

Hadoop Flags: Reviewed
Release Note: Adds per-user metrics for reads/writes to each RegionServer. 
These metrics are exported by default. hbase.regionserver.user.metrics.enabled 
can be used to disable the feature if desired for any reason.
  Resolution: Fixed

Thanks for reviving this, Ankit. Pushed it to branch-2 and master.

> Add per-user metrics 
> -
>
> Key: HBASE-15519
> URL: https://issues.apache.org/jira/browse/HBASE-15519
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 1.2.0
>Reporter: Enis Soztutar
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HBASE-15519.master.003.patch, hbase-15519_v0.patch, 
> hbase-15519_v1.patch, hbase-15519_v1.patch, hbase-15519_v2.patch
>
>
> Per-user metrics will be useful in multi-tenant cases where we can emit 
> number of requests, operations, num RPCs etc at the per-user aggregate level 
> per regionserver. We currently have throttles per user, but no way to monitor 
> resource usage per-user. 
> Looking at these metrics, operators can adjust throttles, do capacity 
> planning, etc per-user. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23150) TestBulkLoadReplication is broken

2019-10-11 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-23150.

Resolution: Fixed

Came along to figure out why Wellington's initial patch was reverted :)

I see things are passing again in 
https://builds.apache.org/job/HBase-Flaky-Tests/job/master/4509/. Resolving 
this.

> TestBulkLoadReplication is broken
> -
>
> Key: HBASE-23150
> URL: https://issues.apache.org/jira/browse/HBASE-23150
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Peter Somogyi
>Assignee: Wellington Chevreuil
>Priority: Major
> Fix For: 3.0.0
>
>
> Test is failing. See 
> [https://builds.apache.org/job/HBase-Flaky-Tests/job/master/4506/testReport/org.apache.hadoop.hbase.regionserver/TestBulkLoadReplication/testBulkLoadReplicationActiveActive/]
> h3. Stacktrace
> java.lang.AssertionError at 
> org.apache.hadoop.hbase.regionserver.TestBulkLoadReplication.assertTableHasValue(TestBulkLoadReplication.java:295)
>  at 
> org.apache.hadoop.hbase.regionserver.TestBulkLoadReplication.assertBulkLoadConditions(TestBulkLoadReplication.java:275)
>  at 
> org.apache.hadoop.hbase.regionserver.TestBulkLoadReplication.testBulkLoadReplicationActiveActive(TestBulkLoadReplication.java:236)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23142) ZooKeeper-Jute missing from HBOSS shaded dependencies with ZK 3.5

2019-10-09 Thread Josh Elser (Jira)
Josh Elser created HBASE-23142:
--

 Summary: ZooKeeper-Jute missing from HBOSS shaded dependencies 
with ZK 3.5
 Key: HBASE-23142
 URL: https://issues.apache.org/jira/browse/HBASE-23142
 Project: HBase
  Issue Type: Bug
  Components: hboss
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: hbase-filesystem-1.0.0-alpha2


ZooKeeper 3.5 has a transitive dependency on a {{zookeeper-jute}} artifact. If 
this isn't on the classpath by some other means, you'll get an error similar to:

{noformat}
Failed construction RegionServer
java.lang.NoClassDefFoundError: 
org/apache/hadoop/hbase/oss/thirdparty/org/apache/jute/Record
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at 
org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.utils.Compatibility.(Compatibility.java:35)
at 
org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.CuratorFrameworkFactory$Builder.(CuratorFrameworkFactory.java:149)
at 
org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.CuratorFrameworkFactory$Builder.(CuratorFrameworkFactory.java:130)
at 
org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.CuratorFrameworkFactory.builder(CuratorFrameworkFactory.java:78)
at 
org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.CuratorFrameworkFactory.newClient(CuratorFrameworkFactory.java:104)
at 
org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.CuratorFrameworkFactory.newClient(CuratorFrameworkFactory.java:90)
at 
org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.initialize(ZKTreeLockManager.java:93)
at 
org.apache.hadoop.hbase.oss.sync.TreeLockManager.get(TreeLockManager.java:72)
at 
org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics.initialize(HBaseObjectStoreSemantics.java:122)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
at 
org.apache.hadoop.hbase.util.CommonFSUtils.getRootDir(CommonFSUtils.java:361)
at 
org.apache.hadoop.hbase.util.CommonFSUtils.isValidWALRootDir(CommonFSUtils.java:411)
at 
org.apache.hadoop.hbase.util.CommonFSUtils.getWALRootDir(CommonFSUtils.java:387)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.initializeFileSystem(HRegionServer.java:731)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:637)
at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:493)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2905)
at 
org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:236)
at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2923)
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.oss.thirdparty.org.apache.jute.Record
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 33 more
{noformat}

when trying to start up any HBase service.

Just need to the shade-plugin execution to include {{zookeeper-jute}}, too. I 
think Maven will complain (but not error) with the default ZooKeeper 3.4 
dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-22012) SpaceQuota DisableTableViolationPolicy will cause cycles of enable/disable table

2019-09-26 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-22012.

Hadoop Flags: Reviewed
  Resolution: Fixed

Another good bugfix here, Shardul. Well done.

> SpaceQuota DisableTableViolationPolicy will cause cycles of enable/disable 
> table
> 
>
> Key: HBASE-22012
> URL: https://issues.apache.org/jira/browse/HBASE-22012
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.1
>Reporter: Ajeet Rai
>Assignee: Shardul Singh
>Priority: Major
>  Labels: Quota, Space
> Fix For: 3.0.0, 2.3.0, 2.1.7, 2.2.2
>
>
> Space Quota: Policy state is getting changed from disable to Observance after 
> sometime automatically.
> Steps:
> 1: Create a table with space quota policy as Disable
> 2: Put some data so that table state is in space quota violation
> 3: So observe that table state is in violation
> 4: Now wait for some time
> 5: Observe that after some time table state is changing to to Observance 
> however table is still disabled
> edit (elserj): The table is automatically moved back from the violation state 
> because of the code added that tried to ride over RITs. When a Region is not 
> online (whether normally or abnormally), the RegionSizeReports are not sent 
> from RS to Master. Eventually, enough Regions are not reported which dips 
> below the acceptable threshold and we automatically move the table back to 
> the "acceptable" space quota state (not in violation). We could skip this 
> failsafe when we're checking for a quota that has the DisableTable violation 
> policy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23082) Backport low-latency snapshot tracking for space quotas to 2.x

2019-09-26 Thread Josh Elser (Jira)
Josh Elser created HBASE-23082:
--

 Summary: Backport low-latency snapshot tracking for space quotas 
to 2.x
 Key: HBASE-23082
 URL: https://issues.apache.org/jira/browse/HBASE-23082
 Project: HBase
  Issue Type: Improvement
  Components: Quotas
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.3.0, 2.1.7, 2.2.2


Some rebase'ing at $dayjob found that HBASE-18133 and HBASE-18135 never made it 
to any branch-2's.

There's no good reason that I can think of as to why we shouldn't backport 
these (will make quotas and snapshots much more accurate). As memory serves me, 
they just were landing when we were branching for early 2.0.0 releases.

Let's get backports for HBASE-18133, HBASE-18135, and HBASE-20531.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-22944) TableNotFoundException: hbase:quota is thrown when region server is restarted.

2019-09-20 Thread Josh Elser (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-22944.

Fix Version/s: 2.2.2
   2.1.7
   2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Thanks for the fix, Shardul!

> TableNotFoundException: hbase:quota  is thrown when region server is 
> restarted.
> ---
>
> Key: HBASE-22944
> URL: https://issues.apache.org/jira/browse/HBASE-22944
> Project: HBase
>  Issue Type: Bug
>  Components: Quotas
>Reporter: Shardul Singh
>Assignee: Shardul Singh
>Priority: Minor
> Fix For: 3.0.0, 2.3.0, 2.1.7, 2.2.2
>
>
> During Master startup if quota feature is enabled and region server is 
> running TableNotFoundException occurs in regionServer logs
> SpaceQuotaRefresherChore does not checks whether hbase:quota table is 
> present,before starting its operation which sometimes may result to 
> TableNotFoundExceptions like below.
> This is because master has not created the hbase:quota table and 
> SpaceQuotaRefresherChore is running.
> {code:java}
> java.io.UncheckedIOException: org.apache.hadoop.hbase.
> TableNotFoundException: hbase:quota
> at 
> org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:55)
>  at 
> org.apache.hadoop.hbase.quotas.SpaceQuotaRefresherChore.fetchSnapshotsFromQuotaTable(SpaceQuotaRefresherChore.java:170)
>  at 
> org.apache.hadoop.hbase.quotas.SpaceQuotaRefresherChore.chore(SpaceQuotaRefresherChore.java:84)
>  at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23011) AP stuck in retry loop if underlying table no longer exists

2019-09-10 Thread Josh Elser (Jira)
Josh Elser created HBASE-23011:
--

 Summary: AP stuck in retry loop if underlying table no longer 
exists
 Key: HBASE-23011
 URL: https://issues.apache.org/jira/browse/HBASE-23011
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.1.6, 2.0.6
Reporter: Josh Elser
Assignee: Josh Elser


Looking at a user's issue with [~wchevreuil]... While the details of how 
exactly we got into this situation are murky, I'm noticing that we have a 
situation where an AP can get stuck resubmitting itself over and over if, 
somehow, the table the region the AP is assigning gets deleted.
{noformat}
2019-08-25 23:33:54,588 WARN  [PEWorker-11] 
assignment.RegionTransitionProcedure: Failed transition, suspend 1secs 
pid=1100250, ppid=1100195, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; 
AssignProcedure table=, region=; rit=OFFLINE, 
location=null; waiting on rectified condition fixed by other Procedure or 
operator intervention
org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: 
monitoring:test1
at 
org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:215)
at 
org.apache.hadoop.hbase.master.assignment.AssignProcedure.assign(AssignProcedure.java:195)
at 
org.apache.hadoop.hbase.master.assignment.AssignProcedure.startTransition(AssignProcedure.java:206)
at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:364)
at 
org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:98)
at 
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:958)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1836)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1596)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:80)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2141)
 {noformat}
Stack trace looks like similar to the above.

The problem appears to be that we don't catch the 
{{TableStateNotFoundException}} coming out of 
{{TableStateManager#getTableState(TableName)}}. This keeps the AP in a 
fail/resubmit loop (until, presumably, someone comes along with an `HBCK2 
bypass`). This is only a problem in branch-2.0 and branch-2.1. 
{{TransitRegionStateProcedure}} in branch-2.2+ doesn't have the same issue (at 
least on the surface).

As mentioned earlier, it's not clear how we got this SCP(1100195)->AP(1100250) 
scheduled while the table itself is actually deleted. Some quick attempts to 
reproduce this locally weren't successful. I'm not sure if I can write a 
meaningful test. Need to try to look more closely at that, but will attach a 
patch which I think will work around the issue.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HBASE-22985) Gracefully handle invalid ServiceLoader entries

2019-09-06 Thread Josh Elser (Jira)
Josh Elser created HBASE-22985:
--

 Summary: Gracefully handle invalid ServiceLoader entries
 Key: HBASE-22985
 URL: https://issues.apache.org/jira/browse/HBASE-22985
 Project: HBase
  Issue Type: Bug
  Components: metrics
Reporter: Josh Elser
Assignee: Josh Elser


Just saw this happen: A RegionServer failed to start because, on the classpath, 
there was a {{META-INF/services}} entry in a JAR on the classpath that was 
advertising an implementation of 
{{org.apache.hadoop.hbase.metrics.MetricsRegistries}} but was an implementation 
of a completely different class:
{noformat}
Caused by: java.util.ServiceConfigurationError: 
org.apache.hadoop.hbase.metrics.MetricRegistries: Provider 
org.apache.ratis.metrics.impl.MetricRegistriesImpl not a subtype
at java.util.ServiceLoader.fail(ServiceLoader.java:239)
at java.util.ServiceLoader.access$300(ServiceLoader.java:185)
at 
java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at 
org.apache.hadoop.hbase.metrics.MetricRegistriesLoader.getDefinedImplemantations(MetricRegistriesLoader.java:92)
at 
org.apache.hadoop.hbase.metrics.MetricRegistriesLoader.load(MetricRegistriesLoader.java:50)
at 
org.apache.hadoop.hbase.metrics.MetricRegistries$LazyHolder.(MetricRegistries.java:39)
at 
org.apache.hadoop.hbase.metrics.MetricRegistries.global(MetricRegistries.java:47)
at 
org.apache.hadoop.hbase.metrics.BaseSourceImpl.(BaseSourceImpl.java:122)
at 
org.apache.hadoop.hbase.io.MetricsIOSourceImpl.(MetricsIOSourceImpl.java:46)
at 
org.apache.hadoop.hbase.io.MetricsIOSourceImpl.(MetricsIOSourceImpl.java:38)
at 
org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceFactoryImpl.createIO(MetricsRegionServerSourceFactoryImpl.java:84)
at org.apache.hadoop.hbase.io.MetricsIO.(MetricsIO.java:35)
at org.apache.hadoop.hbase.io.hfile.HFile.(HFile.java:195)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:570)
... 10 more{noformat}
Now, we could catch this and gracefully ignore it; however, this would mean 
that we're catching an Error which is typically considered a smell.

It's a pretty straightforward change, so I'm apt to think that it's OK. What do 
other folks think?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HBASE-22701) Better handle invalid local directory for DynamicClassLoader

2019-07-16 Thread Josh Elser (JIRA)
Josh Elser created HBASE-22701:
--

 Summary: Better handle invalid local directory for 
DynamicClassLoader
 Key: HBASE-22701
 URL: https://issues.apache.org/jira/browse/HBASE-22701
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 2.3.0, 2.2.1, 2.1.6


If you give HBase an {{hbase.local.dir}} (usually, "{{hbase.tmp.dir}}/local") 
which is not writable to it, you will get some weird errors on the scan path. I 
just saw this (again?) with Phoenix.

Specifically, the first attempt to reference DynamicClassLoader (via 
ProtobufUtil), will result in an ExceptionInInitializationError because the 
unchecked exception coming out of DynamicClassLoader's constructor interrupts 
the loading of {{DynamicClassLoader.class}}.
{noformat}
2019-07-14 06:25:34,284 ERROR 
[RpcServer.Metadata.Fifo.handler=12,queue=0,port=16020] 
coprocessor.MetaDataEndpointImpl: dropTable failed
org.apache.hadoop.hbase.DoNotRetryIOException: 
java.lang.ExceptionInInitializerError
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.translateException(RpcRetryingCallerImpl.java:221)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:194)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:387)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:361)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
at 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ExceptionInInitializerError
at 
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toFilter(ProtobufUtil.java:1598)
at 
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:1152)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2967)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3301)
at 
org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332)
at 
org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:242)
at 
org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:58)
at 
org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
... 7 more
Caused by: java.lang.RuntimeException: Failed to create local dir 
/hadoopfs/fs1/hbase/local/jars, DynamicClassLoader failed to init
at 
org.apache.hadoop.hbase.util.DynamicClassLoader.initTempDir(DynamicClassLoader.java:110)
at 
org.apache.hadoop.hbase.util.DynamicClassLoader.(DynamicClassLoader.java:98)
at 
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil$ClassLoaderHolder.lambda$static$0(ProtobufUtil.java:261)
at java.security.AccessController.doPrivileged(Native Method)
at 
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil$ClassLoaderHolder.(ProtobufUtil.java:260)
... 16 more
{noformat}
Every subsequent call will result in a NoClassDefFoundError, because we already 
tried to load DynamicClassLoader.class once and failed.
{noformat}
2019-07-14 06:25:34,380 ERROR 
[RpcServer.Metadata.Fifo.handler=2,queue=2,port=16020] 
coprocessor.MetaDataEndpointImpl: dropTable failed
org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.NoClassDefFoundError: 
Could not initialize class 
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil$ClassLoaderHolder
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.translateException(RpcRetryingCallerImpl.java:221)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:194)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:387)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:361)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
at 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
at 

[jira] [Resolved] (HBASE-22694) Use hbase.zookeeper.quorum if fs.hboss.sync.zk.connectionString is not defined

2019-07-15 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-22694.

  Resolution: Fixed
Hadoop Flags: Reviewed

Tested by hand using the following:
{noformat}
for type in local zk; do for version in 3 2; do mvn clean verify 
-Dhadoop.profile=$version -P$type; done; done
{noformat}

Pushed to master.

> Use hbase.zookeeper.quorum if fs.hboss.sync.zk.connectionString is not defined
> --
>
> Key: HBASE-22694
> URL: https://issues.apache.org/jira/browse/HBASE-22694
> Project: HBase
>  Issue Type: Improvement
>  Components: hboss
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: hbase-filesystem-1.0.0-alpha2
>
>
> In most cases, the ZooKeeper quorum that is used by HBase should be 
> sufficient for use by HBoss. Automatically use hbase.zookeeper.quorum if 
> fs.hboss.sync.zk.connectionString is not defined.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HBASE-22694) Use hbase.zookeeper.quorum if fs.hboss.sync.zk.connectionString is not defined

2019-07-15 Thread Josh Elser (JIRA)
Josh Elser created HBASE-22694:
--

 Summary: Use hbase.zookeeper.quorum if 
fs.hboss.sync.zk.connectionString is not defined
 Key: HBASE-22694
 URL: https://issues.apache.org/jira/browse/HBASE-22694
 Project: HBase
  Issue Type: Improvement
  Components: hboss
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: hbase-filesystem-1.0.0-alpha2


In most cases, the ZooKeeper quorum that is used by HBase should be sufficient 
for use by HBoss. Automatically use hbase.zookeeper.quorum if 
fs.hboss.sync.zk.connectionString is not defined.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (HBASE-22675) Use commons-cli from hbase-thirdparty

2019-07-15 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved HBASE-22675.

  Resolution: Fixed
Hadoop Flags: Reviewed

> Use commons-cli from hbase-thirdparty
> -
>
> Key: HBASE-22675
> URL: https://issues.apache.org/jira/browse/HBASE-22675
> Project: HBase
>  Issue Type: Task
>  Components: hbase-operator-tools
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: hbck2-1.0.0
>
>
> Noticed that hbck2 pulls in its own version of commons-cli, but is expecting 
> a specific version to be available on the server-side classpath.
> It would be an incremental improvement to use the commons-cli from 
> hbase-thirdparty.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


  1   2   3   4   >