[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-23 Thread Geoffrey Jacoby (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403225#comment-17403225
 ] 

Geoffrey Jacoby commented on HBASE-26158:
-

Sounds good. Thank you, I appreciate it.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-22 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402786#comment-17402786
 ] 

Duo Zhang commented on HBASE-26158:
---

On the home page of Apache HBase, we say that

{quote}
Apache HBase™ is the Hadoop database, a distributed, scalable, big data store.

Use Apache HBase™ when you need random, realtime read/write access to your Big 
Data. This project's goal is the hosting of very large tables -- billions of 
rows X millions of columns -- atop clusters of commodity hardware. Apache HBase 
is an open-source, distributed, versioned, non-relational database modeled 
after Google's Bigtable: A Distributed Storage System for Structured Data by 
Chang et al. Just as Bigtable leverages the distributed data storage provided 
by the Google File System, Apache HBase provides Bigtable-like capabilities on 
top of Hadoop and HDFS.
{quote}

So I do not think this is just what I dislike on something, it is the common 
sense of the HBase community.

And on the harm in real world, there was an example in the past, when someone 
wanted to introduce a StartMiniClusterOption so we do not need to keep so many 
methods in HBTU. But since HBTU was IA.Public, we could not remove these 
methods, so it was really a pain that, we wanted to clean up HBTU, but finally 
the result was we have more methods in HBTU...

But looking at MiniZooKeeperCluster, the interface has always been stable. The 
latest interface change is adding a new getAddress method, in this commit

https://github.com/apache/hbase/commit/3e1cf00c71ed3071fb8dc1a76c2b0e4c040ce16d#diff-94e247ecfdb233ed01ad9550f54bd169952a3583afe5656ace43169f7a952783

It is about one and half years ago. So I agree with you that, the real world 
harm for keeping MiniZooKeeperCluster IA.Public is small, at least for now.

Due to this fact, I think for now maybe we could still leave is as is, as it 
does not block other works on changing HBTU to IA.Private. Later if we have 
requirements where we want to change the interface or the behavior of this 
class, we could go back here and discuss again.

WDYT? Thanks.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-21 Thread Geoffrey Jacoby (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402711#comment-17402711
 ] 

Geoffrey Jacoby commented on HBASE-26158:
-

I stand by my point that so long as HBase has a mini ZooKeeper cluster, it 
should remain IA.Public so that downstream users don't have to wrestle with 
reconciling dependencies or maintaining internal forks. 

I don't think I've heard yet a practical example of something bad that occurs 
if it remains IA.Public -- you seem to dislike it because of a theory of what 
HBase "is" and "is not". Given that you don't want to get rid of or change the 
class, what's the real-world harm in keeping the IA as is? I've tried to give 
practical examples of bad things that will happen if it's no longer IA.Public. 

Here's another example, from just last week. Tephra is a subproject of Phoenix 
that does distributed transactions for both HBase and Phoenix. It was 
originally an incubating Apache project that Phoenix later adopted. Tephra 
chose to use Apache Twill's ZK minicluster for its tests. But Twill is in the 
Attic now and its dependencies are old. Trying to use newer versions of Tephra 
as dependencies with multiple legacy (4.x) Phoenix versions (which don't shade 
their dependencies very well) is _really_ hard. If Tephra had chosen to use 
HBase's ZK minicluster instead, this would be much easier. 

As you say, HBase is trying to get rid of ZK dependencies, so it's totally 
reasonable for HBase to deprecate its own MiniZookeeperCluster and adopt 
another project's. Then downstream projects can use the same well-tested 
solution that HBase uses; it's wouldn't be a lot of extra work for them. And if 
HBase ever does deprecate ZK support entirely, as I think Kafka is trying to 
do, of course any testing infrastructure around ZK would get deprecated too. No 
objection to either!

But in the meantime, downstream developers should be able to use the same APIs 
as HBase does to do the same sort of testing. Most tests are simple, but not 
all, and it's the hard ones I worry about.

Happy to continue the conversation here or on the dev list to get other 
opinions and perspectives if you'd like. I'd like to find consensus and avoid 
using my first committer veto if I can. But so far I haven't heard any benefit 
to HBase that outweighs the harm to downstream devs.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-21 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402617#comment-17402617
 ] 

Duo Zhang commented on HBASE-26158:
---

Any other concerns here? [~gjacoby] If you still think HBase should keep 
MiniZooKeeperCluster as IA.Public, you'd better send an email to the dev and 
user list to collect more feedbacks.

But for me, I stand my point that, 'HBase' is HBase, 'ZooKeeper' is ZooKeeper, 
it is not HBase's duty to provide a testing utility for ZooKeeper, especially 
that, we are not good at i, and the community effort is to depend less and less 
on ZooKeeper.

For a class marked as IA.Public, we will honor the compatibility guide, to keep 
it during a whole major version, and then remove it.

Thanks.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392618#comment-17392618
 ] 

Duo Zhang commented on HBASE-26158:
---

And on forking the code, if it is not designed to be used by downstream 
projects, forking it is the correct decision IMO.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392617#comment-17392617
 ] 

Duo Zhang commented on HBASE-26158:
---

No, you do not need to use mini zookeeper directly for most cases. HBTU will 
manage it for you. And what you describe above is you write code with both 
Kafka and HBase, I do not think this is the same with writing code inside the 
HBase code base. Still the same question, why not let Kafka to provide the 
ability to start a mini zookeeper cluster? Only HBase community needs to eat 
the dog food but Kafka community does not?

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Geoffrey Jacoby (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392613#comment-17392613
 ] 

Geoffrey Jacoby commented on HBASE-26158:
-

I don't know the details of the Hadoop and HBase community's decisions that led 
to the forking of metrics and http, so I can't offer an informed opinion on it. 
As a general rule, I consider permanent forks of code such as those cases a 
last resort. 

What you describe isn't dogfooding, at least as I understand the term. 

Say I need to coordinate between two in-process services via ZK, or to kill a 
ZK quorum to test some error path code. 

If I'm writing code for the HBase project itself, under this JIRA's proposal I 
would use the MiniZookeeperCluster.
If I'm writing code for anything else, under this JIRA's proposal I would need 
to use curator-test, and manage the dependency myself, or fork the 
MiniZookeeperCluster, and have to keep it up to date with any changes from 
upstream forever. 

Different people would be using different APIs to do the same thing. The fact 
that in _other_, simpler use cases they'd go through the same code path doesn't 
change this. 



> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392604#comment-17392604
 ] 

Duo Zhang commented on HBASE-26158:
---

I do not think we should shade curator, still I do not see why HBase should 
take care of zookeeper, if you say HBase is a platform, what about Hadoop? It 
also uses zookeeper and HBase is even built on it which means it is more like a 
platform than HBase, so why not rely on Hadoop to provide a zookeeper mini 
cluster? Just because it does not do this in the past so it is not its duty?

Technically on the curator version, it has different versions for zookeeper 
3.4.x and 3.5.x. It has already considered the compatibility issue, and also 
shaded a bunch of dependencies. This is its duty, to make end users easy to use 
zookeeper. I do not think on this topic HBase can do better than curator.

And on your current code base, I do not think it is a strong argument. I also 
have lots of code depend on the old HBTU, I also need to change them. If you 
think it is really a pain, just copy the mini zk cluster code into your own 
code base and use it. This is what we have done in HBase many times, copy code 
from Hadoop and use it by our own. The HBase metrics and HBase http module are 
both started by copying code from Hadoop.

And on the last ‘dog fooding’, technically, we just wrap the HBTU and provide a 
simpler interface to end users, so I do not think we break the rule, as we just 
want to hide the internal things which should not be exposed to end users.

Thanks.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Geoffrey Jacoby (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392574#comment-17392574
 ] 

Geoffrey Jacoby commented on HBASE-26158:
-

As for why I think it's not good to have one ZK API for internal use cases and 
another for end-users:

HBase isn't just a fast, powerful key-value storage engine. It's also a 
platform (and platform component) on which other developers build applications. 

One of the principles of platform development is that when an internal use case 
and an external use case are both trying to do the same thing, they should 
always use the same API [1]. This allows platform developers (such as the HBase 
community) to experience the ease or difficulties that end-users experience 
when using their platform.

[1] https://deviq.com/practices/dogfooding 

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Geoffrey Jacoby (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392569#comment-17392569
 ] 

Geoffrey Jacoby commented on HBASE-26158:
-

I agree the curator version compatibility is a big issue here too. There's a 
curator.version in the HBase pom.xml, but it only seems to be used for a coproc 
example. There's little to no guidance to end users about what version of 
curator are compatible with what versions of HBase. 

Again, I think [~zhangduo] makes a good point that with a publicly available ZK 
minicluster, the MiniZookeeperCluster may no longer be necessary. It's working 
fine both internally and externally, so I don't feel strongly that it needs to 
go, but I wouldn't mind if it did -- so long as it was deprecated both 
externally AND internally, via the usual deprecation method. 

I believe that would be, in HBase 3.0:
1. Take on curator-test as a formal dependency of HBase. This requires figuring 
out which version to use, excluding conflicting transitive dependencies, etc 
2. Probably shade it into hbase-thirdparty? 
3. Do HBASE-26167 to allow for overriding ZK with an external embedded ZK 
quorum.
4. Switch TestingHBaseCluster and HBTU to use curator-test by default
5. Fix any HBase tests that break because of this effort
6. Fix any bugs found in curator-test because of this effort
5. Deprecate the MiniZookeeperCluster and the methods in HBTU that reference it 
(but still support them during the 3.x cycle). 

In HBase 4.0:
1. Delete the MiniZookeeperCluster and the methods in HBTU that reference it. 

At this point, HBase will have offloaded its embedded ZK onto a ZK-based 
project, similarly to how MiniDFSCluster is I believe part of HDFS. Which would 
be good. 

On the other hand, if that sounds like a lot of work compared to the 
benefit...please remember that that aside from HBASE-26167 and the actual 
deprecation JIRA, that's almost the same work you're asking end-users to do. 

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Bharath Vissapragada (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392420#comment-17392420
 ] 

Bharath Vissapragada commented on HBASE-26158:
--

As Geoffrey mentioned, we use this class quite extensively as shared ZK test 
instance across multiple mini clusters and other services. It is a very handy 
wrapper that is tightly integrated with mini cluster and handles all the boiler 
plate code for us. We also don't have to worry about curator version 
compatibility with hbase. 

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392297#comment-17392297
 ] 

Duo Zhang commented on HBASE-26158:
---

The project name tells everything, we are ‘HBase’, not ‘Zookeeper’, and on your 
problem, if it is something about how to config HBase to make use of an 
external zookeeper, then it certainly should be asked in HBase community. If 
you ask something about the TestingServer itself, then you should ask the 
curator community. IMO, if you ask something about MiniZookeeperCluster in the 
HBase community, you will likely get an answer about just use TestingServer in 
curator.

On why MiniZookeeperCluster is IA.Public, it is easy, HBTU is public in the 
past, so in general all the related things you can get from HBTU should be 
IA.Public, and finally we found that this is not the correct way, this is why 
we have the parent issue here.

And on the changing of the existing testing code, we can copy the code to 
HBase-testing-util module and keep it for the whole 3.x, so you have plenty of 
time to change you code. And finally, you can use IA.Private class in HBase, no 
one can stop you, just like what we have done in HBase to implement AsyncWAL, 
we used a lot of internal classes in Hadoop. Just take your own risk.

And I really do not see any problems that we have our own internal test way 
while providing another way for end users. We need to test internal of HBase, 
so we need to expose lots of internal stuff in our own testing framework, but I 
do not think we need to expose this to end users, and end users should not rely 
on this. Maybe in the future we may change some internal implementations and 
lots of UTs are not useful any more, but since the we have exposed the internal 
stuff in HBTU then we can not change our code? That’s really painful. But if we 
changed HBTU in this way, or just keep the interface the same but the behavior 
is completely changed, then what is the idfference between marking it IA.Public 
and IA.Private?

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Geoffrey Jacoby (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392282#comment-17392282
 ] 

Geoffrey Jacoby commented on HBASE-26158:
-

You keep saying "HBase is not designed to do X" but not providing evidence. 
HBase is the result of many people's decisions and designs -- how do you know 
it's not designed to do this? The code and annotation certainly indicates it 
was, and the code is the concrete form of a design.

The problem with having HBase continue to use MiniZookeeperCluster internally, 
but allowing TestingServer to be passed in via config, is that it makes tests 
using it that way unsupported. My worry is that if I have a problem, the HBase 
community might say, "TestingServer is owned by curator, we don't use it at 
all, so go ask the Curator community", while the Curator community would know 
nothing about the HBase minicluster, and probably point me back to HBase.

Now, if we want the HBase minicluster to formally adopt curator-test internally 
and deprecate MiniZookeeperCluster because MiniZookeeperCluster is now 
redundant given curator-test's existence, that sounds fine -- but having a 
supported option for internal tests and an unsupported option for end-user 
tests just seems like it would lead to lots of problems for end-users.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392278#comment-17392278
 ] 

Duo Zhang commented on HBASE-26158:
---

Filed HBASE-26167.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392276#comment-17392276
 ] 

Duo Zhang commented on HBASE-26158:
---

{quote}
The reason why we can't use curator-test is that there's no way I can see to 
inject HBaseTestingUtility with a curator-test TestingServer instead of an 
MiniZookeeperCluster. So we have to inject a MiniZookeeperCluster into the 
custom process's in-process test harness. The test just wouldn't work with 
curator-test.
{quote}

So the solution is very clear then. We could provide configs for 
TestingHBaseCluster to not start a mini zookeeper cluster and the mini dfs 
cluster. And I need to correct you that, you do not need to inject the HBTU 
with a curator-test TestingServer, you just need to set configurations to let 
HBase use the zookeeper you start outside HBTU, it is 'hbase.zookeeper.quorum'. 
You need to set this when actually starting a hbase cluster in production 
right? Just set it in UT too.

Again, HBase is not designed to provide a utility class to start zookeeper, it 
is the duty for the zookeeper related projects.

Thanks.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Geoffrey Jacoby (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392269#comment-17392269
 ] 

Geoffrey Jacoby commented on HBASE-26158:
-

A correction: the Kafka minicluster in my example above is using the Kafka 
EmbeddedZookeeper class and not sharing a ZK quorum with the HBase minicluster 
or custom process. (Which do share with each other) So let's ignore it for this 
discussion. 

The reason why we can't use curator-test is that there's no way I can see to 
inject HBaseTestingUtility with a curator-test TestingServer instead of an 
MiniZookeeperCluster. So we have to inject a MiniZookeeperCluster into the 
custom process's in-process test harness. The test just wouldn't work with 
curator-test. (There _definitely_ isn't a way to do it with the 
TestingHBaseCluster, which is one of many reasons I proposed HBASE-26115.)

HBase didn't have a "duty" to make MiniZookeeperCluster -- it could use 
curator-test internally and allow tests to inject a TestingServer in the HBTU 
-- but it did, and deliberately exposed it as IA.Public, and downstream users 
are allowed to rely on that. 

It _is_ literally designed to allow end users the ability to start a ZK to 
write UT. That's what IA.Public means. :-)





> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-03 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392028#comment-17392028
 ] 

Duo Zhang commented on HBASE-26158:
---

OK, here comes the question. Since you share a zk across Kafka, HBase, and a 
custom process, then why it is the duty for HBase to provide the mini zk 
cluster, not Kafka?

As I mentioned in the description, HBase is not designed to give end users the 
ability to start a zk to write UT right? Curator-test is for this purpose, why 
not just use curator-test?

Thanks.

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26158) MiniZooKeeperCluster should not be IA.Public

2021-08-02 Thread Geoffrey Jacoby (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391939#comment-17391939
 ] 

Geoffrey Jacoby commented on HBASE-26158:
-

[~zhangduo] - I'm not sure that this is a correct assumption. 

For example, my colleague [~bharathv] recently wrote an IT test that used the 
MiniZooKeeperCluster to share a ZK quorum between an in-memory Kafka instance, 
an HBase minicluster, and a custom process implementing the HBase RegionServer 
API. (It's part of a project that will be contributed to Phoenix when it's 
closer to final form.)  

And I can see other valid test cases for using the MiniZooKeeperCluster, such 
as verifying that a custom coproc or replication endpoint (particularly the 
endpoint) don't leak resources or do other bad things if ZK connectivity is 
lost. 

> MiniZooKeeperCluster should not be IA.Public
> 
>
> Key: HBASE-26158
> URL: https://issues.apache.org/jira/browse/HBASE-26158
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, test
>Reporter: Duo Zhang
>Priority: Major
>
> End users do not need to test HBase when zookeeper is broken. And if users 
> want to start only a zookeeper cluster, they can just use curator-test, so I 
> do not think we should expose this class as IA.Public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)