Re: [DISCUSS] Kubernetes Orchestration for ZK, HDFS, and HBase

Andrew Purtell Wed, 22 Mar 2023 08:51:47 -0700

We have implemented deployments secured with Kerberos. It is no less painful on 
kubernetes.


Our team at $dayjob does not operate or manage the Kerberos service itself, I 
suppose that shifts the pain. I could not give advice on how best to operate 
krb on k8s beyond “don’t do it”.

Consider an encrypted overlay network for transport security instead, perhaps 
based on istio or envoy. HBase does have a nifty new mTLS based RPC in branch-2 
that does need to make its way into a released version first but would be what 
I’d point to as a starting point for native transport security if that is 
desired, but this only addresses the HBase layer, you would still need to 
secure the dependencies (like hdfs), and tackling this at the network layer 
instead would be simpler. This will not help you if you will have an 
adversarial relationship with your service clients and require more than simple 
auth, though. Security is going to be something each user must architect for 
themselves but I think we could maybe design and offer some configuration 
primitives to aid them. I know how to do that with helm but not kustomize but 
could look into it. 

> On Mar 22, 2023, at 4:46 AM, Nick Dimiduk <[email protected]> wrote:
> 
> Indeed, as with Andrew's comments, we have encountered several "usual
> suspects" in terms of network issues. Many of our fixes have either landed
> upstream or are waiting in PRs that need to be merged. I'm sure there are
> lingering issues, and I'm sure that new issues will arise as we run on more
> Kubernetes implementations. For now though, things seem solid enough. Indeed,
> we are on a stack of ZooKeeper 3.8, Hadoop 3.3, HBase 2.5. We've also
> implemented a basic Chaos Monkey Cluster Manager for interacting with pods,
> and re-cast ITBLL with the Map tasks as a Kubernetes Job and the Verify
> task as a Spark application. In short, I'd like to provide a set of
> primitives that can give us a lot of confidence in the functionality of the
> systems in question.
> 
> We have not attempted to tackle kerberos. It seems to me like a bad
> technology choice for a containerized environment, but I'll be the first to
> admit my ignorance in this area and I'm quite prepared to be wrong. There
> have been some promising experiments using Cert Manager, but they should be
> vetted by a wider audience. I hope that such an effort could lead to a
> review and abstraction of how authentication and authorization are handled
> in Hadoop generally.
> 
>> On Fri, Mar 17, 2023 at 6:51 PM Andrew Purtell <[email protected]> wrote:
>> 
>> We have also completed a migration to kubernetes (more specifically, EKS)
>> at $dayjob. Our basic deployment stack is helm, terraform, and spinnaker,
>> so we took a different approach in implementation. I think we would be
>> happy to review and offer advice/opinion/suggestion on points of
>> commonality, though.
>> 
>> Then like Stephen asked, as we moved our stack onto k8s we did encounter a
>> few issues, like inappropriate caching of DNS resolutions in Hadoop and
>> HBase. The issues I know of in that regard have been fixed in open source
>> as long as you are running the latest --- Hadoop 3.3, HBase 2.5, Zookeeper
>> 3.7 or 3.8, etc. But there will be I think common cause and interest in
>> addressing any remaining challenges that may reveal themselves once we have
>> KIND or Minicube running on ASF infrastructure and the HBase stack deployed
>> on top of it.
>> 
>> 
>> On Fri, Mar 17, 2023 at 9:52 AM Tak Lon (Stephen) Wu <[email protected]>
>> wrote:
>> 
>>> Hi Nick,
>>> 
>>> Other than your concerns on the Apache infrastructure, our end are
>>> interests and discussed internally about how HBase could be deployed in
>>> K8s, but would it be possible to share any architecture diagram or design
>>> documentation?
>>> 
>>> Although moving ZK, HDFS, HBase sounds like as simple as using Kustomize,
>>> we have few questions on the details.
>>> 
>>> 1. After deploying in K8s, did you introduce any change in HBase to
>> support
>>> multi-tenancy other than using AD / Kerberous ?
>>> 2. Did you guys face any challenge after HDFS deploying in K8s? e.g.
>>> Network topology (hostname/IP problem) or performance issue?
>>> 
>>> That's a bit detail and we can take above question offline if we have a
>>> JIRA or other share documentation as well.
>>> 
>>> Thanks,
>>> Stephen
>>> 
>>> 
>>> 
>>> On Fri, Mar 17, 2023 at 5:10 AM Nick Dimiduk <[email protected]>
>> wrote:
>>> 
>>>> Thank you both for your comments. I'm happy to hear that there is some
>>>> interest in this pursuit.
>>>> 
>>>> Before we get too deep into details of the implementation, I'm
>> concerned
>>>> that we have support from ASF Infra. Can we install KIND or MiniKube on
>>>> Jenkins worker hosts? Can we get a full cluster provisioned in some
>>>> environment, 3PC or otherwise? How can we manage permissions and
>> resource
>>>> access? Do we want this to be confined to CI, or will we grant
>>> permissions
>>>> to committers as well?
>>>> 
>>>> On Tue, Mar 14, 2023 at 6:18 PM Lars Francke <[email protected]>
>>>> wrote:
>>>> 
>>>>> Hi Nick,
>>>>> 
>>>>> I do not mean to derail your mail so I'll keep mine short: Yes, I
>>>>> think testing & infrastructure on Kubernetes would be worthwhile and
>> I
>>>>> thank you for the offer.
>>>>> We're happy to take a look and would try to review any incoming
>>>>> contributions depending on how large/digestable they are :)
>>>>> 
>>>>> We[1] are developing our own operator[2] and during that mission have
>>>>> learned how prone Kubernetes is to bit rot so it'd be great if your
>>>>> team would continue helping out.
>>>>> I'm happy to go into details on why we did what we did but that'd be
>> a
>>>>> separate thread.
>>>>> 
>>>>> Cheers,
>>>>> Lars
>>>>> 
>>>>> [1] <https://stackable.tech/en/>
>>>>> [2] <https://github.com/stackabletech/hbase-operator>
>>>>> 
>>>>> On Tue, Mar 14, 2023 at 3:23 PM Mallikarjun <
>> [email protected]>
>>>>> wrote:
>>>>>> 
>>>>>> Hi Nick,
>>>>>> 
>>>>>> I agree with your thought that there is an increasing reliance on
>>>>>> kubernetes, more so for complex workloads like hbase deployments
>>>> because
>>>>> of
>>>>>> unavailability of reliable automation frameworks outside of k8s.
>>>>>> 
>>>>>> But I have a slightly different view in terms of how to achieve it.
>>>> When
>>>>> I
>>>>>> was exploring what are the possibilities such as kustomize or helm
>> or
>>>>>> operator. I found it can get pretty complex in terms of writing
>>>>> extensible
>>>>>> deployment manifest (for different kinds of deployments) with tools
>>>> like
>>>>>> kustomize or helm. Here is our attempt to conterairise hbase with
>>>>> operator
>>>>>> --> https://github.com/flipkart-incubator/hbase-k8s-operator
>>>>>> 
>>>>>> ---
>>>>>> Mallikarjun
>>>>>> 
>>>>>> On Mon, Mar 13, 2023 at 3:58 PM Nick Dimiduk <[email protected]>
>>>>> wrote:
>>>>>> 
>>>>>>> Heya team,
>>>>>>> 
>>>>>>> Over here at $dayjob, we have an increasing reliance on
>> Kubernetes
>>>> for
>>>>>>> both development and production workloads. Our tools are maturing
>>> and
>>>>>>> we're hoping that they might be of interest to the wider
>> community.
>>>>>>> I'd like to see if there's community interest in receiving
>> some/any
>>>> of
>>>>>>> them as a contribution. I think we'll also need a plan from ASF
>>> Infra
>>>>>>> that makes kubernetes available to us as a project.
>>>>>> 
>>>>>> 
>>>>>>> We have implemented a basic stack of tools for orchestrating ZK +
>>>> HDFS
>>>>>>> + HBase on Kubernetes. We use this for running a small local dev
>>>>>>> cluster via MiniKube/KIND ; for ITBLL on smallish distributed
>>>> clusters
>>>>>>> in a public cloud ; and in production for running clusters of
>> ~100
>>>>>>> Data Nodes/Region Servers in a public cloud. There was an earlier
>>>>>>> discussion about using our donation of test hardware for running
>>> more
>>>>>>> thorough tests in our CI, but one of the limiting factors is full
>>>>>>> cluster deployment. I hope that the community might be interested
>>> in
>>>>>>> receiving this tooling as a foundation for more rigorous
>>> correctness
>>>>>>> and maybe even performance tests in the open. Furthermore,
>> perhaps
>>>> the
>>>>>>> wider community has interest in an Apache licensed cluster
>>>>>>> orchestration tool for other uses.
>>>>>>> 
>>>>>>> Now for some details: The implementation is built on Kustomize,
>> so
>>>>>>> it's fundamentally transparent resource specification with yaml
>>>>>>> patches for composability; this is in contrast to a solution
>> using
>>>>>>> templates with defined capabilities and interfaces. There is no
>>>>>>> operator ; it's all coordinated via init/bootstrap containers,
>>> shell
>>>>>>> scripts, shared volumes for state, &c. For now.
>>>>>> 
>>>>>> 
>>>>>>> Such a donation will amount to a code drop, which will have its
>>>>>>> challenges. I'm motivated via internal processes to carve it into
>>>>>>> smaller pieces, and I think that will benefit community review as
>>>>>>> well. Perhaps this approach could be used to make the
>> contribution
>>>> via
>>>>>>> a feature branch.
>>>>>>> 
>>>>>>> Is there community interest in adding such a capability to our
>>>>>>> maintained responsibilities? I'd hope that we have several
>>> volunteers
>>>>>>> to work with me through the contribution process, and who are
>>>>>>> reasonably confident that they'll be able to help maintain such a
>>>>>>> capability going forward. We'll also need someone who can work
>> with
>>>>>>> Infra to get us access to Kubernetes cluster(s), via whatever
>>> means.
>>>>>>> 
>>>>>>> What do you think?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Nick & the HBase team at Apple
>>>>> 
>>>> 
>>> 
>> 
>> 
>> --
>> Best regards,
>> Andrew
>> 
>> Unrest, ignorance distilled, nihilistic imbeciles -
>>    It's what we’ve earned
>> Welcome, apocalypse, what’s taken you so long?
>> Bring us the fitting end that we’ve been counting on
>>   - A23, Welcome, Apocalypse
>>

Re: [DISCUSS] Kubernetes Orchestration for ZK, HDFS, and HBase

Reply via email to