Hey Luca,

Good point!
Helm itself is just a package manager for containerized applications for
k8s. I agree that Helm with k8s objects and controllers are sufficient in
most cases. But if we *really* need complex operations or reconciliation
for our stack, I believe the k8s operator would be a good candidate. And
also It looks like customizing Custom Resource for multiple env is not
trivial but some tools like Kustomize would be useful for customizing
manifests.

Thanks,
Youngwoo

On Sat, Nov 6, 2021 at 6:37 PM Luca Toscano <[email protected]> wrote:

> Hi Evans!
>
> On Tue, Nov 2, 2021 at 5:35 PM Evans Ye <[email protected]> wrote:
> >
> > Hi folks,
> >
> > With Bigtop 3.0 been released, I think it's time to discuss what's new as
> > our next steps. Of course the open source ver. of unified compatible
> Hadoop
> > Distro. is still our core product going forward. But the surrounding
> value
> > added features might be something that can take us further beyond where
> we
> > were at. Now, let me post some ideas to start the brainstorming.
> >
> > 1. Deployment on K8S: Ambari or Bigtop Puppet as K8S operators.
>
> I am wondering how complex it is to write a Kubernetes Operator (that
> I assume would be a go-based application that talks with the
> Kubernetes API) vs writing Helm charts (or similar). We use the latter
> extensively at Wikimedia (but not for any Hadoop-related configs) and
> it works really well.
> Tools like Helmfile (https://github.com/roboll/helmfile) are also very
> nice to bootstrap and manage different
> environments/clusters/configurations. The couple Helm+Helmfile seems
> to be more close to what Bigtop currently does with puppet, so it may
> be an alternative (before writing an Operator) to figure out how to
> handle configs.
> For example, how is the Operator going to apply/create/etc..
> configurations? I worked with Istio recently (https://istio.io/), and
> they offer tools that basically wrap Helm configurations (via binary
> client-side tool or K8s Operator) under the hood. I've never written a
> K8s operator so my understanding could be completely wrong!
>
> > 2. MLOps integrations: MLFlow, Submarine.
>
> At Wikimedia we are using KServe/Kubeflow, it may be a good addition
> to the list. We are using Openstack's Swift as object storage for
> models since it offers an S3 API, Apache Ozone could represent a very
> nice alternative (I saw some traction in the Jira, I'll try to
> help/review if needed!).
>
> > 3. Data Lake integrations: Hudi, Iceberg, Delta.
> +1, our plan is to experiment with Apache Iceberg very soon :)
>
> > And for some software engineering stuffs, I think we can do a clean up on
> > out-dated features such as:
> > 1. vagrant provisioner
> > 2. docker sandbox
> > 3. bigtop-ci
> > 4. bigtop-data-generators
> > 5. bigtop-bigpetstore
>
> Something else that would be nice:
> 1) Upgrade the Puppet version where needed (I know that Bigtop needs
> to keep compatibility with OS Distros that offer older versions of
> puppet etc..)
> 2) Migrate init.d scripts to systemd units where possible (for
> example, in Distros like Debian where it is fully supported).
>
> I understand that the above tasks are very complex and that require a
> lot of work :) They may not be super important given the above
> Kubernetes work to focus on, but I thought it was good to mention
> them!
>
> Thanks a lot for all the work!
>
> Luca
>

Reply via email to