Hey Evans, My comments inline.
Could you elaborate what component is needed for the real-time event > streaming? Kafka + Flink in current stack are the solution for it. Pulsar > can be an addition. I believe Kafka is the most common choice for oss message broker / event hub. As you know, Flink(or Spark / Kafka Streams itself) + Kafka are everywhere. Apache Pulsar would be an alternative for Kafka. Regarding query processing, Could you share more insight on the difference > of Pinot V.S. Presto? Does Pinot suitable for having Looker plugged in > front of it for analytical purposes? Apache Pinot and Druid are designed for real-time and time-series data analytics, whereas PrestoDB(or Trino / MPP databases) is used to crunch big-data from general-purpose data lake or 'lake house'. So, I think it's not a replacement, but rather, it would be a complement for data platforms. Thanks, Youngwoo On Fri, Nov 5, 2021 at 10:54 PM Evans Ye <[email protected]> wrote: > Hi Matt Andruff, > Agree. > > Hi Youngwoo, > Could you elaborate what component is needed for the real-time event > streaming? Kafka + Flink in current stack are the solution for it. Pulsar > can be an addition. > Regarding query processing, Could you share more insight on the difference > of Pinot V.S. Presto? Does Pinot suitable for having Looker plugged in > front of it for analytical purposes? > > Hi Kengo/Masatake, > Do you have any feature that is needed for your company to move business > forward? > > > BTW, we have some discussion in the past for this topic you can take as a > reference[1]. > > [1] > > https://docs.google.com/document/d/1F2Gxu8GARQDZXgqHn12LKkQ5wCV_AF4b_tVmjYB6YfA/edit# > > Youngwoo Kim (김영우) <[email protected]> 於 2021年11月3日 週三 下午4:02寫道: > > > Evans, > > Thanks for starting this discussion. > > > > Hopefully, It would be valuable to integrate the real-time event > streaming > > and query processing stack e.g., Apache Druid, Pinot, Pulsar and etc. > > > > And 'k8s operator for Bigtop' looks promising for me! > > > > Thanks, > > Youngwoo > > > > On Wed, Nov 3, 2021 at 1:34 AM Evans Ye <[email protected]> wrote: > > > > > Hi folks, > > > > > > With Bigtop 3.0 been released, I think it's time to discuss what's new > as > > > our next steps. Of course the open source ver. of unified compatible > > Hadoop > > > Distro. is still our core product going forward. But the surrounding > > value > > > added features might be something that can take us further beyond where > > we > > > were at. Now, let me post some ideas to start the brainstorming. > > > > > > 1. Deployment on K8S: Ambari or Bigtop Puppet as K8S operators. > > > 2. MLOps integrations: MLFlow, Submarine. > > > 3. Data Lake integrations: Hudi, Iceberg, Delta. > > > > > > And for some software engineering stuffs, I think we can do a clean up > on > > > out-dated features such as: > > > 1. vagrant provisioner > > > 2. docker sandbox > > > 3. bigtop-ci > > > 4. bigtop-data-generators > > > 5. bigtop-bigpetstore > > > > > > Any thoughts? Would love to hear all of you. > > > > > >
