Hi, IMO, I agree with jinsong's point of view. Using FlinkCDC to ingest data produces too many Delete Files, which is the vast majority of reasons why users choose amoro. Therefore, it would be better to provide a quick start image and provide the most realistic scene demo to users who are exposed to amoro for the first time.
Regarding mirroring, I agree with Baiyang's point of view. Only keep production-available mirrors under apache, and for quick start we can put them under amoro (including datanode and namenode) - apache/amoro-ams (Just `amoro` may not be very clear, because amoro also includes many things like mixed format, optimizer) - apache/amoro-optimizer-flink (maybe need to rename) - apache/amoro-optimizer-spark (maybe need to rename) - amoro/amoro-quick-demo - amoro/amoro-datanode - amoro/amoro-datanode Best regards, Qishang Zhong Jinsong Zhou <jinsongz...@apache.org> 于2024年4月30日周二 10:01写道: > Hi, > > Executing Insert/Delete/Update SQL through the Terminal can also trigger > self-optimizing and greatly reduce the difficulty of the quick demo > process. > However, CDC streaming ingestion remains a scenario that most users are > very interested in, and I'm not sure whether it should be removed from the > quick demo. > I'd like to hear more inputs from other developers and users. > > Best, > Jinsong > > On Mon, Apr 29, 2024 at 3:07 PM Gang Huang <tcodehu...@gmail.com> wrote: > > > Thanks for your rely. > > > > I agree with the vast majority of opinions. But I think we should > simplify > > the quick start(whether installation or usage) for a newbie. From this > > point, we can remove amoro-quick-demo image instead of using amoro-ams > > image. > > Users can insert/delete/update rows by ams' Terminal page, then we can > > trigger minor/major processes. > > > > > > > > . > > > > Jinsong Zhou <jinsongz...@apache.org> 于2024年4月29日周一 14:44写道: > > > > > Hi, > > > > > > Thanks a lot for driving this. > > > > > > I agree that we should keep the amoro-ams(renamed from amoro), > > > amoro-flink-optimizer(renamed from flink-optimizer), and > > > amoro-spark-optimizer(renamed from spark-optimizer) images. > > > Besides, considering that the quick demo now requires additional usage > of > > > the Flink engine to complete CDC data ingestion, we still need the > > > amoro-quick-demo image until we plan to adjust the quick demo process. > > > > > > Best, > > > Jinsong > > > > > > On Sat, Apr 27, 2024 at 1:30 PM BaiyangTX <xiangneb...@163.com> wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > The docker image currently maintained in the project does have room > for > > > > further optimization. > > > > > > > > amoro: This is the core image of the project. It provides AMS > > deployment > > > > in early versions. After this PR will be merged ( > > > > https://github.com/apache/amoro/pull/2695), it can also directly > > provide > > > > the deployment of Optimizer on K8S. , this should also be the > > recommended > > > > using in the future. I recommend maintaining this image in the apache > > > repo > > > > in the future, using apache/amoro as the image name. Here you can > refer > > > to > > > > kyuubi (https://hub.docker.com/r/apache/kyuubi) > > > > > > > > optimizer-flink/optimizer-spark: Provides images of optimizer > > deployment > > > > under different computing engines. I also recommend that these two > > images > > > > be maintained under the apache repo, using > apache/amoro-flink-optimizer > > > and > > > > apache/amoro-spark-optimizer as the image id. > > > > > > > > quickstart: Used to demonstrate the QuickStart part of the official > > > > website. This image is based on the amoro image and includes > computing > > > > engines such as flink/spark and connectors such as iceberg. It is > > > > recommended to use amoro's repo maintenance for this part and use > > > > amoro/quickstart as the image name. > > > > > > > > namenode/datanode: These two images are used to provide an HDFS > > > > environment in the qucikstart demonstration. I suggest modifying the > > > > current quickstart process, using minio as the quickstart > environment, > > > and > > > > no longer maintaining these two images. > > > > > > > > The above are my personal suggestions. > > > > > > > > > > > > Kind Regards, > > > > baiyangtx > > > > > > > > > > > > ---- Replied Message ---- > > > > | From | Gang Huang<tcodehu...@gmail.com> | > > > > | Date | 4/26/2024 17:03 | > > > > | To | <dev@amoro.apache.org> | > > > > | Subject | Adjust docker images of apache amoro project | > > > > Hi, > > > > > > > > Currently, there are up to 6 docker images in apache amoro project. > But > > > in > > > > my opinion, only amoro, optimizer-flink and optimizer-spark maybe are > > > > needed. Furthermore, we have to change the images' final > > names(amoro-ams, > > > > amoro-optimizer-flink, amoro-optimizer-spark) to identify them when > > > > uploading them into docker hub. > > > > > > > > Thus, we can simplify the quick start process for better user > > experience, > > > > just like iceberg/risingwave quickstart: > > > > https://iceberg.apache.org/spark-quickstart/ > > > > https://docs.risingwave.com/docs/current/get-started/ > > > > > > > > Please feel free to contribute your suggestions. > > > > > > > > > > > > > > > > Kind Regards, > > > > Gang Huang > > > > > > > > > > -- Best Regards, Qishang Zhong