Re: [DISCUSS] Apache Amoro proposal

2024-03-02 Thread Justin Mclean
HI, As the discussion seems to have died down, I’ll put this up for a vote. Kind Regards, Justin - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail:

Re: Re: [DISCUSS] Apache Amoro proposal

2024-03-01 Thread Jean-Baptiste Onofré
Hi Nathan, Thanks for the detailed information. Much appreciated. I now have a better understanding of the goals. It looks interesting. Happy to help as a mentor if you need. Thanks ! Regards JB On Sat, Feb 24, 2024 at 6:24 AM nathan ma wrote: > > hi, JB > > As co-creator of this project,

Re: [DISCUSS] Apache Amoro proposal

2024-02-26 Thread Kent Yao
+1, I'm glad to be one of the mentors. I had a discussion with Nathan and Jinsong two years ago. They expressed interest in open-sourcing Arctic (formerly) and donating it to the ASF Incubator in the future. I am happy to witness the community growth and the proposal that has finally been put

Re: [DISCUSS] Apache Amoro proposal

2024-02-25 Thread Xinyu Zhou
+1, as one of the mentors, over the past few months, I have seen significant progress within this community. Regards, Xinyu Zhou On Mon, Feb 26, 2024 at 10:53 AM Xavier Bai wrote: > +1, I was also one of the early developers on the project, focusing on > solving optimization and compaction

Re: [DISCUSS] Apache Amoro proposal

2024-02-25 Thread Xavier Bai
+1, I was also one of the early developers on the project, focusing on solving optimization and compaction issues with the company's Iceberg tables. I believe that many teams using datalake need a system like Amoro for effective data lake management and to reduce the complexity of data lake

Re: [DISCUSS] Apache Amoro proposal

2024-02-25 Thread ConradJam
+1, I'm one of the developers. At present, I think the community is developing well, and this project can help everyone better control the data lake. I suggest joining the ASF incubator to let more people know about this project and participate in it Justin Mclean 于2024年2月23日周五 16:44写道: > Hi, >

Re: [DISCUSS] Apache Amoro proposal

2024-02-24 Thread Yu Li
+1. I'm happy to be one of the mentors. I have discussed with Jinsong, Nathan and the team, and am impressed by their openness and passion on improving the Amoro community through incubation. From my observation, it's a well developed community with a similar governance philosophy as the Apache

RE: Re: [DISCUSS] Apache Amoro proposal

2024-02-24 Thread nathan ma
hi, JB As co-creator of this project, I’d love to explain more about the positioning of lakehouse management system. When discussing databases or traditional data warehouses, we often used the term DBMS (Database Management System) to describe them. Traditional databases, including MPP

RE: Re: [DISCUSS] Apache Amoro proposal

2024-02-24 Thread PJ Fanning
+1. Looks like a good candidate with a good number of contributors already. On 2024/02/24 05:24:33 nathan ma wrote: > hi, JB > > As co-creator of this project, I’d love to explain more about the > positioning of lakehouse management system. > > When discussing databases or traditional data

RE: Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread nathan ma
hi, JB As co-creator of this project, I’d love to explain more about the positioning of lakehouse management system. When discussing databases or traditional data warehouses, we often used the term DBMS (Database Management System) to describe them. Traditional databases, including MPP

Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread 周劲松
Hi JB, Yes, you can say it is an abstraction layer on top of data lake table formats and query engines and we often call it the service layer in Lakehouse architecture. The service layer primarily provides unified metadata and access control, as well as common audit services, and so on. Of

Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread 周劲松
Hi Ayush, I am Jinsong from Amoro community. Thank you very much for your attention and feedback on Amoro. Amoro aims to support multiple versions of Hadoop and Hive clusters as much as possible, allowing users to specify versions during build time, but just as you said, our default version

Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread Jean-Baptiste Onofré
Hi Justin Even if it looks interesting, I'm not sure to understand exactly the purpose of the proposal. What lakehouse management system means exactly ? Is it an abstraction layer on top of Iceberg, Paimon + query engine powered by Flink, Spark, Trino ? Please let me know if you want an

Re: [DISCUSS] Apache Amoro proposal

2024-02-23 Thread Ayush Saxena
+1, I remember exploring this while exploring a way for compaction for iceberg tables for a Hive usecase, got some good pointers for cleaning up orphan files, I think it was using a pretty old version of Hive(3.1.1 I believe), so couldn't pull it in as dependency in Hive master branch itself,