Thanks Jan! I am not sure if you would like to make suggestions to revise the options themselves or the current options pros and cons. In either case, as mentioned earlier, we can do that on the doc and once we agree on the options and their pros and cons we can move forward. How does that sound?
Thanks, Walaa. On Mon, Mar 25, 2024 at 7:45 AM Jan Kaul <jank...@mailbox.org.invalid> wrote: > I have the feeling that the current pros and cons from the summary target > a version of the MV spec that wasn't really part of the discussion. The > current arguments target a completely new specification for materialized > views which we agreed on, is out of scope. Instead of a completely new > specification the argument was made for a MV metadata object that embeds > the View and the Table metadata, which was Option 6 > <https://docs.google.com/spreadsheets/d/1a0tlyh8f2ft2SepE7H3bgoY2A0q5IILgzAsJMnwjTBs/edit#gid=0&range=G3> > in Jack's summary document. With that approach the "commitView" and > "commitTable" operations don't have to be changed and only the "loadView" > operation has to be adopted. Additionally, compaction and snapshot > expiration can be reused for the embedded solution. With that in mind, the > cons 2, 4, 5, 6 from the summary don't really apply. > Furthermore, I think we should distinguish between pros and cons for the > implementers and the users. Because most of the pros (no new operations) > for separate objects (option1) are for the implementers and most of the > pros (single logical object, doesn't require 2 loads) for combined objects > (option3) are for the users. In my opinion, in the long run the design > decisions should be focused more on the user preferences than the > implementers. > > On 3/25/24 14:49, Benny Chow wrote: > > Hi Manu > > This is Walaa's Spark implementation for option 1: > https://github.com/apache/iceberg/pull/9830/files/a9e1bee3b5bf5914e5330d3b195042aea33868c9 > There's no code for option 2 yet. > > Best > Benny > > On Mon, Mar 25, 2024 at 12:37 AM Manu Zhang <owenzhang1...@gmail.com> > wrote: > >> Thanks Walaa for the summary. It's unclear to me which are the reference >> implementation for option 1 and reference MV spec for option 2 from the >> context. I can find some links in the References section but not sure which >> should be referred to respectively. >> >> On Mon, Mar 25, 2024 at 3:38 AM Walaa Eldin Moustafa < >> wa.moust...@gmail.com> wrote: >> >>> Thanks Himadri for the questions. At this point, our objective is to >>> have a common understanding of both options and their pros and cons. The >>> best way to achieve this is to iterate on the doc to discuss the details of >>> each option or their pros and cons. We can always add more details or >>> update the pros and cons. The main thing is to keep the options to two so >>> that we keep the scope manageable. >>> >>> Once we have a common understanding, it will be easy to make a choice >>> and move forward. Therefore, I would suggest reframing your questions as >>> either adding suggestions to add more details to the options, questions on >>> how either works, or discussions of their pros and cons on the doc. >>> >>> Thanks, >>> Walaa. >>> >>>