Hi JB, Thanks for the review. As we discussed, engines can still use table properties if that's preferred.
In the case that a table is visited by multiple engines, these optional properties become critical, allowing admins or SREs to specify specialized “recipes” for table maintenance engines. Yufei On Mon, Jan 13, 2025 at 6:10 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Hi Folks, > > I did a new pass on the "Policy Management" proposal and I struggled > to understand why the policy "overrides" the Iceberg table properties > (even optional). > In the proposal, I see this: > > { > "type": "object", > "properties": { > "enable": { "type": "boolean" }, > "target_file_size_bytes": { "type": "integer"}, > "compaction_threshold": { "type": "number"} > }, > "required": ["enable"] > } > > Table properties already contain write.target-file-size-bytes > (https://iceberg.apache.org/docs/latest/configuration/#table-properties). > Why not use this property ? > > Same comment for snapshots: > > { > "type": "object", > "properties": { > "enable": { "type": "boolean" }, > "min_snapshot_to_keep": { "type": "integer"}, > "max_snapshot_age_days": { "type": "integer"} > }, > "required": ["enable"] > } > > We have history.expire.max-snapshot-age-ms and > history.expire.min-snapshots-to-keep table properties. > > I think it's very confusing, and in order to keep query engines > interoperability, I would rather use the "standard" table properties. > > Can we clarify this ? > > Thanks ! > Regards > JB > > On Thu, Jan 2, 2025 at 8:30 PM Yufei Gu <flyrain...@gmail.com> wrote: > > > > Hi Folks, > > > > Happy New Year! I hope you all had a wonderful and refreshing break. > > > > Following our previous discussions, we have decided to use separated > policy > > entities(option 2) for table maintenance. I've outlined the detailed > design > > here, > > > https://docs.google.com/document/d/1kIiVkFFg9tPa5SH70b9WwzbmclrzH3qWHKfCKXw5lbs/edit?usp=sharing > . > > It is based on the previous design with a wider scope for policy > management. > > > > I’d love to hear your thoughts, feedback, or suggestions, so feel free to > > review and share your input. > > > > Yufei > > > > > > On Tue, Dec 10, 2024 at 5:43 PM Yufei Gu <flyrain...@gmail.com> wrote: > > > > > Hi everyone, > > > > > > > > > Thank you all for taking the time to meet! Here’s a summary of our > > > discussion: > > > > > > 1. *Challenges with Storing Policies as Properties (Option 1):* > > > - We identified scalability limitations for access control in > this > > > approach. > > > 2. *Benefits of Using Separate Policy Entities (Option 2):* > > > - This approach offers a more generic solution with improved > access > > > control and better performance. > > > - This approach could apply to a variety of use cases, like > column > > > masking. > > > - There are certain agreements on this approach. > > > 3. *Other Options Considered:* > > > - Storing policies as Polaris entity properties or using a 1:1 > > > mapping of policy entities with catalog/namespace/table entities. > > > - While slightly different from Option 1, these approaches still > > > present notable drawbacks similar to option 1. > > > 4. *Option to Delegate Policy Storage to TMS:* > > > - We discussed the possibility of not storing any policies in > > > Polaris, allowing TMS to manage all policies. > > > - However, the proposed approach aims to promote interoperability > > > across engines and systems like TMS, without preventing them > from having > > > their own rules or policies. > > > > > > > > > Please let me know if I missed anything or if further clarifications > are > > > needed. > > > > > > > > > > > > Yufei > > > > > > > > > On Wed, Dec 4, 2024 at 2:37 PM Omar Al-Safi <o...@oalsafi.com> wrote: > > > > > >> Thank you Yufei for the flexibility! > > >> > > >> Regards, > > >> Omar > > >> > > >> On Wed, 4 Dec 2024, 23:12 Yufei Gu, <flyrain...@gmail.com> wrote: > > >> > > >> > I've rescheduled it to next Monday due to the availability. Sorry > for > > >> > any inconvenience. FYI, I will not record it as I don't have a > > >> > premium google account yet. > > >> > > > >> > Table maintenance in Polaris > > >> > Monday, December 9 · 9:00 – 10:00am > > >> > Time zone: America/Los_Angeles > > >> > Google Meet joining info > > >> > Video call link: https://meet.google.com/dix-cdfm-pve > > >> > > > >> > Yufei > > >> > > > >> > > > >> > On Wed, Dec 4, 2024 at 1:15 AM Omar Al-Safi <o...@oalsafi.com> > wrote: > > >> > > > >> > > Thank you Yufei for getting this moving. > > >> > > > > >> > > Unfortunately tomorrow I won't be able to make it plus I think a > > >> couple > > >> > of > > >> > > guys are at reinvent (JB for example), would it make sense to > > >> reschedule > > >> > to > > >> > > early next week? Or maybe have it recorded. > > >> > > As I highlighted in the document, I am feeling embedding the > policies > > >> > into > > >> > > Polaris feels more of TMS concern rather than Polaris concern. > > >> Unless, we > > >> > > provide a way to have pluggable policies where you can either > rely on > > >> > > Polaris to store the polices or the pluggable implementation would > > >> handle > > >> > > how that can be stored, which I think fits well in both worlds. > > >> > > > > >> > > Regards, > > >> > > Omar > > >> > > > > >> > > On Tue, Dec 3, 2024 at 10:26 PM Yufei Gu <flyrain...@gmail.com> > > >> wrote: > > >> > > > > >> > > > Sorry the meeting title is misleading, the meeting itself is > > >> scheduled > > >> > on > > >> > > > Dec. 5th. Thanks Anurag for pointing that out. > > >> > > > > > >> > > > Table maintenance in Polaris > > >> > > > Thursday, December 5 · 9:00 – 10:00am > > >> > > > Time zone: America/Los_Angeles > > >> > > > Google Meet joining info > > >> > > > Video call link: https://meet.google.com/dix-cdfm-pve > > >> > > > > > >> > > > Yufei > > >> > > > > > >> > > > > > >> > > > On Tue, Dec 3, 2024 at 12:32 PM Anurag Mantripragada > > >> > > > <amantriprag...@apple.com.invalid> wrote: > > >> > > > > > >> > > > > Thanks Yufei, I think you meant Thursday, December 5th 9:00am > – > > >> > 10:00am > > >> > > > > (GMT-08). > > >> > > > > > > >> > > > > > > >> > > > > Anurag Mantripragada > > >> > > > > > > >> > > > > > > >> > > > > > On Dec 3, 2024, at 11:33 AM, Yufei Gu <flyrain...@gmail.com > > > > >> > wrote: > > >> > > > > > > > >> > > > > > Hi Folks, > > >> > > > > > > > >> > > > > > We’ve made some adjustments to the design, moving from > *Option > > >> 1* > > >> > to > > >> > > > > *Option > > >> > > > > > 2*: > > >> > > > > > > > >> > > > > > 1. *Option 1:* Store maintenance policies in > > >> > > catalog/namespace/table > > >> > > > > > properties. > > >> > > > > > 2. *Option 2:* Store maintenance policies as separate > > >> entities. > > >> > > > > > > > >> > > > > > The key concern with Option 1 is that the access control > model > > >> > isn't > > >> > > > > > scalable. On the other hand, Option 2 provides greater > > >> flexibility, > > >> > > > > > improved privilege enforcement, and better overall > performance. > > >> > > > > > > > >> > > > > > I’ve updated the design document with the latest changes, > which > > >> you > > >> > > can > > >> > > > > > find here: Updated Design Document > > >> > > > > > < > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > https://docs.google.com/document/d/1Pd_mzZcfvnUvcH98IbwsIYf4eryet1lQDfclKYx-t-M/edit?usp=sharing > > >> > > > > > > > >> > > > > > . > > >> > > > > > > > >> > > > > > To discuss this design change in detail, I’ll be hosting a > > >> session > > >> > on > > >> > > > > > Thursday. Please find the meeting details below: > > >> > > > > > Table maintenance in Polaris @ Thu, Nov 7, 2024 9:00am – > 10:00am > > >> > > > (GMT-08) > > >> > > > > > Thursday, December 5 · 9:00 – 10:00am > > >> > > > > > Time zone: America/Los_Angeles > > >> > > > > > Google Meet joining info > > >> > > > > > Video call link: https://meet.google.com/dix-cdfm-pve > > >> > > > > > > > >> > > > > > Feel free to review the updated document ahead of the > session. > > >> > > Looking > > >> > > > > > forward to your thoughts and feedback during the meeting! > > >> > > > > > > > >> > > > > > Yufei > > >> > > > > > > > >> > > > > > > > >> > > > > > On Mon, Nov 18, 2024 at 9:43 PM Jean-Baptiste Onofré < > > >> > > j...@nanthrax.net> > > >> > > > > > wrote: > > >> > > > > > > > >> > > > > >> Hi Yufei > > >> > > > > >> > > >> > > > > >> Not sure we got consensus in all details but the overall > > >> picture > > >> > is > > >> > > ok > > >> > > > > for > > >> > > > > >> me. > > >> > > > > >> > > >> > > > > >> Let’s continue the details definition in the PR. > > >> > > > > >> > > >> > > > > >> Thanks ! > > >> > > > > >> Regards > > >> > > > > >> JB > > >> > > > > >> > > >> > > > > >> Le jeu. 14 nov. 2024 à 02:39, Yufei Gu < > flyrain...@gmail.com> > > >> a > > >> > > > écrit : > > >> > > > > >> > > >> > > > > >>> Hi everyone, > > >> > > > > >>> > > >> > > > > >>> > > >> > > > > >>> Thank you for joining the table maintenance discussion > today! > > >> We > > >> > > made > > >> > > > > >>> significant progress, and here are the key takeaways: > > >> > > > > >>> > > >> > > > > >>> 1. Clarified furthermore and reached consensus on > > >> introducing > > >> > > table > > >> > > > > >>> maintenance properties in Polaris to support for > different > > >> TMS > > >> > > and > > >> > > > > >>> promote > > >> > > > > >>> interoperability. > > >> > > > > >>> 2. Agreed to proceed with Option 1, which stores > metadata as > > >> > > > > >>> catalog/namespace/table properties. > > >> > > > > >>> 3. Confirmed the new privileges to ensure that > maintenance > > >> > > > properties > > >> > > > > >>> are safeguarded from being altered by clients with > existing > > >> > write > > >> > > > > >>> access. > > >> > > > > >>> 4. Briefly discussed the support for customized > maintenance > > >> > > > policies > > >> > > > > . > > >> > > > > >>> > > >> > > > > >>> Next step: > > >> > > > > >>> > > >> > > > > >>> 1. Will file maintenance properties related PRs per > design > > >> > > > > >>> 2. Will add more details for customized policy support. > > >> > > > > >>> > > >> > > > > >>> *Note*: Unfortunately, I wasn’t able to record the meeting > > >> due to > > >> > > the > > >> > > > > >> need > > >> > > > > >>> for a Google premium account. > > >> > > > > >>> > > >> > > > > >>> > > >> > > > > >>> Yufei > > >> > > > > >>> > > >> > > > > >>> > > >> > > > > >>> On Tue, Nov 12, 2024 at 10:10 AM Omar Al-Safi < > > >> o...@oalsafi.com> > > >> > > > > wrote: > > >> > > > > >>> > > >> > > > > >>>> Thank you! Will try to be there > > >> > > > > >>>> > > >> > > > > >>>> On Tue, 12 Nov 2024, 18:55 Yufei Gu, < > flyrain...@gmail.com> > > >> > > wrote: > > >> > > > > >>>> > > >> > > > > >>>>> Hi Omar, I sent the invitation to > dev@polaris.apache.org, > > >> as > > >> > > well > > >> > > > as > > >> > > > > >>>> your > > >> > > > > >>>>> email address. > > >> > > > > >>>>> > > >> > > > > >>>>> Yufei > > >> > > > > >>>>> > > >> > > > > >>>>> > > >> > > > > >>>>> On Tue, Nov 12, 2024 at 9:51 AM Omar Al-Safi < > > >> o...@oalsafi.com > > >> > > > > >> > > > > >> wrote: > > >> > > > > >>>>> > > >> > > > > >>>>>> Thanks Yufei, is it possible to send the invitation to > the > > >> > > > > >>>>>> Polaris google group? > > >> > > > > >>>>>> > > >> > > > > >>>>>> Regards, > > >> > > > > >>>>>> Omar > > >> > > > > >>>>>> > > >> > > > > >>>>>> On Tue, Nov 12, 2024 at 6:48 PM Yufei Gu < > > >> > flyrain...@gmail.com> > > >> > > > > >>> wrote: > > >> > > > > >>>>>> > > >> > > > > >>>>>>> Hi folks, > > >> > > > > >>>>>>> > > >> > > > > >>>>>>> We are going to have another sync for table > maintenance in > > >> > > > > >> Polaris > > >> > > > > >>>> per > > >> > > > > >>>>>>> discussion with JB. Here are meeting details: > > >> > > > > >>>>>>> > > >> > > > > >>>>>>> Polaris Table maintenance sync > > >> > > > > >>>>>>> Wednesday, November 13 · 10:00 – 11:00am > > >> > > > > >>>>>>> Time zone: America/Los_Angeles > > >> > > > > >>>>>>> Google Meet joining info > > >> > > > > >>>>>>> Video call link: https://meet.google.com/nyy-ahmn-jqd > > >> > > > > >>>>>>> > > >> > > > > >>>>>>> > > >> > > > > >>>>>>> Yufei > > >> > > > > >>>>>>> > > >> > > > > >>>>>>> > > >> > > > > >>>>>>> On Fri, Nov 8, 2024 at 5:23 PM Yufei Gu < > > >> > flyrain...@gmail.com> > > >> > > > > >>>> wrote: > > >> > > > > >>>>>>> > > >> > > > > >>>>>>>> Thanks everyone for joining the discussion. Sorry I > > >> couldn't > > >> > > > > >>> record > > >> > > > > >>>>> the > > >> > > > > >>>>>>>> session due to a tech issue. Here are meeting notes: > > >> > > > > >>>>>>>> > > >> > > > > >>>>>>>> 1. We discussed the boundary between Polaris and > the > > >> Table > > >> > > > > >>>>>> Maintenance > > >> > > > > >>>>>>>> System(TMS). We agreed that they should be > separated > > >> > > > > >> systems. > > >> > > > > >>>>>>>> 2. A general agreement on the minimal metadata > added to > > >> > > > > >>> Polaris > > >> > > > > >>>> to > > >> > > > > >>>>>>>> support TMS, focusing on essential data needed for > > >> > > > > >>>>> interoperability. > > >> > > > > >>>>>>>> 3. A general consensus on option 1 to store > metadata as > > >> > > > > >>>>>>>> catalog/namespace/table properties. We could > introduce > > >> > > > > >> policy > > >> > > > > >>>>>>> entities in > > >> > > > > >>>>>>>> the future for other use cases, like column > masking. > > >> Will > > >> > > > > >>>> address > > >> > > > > >>>>>> two > > >> > > > > >>>>>>>> feedbacks: > > >> > > > > >>>>>>>> 1. Caching the table properties in the catalog > to > > >> > reduce > > >> > > > > >> IO > > >> > > > > >>>>> cost. > > >> > > > > >>>>>>>> 2. Introducing new permissions for table > maintenance > > >> > > > > >>> related > > >> > > > > >>>>>>>> metadata to prevent any clients with the write > > >> > permission > > >> > > > > >>> to > > >> > > > > >>>>> mess > > >> > > > > >>>>>>> up with > > >> > > > > >>>>>>>> them. > > >> > > > > >>>>>>>> 4. Briefly touched on the communication module > between > > >> TMS > > >> > > > > >> and > > >> > > > > >>>>>>>> Polaris, as a long-term plan, an event system from > > >> Polaris > > >> > > > > >> is > > >> > > > > >>>>>>> necessary, > > >> > > > > >>>>>>>> not only benefits TMS, but also benefits other > systems > > >> > which > > >> > > > > >>>>> consume > > >> > > > > >>>>>>> change > > >> > > > > >>>>>>>> from Polaris. > > >> > > > > >>>>>>>> > > >> > > > > >>>>>>>> Next Steps: > > >> > > > > >>>>>>>> > > >> > > > > >>>>>>>> 1. Implement metadata storage as properties > > >> > > > > >>>>>>>> 1. Design detailed schema for properties > > >> > > > > >>>>>>>> 2. Figure out a way to be extensible for future > > >> > > > > >> maintenance > > >> > > > > >>>>>> policy > > >> > > > > >>>>>>>> or customized policies. > > >> > > > > >>>>>>>> 3. Add new permissions for new properties > > >> > > > > >>>>>>>> 2. Begin planning for event system > > >> > > > > >>>>>>>> > > >> > > > > >>>>>>>> Yufei > > >> > > > > >>>>>>>> > > >> > > > > >>>>>>>> > > >> > > > > >>>>>>>> On Tue, Nov 5, 2024 at 12:25 AM Jean-Baptiste Onofré > < > > >> > > > > >>>>> j...@nanthrax.net> > > >> > > > > >>>>>>>> wrote: > > >> > > > > >>>>>>>> > > >> > > > > >>>>>>>>> Hi Yufei > > >> > > > > >>>>>>>>> > > >> > > > > >>>>>>>>> Thanks for scheduling this ! > > >> > > > > >>>>>>>>> > > >> > > > > >>>>>>>>> I should be able to join. > > >> > > > > >>>>>>>>> > > >> > > > > >>>>>>>>> For the community, will you be able to record ? > > >> > > > > >>>>>>>>> > > >> > > > > >>>>>>>>> Regards > > >> > > > > >>>>>>>>> JB > > >> > > > > >>>>>>>>> > > >> > > > > >>>>>>>>> On Mon, Nov 4, 2024 at 10:40 PM Yufei Gu < > > >> > > > > >> flyrain...@gmail.com> > > >> > > > > >>>>>> wrote: > > >> > > > > >>>>>>>>>> > > >> > > > > >>>>>>>>>> Hi Folks, > > >> > > > > >>>>>>>>>> > > >> > > > > >>>>>>>>>> I've scheduled a community sync to discuss table > > >> > maintenance > > >> > > > > >>> in > > >> > > > > >>>>>>> Polaris > > >> > > > > >>>>>>>>>> this Thursday at 9 AM PST. Since we didn’t have a > > >> chance > > >> > to > > >> > > > > >>> dive > > >> > > > > >>>>>> into > > >> > > > > >>>>>>>>> this > > >> > > > > >>>>>>>>>> topic during our last sync, this will be a > dedicated > > >> > session > > >> > > > > >>> to > > >> > > > > >>>>>> cover > > >> > > > > >>>>>>>>> it in > > >> > > > > >>>>>>>>>> detail. > > >> > > > > >>>>>>>>>> > > >> > > > > >>>>>>>>>> *Updates to Note:* I've made some updates to the > design > > >> > > > > >>>> document, > > >> > > > > >>>>>>> with a > > >> > > > > >>>>>>>>>> particular focus on the approach for maintenance > > >> metadata. > > >> > > > > >> The > > >> > > > > >>>>>>> document > > >> > > > > >>>>>>>>> now > > >> > > > > >>>>>>>>>> favors *Option 1*, which involves leveraging table, > > >> > > > > >> namespace, > > >> > > > > >>>> and > > >> > > > > >>>>>>>>> catalog > > >> > > > > >>>>>>>>>> properties for maintenance metadata. > > >> > > > > >>>>>>>>>> > > >> > > > > >>>>>>>>>> Please review the latest version of the design doc > > >> before > > >> > > > > >> the > > >> > > > > >>>>>> meeting, > > >> > > > > >>>>>>>>> as > > >> > > > > >>>>>>>>>> it will help us streamline the discussion. > > >> > > > > >>>>>>>>>> > > >> > > > > >>>>>>>>>> Looking forward to everyone’s insights! > > >> > > > > >>>>>>>>>> Video call link: > https://meet.google.com/opc-vath-mgb > > >> > > > > >>>>>>>>>> Design doc: > > >> > > > > >>>>>>>>>> > > >> > > > > >>>>>>>>> > > >> > > > > >>>>>>> > > >> > > > > >>>>>> > > >> > > > > >>>>> > > >> > > > > >>>> > > >> > > > > >>> > > >> > > > > >> > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > https://docs.google.com/document/d/1Pd_mzZcfvnUvcH98IbwsIYf4eryet1lQDfclKYx-t-M/edit?usp=sharing > > >> > > > > >>>>>>>>>> < > > >> > > > > >>>>>>>>> > > >> > > > > >>>>>>> > > >> > > > > >>>>>> > > >> > > > > >>>>> > > >> > > > > >>>> > > >> > > > > >>> > > >> > > > > >> > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > https://www.google.com/url?q=https://docs.google.com/document/d/1Pd_mzZcfvnUvcH98IbwsIYf4eryet1lQDfclKYx-t-M/edit?usp%3Dsharing&sa=D&source=calendar&usd=2&usg=AOvVaw2V3IjIcadea8miDcKKSG9I > > >> > > > > >>>>>>>>>> > > >> > > > > >>>>>>>>>> > > >> > > > > >>>>>>>>>> Yufei > > >> > > > > >>>>>>>>> > > >> > > > > >>>>>>>> > > >> > > > > >>>>>>> > > >> > > > > >>>>>> > > >> > > > > >>>>> > > >> > > > > >>>> > > >> > > > > >>> > > >> > > > > >> > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > >