Hi folks, We got couple requests from community from Slack for providing an automation tool to bootstrap a list of entities in Polaris (e.g. principals).
Currently the python CLI only support one entity modification at a time and a custom wrapper script is needed to perform operations in bulk. Many people ended up implementing a custom wrapper in different ways to achieve this. To have a better way to support the community for this feature request, I ended up writing a POC implementation with existed python CLI (PR in https://github.com/apache/polaris/pull/1474). The POC is pretty basic and it is really just to show maybe we can do this way to support quick environment bootstrap. Also, this can be easily integrated with CI for rolling out new entities (PR for the change and have CI call the CLI with the changes in the input config to rollout new changes). So far, it is supported with the current PR. To make it more useful, we may want to support the following: 1. declarative approach: define what you needed in the configure file and we figure out the gap and roll out the needed changes to fulfill the gap. This will also mean update will be supported. We may want to think about support of delete as this can be destructive and we don't control all entities such as tables/views. 2. setup export: export what is currently in the catalog (all polaris entities as well as tables/views? or add a flag to support what should be exported? think about this as mysqldump where users can decide what to dump and use the dump to restore or create new environments fully or partially) While chatting with Eric, I got to know sync tool (https://github.com/apache/polaris-tools/tree/main/polaris-synchronizer) has the similar roadmap for supporting this feature (ML in https://lists.apache.org/thread/5p96vdvj5x68kfhk8f8vxo51v0y5x769). I would like to get some input from community before adding more codes to the existed PR as well as the potential features mentioned above. Based on my understanding, the existed sync tool will sync Polaris entities between two catalog servers as well as tables/views associated with the catalog. However, this doesn't currently support environment bootstrap nor setup export. I don't have a strong preference for where the functionality should be added/implemented, but I do think having the ability to quickly import/export environments is handy and practical as we will have a set of environments and running commands line by line is not really feasible. Thanks, Yong Zheng