Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-23 Thread John Zhuge
Holden has graciously agreed to shepherd the SPIP. Thanks!

On Thu, Feb 10, 2022 at 9:19 AM John Zhuge  wrote:

> The vote is now closed and the vote passes. Thank you to everyone who took
> the time to review and vote on this SPIP. I’m looking forward to adding
> this feature to the next Spark release. The tracking JIRA is
> https://issues.apache.org/jira/browse/SPARK-31357.
>
> The tally is:
>
> +1s:
>
> Walaa Eldin Moustafa
> Erik Krogen
> Holden Karau (binding)
> Ryan Blue
> Chao Sun
> L C Hsieh (binding)
> Huaxin Gao
> Yufei Gu
> Terry Kim
> Jacky Lee
> Wenchen Fan (binding)
>
> 0s:
>
> -1s:
>
> On Mon, Feb 7, 2022 at 10:04 PM Wenchen Fan  wrote:
>
>> +1 (binding)
>>
>> On Sun, Feb 6, 2022 at 10:27 AM Jacky Lee  wrote:
>>
>>> +1 (non-binding). Thanks John!
>>> It's great to see ViewCatalog moving on, it's a nice feature.
>>>
>>> Terry Kim  于2022年2月5日周六 11:57写道:
>>>
 +1 (non-binding). Thanks John!

 Terry

 On Fri, Feb 4, 2022 at 4:13 PM Yufei Gu  wrote:

> +1 (non-binding)
> Best,
>
> Yufei
>
> `This is not a contribution`
>
>
> On Fri, Feb 4, 2022 at 11:54 AM huaxin gao 
> wrote:
>
>> +1 (non-binding)
>>
>> On Fri, Feb 4, 2022 at 11:40 AM L. C. Hsieh  wrote:
>>
>>> +1
>>>
>>> On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
>>> >
>>> > +1 (non-binding). Looking forward to this feature!
>>> >
>>> > On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
>>> >>
>>> >> +1 for the SPIP. I think it's well designed and it has worked
>>> quite well at Netflix for a long time.
>>> >>
>>> >> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge 
>>> wrote:
>>> >>>
>>> >>> Hi Spark community,
>>> >>>
>>> >>> I’d like to restart the vote for the ViewCatalog design proposal
>>> (SPIP).
>>> >>>
>>> >>> The proposal is to add a ViewCatalog interface that can be used
>>> to load, create, alter, and drop views in DataSourceV2.
>>> >>>
>>> >>> Please vote on the SPIP until Feb. 9th (Wednesday).
>>> >>>
>>> >>> [ ] +1: Accept the proposal as an official SPIP
>>> >>> [ ] +0
>>> >>> [ ] -1: I don’t think this is a good idea because …
>>> >>>
>>> >>> Thanks!
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Ryan Blue
>>> >> Tabular
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>
> --
> John Zhuge
>


-- 
John Zhuge


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-10 Thread John Zhuge
The vote is now closed and the vote passes. Thank you to everyone who took
the time to review and vote on this SPIP. I’m looking forward to adding
this feature to the next Spark release. The tracking JIRA is
https://issues.apache.org/jira/browse/SPARK-31357.

The tally is:

+1s:

Walaa Eldin Moustafa
Erik Krogen
Holden Karau (binding)
Ryan Blue
Chao Sun
L C Hsieh (binding)
Huaxin Gao
Yufei Gu
Terry Kim
Jacky Lee
Wenchen Fan (binding)

0s:

-1s:

On Mon, Feb 7, 2022 at 10:04 PM Wenchen Fan  wrote:

> +1 (binding)
>
> On Sun, Feb 6, 2022 at 10:27 AM Jacky Lee  wrote:
>
>> +1 (non-binding). Thanks John!
>> It's great to see ViewCatalog moving on, it's a nice feature.
>>
>> Terry Kim  于2022年2月5日周六 11:57写道:
>>
>>> +1 (non-binding). Thanks John!
>>>
>>> Terry
>>>
>>> On Fri, Feb 4, 2022 at 4:13 PM Yufei Gu  wrote:
>>>
 +1 (non-binding)
 Best,

 Yufei

 `This is not a contribution`


 On Fri, Feb 4, 2022 at 11:54 AM huaxin gao 
 wrote:

> +1 (non-binding)
>
> On Fri, Feb 4, 2022 at 11:40 AM L. C. Hsieh  wrote:
>
>> +1
>>
>> On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
>> >
>> > +1 (non-binding). Looking forward to this feature!
>> >
>> > On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
>> >>
>> >> +1 for the SPIP. I think it's well designed and it has worked
>> quite well at Netflix for a long time.
>> >>
>> >> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge 
>> wrote:
>> >>>
>> >>> Hi Spark community,
>> >>>
>> >>> I’d like to restart the vote for the ViewCatalog design proposal
>> (SPIP).
>> >>>
>> >>> The proposal is to add a ViewCatalog interface that can be used
>> to load, create, alter, and drop views in DataSourceV2.
>> >>>
>> >>> Please vote on the SPIP until Feb. 9th (Wednesday).
>> >>>
>> >>> [ ] +1: Accept the proposal as an official SPIP
>> >>> [ ] +0
>> >>> [ ] -1: I don’t think this is a good idea because …
>> >>>
>> >>> Thanks!
>> >>
>> >>
>> >>
>> >> --
>> >> Ryan Blue
>> >> Tabular
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

-- 
John Zhuge


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-07 Thread Wenchen Fan
+1 (binding)

On Sun, Feb 6, 2022 at 10:27 AM Jacky Lee  wrote:

> +1 (non-binding). Thanks John!
> It's great to see ViewCatalog moving on, it's a nice feature.
>
> Terry Kim  于2022年2月5日周六 11:57写道:
>
>> +1 (non-binding). Thanks John!
>>
>> Terry
>>
>> On Fri, Feb 4, 2022 at 4:13 PM Yufei Gu  wrote:
>>
>>> +1 (non-binding)
>>> Best,
>>>
>>> Yufei
>>>
>>> `This is not a contribution`
>>>
>>>
>>> On Fri, Feb 4, 2022 at 11:54 AM huaxin gao 
>>> wrote:
>>>
 +1 (non-binding)

 On Fri, Feb 4, 2022 at 11:40 AM L. C. Hsieh  wrote:

> +1
>
> On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
> >
> > +1 (non-binding). Looking forward to this feature!
> >
> > On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
> >>
> >> +1 for the SPIP. I think it's well designed and it has worked quite
> well at Netflix for a long time.
> >>
> >> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge 
> wrote:
> >>>
> >>> Hi Spark community,
> >>>
> >>> I’d like to restart the vote for the ViewCatalog design proposal
> (SPIP).
> >>>
> >>> The proposal is to add a ViewCatalog interface that can be used to
> load, create, alter, and drop views in DataSourceV2.
> >>>
> >>> Please vote on the SPIP until Feb. 9th (Wednesday).
> >>>
> >>> [ ] +1: Accept the proposal as an official SPIP
> >>> [ ] +0
> >>> [ ] -1: I don’t think this is a good idea because …
> >>>
> >>> Thanks!
> >>
> >>
> >>
> >> --
> >> Ryan Blue
> >> Tabular
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-05 Thread Jacky Lee
+1 (non-binding). Thanks John!
It's great to see ViewCatalog moving on, it's a nice feature.

Terry Kim  于2022年2月5日周六 11:57写道:

> +1 (non-binding). Thanks John!
>
> Terry
>
> On Fri, Feb 4, 2022 at 4:13 PM Yufei Gu  wrote:
>
>> +1 (non-binding)
>> Best,
>>
>> Yufei
>>
>> `This is not a contribution`
>>
>>
>> On Fri, Feb 4, 2022 at 11:54 AM huaxin gao 
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Fri, Feb 4, 2022 at 11:40 AM L. C. Hsieh  wrote:
>>>
 +1

 On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
 >
 > +1 (non-binding). Looking forward to this feature!
 >
 > On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
 >>
 >> +1 for the SPIP. I think it's well designed and it has worked quite
 well at Netflix for a long time.
 >>
 >> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge  wrote:
 >>>
 >>> Hi Spark community,
 >>>
 >>> I’d like to restart the vote for the ViewCatalog design proposal
 (SPIP).
 >>>
 >>> The proposal is to add a ViewCatalog interface that can be used to
 load, create, alter, and drop views in DataSourceV2.
 >>>
 >>> Please vote on the SPIP until Feb. 9th (Wednesday).
 >>>
 >>> [ ] +1: Accept the proposal as an official SPIP
 >>> [ ] +0
 >>> [ ] -1: I don’t think this is a good idea because …
 >>>
 >>> Thanks!
 >>
 >>
 >>
 >> --
 >> Ryan Blue
 >> Tabular

 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org




Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-04 Thread Terry Kim
+1 (non-binding). Thanks John!

Terry

On Fri, Feb 4, 2022 at 4:13 PM Yufei Gu  wrote:

> +1 (non-binding)
> Best,
>
> Yufei
>
> `This is not a contribution`
>
>
> On Fri, Feb 4, 2022 at 11:54 AM huaxin gao  wrote:
>
>> +1 (non-binding)
>>
>> On Fri, Feb 4, 2022 at 11:40 AM L. C. Hsieh  wrote:
>>
>>> +1
>>>
>>> On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
>>> >
>>> > +1 (non-binding). Looking forward to this feature!
>>> >
>>> > On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
>>> >>
>>> >> +1 for the SPIP. I think it's well designed and it has worked quite
>>> well at Netflix for a long time.
>>> >>
>>> >> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge  wrote:
>>> >>>
>>> >>> Hi Spark community,
>>> >>>
>>> >>> I’d like to restart the vote for the ViewCatalog design proposal
>>> (SPIP).
>>> >>>
>>> >>> The proposal is to add a ViewCatalog interface that can be used to
>>> load, create, alter, and drop views in DataSourceV2.
>>> >>>
>>> >>> Please vote on the SPIP until Feb. 9th (Wednesday).
>>> >>>
>>> >>> [ ] +1: Accept the proposal as an official SPIP
>>> >>> [ ] +0
>>> >>> [ ] -1: I don’t think this is a good idea because …
>>> >>>
>>> >>> Thanks!
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Ryan Blue
>>> >> Tabular
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-04 Thread Yufei Gu
+1 (non-binding)
Best,

Yufei

`This is not a contribution`


On Fri, Feb 4, 2022 at 11:54 AM huaxin gao  wrote:

> +1 (non-binding)
>
> On Fri, Feb 4, 2022 at 11:40 AM L. C. Hsieh  wrote:
>
>> +1
>>
>> On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
>> >
>> > +1 (non-binding). Looking forward to this feature!
>> >
>> > On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
>> >>
>> >> +1 for the SPIP. I think it's well designed and it has worked quite
>> well at Netflix for a long time.
>> >>
>> >> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge  wrote:
>> >>>
>> >>> Hi Spark community,
>> >>>
>> >>> I’d like to restart the vote for the ViewCatalog design proposal
>> (SPIP).
>> >>>
>> >>> The proposal is to add a ViewCatalog interface that can be used to
>> load, create, alter, and drop views in DataSourceV2.
>> >>>
>> >>> Please vote on the SPIP until Feb. 9th (Wednesday).
>> >>>
>> >>> [ ] +1: Accept the proposal as an official SPIP
>> >>> [ ] +0
>> >>> [ ] -1: I don’t think this is a good idea because …
>> >>>
>> >>> Thanks!
>> >>
>> >>
>> >>
>> >> --
>> >> Ryan Blue
>> >> Tabular
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-04 Thread huaxin gao
+1 (non-binding)

On Fri, Feb 4, 2022 at 11:40 AM L. C. Hsieh  wrote:

> +1
>
> On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
> >
> > +1 (non-binding). Looking forward to this feature!
> >
> > On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
> >>
> >> +1 for the SPIP. I think it's well designed and it has worked quite
> well at Netflix for a long time.
> >>
> >> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge  wrote:
> >>>
> >>> Hi Spark community,
> >>>
> >>> I’d like to restart the vote for the ViewCatalog design proposal
> (SPIP).
> >>>
> >>> The proposal is to add a ViewCatalog interface that can be used to
> load, create, alter, and drop views in DataSourceV2.
> >>>
> >>> Please vote on the SPIP until Feb. 9th (Wednesday).
> >>>
> >>> [ ] +1: Accept the proposal as an official SPIP
> >>> [ ] +0
> >>> [ ] -1: I don’t think this is a good idea because …
> >>>
> >>> Thanks!
> >>
> >>
> >>
> >> --
> >> Ryan Blue
> >> Tabular
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-04 Thread L. C. Hsieh
+1

On Thu, Feb 3, 2022 at 7:25 PM Chao Sun  wrote:
>
> +1 (non-binding). Looking forward to this feature!
>
> On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:
>>
>> +1 for the SPIP. I think it's well designed and it has worked quite well at 
>> Netflix for a long time.
>>
>> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge  wrote:
>>>
>>> Hi Spark community,
>>>
>>> I’d like to restart the vote for the ViewCatalog design proposal (SPIP).
>>>
>>> The proposal is to add a ViewCatalog interface that can be used to load, 
>>> create, alter, and drop views in DataSourceV2.
>>>
>>> Please vote on the SPIP until Feb. 9th (Wednesday).
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>> Thanks!
>>
>>
>>
>> --
>> Ryan Blue
>> Tabular

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread Chao Sun
+1 (non-binding). Looking forward to this feature!

On Thu, Feb 3, 2022 at 2:32 PM Ryan Blue  wrote:

> +1 for the SPIP. I think it's well designed and it has worked quite well
> at Netflix for a long time.
>
> On Thu, Feb 3, 2022 at 2:04 PM John Zhuge  wrote:
>
>> Hi Spark community,
>>
>> I’d like to restart the vote for the ViewCatalog design proposal (SPIP).
>>
>> The proposal is to add a ViewCatalog interface that can be used to load,
>> create, alter, and drop views in DataSourceV2.
>>
>> Please vote on the SPIP until Feb. 9th (Wednesday).
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>> Thanks!
>>
>
>
> --
> Ryan Blue
> Tabular
>


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread Ryan Blue
+1 for the SPIP. I think it's well designed and it has worked quite well at
Netflix for a long time.

On Thu, Feb 3, 2022 at 2:04 PM John Zhuge  wrote:

> Hi Spark community,
>
> I’d like to restart the vote for the ViewCatalog design proposal (SPIP).
>
> The proposal is to add a ViewCatalog interface that can be used to load,
> create, alter, and drop views in DataSourceV2.
>
> Please vote on the SPIP until Feb. 9th (Wednesday).
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>
> Thanks!
>


-- 
Ryan Blue
Tabular


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread John Zhuge
Sure Xiao.

Happy Lunar New Year!

On Thu, Feb 3, 2022 at 1:57 PM Xiao Li  wrote:

> Can we extend the voting window to next Wednesday? This week is a holiday
> week for the lunar new year. AFAIK, many members in Asia are taking the
> whole week off. They might not regularly check the emails.
>
> Also how about starting a separate email thread starting with [VOTE] ?
>
> Happy Lunar New Year!!!
>
> Xiao
>
> Holden Karau  于2022年2月3日周四 12:28写道:
>
>> +1 (binding)
>>
>> On Thu, Feb 3, 2022 at 2:26 PM Erik Krogen  wrote:
>>
>>> +1 (non-binding)
>>>
>>> Really looking forward to having this natively supported by Spark, so
>>> that we can get rid of our own hacks to tie in a custom view catalog
>>> implementation. I appreciate the care John has put into various parts of
>>> the design and believe this will provide a robust and flexible solution to
>>> this problem faced by various large-scale Spark users.
>>>
>>> Thanks John!
>>>
>>> On Thu, Feb 3, 2022 at 11:22 AM Walaa Eldin Moustafa <
>>> wa.moust...@gmail.com> wrote:
>>>
 +1

 On Thu, Feb 3, 2022 at 11:19 AM John Zhuge  wrote:

> Hi Spark community,
>
> I’d like to restart the vote for the ViewCatalog design proposal (SPIP
> 
> ).
>
> The proposal is to add a ViewCatalog interface that can be used to
> load, create, alter, and drop views in DataSourceV2.
>
> Please vote on the SPIP in the next 72 hours. Once it is approved,
> I’ll update the PR  for
> review.
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>
> Thanks!
>
> On Fri, Jun 4, 2021 at 1:46 PM Walaa Eldin Moustafa <
> wa.moust...@gmail.com> wrote:
>
>> Considering the API aspect, the ViewCatalog API sounds like a good
>> idea. A view catalog will enable us to integrate Coral
>>  (our view SQL
>> translation and management layer) very cleanly to Spark. Currently we can
>> only do it by maintaining our special version of the
>> HiveExternalCatalog. Considering that views can be expanded
>> syntactically without necessarily invoking the analyzer, using a 
>> dedicated
>> view API can make performance better if performance is the concern.
>> Further, a catalog can still be both a table and view provider if it
>> chooses to based on this design, so I do not think we necessarily lose 
>> the
>> ability of providing both. Looking forward to more discussions on this 
>> and
>> making views a powerful tool in Spark.
>>
>> Thanks,
>> Walaa.
>>
>>
>> On Wed, May 26, 2021 at 9:54 AM John Zhuge  wrote:
>>
>>> Looks like we are running in circles. Should we have an online
>>> meeting to get this sorted out?
>>>
>>> Thanks,
>>> John
>>>
>>> On Wed, May 26, 2021 at 12:01 AM Wenchen Fan 
>>> wrote:
>>>
 OK, then I'd vote for TableViewCatalog, because
 1. This is how Hive catalog works, and we need to migrate Hive
 catalog to the v2 API sooner or later.
 2. Because of 1, TableViewCatalog is easy to support in the current
 table/view resolution framework.
 3. It's better to avoid name conflicts between table and views at
 the API level, instead of relying on the catalog implementation.
 4. Caching invalidation is always a tricky problem.

 On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
 wrote:

> I don't think that it makes sense to discuss a different approach
> in the PR rather than in the vote. Let's discuss this now since 
> that's the
> purpose of an SPIP.
>
> On Mon, May 24, 2021 at 11:22 AM John Zhuge 
> wrote:
>
>> Hi everyone, I’d like to start a vote for the ViewCatalog design
>> proposal (SPIP).
>>
>> The proposal is to add a ViewCatalog interface that can be used
>> to load, create, alter, and drop views in DataSourceV2.
>>
>> The full SPIP doc is here:
>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>>
>> Please vote on the SPIP in the next 72 hours. Once it is
>> approved, I’ll update the PR for review.
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

>>>
>>> --
>>> John Zhuge
>>>
>>
>
> --
> John Zhuge
>
 --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning 

Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread Xiao Li
Can we extend the voting window to next Wednesday? This week is a holiday
week for the lunar new year. AFAIK, many members in Asia are taking the
whole week off. They might not regularly check the emails.

Also how about starting a separate email thread starting with [VOTE] ?

Happy Lunar New Year!!!

Xiao

Holden Karau  于2022年2月3日周四 12:28写道:

> +1 (binding)
>
> On Thu, Feb 3, 2022 at 2:26 PM Erik Krogen  wrote:
>
>> +1 (non-binding)
>>
>> Really looking forward to having this natively supported by Spark, so
>> that we can get rid of our own hacks to tie in a custom view catalog
>> implementation. I appreciate the care John has put into various parts of
>> the design and believe this will provide a robust and flexible solution to
>> this problem faced by various large-scale Spark users.
>>
>> Thanks John!
>>
>> On Thu, Feb 3, 2022 at 11:22 AM Walaa Eldin Moustafa <
>> wa.moust...@gmail.com> wrote:
>>
>>> +1
>>>
>>> On Thu, Feb 3, 2022 at 11:19 AM John Zhuge  wrote:
>>>
 Hi Spark community,

 I’d like to restart the vote for the ViewCatalog design proposal (SPIP
 
 ).

 The proposal is to add a ViewCatalog interface that can be used to
 load, create, alter, and drop views in DataSourceV2.

 Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
 update the PR  for review.

 [ ] +1: Accept the proposal as an official SPIP
 [ ] +0
 [ ] -1: I don’t think this is a good idea because …

 Thanks!

 On Fri, Jun 4, 2021 at 1:46 PM Walaa Eldin Moustafa <
 wa.moust...@gmail.com> wrote:

> Considering the API aspect, the ViewCatalog API sounds like a good
> idea. A view catalog will enable us to integrate Coral
>  (our view SQL
> translation and management layer) very cleanly to Spark. Currently we can
> only do it by maintaining our special version of the
> HiveExternalCatalog. Considering that views can be expanded
> syntactically without necessarily invoking the analyzer, using a dedicated
> view API can make performance better if performance is the concern.
> Further, a catalog can still be both a table and view provider if it
> chooses to based on this design, so I do not think we necessarily lose the
> ability of providing both. Looking forward to more discussions on this and
> making views a powerful tool in Spark.
>
> Thanks,
> Walaa.
>
>
> On Wed, May 26, 2021 at 9:54 AM John Zhuge  wrote:
>
>> Looks like we are running in circles. Should we have an online
>> meeting to get this sorted out?
>>
>> Thanks,
>> John
>>
>> On Wed, May 26, 2021 at 12:01 AM Wenchen Fan 
>> wrote:
>>
>>> OK, then I'd vote for TableViewCatalog, because
>>> 1. This is how Hive catalog works, and we need to migrate Hive
>>> catalog to the v2 API sooner or later.
>>> 2. Because of 1, TableViewCatalog is easy to support in the current
>>> table/view resolution framework.
>>> 3. It's better to avoid name conflicts between table and views at
>>> the API level, instead of relying on the catalog implementation.
>>> 4. Caching invalidation is always a tricky problem.
>>>
>>> On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
>>> wrote:
>>>
 I don't think that it makes sense to discuss a different approach
 in the PR rather than in the vote. Let's discuss this now since that's 
 the
 purpose of an SPIP.

 On Mon, May 24, 2021 at 11:22 AM John Zhuge 
 wrote:

> Hi everyone, I’d like to start a vote for the ViewCatalog design
> proposal (SPIP).
>
> The proposal is to add a ViewCatalog interface that can be used to
> load, create, alter, and drop views in DataSourceV2.
>
> The full SPIP doc is here:
> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>
> Please vote on the SPIP in the next 72 hours. Once it is approved,
> I’ll update the PR for review.
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>


 --
 Ryan Blue
 Software Engineer
 Netflix

>>>
>>
>> --
>> John Zhuge
>>
>

 --
 John Zhuge

>>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread Holden Karau
+1 (binding)

On Thu, Feb 3, 2022 at 2:26 PM Erik Krogen  wrote:

> +1 (non-binding)
>
> Really looking forward to having this natively supported by Spark, so that
> we can get rid of our own hacks to tie in a custom view catalog
> implementation. I appreciate the care John has put into various parts of
> the design and believe this will provide a robust and flexible solution to
> this problem faced by various large-scale Spark users.
>
> Thanks John!
>
> On Thu, Feb 3, 2022 at 11:22 AM Walaa Eldin Moustafa <
> wa.moust...@gmail.com> wrote:
>
>> +1
>>
>> On Thu, Feb 3, 2022 at 11:19 AM John Zhuge  wrote:
>>
>>> Hi Spark community,
>>>
>>> I’d like to restart the vote for the ViewCatalog design proposal (SPIP
>>> 
>>> ).
>>>
>>> The proposal is to add a ViewCatalog interface that can be used to load,
>>> create, alter, and drop views in DataSourceV2.
>>>
>>> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
>>> update the PR  for review.
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>> Thanks!
>>>
>>> On Fri, Jun 4, 2021 at 1:46 PM Walaa Eldin Moustafa <
>>> wa.moust...@gmail.com> wrote:
>>>
 Considering the API aspect, the ViewCatalog API sounds like a good
 idea. A view catalog will enable us to integrate Coral
  (our view SQL
 translation and management layer) very cleanly to Spark. Currently we can
 only do it by maintaining our special version of the
 HiveExternalCatalog. Considering that views can be expanded
 syntactically without necessarily invoking the analyzer, using a dedicated
 view API can make performance better if performance is the concern.
 Further, a catalog can still be both a table and view provider if it
 chooses to based on this design, so I do not think we necessarily lose the
 ability of providing both. Looking forward to more discussions on this and
 making views a powerful tool in Spark.

 Thanks,
 Walaa.


 On Wed, May 26, 2021 at 9:54 AM John Zhuge  wrote:

> Looks like we are running in circles. Should we have an online meeting
> to get this sorted out?
>
> Thanks,
> John
>
> On Wed, May 26, 2021 at 12:01 AM Wenchen Fan 
> wrote:
>
>> OK, then I'd vote for TableViewCatalog, because
>> 1. This is how Hive catalog works, and we need to migrate Hive
>> catalog to the v2 API sooner or later.
>> 2. Because of 1, TableViewCatalog is easy to support in the current
>> table/view resolution framework.
>> 3. It's better to avoid name conflicts between table and views at the
>> API level, instead of relying on the catalog implementation.
>> 4. Caching invalidation is always a tricky problem.
>>
>> On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
>> wrote:
>>
>>> I don't think that it makes sense to discuss a different approach in
>>> the PR rather than in the vote. Let's discuss this now since that's the
>>> purpose of an SPIP.
>>>
>>> On Mon, May 24, 2021 at 11:22 AM John Zhuge 
>>> wrote:
>>>
 Hi everyone, I’d like to start a vote for the ViewCatalog design
 proposal (SPIP).

 The proposal is to add a ViewCatalog interface that can be used to
 load, create, alter, and drop views in DataSourceV2.

 The full SPIP doc is here:
 https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing

 Please vote on the SPIP in the next 72 hours. Once it is approved,
 I’ll update the PR for review.

 [ ] +1: Accept the proposal as an official SPIP
 [ ] +0
 [ ] -1: I don’t think this is a good idea because …

>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>
> --
> John Zhuge
>

>>>
>>> --
>>> John Zhuge
>>>
>> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread Erik Krogen
+1 (non-binding)

Really looking forward to having this natively supported by Spark, so that
we can get rid of our own hacks to tie in a custom view catalog
implementation. I appreciate the care John has put into various parts of
the design and believe this will provide a robust and flexible solution to
this problem faced by various large-scale Spark users.

Thanks John!

On Thu, Feb 3, 2022 at 11:22 AM Walaa Eldin Moustafa 
wrote:

> +1
>
> On Thu, Feb 3, 2022 at 11:19 AM John Zhuge  wrote:
>
>> Hi Spark community,
>>
>> I’d like to restart the vote for the ViewCatalog design proposal (SPIP
>> 
>> ).
>>
>> The proposal is to add a ViewCatalog interface that can be used to load,
>> create, alter, and drop views in DataSourceV2.
>>
>> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
>> update the PR  for review.
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>> Thanks!
>>
>> On Fri, Jun 4, 2021 at 1:46 PM Walaa Eldin Moustafa <
>> wa.moust...@gmail.com> wrote:
>>
>>> Considering the API aspect, the ViewCatalog API sounds like a good idea.
>>> A view catalog will enable us to integrate Coral
>>>  (our view SQL
>>> translation and management layer) very cleanly to Spark. Currently we can
>>> only do it by maintaining our special version of the HiveExternalCatalog.
>>> Considering that views can be expanded syntactically without necessarily
>>> invoking the analyzer, using a dedicated view API can make performance
>>> better if performance is the concern. Further, a catalog can still be both
>>> a table and view provider if it chooses to based on this design, so I do
>>> not think we necessarily lose the ability of providing both. Looking
>>> forward to more discussions on this and making views a powerful tool in
>>> Spark.
>>>
>>> Thanks,
>>> Walaa.
>>>
>>>
>>> On Wed, May 26, 2021 at 9:54 AM John Zhuge  wrote:
>>>
 Looks like we are running in circles. Should we have an online meeting
 to get this sorted out?

 Thanks,
 John

 On Wed, May 26, 2021 at 12:01 AM Wenchen Fan 
 wrote:

> OK, then I'd vote for TableViewCatalog, because
> 1. This is how Hive catalog works, and we need to migrate Hive catalog
> to the v2 API sooner or later.
> 2. Because of 1, TableViewCatalog is easy to support in the current
> table/view resolution framework.
> 3. It's better to avoid name conflicts between table and views at the
> API level, instead of relying on the catalog implementation.
> 4. Caching invalidation is always a tricky problem.
>
> On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
> wrote:
>
>> I don't think that it makes sense to discuss a different approach in
>> the PR rather than in the vote. Let's discuss this now since that's the
>> purpose of an SPIP.
>>
>> On Mon, May 24, 2021 at 11:22 AM John Zhuge 
>> wrote:
>>
>>> Hi everyone, I’d like to start a vote for the ViewCatalog design
>>> proposal (SPIP).
>>>
>>> The proposal is to add a ViewCatalog interface that can be used to
>>> load, create, alter, and drop views in DataSourceV2.
>>>
>>> The full SPIP doc is here:
>>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>>>
>>> Please vote on the SPIP in the next 72 hours. Once it is approved,
>>> I’ll update the PR for review.
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

 --
 John Zhuge

>>>
>>
>> --
>> John Zhuge
>>
>


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread Walaa Eldin Moustafa
+1

On Thu, Feb 3, 2022 at 11:19 AM John Zhuge  wrote:

> Hi Spark community,
>
> I’d like to restart the vote for the ViewCatalog design proposal (SPIP
> 
> ).
>
> The proposal is to add a ViewCatalog interface that can be used to load,
> create, alter, and drop views in DataSourceV2.
>
> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
> update the PR  for review.
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>
> Thanks!
>
> On Fri, Jun 4, 2021 at 1:46 PM Walaa Eldin Moustafa 
> wrote:
>
>> Considering the API aspect, the ViewCatalog API sounds like a good idea.
>> A view catalog will enable us to integrate Coral
>>  (our view SQL
>> translation and management layer) very cleanly to Spark. Currently we can
>> only do it by maintaining our special version of the HiveExternalCatalog.
>> Considering that views can be expanded syntactically without necessarily
>> invoking the analyzer, using a dedicated view API can make performance
>> better if performance is the concern. Further, a catalog can still be both
>> a table and view provider if it chooses to based on this design, so I do
>> not think we necessarily lose the ability of providing both. Looking
>> forward to more discussions on this and making views a powerful tool in
>> Spark.
>>
>> Thanks,
>> Walaa.
>>
>>
>> On Wed, May 26, 2021 at 9:54 AM John Zhuge  wrote:
>>
>>> Looks like we are running in circles. Should we have an online meeting
>>> to get this sorted out?
>>>
>>> Thanks,
>>> John
>>>
>>> On Wed, May 26, 2021 at 12:01 AM Wenchen Fan 
>>> wrote:
>>>
 OK, then I'd vote for TableViewCatalog, because
 1. This is how Hive catalog works, and we need to migrate Hive catalog
 to the v2 API sooner or later.
 2. Because of 1, TableViewCatalog is easy to support in the current
 table/view resolution framework.
 3. It's better to avoid name conflicts between table and views at the
 API level, instead of relying on the catalog implementation.
 4. Caching invalidation is always a tricky problem.

 On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
 wrote:

> I don't think that it makes sense to discuss a different approach in
> the PR rather than in the vote. Let's discuss this now since that's the
> purpose of an SPIP.
>
> On Mon, May 24, 2021 at 11:22 AM John Zhuge  wrote:
>
>> Hi everyone, I’d like to start a vote for the ViewCatalog design
>> proposal (SPIP).
>>
>> The proposal is to add a ViewCatalog interface that can be used to
>> load, create, alter, and drop views in DataSourceV2.
>>
>> The full SPIP doc is here:
>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>>
>> Please vote on the SPIP in the next 72 hours. Once it is approved,
>> I’ll update the PR for review.
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

>>>
>>> --
>>> John Zhuge
>>>
>>
>
> --
> John Zhuge
>


Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread John Zhuge
Hi Spark community,

I’d like to restart the vote for the ViewCatalog design proposal (SPIP

).

The proposal is to add a ViewCatalog interface that can be used to load,
create, alter, and drop views in DataSourceV2.

Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
update the PR  for review.

[ ] +1: Accept the proposal as an official SPIP
[ ] +0
[ ] -1: I don’t think this is a good idea because …

Thanks!

On Fri, Jun 4, 2021 at 1:46 PM Walaa Eldin Moustafa 
wrote:

> Considering the API aspect, the ViewCatalog API sounds like a good idea. A
> view catalog will enable us to integrate Coral
>  (our view SQL
> translation and management layer) very cleanly to Spark. Currently we can
> only do it by maintaining our special version of the HiveExternalCatalog.
> Considering that views can be expanded syntactically without necessarily
> invoking the analyzer, using a dedicated view API can make performance
> better if performance is the concern. Further, a catalog can still be both
> a table and view provider if it chooses to based on this design, so I do
> not think we necessarily lose the ability of providing both. Looking
> forward to more discussions on this and making views a powerful tool in
> Spark.
>
> Thanks,
> Walaa.
>
>
> On Wed, May 26, 2021 at 9:54 AM John Zhuge  wrote:
>
>> Looks like we are running in circles. Should we have an online meeting to
>> get this sorted out?
>>
>> Thanks,
>> John
>>
>> On Wed, May 26, 2021 at 12:01 AM Wenchen Fan  wrote:
>>
>>> OK, then I'd vote for TableViewCatalog, because
>>> 1. This is how Hive catalog works, and we need to migrate Hive catalog
>>> to the v2 API sooner or later.
>>> 2. Because of 1, TableViewCatalog is easy to support in the current
>>> table/view resolution framework.
>>> 3. It's better to avoid name conflicts between table and views at the
>>> API level, instead of relying on the catalog implementation.
>>> 4. Caching invalidation is always a tricky problem.
>>>
>>> On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
>>> wrote:
>>>
 I don't think that it makes sense to discuss a different approach in
 the PR rather than in the vote. Let's discuss this now since that's the
 purpose of an SPIP.

 On Mon, May 24, 2021 at 11:22 AM John Zhuge  wrote:

> Hi everyone, I’d like to start a vote for the ViewCatalog design
> proposal (SPIP).
>
> The proposal is to add a ViewCatalog interface that can be used to
> load, create, alter, and drop views in DataSourceV2.
>
> The full SPIP doc is here:
> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>
> Please vote on the SPIP in the next 72 hours. Once it is approved,
> I’ll update the PR for review.
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>


 --
 Ryan Blue
 Software Engineer
 Netflix

>>>
>>
>> --
>> John Zhuge
>>
>

-- 
John Zhuge


Re: [VOTE] SPIP: Catalog API for view metadata

2021-06-04 Thread Walaa Eldin Moustafa
Considering the API aspect, the ViewCatalog API sounds like a good idea. A
view catalog will enable us to integrate Coral
 (our view SQL
translation and management layer) very cleanly to Spark. Currently we can
only do it by maintaining our special version of the HiveExternalCatalog.
Considering that views can be expanded syntactically without necessarily
invoking the analyzer, using a dedicated view API can make performance
better if performance is the concern. Further, a catalog can still be both
a table and view provider if it chooses to based on this design, so I do
not think we necessarily lose the ability of providing both. Looking
forward to more discussions on this and making views a powerful tool in
Spark.

Thanks,
Walaa.


On Wed, May 26, 2021 at 9:54 AM John Zhuge  wrote:

> Looks like we are running in circles. Should we have an online meeting to
> get this sorted out?
>
> Thanks,
> John
>
> On Wed, May 26, 2021 at 12:01 AM Wenchen Fan  wrote:
>
>> OK, then I'd vote for TableViewCatalog, because
>> 1. This is how Hive catalog works, and we need to migrate Hive catalog to
>> the v2 API sooner or later.
>> 2. Because of 1, TableViewCatalog is easy to support in the current
>> table/view resolution framework.
>> 3. It's better to avoid name conflicts between table and views at the API
>> level, instead of relying on the catalog implementation.
>> 4. Caching invalidation is always a tricky problem.
>>
>> On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
>> wrote:
>>
>>> I don't think that it makes sense to discuss a different approach in the
>>> PR rather than in the vote. Let's discuss this now since that's the purpose
>>> of an SPIP.
>>>
>>> On Mon, May 24, 2021 at 11:22 AM John Zhuge  wrote:
>>>
 Hi everyone, I’d like to start a vote for the ViewCatalog design
 proposal (SPIP).

 The proposal is to add a ViewCatalog interface that can be used to
 load, create, alter, and drop views in DataSourceV2.

 The full SPIP doc is here:
 https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing

 Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
 update the PR for review.

 [ ] +1: Accept the proposal as an official SPIP
 [ ] +0
 [ ] -1: I don’t think this is a good idea because …

>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>
> --
> John Zhuge
>


Re: [VOTE] SPIP: Catalog API for view metadata

2021-05-26 Thread John Zhuge
Looks like we are running in circles. Should we have an online meeting to
get this sorted out?

Thanks,
John

On Wed, May 26, 2021 at 12:01 AM Wenchen Fan  wrote:

> OK, then I'd vote for TableViewCatalog, because
> 1. This is how Hive catalog works, and we need to migrate Hive catalog to
> the v2 API sooner or later.
> 2. Because of 1, TableViewCatalog is easy to support in the current
> table/view resolution framework.
> 3. It's better to avoid name conflicts between table and views at the API
> level, instead of relying on the catalog implementation.
> 4. Caching invalidation is always a tricky problem.
>
> On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
> wrote:
>
>> I don't think that it makes sense to discuss a different approach in the
>> PR rather than in the vote. Let's discuss this now since that's the purpose
>> of an SPIP.
>>
>> On Mon, May 24, 2021 at 11:22 AM John Zhuge  wrote:
>>
>>> Hi everyone, I’d like to start a vote for the ViewCatalog design
>>> proposal (SPIP).
>>>
>>> The proposal is to add a ViewCatalog interface that can be used to load,
>>> create, alter, and drop views in DataSourceV2.
>>>
>>> The full SPIP doc is here:
>>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>>>
>>> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
>>> update the PR for review.
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
John Zhuge


Re: [VOTE] SPIP: Catalog API for view metadata

2021-05-26 Thread Wenchen Fan
OK, then I'd vote for TableViewCatalog, because
1. This is how Hive catalog works, and we need to migrate Hive catalog to
the v2 API sooner or later.
2. Because of 1, TableViewCatalog is easy to support in the current
table/view resolution framework.
3. It's better to avoid name conflicts between table and views at the API
level, instead of relying on the catalog implementation.
4. Caching invalidation is always a tricky problem.

On Tue, May 25, 2021 at 3:09 AM Ryan Blue  wrote:

> I don't think that it makes sense to discuss a different approach in the
> PR rather than in the vote. Let's discuss this now since that's the purpose
> of an SPIP.
>
> On Mon, May 24, 2021 at 11:22 AM John Zhuge  wrote:
>
>> Hi everyone, I’d like to start a vote for the ViewCatalog design proposal
>> (SPIP).
>>
>> The proposal is to add a ViewCatalog interface that can be used to load,
>> create, alter, and drop views in DataSourceV2.
>>
>> The full SPIP doc is here:
>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>>
>> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
>> update the PR for review.
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


Re: [VOTE] SPIP: Catalog API for view metadata

2021-05-24 Thread Ryan Blue
I don't think that it makes sense to discuss a different approach in the PR
rather than in the vote. Let's discuss this now since that's the purpose of
an SPIP.

On Mon, May 24, 2021 at 11:22 AM John Zhuge  wrote:

> Hi everyone, I’d like to start a vote for the ViewCatalog design proposal
> (SPIP).
>
> The proposal is to add a ViewCatalog interface that can be used to load,
> create, alter, and drop views in DataSourceV2.
>
> The full SPIP doc is here:
> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>
> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
> update the PR for review.
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>


-- 
Ryan Blue
Software Engineer
Netflix