Before I forget, we'd better not forget to change the doc, as create table doc looks to represent current syntax which will be incorrect later.
On Tue, May 12, 2020 at 5:32 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote: > It's not only for end users, but also for us. Spark itself uses the config > "true" and "false" in tests and it still brings confusion. We still have to > deal with both situations. > > I'm wondering how long days it would be needed to revert it cleanly, but > if we worry about the amount of code change just around the new RC, what > about make the code dirty (should be fixed soon) but less headache via > applying traditional (and bad) way? > > Let's just remove the config so that the config cannot be used in any way > (even in Spark codebase), and set corresponding field in parser to the > constant value so that no one can modify in any way. This would make the > dead code by intention which should be cleaned it up later, so let's add > FIXME comment there so that anyone can take it up for cleaning up the code > later. (If no one volunteers then I'll probably pick up.) > > That is a bad pattern, but still better as we prevent end users (even > early adopters) go through the undocumented path in any way, and that will > be explicitly marked as "should be fixed". This is different from retaining > config - I don't expect unified create table syntax will be landed in > bugfix version, so even unified create table syntax can be landed in 3.1.0 > (this is also not guaranteed) the config will live in 3.0.x in any way. If > we temporarily go dirty way then we can clean up the code in any version, > even from bugfix version, maybe within a couple of weeks just after 3.0.0 > is released. > > Does it sound valid? > > On Tue, May 12, 2020 at 2:35 PM Wenchen Fan <cloud0...@gmail.com> wrote: > >> SPARK-30098 was merged about 6 months ago. It's not a clean revert and we >> may need to spend quite a bit of time to resolve conflicts and fix tests. >> >> I don't see why it's still a problem if a feature is disabled and hidden >> from end-users (it's undocumented, the config is internal). The related >> code will be replaced in the master branch sooner or later, when we unify >> the syntaxes. >> >> >> >> On Tue, May 12, 2020 at 6:16 AM Ryan Blue <rb...@netflix.com.invalid> >> wrote: >> >>> I'm all for getting the unified syntax into master. The only issue >>> appears to be whether or not to pass the presence of the EXTERNAL keyword >>> through to a catalog in v2. Maybe it's time to start a discuss thread for >>> that issue so we're not stuck for another 6 weeks on it. >>> >>> On Mon, May 11, 2020 at 3:13 PM Jungtaek Lim < >>> kabhwan.opensou...@gmail.com> wrote: >>> >>>> Btw another wondering here is, is it good to retain the flag on master >>>> as an intermediate step? Wouldn't it be better for us to start "unified >>>> create table syntax" from scratch? >>>> >>>> >>>> On Tue, May 12, 2020 at 6:50 AM Jungtaek Lim < >>>> kabhwan.opensou...@gmail.com> wrote: >>>> >>>>> I'm sorry, but I have to agree with Ryan and Russell. I chose the >>>>> option 1 because it's less worse than option 2, but it doesn't mean I >>>>> fully >>>>> agree with option 1. >>>>> >>>>> Let's make below things clear if we really go with option 1, otherwise >>>>> please consider reverting it. >>>>> >>>>> * Do you fully indicate about "all" the paths where the second create >>>>> table syntax is taken? >>>>> * Could you explain "why" to end users without any confusion? Do you >>>>> think end users will understand it easily? >>>>> * Do you have an actual end users to guide to turn this on? Or do you >>>>> have a plan to turn this on for your team/customers and deal with >>>>> the ambiguity? >>>>> * Could you please document about how things will change if the flag >>>>> is turned on? >>>>> >>>>> I guess the option 1 is to leave a flag as "undocumented" one and >>>>> forget about the path to turn on, but I think that would lead to make the >>>>> feature be "broken window" even we are not able to touch. >>>>> >>>>> On Tue, May 12, 2020 at 6:45 AM Russell Spitzer < >>>>> russell.spit...@gmail.com> wrote: >>>>> >>>>>> I think reverting 30098 is the right decision here if we want to >>>>>> unblock 3.0. We shouldn't ship with features which we know do not >>>>>> function >>>>>> in the way we intend, regardless of how little exposure most users have >>>>>> to >>>>>> them. Even if it's off my default, we should probably work to avoid >>>>>> switches that cause things to behave unpredictably or require a flow >>>>>> chart >>>>>> to actually determine what will happen. >>>>>> >>>>>> On Mon, May 11, 2020 at 3:07 PM Ryan Blue <rb...@netflix.com.invalid> >>>>>> wrote: >>>>>> >>>>>>> I'm all for fixing behavior in master by turning this off as an >>>>>>> intermediate step, but I don't think that Spark 3.0 can safely include >>>>>>> SPARK-30098. >>>>>>> >>>>>>> The problem is that SPARK-30098 introduces strange behavior, as >>>>>>> Jungtaek pointed out. And that behavior is not fully understood. While >>>>>>> working on a unified CREATE TABLE syntax, I hit additional test >>>>>>> failures >>>>>>> <https://github.com/apache/spark/pull/28026#issuecomment-606967363> >>>>>>> where the wrong create path was being used. >>>>>>> >>>>>>> Unless we plan to NOT support the behavior >>>>>>> when spark.sql.legacy.createHiveTableByDefault.enabled is disabled, we >>>>>>> should not ship Spark 3.0 with SPARK-30098. Otherwise, we will have to >>>>>>> deal >>>>>>> with this problem for years to come. >>>>>>> >>>>>>> On Mon, May 11, 2020 at 1:06 AM JackyLee <qcsd2...@163.com> wrote: >>>>>>> >>>>>>>> +1. Agree with Xiao Li and Jungtaek Lim. >>>>>>>> >>>>>>>> This seems to be controversial, and can not be done in a short >>>>>>>> time. It is >>>>>>>> necessary to choose option 1 to unblock Spark 3.0 and support it in >>>>>>>> 3.1. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Sent from: >>>>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/ >>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ryan Blue >>>>>>> Software Engineer >>>>>>> Netflix >>>>>>> >>>>>> >>> >>> -- >>> Ryan Blue >>> Software Engineer >>> Netflix >>> >>