Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Hyukjin Kwon
Also, I would like to hear other people' thoughts on here. Could I ask what you guys think about this in general? 2020년 2월 12일 (수) 오후 12:02, Hyukjin Kwon 님이 작성: > To do that, we should explicitly document such structured configuration > and implicit effect, which is currently missing. > I would

Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Hyukjin Kwon
Yeah, that's one of my point why I dont want to document this in the guide yet. I would like to make sure dev people are on the same page that documenting is a better practice. I dont intend to force as a hard requirement; however, if that's pointed out, it should better to address. On Wed, 12

Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Jules Damji
All are valid and valuable observations to put into practice: * structured and meaningful config names * explainable text or succinct description * easily accessible or searchable While these are aspirational but gradually doable if we make it part of the dev and review cycle. Often

Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Wenchen Fan
In general I think it's better to have more detailed documents, but we don't have to force everyone to do it if the config name is structured. I would +1 to document the relationship of we can't tell it from the config names, e.g. spark.shuffle.service.enabled and spark.dynamicAllocation.enabled.

[DISCUSS] naming policy of Spark configs

2020-02-12 Thread Wenchen Fan
Hi all, I'd like to discuss the naming policy of Spark configs, as for now it depends on personal preference which leads to inconsistent namings. In general, the config name should be a noun that describes its meaning clearly. Good examples: spark.sql.session.timeZone

Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Dongjoon Hyun
Thank you for raising the issue, Hyukjin. According to the current status of discussion, it seems that we are able to agree on updating the non-structured configurations and keeping the structured configuration AS-IS. I'm +1 for the revisiting the configurations if that is our direction. If

Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Hyukjin Kwon
I think it’s just fine as long as we’re consistent with the instances having the description, for instance: When true and ‘spark.xx.xx’ is enabled, … I think this is 2-2 in most cases so far. I think we can reference other configuration keys in another configuration documentation by using

Re: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Wenchen Fan
Hi Dongjoon, It's too much work to revisit all the configs that added in 3.0, but I'll revisit the recent commits that update config names and see if they follow the new policy. Hi Reynold, There are a few interval configs: spark.sql.streaming.fileSink.log.compactInterval

Re: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Dongjoon Hyun
Thank you, Wenchen. The new policy looks clear to me. +1 for the explicit policy. So, are we going to revise the existing conf names before 3.0.0 release? Or, is it applied to new up-coming configurations from now? Bests, Dongjoon. On Wed, Feb 12, 2020 at 7:43 AM Wenchen Fan wrote: > Hi

Re: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Hyukjin Kwon
+1. 2020년 2월 13일 (목) 오전 9:30, Gengliang Wang 님이 작성: > +1, this is really helpful. We should make the SQL configurations > consistent and more readable. > > On Wed, Feb 12, 2020 at 3:33 PM Rubén Berenguel > wrote: > >> I love it, it will make configs easier to read and write. Thanks Wenchen. >>

Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Hyukjin Kwon
Adding those information is already a more prevailing style at this moment, and this is usual to follow prevailing side if there isn't a specific reason. If there is confusion about this, I will explicitly add it into the guide ( https://spark.apache.org/contributing.html). Let me know if this

Re: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Jungtaek Lim
+1 Thanks for the proposal. Looks very reasonable to me. On Thu, Feb 13, 2020 at 10:53 AM Hyukjin Kwon wrote: > +1. > > 2020년 2월 13일 (목) 오전 9:30, Gengliang Wang 님이 > 작성: > >> +1, this is really helpful. We should make the SQL configurations >> consistent and more readable. >> >> On Wed, Feb 12,

Re: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Reynold Xin
This is really cool. We should also be more opinionated about how we specify time and intervals. On Wed, Feb 12, 2020 at 3:15 PM, Dongjoon Hyun < dongjoon.h...@gmail.com > wrote: > > Thank you, Wenchen. > > > The new policy looks clear to me. +1 for the explicit policy. > > > So, are we

Re: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Rubén Berenguel
I love it, it will make configs easier to read and write. Thanks Wenchen. R > On 13 Feb 2020, at 00:15, Dongjoon Hyun wrote: > >  > Thank you, Wenchen. > > The new policy looks clear to me. +1 for the explicit policy. > > So, are we going to revise the existing conf names before 3.0.0

Re: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Gengliang Wang
+1, this is really helpful. We should make the SQL configurations consistent and more readable. On Wed, Feb 12, 2020 at 3:33 PM Rubén Berenguel wrote: > I love it, it will make configs easier to read and write. Thanks Wenchen. > > R > > On 13 Feb 2020, at 00:15, Dongjoon Hyun wrote: > >  >

Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Hyukjin Kwon
Yes, that's probably our final goal to revisit the configurations to make it structured and deduplicated documentation cleanly. +1. One point I would like to add is though to add such information to the documentation until we actually manage our final goal since probably it's going to take a

Re: Request to document the direct relationship between other configurations

2020-02-12 Thread Jungtaek Lim
I tend to agree that there should be a time to make thing be consistent (and I'm very happy to see the new thread on discussion) and we may want to take some practice in the interim. But for me it is not clear what is the practice in the interim. I pointed out the problems of existing style and

Re: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Takeshi Yamamuro
+1; the idea sounds reasonable. Bests, Takeshi On Thu, Feb 13, 2020 at 12:39 PM Wenchen Fan wrote: > Hi Dongjoon, > > It's too much work to revisit all the configs that added in 3.0, but I'll > revisit the recent commits that update config names and see if they follow > the new policy. > > >

RE: [DISCUSS] naming policy of Spark configs

2020-02-12 Thread Kazuaki Ishizaki
+1 if we add them to Alternative config. Kazuaki Ishizaki From: Takeshi Yamamuro To: Wenchen Fan Cc: Spark dev list Date: 2020/02/13 16:02 Subject:[EXTERNAL] Re: [DISCUSS] naming policy of Spark configs +1; the idea sounds reasonable. Bests, Takeshi On Thu, Feb 13,