Sorry, backing off on this due to time constraints. On Wed, Jan 22, 2020 at 3:39 PM Udi Meiri <eh...@google.com> wrote:
> It sounds like there's a consensus for yapf. I volunteer to take this on > > On Wed, Jan 22, 2020, 10:31 Udi Meiri <eh...@google.com> wrote: > >> +1 to autoformatting >> >> On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik <lc...@google.com> wrote: >> >>> +1 to autoformatters. Also the Beam Java SDK went through a one time >>> pass to apply the spotless formatting. >>> >>> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay <al...@google.com> wrote: >>> >>>> +1 to autoformatters and yapf. It appears to be a well maintained >>>> project. I do support making a one time pass to apply formatting the whole >>>> code base. >>>> >>>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova <chad...@gmail.com> >>>> wrote: >>>> >>>>> >>>>>> It'd be good if there was a way to only apply to violating (or at >>>>>> least changed) lines. >>>>> >>>>> >>>>> I assumed the first thing we’d do is convert all of the code in one >>>>> go, since it’s a very safe operation. Did you have something else in mind? >>>>> >>>>> -chad >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova <chad...@gmail.com> >>>>>> wrote: >>>>>> > >>>>>> > +1 to autoformatting >>>>>> > >>>>>> > Let me add some nuance to that. >>>>>> > >>>>>> > The way I see it there are 2 varieties of formatters: those which >>>>>> take the original formatting into consideration (autopep8) and those >>>>>> which >>>>>> disregard it (yapf, black). >>>>>> > >>>>>> > I much prefer yapf to black, because you have plenty of options to >>>>>> tweak with yapf (enough to make the output a pretty close match to the >>>>>> current Beam style), and you can mark areas to preserve the original >>>>>> formatting, which could be very useful with Pipeline building with pipe >>>>>> operators. Please don't pick black. >>>>>> > >>>>>> > autopep8 is more along the lines of spotless in Java -- it only >>>>>> corrects code that breaks the project's style rules. The big problem >>>>>> with >>>>>> Beam's current style is that it is so esoteric that autopep8 can't >>>>>> enforce >>>>>> it -- and I'm not just talking about 2-spaces, which I don't really have >>>>>> a >>>>>> problem with -- the problem is the use of either 2 or 4 spaces depending >>>>>> on >>>>>> context (expression start vs hanging indent, etc). This is my *biggest* >>>>>> gripe about the current style. PyCharm doesn't have enough control >>>>>> either. So, if we can choose a style that can be expressed by flake8 or >>>>>> pycodestyle then we can use autopep8 to enforce it. >>>>>> > >>>>>> > I'd prefer autopep8 to yapf because I like having a little wiggle >>>>>> room to influence the style, but on a big project like Beam all that >>>>>> wiggle >>>>>> room ends up to minor but noticeable inconsistencies in style throughout >>>>>> the project. yapf ensures completely consistent style, but the tradeoff >>>>>> is >>>>>> that it's sometimes ugly, especially in scenarios with similar repeated >>>>>> entries like argparse, where yapf might insert line breaks in visually >>>>>> inconsistent and unappealing ways depending on the lengths of the >>>>>> keywords >>>>>> and expressions involved. >>>>>> > >>>>>> > Either way (but especially if we choose yapf) I think it'd be a >>>>>> nice addition to setup a pre-commit [1] config so that people can opt in >>>>>> to >>>>>> running *lightweight* autofixers prior to commit. This will not only >>>>>> reduce dev frustration but will also reduce the amount of cpu cycles that >>>>>> Jenkins spends pointing out lint errors. >>>>>> > >>>>>> > [1] https://pre-commit.com/ >>>>>> > >>>>>> > -chad >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Tue, Jan 21, 2020 at 12:52 PM Ismaël Mejía <ieme...@gmail.com> >>>>>> wrote: >>>>>> >> >>>>>> >> Last time we discussed this there seems not to be much progress >>>>>> into autoformatting. >>>>>> >> This tool looks more tweakable, so maybe it could be more >>>>>> appropriate for Beam's use case. >>>>>> >> https://github.com/google/yapf/ >>>>>> >> WDYT? >>>>>> >> >>>>>> >> >>>>>> >> On Thu, May 30, 2019 at 10:50 AM Łukasz Gajowy <lgaj...@apache.org> >>>>>> wrote: >>>>>> >>> >>>>>> >>> +1 for any autoformatter for Python SDK that does the job. My >>>>>> experience is that since spotless in Java SDK I would never start a new >>>>>> Java project without it. So many great benefits not only for one person >>>>>> coding but for all community. >>>>>> >>> >>>>>> >>> It is a GitHub UI issue that you cannot easily browse past the >>>>>> reformat. It is not actually that hard, but does take a couple extra >>>>>> clicks >>>>>> to get GitHub to display blame before a reformat. It is easier with the >>>>>> command line. I do a lot of code history digging and the global Java >>>>>> reformat is not really a problem. >>>>>> >>> >>>>>> >>> It's actually one more click on Github but I agree it's not the >>>>>> best way to search the history. The most convenient and clear one I've >>>>>> found so far is in Jetbrains IDEs (Intelij) where you can: >>>>>> >>> >>>>>> >>> right click on line number -> "annotate" -> click again -> >>>>>> "annotate previous revision" -> ... >>>>>> >>> >>>>>> >>> You can also use "compare with" to see the diff between two >>>>>> revisions. >>>>>> >>> >>>>>> >>> Łukasz >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> czw., 30 maj 2019 o 06:15 Kenneth Knowles <k...@apache.org> >>>>>> napisał(a): >>>>>> >>>> >>>>>> >>>> +1 pending good enough tooling (I can't quite tell - seems there >>>>>> are some issues?) >>>>>> >>>> >>>>>> >>>> On Wed, May 29, 2019 at 2:40 PM Katarzyna Kucharczyk < >>>>>> ka.kucharc...@gmail.com> wrote: >>>>>> >>>>> >>>>>> >>>>> What else actually we gain? My guess is faster PR review >>>>>> iteration. We will skip some of conversations about code style. >>>>>> >>>> >>>>>> >>>> ... >>>>>> >>>>> >>>>>> >>>>> Last but not least, new contributor may be less discouraged. >>>>>> When I started contribute I didn’t know how to format my code and I lost >>>>>> a >>>>>> lot of time to add pylint and adjust IntelliJ. I eventually failed. >>>>>> Currently I write code intuitively and when I don’t forget I rerun tox. >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> This is a huge benefit. This is why I supported it so much for >>>>>> Java. It is a community benefit. You do not have to be a contributor to >>>>>> the >>>>>> Python SDK to support this. That is why I am writing here. Just eliminate >>>>>> all discussion of formatting. It doesn't really matter what the resulting >>>>>> format is, if it is not crazy to read. I strongly oppose maintaining a >>>>>> non-default format. >>>>>> >>>> >>>>>> >>>> Reformating 20k lines or 200k is not hard. The Java global >>>>>> reformat touched 50k lines. It does not really matter how big it is. >>>>>> Definitely do it all at once if you think the tool is good enough. And >>>>>> you >>>>>> should pin a version, so churn is not a problem. You can upgrade the >>>>>> version and reformat in a PR later and that is also easy. >>>>>> >>>> >>>>>> >>>> It is a GitHub UI issue that you cannot easily browse past the >>>>>> reformat. It is not actually that hard, but does take a couple extra >>>>>> clicks >>>>>> to get GitHub to display blame before a reformat. It is easier with the >>>>>> command line. I do a lot of code history digging and the global Java >>>>>> reformat is not really a problem. >>>>>> >>>> >>>>>> >>>> Kenn >>>>>> >>>> >>>>>> >>>> >>>>>> >>>>> >>>>>> >>>>> Also everything will be formatted in a same way, so eventually >>>>>> it would be easier to read. >>>>>> >>>>> >>>>>> >>>>> Moreover, as it was mentioned in previous emails - a lot of >>>>>> Jenkins failures won’t take place, so we save time and resources. >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> One of disadvantages is that our pipelines has custom syntax >>>>>> and after formatting they looks a little bit weird, but maybe extending >>>>>> the >>>>>> only configurable option in Black - lines, from 88 to 110 would be >>>>>> solution. >>>>>> >>>>> >>>>>> >>>>> Second one is that Black requires Python 3 to be run. I don’t >>>>>> know how big obstacle it would be. >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> I believe there are two options how it would be possible to >>>>>> introduce Black. First: just do it, it will hurt but then it would be ok >>>>>> (same as a dentist appointment). Of course it may require some work to >>>>>> adjust linters. On the other hand we can do it gradually and start >>>>>> including sdk parts one by one - maybe it will be less painful? >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> As an example I can share one of projects [2] I know that uses >>>>>> Black (they use also other cool checkers and pre-commit [3]). This is how >>>>>> looks their build with all checks [4]. >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> To sum up I believe that if we want improve our coding >>>>>> experience, we should improve our toolset. Black seems be recent and >>>>>> quite >>>>>> popular tool what makes think they won’t stop developing it. >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> [1] >>>>>> https://stackoverflow.com/questions/4112410/git-change-styling-whitespace-without-changing-ownership-blame >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> [2] https://github.com/GoogleCloudPlatform/oozie-to-airflow >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> [3] https://pre-commit.com >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> [4] >>>>>> https://travis-ci.org/GoogleCloudPlatform/oozie-to-airflow/builds/538725689 >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> On Wed, May 29, 2019 at 2:01 PM Robert Bradshaw < >>>>>> rober...@google.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Reformatting to 4 spaces seems a non-starter to me, as it >>>>>> would change nearly every single line in the codebase (and the loss of >>>>>> all >>>>>> context as well as that particular line). >>>>>> >>>>>> >>>>>> >>>>>> This is probably why the 2-space fork exists. However, we >>>>>> don't conform to that either--we use 2 spaces for indentation, but 4 for >>>>>> continuation indentation. (As for the history of this, this goes back to >>>>>> Google's internal style guide, probably motivated by consistency with >>>>>> C++, >>>>>> Java, ... and the fact that with an indent level of 4 one ends up >>>>>> wrapping >>>>>> lines quite frequently (it's telling that black's default line length is >>>>>> 88)). This turns out to be an easy change to the codebase. >>>>>> >>>>>> >>>>>> >>>>>> Once we move beyond the 2 vs. 4 whitespace thing, I found that >>>>>> this tool introduces a huge amount of vertical whitespace (e.g. closing >>>>>> parentheses on their own line), e.g. >>>>>> >>>>>> >>>>>> >>>>>> def foo( >>>>>> >>>>>> args >>>>>> >>>>>> ): >>>>>> >>>>>> if ( >>>>>> >>>>>> long expression) >>>>>> >>>>>> ): >>>>>> >>>>>> func( >>>>>> >>>>>> args >>>>>> >>>>>> ) >>>>>> >>>>>> >>>>>> >>>>>> I wrote a simple post-processor to put closing parentheses on >>>>>> the same lines, as well as omit the newline after "if (", and disabling >>>>>> formatting of strings, which reduce the churn in our codebase to 15k >>>>>> lines >>>>>> (adding about 4k) out of 200k total. >>>>>> >>>>>> >>>>>> >>>>>> https://github.com/apache/beam/pull/8712/files >>>>>> >>>>>> >>>>>> >>>>>> It's still very opinionated, often in different ways then me, >>>>>> and doesn't understand the semantics of the code, but possibly something >>>>>> we >>>>>> could live with given the huge advantages of an autoformatter. >>>>>> >>>>>> >>>>>> >>>>>> An intermediate point would be to allow, but not require, >>>>>> autoformatting of changed lines. >>>>>> >>>>>> >>>>>> >>>>>> As for being beta quality, it looks like it's got a decent >>>>>> number of contributors and in my book being in the python github project >>>>>> is >>>>>> a strong positive signal. But, due to the above issues, I think we'd have >>>>>> to maintain a fork. (The code is pretty lightweight, the 2 vs. 4 space >>>>>> issue is a 2-line change, and the rest implemented as a post-processing >>>>>> step (for now, incomplete), so it'd be easy to stay in sync with >>>>>> upstream.) >>>>>> >>>>>> >>>>>> >>>>>> On Wed, May 29, 2019 at 11:03 AM Ismaël Mejía < >>>>>> ieme...@gmail.com> wrote: >>>>>> >>>>>> > >>>>>> >>>>>> > > I think the question is if it can be configured in a way >>>>>> to fit our >>>>>> >>>>>> > > current linter's style. I don't think it is feasible to >>>>>> reformat the >>>>>> >>>>>> > > entire Python SDK. >>>>>> >>>>>> > >>>>>> >>>>>> > It cannot be configured to do what we actually do because >>>>>> Black is >>>>>> >>>>>> > configurable only to support the standard python codestyle >>>>>> guidelines >>>>>> >>>>>> > (PEP-8) which recommends 4 spaces and is what most projects >>>>>> in the >>>>>> >>>>>> > python world use. >>>>>> >>>>>> > >>>>>> >>>>>> > > Reformatted lines don't allow quick access to the Git >>>>>> history. This >>>>>> >>>>>> > > effect is still visible in the Java SDK. However, I have >>>>>> the feeling >>>>>> >>>>>> > > that this might be less of a problem with Python because >>>>>> the linter has >>>>>> >>>>>> > > more rules than Checkstyle had. >>>>>> >>>>>> > >>>>>> >>>>>> > Yes that’s the bad side effect but there are always >>>>>> tradeoffs we have >>>>>> >>>>>> > to deal with. >>>>>> >>>>>> > >>>>>> >>>>>> > >>>>>> >>>>>> > >>>>>> >>>>>> > >>>>>> >>>>>> > On Wed, May 29, 2019 at 10:52 AM Maximilian Michels < >>>>>> m...@apache.org> wrote: >>>>>> >>>>>> > > >>>>>> >>>>>> > > I think the question is if it can be configured in a way >>>>>> to fit our >>>>>> >>>>>> > > current linter's style. I don't think it is feasible to >>>>>> reformat the >>>>>> >>>>>> > > entire Python SDK. >>>>>> >>>>>> > > >>>>>> >>>>>> > > Reformatted lines don't allow quick access to the Git >>>>>> history. This >>>>>> >>>>>> > > effect is still visible in the Java SDK. However, I have >>>>>> the feeling >>>>>> >>>>>> > > that this might be less of a problem with Python because >>>>>> the linter has >>>>>> >>>>>> > > more rules than Checkstyle had. >>>>>> >>>>>> > > >>>>>> >>>>>> > > -Max >>>>>> >>>>>> > > >>>>>> >>>>>> > > On 29.05.19 10:16, Ismaël Mejía wrote: >>>>>> >>>>>> > > >> My concerns are: >>>>>> >>>>>> > > >> - The product is clearly marked as beta with a big >>>>>> warning. >>>>>> >>>>>> > > >> - It looks like mostly a single person project. For the >>>>>> same reason I also strongly prefer not using a fork for a specific >>>>>> setting. >>>>>> Fork will only have less people looking at it. >>>>>> >>>>>> > > > >>>>>> >>>>>> > > > I suppose the project is marked as beta because it is >>>>>> recent, it was >>>>>> >>>>>> > > > presented in 2018’s pycon, and because some things can >>>>>> change since >>>>>> >>>>>> > > > auto-formatters are pretty tricky beasts, I think beta >>>>>> in that case is >>>>>> >>>>>> > > > like our own ‘@Experimental’. If you look at the >>>>>> contribution page [1] >>>>>> >>>>>> > > > you can notice that it is less and less a single person >>>>>> project, there >>>>>> >>>>>> > > > have been 93 independent contributions since the project >>>>>> became >>>>>> >>>>>> > > > public, and the fact that it is hosted in the python >>>>>> organization >>>>>> >>>>>> > > > github [2] gives some confidence on the project >>>>>> continuity. >>>>>> >>>>>> > > > >>>>>> >>>>>> > > > You are right however about the fact that the main >>>>>> author seems to be >>>>>> >>>>>> > > > the ‘benevolent’ dictator, and in the 2-spaces issue he >>>>>> can seem >>>>>> >>>>>> > > > arbitrary, but he is just following pep8 style guide >>>>>> recommendations >>>>>> >>>>>> > > > [3]. I am curious of why we (Beam) do not follow the 4 >>>>>> spaces >>>>>> >>>>>> > > > recommendation of PEP-8 or even Google's own Python >>>>>> style guide [4], >>>>>> >>>>>> > > > So, probably it should be to us to reconsider the >>>>>> current policy to >>>>>> >>>>>> > > > adapt to the standards (and the tool). >>>>>> >>>>>> > > > >>>>>> >>>>>> > > > I did a quick run of black with python 2.7 compatibility >>>>>> on >>>>>> >>>>>> > > > sdks/python and got only 4 parsing errors which is >>>>>> positive given the >>>>>> >>>>>> > > > size of our code base. >>>>>> >>>>>> > > > >>>>>> >>>>>> > > > 415 files reformatted, 45 files left unchanged, 4 files >>>>>> failed to reformat. >>>>>> >>>>>> > > > >>>>>> >>>>>> > > > error: cannot format >>>>>> >>>>>> > > > >>>>>> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py: >>>>>> >>>>>> > > > Cannot parse: 47:22: _display_progress = print >>>>>> >>>>>> > > > error: cannot format >>>>>> >>>>>> > > > >>>>>> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/log_handler.py: >>>>>> >>>>>> > > > Cannot parse: 151:18: file=sys.stderr) >>>>>> >>>>>> > > > error: cannot format >>>>>> >>>>>> > > > >>>>>> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py: >>>>>> >>>>>> > > > Cannot parse: 160:34: print(traceback_string, >>>>>> file=sys.stderr) >>>>>> >>>>>> > > > error: cannot format >>>>>> >>>>>> > > > >>>>>> /home/ismael/upstream/beam/sdks/python/apache_beam/typehints/trivial_inference.py: >>>>>> >>>>>> > > > Cannot parse: 335:51: print('-->' if pc == last_pc >>>>>> else ' ', >>>>>> >>>>>> > > > end=' ') >>>>>> >>>>>> > > > >>>>>> >>>>>> > > > I still think this can be positive for the project but >>>>>> well I am >>>>>> >>>>>> > > > barely a contributor to the python code base so I let >>>>>> you the python >>>>>> >>>>>> > > > maintainers to reconsider this, in any case it seems >>>>>> like a good >>>>>> >>>>>> > > > improvement for the project. >>>>>> >>>>>> > > > >>>>>> >>>>>> > > > [1] https://github.com/python/black/graphs/contributors >>>>>> >>>>>> > > > [2] https://github.com/python >>>>>> >>>>>> > > > [3] >>>>>> https://www.python.org/dev/peps/pep-0008/#indentation >>>>>> >>>>>> > > > [4] >>>>>> https://github.com/google/styleguide/blob/gh-pages/pyguide.md#34-indentation >>>>>> >>>>>> > > > >>>>>> >>>>>> > > > On Tue, May 28, 2019 at 11:15 PM Ahmet Altay < >>>>>> al...@google.com> wrote: >>>>>> >>>>>> > > >> >>>>>> >>>>>> > > >> I am in the same boat with Robert, I am in favor of >>>>>> autoformatters but I am not familiar with this one. My concerns are: >>>>>> >>>>>> > > >> - The product is clearly marked as beta with a big >>>>>> warning. >>>>>> >>>>>> > > >> - It looks like mostly a single person project. For the >>>>>> same reason I also strongly prefer not using a fork for a specific >>>>>> setting. >>>>>> Fork will only have less people looking at it. >>>>>> >>>>>> > > >> >>>>>> >>>>>> > > >> IMO, this is in an early stage for us. That said lint >>>>>> issues are real as pointed in the thread. If someone would like to give >>>>>> it >>>>>> a try and see how it would look like for us that would be interesting. >>>>>> >>>>>> > > >> >>>>>> >>>>>> > > >> On Tue, May 28, 2019 at 4:44 AM Katarzyna Kucharczyk < >>>>>> ka.kucharc...@gmail.com> wrote: >>>>>> >>>>>> > > >>> >>>>>> >>>>>> > > >>> This sounds really good. A lot of Jenkins jobs >>>>>> failures are caused by lint problems. >>>>>> >>>>>> > > >>> I think it would be great to have something similar to >>>>>> Spotless in Java SDK (I heard there is problem with configuring Black >>>>>> with >>>>>> IntelliJ). >>>>>> >>>>>> > > >>> >>>>>> >>>>>> > > >>> On Mon, May 27, 2019 at 10:52 PM Robert Bradshaw < >>>>>> rober...@google.com> wrote: >>>>>> >>>>>> > > >>>> >>>>>> >>>>>> > > >>>> I'm generally in favor of autoformatters, though I >>>>>> haven't looked at >>>>>> >>>>>> > > >>>> how well this particular one works. We might have to >>>>>> go with >>>>>> >>>>>> > > >>>> https://github.com/desbma/black-2spaces given >>>>>> >>>>>> > > >>>> https://github.com/python/black/issues/378 . >>>>>> >>>>>> > > >>>> >>>>>> >>>>>> > > >>>> On Mon, May 27, 2019 at 10:43 PM Pablo Estrada < >>>>>> pabl...@google.com> wrote: >>>>>> >>>>>> > > >>>>> >>>>>> >>>>>> > > >>>>> This looks pretty good:) I know at least a couple >>>>>> people (myself included) who've been annoyed by having to take care of >>>>>> lint >>>>>> issues that maybe a code formatter could save us. >>>>>> >>>>>> > > >>>>> Thanks for sharing Ismael. >>>>>> >>>>>> > > >>>>> -P. >>>>>> >>>>>> > > >>>>> >>>>>> >>>>>> > > >>>>> >>>>>> >>>>>> > > >>>>> On Mon, May 27, 2019, 12:24 PM Ismaël Mejía < >>>>>> ieme...@gmail.com> wrote: >>>>>> >>>>>> > > >>>>>> >>>>>> >>>>>> > > >>>>>> I stumbled by chance into Black [1] a python code >>>>>> auto formatter that >>>>>> >>>>>> > > >>>>>> is becoming the 'de-facto' auto-formatter for >>>>>> python, and wanted to >>>>>> >>>>>> > > >>>>>> bring to the ML Is there interest from the python >>>>>> people to get this >>>>>> >>>>>> > > >>>>>> into the build? >>>>>> >>>>>> > > >>>>>> >>>>>> >>>>>> > > >>>>>> The introduction of spotless for Java has been a >>>>>> good improvement and >>>>>> >>>>>> > > >>>>>> maybe the python code base may benefit of this too. >>>>>> >>>>>> > > >>>>>> >>>>>> >>>>>> > > >>>>>> WDYT? >>>>>> >>>>>> > > >>>>>> >>>>>> >>>>>> > > >>>>>> [1] https://github.com/python/black >>>>>> >>>>>
smime.p7s
Description: S/MIME Cryptographic Signature