Hi, I just tried out the yapf formatter and I noticed that sometimes it's making the original code a lot less readable. In the below example, - is the original, + is after running the yapf formatter. Looks like the problem is with the method chaining pattern. How feasible is it to have yapf identify such a pattern and format it better? Before this can be fixed, Is it possible to have a directive in the code comment to bypass yapf?
Thanks! - test_stream = (TestStream() - .advance_watermark_to(0) - .add_elements(['a', 'b', 'c']) - .advance_processing_time(1) - .advance_processing_time(1) - .advance_processing_time(1) - .advance_processing_time(1) - .advance_processing_time(1) - .advance_watermark_to(5) - .add_elements(['1', '2', '3']) - .advance_processing_time(1) - .advance_watermark_to(6) - .advance_processing_time(1) - .advance_watermark_to(7) - .advance_processing_time(1) - .advance_watermark_to(8) - .advance_processing_time(1) - .advance_watermark_to(9) - .advance_processing_time(1) - .advance_watermark_to(10) - .advance_processing_time(1) - .advance_watermark_to(11) - .advance_processing_time(1) - .advance_watermark_to(12) - .advance_processing_time(1) - .advance_watermark_to(13) - .advance_processing_time(1) - .advance_watermark_to(14) - .advance_processing_time(1) - .advance_watermark_to(15) - .advance_processing_time(1) - ) + test_stream = ( + TestStream().advance_watermark_to(0).add_elements( + ['a', 'b', 'c']).advance_processing_time(1).advance_processing_time( + 1).advance_processing_time(1).advance_processing_time(1). + advance_processing_time(1).advance_watermark_to(5).add_elements( + ['1', '2', '3']).advance_processing_time(1).advance_watermark_to( + 6).advance_processing_time(1).advance_watermark_to( + 7).advance_processing_time(1).advance_watermark_to( + 8).advance_processing_time(1).advance_watermark_to(9). + advance_processing_time(1).advance_watermark_to( + 10).advance_processing_time(1).advance_watermark_to( + 11).advance_processing_time(1).advance_watermark_to( + 12).advance_processing_time(1).advance_watermark_to( + 13).advance_processing_time(1).advance_watermark_to( + 14).advance_processing_time(1).advance_watermark_to( + 15).advance_processing_time(1)) On Thu, Feb 6, 2020 at 1:50 PM Robert Bradshaw <[email protected]> wrote: > Thanks! > > On Thu, Feb 6, 2020 at 1:29 PM Ismaël Mejía <[email protected]> wrote: > >> Thanks Kamil and Michał for taking care of this. >> Excellent job! >> >> On Thu, Feb 6, 2020 at 1:45 PM Kamil Wasilewski < >> [email protected]> wrote: >> >>> Thanks to everyone involved in the discussion. >>> >>> I've taken a look at the first 50 recently updated Pull Requests. Only >>> few of them were affected. I hope it wouldn't be too hard to fix them. >>> >>> In any case, here you can find instructions on how to run formatter: >>> https://cwiki.apache.org/confluence/display/BEAM/Python+Tips (section >>> "Formatting"). >>> >>> On Thu, Feb 6, 2020 at 12:42 PM Michał Walenia < >>> [email protected]> wrote: >>> >>>> Hi, >>>> the PR is merged, all checks were green :) >>>> Enjoy prettier Python! >>>> >>>> On Thu, Feb 6, 2020 at 11:11 AM Ismaël Mejía <[email protected]> wrote: >>>> >>>>> Agree no need for vote for this because the consensus is clear and the >>>>> sole >>>>> impact I can think of are pending PRs that will be broken. In the Java >>>>> case >>>>> what we did was to just notice every PR that was affected by the >>>>> change. >>>>> And clearly document how to validate and autoformat the code. >>>>> >>>>> So the earlier the better, go go autoformat! >>>>> >>>>> On Thu, Feb 6, 2020 at 1:38 AM Robert Bradshaw <[email protected]> >>>>> wrote: >>>>> >>>>>> No, perhaps not. I agree there's consensus, just wondering what the >>>>>> next steps should be to get this in. (The presubmits look like they're >>>>>> all passing, with the exception of some breakage in java that should >>>>>> be completely unrelated. Of course there's already merge conflicts...) >>>>>> >>>>>> On Wed, Feb 5, 2020 at 3:55 PM Ahmet Altay <[email protected]> wrote: >>>>>> > >>>>>> > Do we need a formal vote? There is consensus on this thread and on >>>>>> the PR. >>>>>> > >>>>>> > On Wed, Feb 5, 2020 at 3:37 PM Robert Bradshaw <[email protected]> >>>>>> wrote: >>>>>> >> >>>>>> >> The PR is looking good. Should we call a vote? >>>>>> >> >>>>>> >> On Mon, Jan 27, 2020 at 11:03 AM Robert Bradshaw < >>>>>> [email protected]> wrote: >>>>>> >> > >>>>>> >> > Thanks. I commented on the PR. I think if we're going this route >>>>>> we >>>>>> >> > should add a pre-commit, plus instructions on how to run the tool >>>>>> >> > (similar to spotless). >>>>>> >> > >>>>>> >> > On Mon, Jan 27, 2020 at 10:00 AM Udi Meiri <[email protected]> >>>>>> wrote: >>>>>> >> > > >>>>>> >> > > I've done a pass on the PR on code I'm familiar with. >>>>>> >> > > Please make a pass and add your suggestions on the PR. >>>>>> >> > > >>>>>> >> > > On Fri, Jan 24, 2020 at 7:15 AM Ismaël Mejía < >>>>>> [email protected]> wrote: >>>>>> >> > >> >>>>>> >> > >> Java build fails on any unformatted code so python probably >>>>>> should be like that. >>>>>> >> > >> We have to ensure however that it fails early on that. >>>>>> >> > >> As Robert said time to debate the knobs :) >>>>>> >> > >> >>>>>> >> > >> On Fri, Jan 24, 2020 at 3:19 PM Kamil Wasilewski < >>>>>> [email protected]> wrote: >>>>>> >> > >>> >>>>>> >> > >>> PR is ready: https://github.com/apache/beam/pull/10684. >>>>>> Please share your comments ;-) I've managed to reduce the impact a bit: >>>>>> >> > >>> 501 files changed, 18245 insertions(+), 19495 deletions(-) >>>>>> >> > >>> >>>>>> >> > >>> We still need to consider how to enforce the usage of >>>>>> autoformatter. Pre-commit sounds like a nice addition, but it still needs >>>>>> to be installed manually by a developer. On the other hand, Jenkins >>>>>> precommit job that fails if any unformatted code is detected looks like >>>>>> too >>>>>> strict. What do you think? >>>>>> >> > >>> >>>>>> >> > >>> On Thu, Jan 23, 2020 at 8:37 PM Robert Bradshaw < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>>>>> >> > >>>> Thanks! Now we get to debate what knobs to twiddle :-P >>>>>> >> > >>>> >>>>>> >> > >>>> FYI, I did a simple run (just pushed to >>>>>> >> > >>>> >>>>>> https://github.com/apache/beam/compare/master...robertwb:yapf) to see >>>>>> >> > >>>> the impact. The diff is >>>>>> >> > >>>> >>>>>> >> > >>>> $ git diff --stat master >>>>>> >> > >>>> ... >>>>>> >> > >>>> 547 files changed, 22118 insertions(+), 21129 >>>>>> deletions(-) >>>>>> >> > >>>> >>>>>> >> > >>>> For reference >>>>>> >> > >>>> >>>>>> >> > >>>> $ find sdks/python/apache_beam -name '*.py' | xargs wc >>>>>> >> > >>>> ... >>>>>> >> > >>>> 200424 612002 7431637 total >>>>>> >> > >>>> >>>>>> >> > >>>> which means a little over 10% of lines get touched. I think >>>>>> there are >>>>>> >> > >>>> some options, such as >>>>>> SPLIT_ALL_TOP_LEVEL_COMMA_SEPARATED_VALUES and >>>>>> >> > >>>> COALESCE_BRACKETS, that will conform more to the style we >>>>>> are already >>>>>> >> > >>>> (mostly) following. >>>>>> >> > >>>> >>>>>> >> > >>>> >>>>>> >> > >>>> On Thu, Jan 23, 2020 at 1:59 AM Kamil Wasilewski >>>>>> >> > >>>> <[email protected]> wrote: >>>>>> >> > >>>> > >>>>>> >> > >>>> > Thank you Michał for creating the ticket. I have some >>>>>> free time and I'd like to volunteer myself for this task. >>>>>> >> > >>>> > Indeed, it looks like there's consensus for `yapf`, so >>>>>> I'll try `yapf` first. >>>>>> >> > >>>> > >>>>>> >> > >>>> > Best, >>>>>> >> > >>>> > Kamil >>>>>> >> > >>>> > >>>>>> >> > >>>> > >>>>>> >> > >>>> > On Thu, Jan 23, 2020 at 10:37 AM Michał Walenia < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> Hi all, >>>>>> >> > >>>> >> I created a JIRA issue for this and summarized the >>>>>> available tools >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> https://issues.apache.org/jira/browse/BEAM-9175 >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> Cheers, >>>>>> >> > >>>> >> Michal >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> On Thu, Jan 23, 2020 at 1:49 AM Udi Meiri < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>> >>>>>> >> > >>>> >>> Sorry, backing off on this due to time constraints. >>>>>> >> > >>>> >>> >>>>>> >> > >>>> >>> On Wed, Jan 22, 2020 at 3:39 PM Udi Meiri < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>>> >>>>>> >> > >>>> >>>> It sounds like there's a consensus for yapf. I >>>>>> volunteer to take this on >>>>>> >> > >>>> >>>> >>>>>> >> > >>>> >>>> On Wed, Jan 22, 2020, 10:31 Udi Meiri < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>>>> >>>>>> >> > >>>> >>>>> +1 to autoformatting >>>>>> >> > >>>> >>>>> >>>>>> >> > >>>> >>>>> On Wed, Jan 22, 2020 at 9:57 AM Luke Cwik < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>>>>> >>>>>> >> > >>>> >>>>>> +1 to autoformatters. Also the Beam Java SDK went >>>>>> through a one time pass to apply the spotless formatting. >>>>>> >> > >>>> >>>>>> >>>>>> >> > >>>> >>>>>> On Tue, Jan 21, 2020 at 9:52 PM Ahmet Altay < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>>>>>> >>>>>> >> > >>>> >>>>>>> +1 to autoformatters and yapf. It appears to be a >>>>>> well maintained project. I do support making a one time pass to apply >>>>>> formatting the whole code base. >>>>>> >> > >>>> >>>>>>> >>>>>> >> > >>>> >>>>>>> On Tue, Jan 21, 2020 at 5:38 PM Chad Dombrova < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> >> > >>>> >>>>>>>>> It'd be good if there was a way to only apply to >>>>>> violating (or at >>>>>> >> > >>>> >>>>>>>>> least changed) lines. >>>>>> >> > >>>> >>>>>>>> >>>>>> >> > >>>> >>>>>>>> >>>>>> >> > >>>> >>>>>>>> I assumed the first thing we’d do is convert all >>>>>> of the code in one go, since it’s a very safe operation. Did you have >>>>>> something else in mind? >>>>>> >> > >>>> >>>>>>>> >>>>>> >> > >>>> >>>>>>>> -chad >>>>>> >> > >>>> >>>>>>>> >>>>>> >> > >>>> >>>>>>>> >>>>>> >> > >>>> >>>>>>>> >>>>>> >> > >>>> >>>>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> >> > >>>> >>>>>>>>> On Tue, Jan 21, 2020 at 1:56 PM Chad Dombrova < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > +1 to autoformatting >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > Let me add some nuance to that. >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > The way I see it there are 2 varieties of >>>>>> formatters: those which take the original formatting into consideration >>>>>> (autopep8) and those which disregard it (yapf, black). >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > I much prefer yapf to black, because you have >>>>>> plenty of options to tweak with yapf (enough to make the output a pretty >>>>>> close match to the current Beam style), and you can mark areas to >>>>>> preserve >>>>>> the original formatting, which could be very useful with Pipeline >>>>>> building >>>>>> with pipe operators. Please don't pick black. >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > autopep8 is more along the lines of spotless in >>>>>> Java -- it only corrects code that breaks the project's style rules. The >>>>>> big problem with Beam's current style is that it is so esoteric that >>>>>> autopep8 can't enforce it -- and I'm not just talking about 2-spaces, >>>>>> which >>>>>> I don't really have a problem with -- the problem is the use of either 2 >>>>>> or >>>>>> 4 spaces depending on context (expression start vs hanging indent, etc). >>>>>> This is my *biggest* gripe about the current style. PyCharm doesn't have >>>>>> enough control either. So, if we can choose a style that can be >>>>>> expressed >>>>>> by flake8 or pycodestyle then we can use autopep8 to enforce it. >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > I'd prefer autopep8 to yapf because I like >>>>>> having a little wiggle room to influence the style, but on a big project >>>>>> like Beam all that wiggle room ends up to minor but noticeable >>>>>> inconsistencies in style throughout the project. yapf ensures completely >>>>>> consistent style, but the tradeoff is that it's sometimes ugly, >>>>>> especially >>>>>> in scenarios with similar repeated entries like argparse, where yapf >>>>>> might >>>>>> insert line breaks in visually inconsistent and unappealing ways >>>>>> depending >>>>>> on the lengths of the keywords and expressions involved. >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > Either way (but especially if we choose yapf) I >>>>>> think it'd be a nice addition to setup a pre-commit [1] config so that >>>>>> people can opt in to running *lightweight* autofixers prior to commit. >>>>>> This will not only reduce dev frustration but will also reduce the amount >>>>>> of cpu cycles that Jenkins spends pointing out lint errors. >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > [1] https://pre-commit.com/ >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > -chad >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > >>>>>> >> > >>>> >>>>>>>>> > On Tue, Jan 21, 2020 at 12:52 PM Ismaël Mejía < >>>>>> [email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >> >>>>>> >> > >>>> >>>>>>>>> >> Last time we discussed this there seems not to >>>>>> be much progress into autoformatting. >>>>>> >> > >>>> >>>>>>>>> >> This tool looks more tweakable, so maybe it >>>>>> could be more appropriate for Beam's use case. >>>>>> >> > >>>> >>>>>>>>> >> https://github.com/google/yapf/ >>>>>> >> > >>>> >>>>>>>>> >> WDYT? >>>>>> >> > >>>> >>>>>>>>> >> >>>>>> >> > >>>> >>>>>>>>> >> >>>>>> >> > >>>> >>>>>>>>> >> On Thu, May 30, 2019 at 10:50 AM Łukasz Gajowy >>>>>> <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> +1 for any autoformatter for Python SDK that >>>>>> does the job. My experience is that since spotless in Java SDK I would >>>>>> never start a new Java project without it. So many great benefits not >>>>>> only >>>>>> for one person coding but for all community. >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> It is a GitHub UI issue that you cannot >>>>>> easily browse past the reformat. It is not actually that hard, but does >>>>>> take a couple extra clicks to get GitHub to display blame before a >>>>>> reformat. It is easier with the command line. I do a lot of code history >>>>>> digging and the global Java reformat is not really a problem. >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> It's actually one more click on Github but I >>>>>> agree it's not the best way to search the history. The most convenient >>>>>> and >>>>>> clear one I've found so far is in Jetbrains IDEs (Intelij) where you can: >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> right click on line number -> "annotate" -> >>>>>> click again -> "annotate previous revision" -> ... >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> You can also use "compare with" to see the >>>>>> diff between two revisions. >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> Łukasz >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> >>>>>> >> > >>>> >>>>>>>>> >>> czw., 30 maj 2019 o 06:15 Kenneth Knowles < >>>>>> [email protected]> napisał(a): >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> +1 pending good enough tooling (I can't >>>>>> quite tell - seems there are some issues?) >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> On Wed, May 29, 2019 at 2:40 PM Katarzyna >>>>>> Kucharczyk <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> What else actually we gain? My guess is >>>>>> faster PR review iteration. We will skip some of conversations about code >>>>>> style. >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> ... >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> Last but not least, new contributor may be >>>>>> less discouraged. When I started contribute I didn’t know how to format >>>>>> my >>>>>> code and I lost a lot of time to add pylint and adjust IntelliJ. I >>>>>> eventually failed. Currently I write code intuitively and when I don’t >>>>>> forget I rerun tox. >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> This is a huge benefit. This is why I >>>>>> supported it so much for Java. It is a community benefit. You do not have >>>>>> to be a contributor to the Python SDK to support this. That is why I am >>>>>> writing here. Just eliminate all discussion of formatting. It doesn't >>>>>> really matter what the resulting format is, if it is not crazy to read. I >>>>>> strongly oppose maintaining a non-default format. >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> Reformating 20k lines or 200k is not hard. >>>>>> The Java global reformat touched 50k lines. It does not really matter how >>>>>> big it is. Definitely do it all at once if you think the tool is good >>>>>> enough. And you should pin a version, so churn is not a problem. You can >>>>>> upgrade the version and reformat in a PR later and that is also easy. >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> It is a GitHub UI issue that you cannot >>>>>> easily browse past the reformat. It is not actually that hard, but does >>>>>> take a couple extra clicks to get GitHub to display blame before a >>>>>> reformat. It is easier with the command line. I do a lot of code history >>>>>> digging and the global Java reformat is not really a problem. >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> Kenn >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> Also everything will be formatted in a same >>>>>> way, so eventually it would be easier to read. >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> Moreover, as it was mentioned in previous >>>>>> emails - a lot of Jenkins failures won’t take place, so we save time and >>>>>> resources. >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> One of disadvantages is that our pipelines >>>>>> has custom syntax and after formatting they looks a little bit weird, but >>>>>> maybe extending the only configurable option in Black - lines, from 88 to >>>>>> 110 would be solution. >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> Second one is that Black requires Python 3 >>>>>> to be run. I don’t know how big obstacle it would be. >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> I believe there are two options how it >>>>>> would be possible to introduce Black. First: just do it, it will hurt but >>>>>> then it would be ok (same as a dentist appointment). Of course it may >>>>>> require some work to adjust linters. On the other hand we can do it >>>>>> gradually and start including sdk parts one by one - maybe it will be >>>>>> less >>>>>> painful? >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> As an example I can share one of projects >>>>>> [2] I know that uses Black (they use also other cool checkers and >>>>>> pre-commit [3]). This is how looks their build with all checks [4]. >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> To sum up I believe that if we want improve >>>>>> our coding experience, we should improve our toolset. Black seems be >>>>>> recent >>>>>> and quite popular tool what makes think they won’t stop developing it. >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> [1] >>>>>> https://stackoverflow.com/questions/4112410/git-change-styling-whitespace-without-changing-ownership-blame >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> [2] >>>>>> https://github.com/GoogleCloudPlatform/oozie-to-airflow >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> [3] https://pre-commit.com >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> [4] >>>>>> https://travis-ci.org/GoogleCloudPlatform/oozie-to-airflow/builds/538725689 >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>> On Wed, May 29, 2019 at 2:01 PM Robert >>>>>> Bradshaw <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> Reformatting to 4 spaces seems a >>>>>> non-starter to me, as it would change nearly every single line in the >>>>>> codebase (and the loss of all context as well as that particular line). >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> This is probably why the 2-space fork >>>>>> exists. However, we don't conform to that either--we use 2 spaces for >>>>>> indentation, but 4 for continuation indentation. (As for the history of >>>>>> this, this goes back to Google's internal style guide, probably motivated >>>>>> by consistency with C++, Java, ... and the fact that with an indent level >>>>>> of 4 one ends up wrapping lines quite frequently (it's telling that >>>>>> black's >>>>>> default line length is 88)). This turns out to be an easy change to the >>>>>> codebase. >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> Once we move beyond the 2 vs. 4 whitespace >>>>>> thing, I found that this tool introduces a huge amount of vertical >>>>>> whitespace (e.g. closing parentheses on their own line), e.g. >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> def foo( >>>>>> >> > >>>> >>>>>>>>> >>>>>> args >>>>>> >> > >>>> >>>>>>>>> >>>>>> ): >>>>>> >> > >>>> >>>>>>>>> >>>>>> if ( >>>>>> >> > >>>> >>>>>>>>> >>>>>> long expression) >>>>>> >> > >>>> >>>>>>>>> >>>>>> ): >>>>>> >> > >>>> >>>>>>>>> >>>>>> func( >>>>>> >> > >>>> >>>>>>>>> >>>>>> args >>>>>> >> > >>>> >>>>>>>>> >>>>>> ) >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> I wrote a simple post-processor to put >>>>>> closing parentheses on the same lines, as well as omit the newline after >>>>>> "if (", and disabling formatting of strings, which reduce the churn in >>>>>> our >>>>>> codebase to 15k lines (adding about 4k) out of 200k total. >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> https://github.com/apache/beam/pull/8712/files >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> It's still very opinionated, often in >>>>>> different ways then me, and doesn't understand the semantics of the code, >>>>>> but possibly something we could live with given the huge advantages of an >>>>>> autoformatter. >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> An intermediate point would be to allow, >>>>>> but not require, autoformatting of changed lines. >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> As for being beta quality, it looks like >>>>>> it's got a decent number of contributors and in my book being in the >>>>>> python >>>>>> github project is a strong positive signal. But, due to the above >>>>>> issues, I >>>>>> think we'd have to maintain a fork. (The code is pretty lightweight, the >>>>>> 2 >>>>>> vs. 4 space issue is a 2-line change, and the rest implemented as a >>>>>> post-processing step (for now, incomplete), so it'd be easy to stay in >>>>>> sync >>>>>> with upstream.) >>>>>> >> > >>>> >>>>>>>>> >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> On Wed, May 29, 2019 at 11:03 AM Ismaël >>>>>> Mejía <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > I think the question is if it can be >>>>>> configured in a way to fit our >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > current linter's style. I don't think >>>>>> it is feasible to reformat the >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > entire Python SDK. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > It cannot be configured to do what we >>>>>> actually do because Black is >>>>>> >> > >>>> >>>>>>>>> >>>>>> > configurable only to support the >>>>>> standard python codestyle guidelines >>>>>> >> > >>>> >>>>>>>>> >>>>>> > (PEP-8) which recommends 4 spaces and is >>>>>> what most projects in the >>>>>> >> > >>>> >>>>>>>>> >>>>>> > python world use. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > Reformatted lines don't allow quick >>>>>> access to the Git history. This >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > effect is still visible in the Java >>>>>> SDK. However, I have the feeling >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > that this might be less of a problem >>>>>> with Python because the linter has >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > more rules than Checkstyle had. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > Yes that’s the bad side effect but there >>>>>> are always tradeoffs we have >>>>>> >> > >>>> >>>>>>>>> >>>>>> > to deal with. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > On Wed, May 29, 2019 at 10:52 AM >>>>>> Maximilian Michels <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > I think the question is if it can be >>>>>> configured in a way to fit our >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > current linter's style. I don't think >>>>>> it is feasible to reformat the >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > entire Python SDK. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > Reformatted lines don't allow quick >>>>>> access to the Git history. This >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > effect is still visible in the Java >>>>>> SDK. However, I have the feeling >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > that this might be less of a problem >>>>>> with Python because the linter has >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > more rules than Checkstyle had. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > -Max >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > On 29.05.19 10:16, Ismaël Mejía wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> My concerns are: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> - The product is clearly marked as >>>>>> beta with a big warning. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> - It looks like mostly a single >>>>>> person project. For the same reason I also strongly prefer not using a >>>>>> fork >>>>>> for a specific setting. Fork will only have less people looking at it. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > I suppose the project is marked as >>>>>> beta because it is recent, it was >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > presented in 2018’s pycon, and >>>>>> because some things can change since >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > auto-formatters are pretty tricky >>>>>> beasts, I think beta in that case is >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > like our own ‘@Experimental’. If you >>>>>> look at the contribution page [1] >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > you can notice that it is less and >>>>>> less a single person project, there >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > have been 93 independent >>>>>> contributions since the project became >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > public, and the fact that it is >>>>>> hosted in the python organization >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > github [2] gives some confidence on >>>>>> the project continuity. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > You are right however about the fact >>>>>> that the main author seems to be >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > the ‘benevolent’ dictator, and in >>>>>> the 2-spaces issue he can seem >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > arbitrary, but he is just following >>>>>> pep8 style guide recommendations >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > [3]. I am curious of why we (Beam) >>>>>> do not follow the 4 spaces >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > recommendation of PEP-8 or even >>>>>> Google's own Python style guide [4], >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > So, probably it should be to us to >>>>>> reconsider the current policy to >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > adapt to the standards (and the >>>>>> tool). >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > I did a quick run of black with >>>>>> python 2.7 compatibility on >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > sdks/python and got only 4 parsing >>>>>> errors which is positive given the >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > size of our code base. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > 415 files reformatted, 45 files left >>>>>> unchanged, 4 files failed to reformat. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > error: cannot format >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > Cannot parse: 47:22: >>>>>> _display_progress = print >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > error: cannot format >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/log_handler.py: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > Cannot parse: 151:18: >>>>>> file=sys.stderr) >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > error: cannot format >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> /home/ismael/upstream/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > Cannot parse: 160:34: >>>>>> print(traceback_string, file=sys.stderr) >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > error: cannot format >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> /home/ismael/upstream/beam/sdks/python/apache_beam/typehints/trivial_inference.py: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > Cannot parse: 335:51: >>>>>> print('-->' if pc == last_pc else ' ', >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > end=' ') >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > I still think this can be positive >>>>>> for the project but well I am >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > barely a contributor to the python >>>>>> code base so I let you the python >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > maintainers to reconsider this, in >>>>>> any case it seems like a good >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > improvement for the project. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > [1] >>>>>> https://github.com/python/black/graphs/contributors >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > [2] https://github.com/python >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > [3] >>>>>> https://www.python.org/dev/peps/pep-0008/#indentation >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > [4] >>>>>> https://github.com/google/styleguide/blob/gh-pages/pyguide.md#34-indentation >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > > On Tue, May 28, 2019 at 11:15 PM >>>>>> Ahmet Altay <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> I am in the same boat with Robert, >>>>>> I am in favor of autoformatters but I am not familiar with this one. My >>>>>> concerns are: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> - The product is clearly marked as >>>>>> beta with a big warning. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> - It looks like mostly a single >>>>>> person project. For the same reason I also strongly prefer not using a >>>>>> fork >>>>>> for a specific setting. Fork will only have less people looking at it. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> IMO, this is in an early stage for >>>>>> us. That said lint issues are real as pointed in the thread. If someone >>>>>> would like to give it a try and see how it would look like for us that >>>>>> would be interesting. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >> On Tue, May 28, 2019 at 4:44 AM >>>>>> Katarzyna Kucharczyk <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>> This sounds really good. A lot of >>>>>> Jenkins jobs failures are caused by lint problems. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>> I think it would be great to have >>>>>> something similar to Spotless in Java SDK (I heard there is problem with >>>>>> configuring Black with IntelliJ). >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>> On Mon, May 27, 2019 at 10:52 PM >>>>>> Robert Bradshaw <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>> I'm generally in favor of >>>>>> autoformatters, though I haven't looked at >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>> how well this particular one >>>>>> works. We might have to go with >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>> >>>>>> https://github.com/desbma/black-2spaces given >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>> >>>>>> https://github.com/python/black/issues/378 . >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>> On Mon, May 27, 2019 at 10:43 PM >>>>>> Pablo Estrada <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>> This looks pretty good:) I know >>>>>> at least a couple people (myself included) who've been annoyed by having >>>>>> to >>>>>> take care of lint issues that maybe a code formatter could save us. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>> Thanks for sharing Ismael. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>> -P. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>> On Mon, May 27, 2019, 12:24 PM >>>>>> Ismaël Mejía <[email protected]> wrote: >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> I stumbled by chance into Black >>>>>> [1] a python code auto formatter that >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> is becoming the 'de-facto' >>>>>> auto-formatter for python, and wanted to >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> bring to the ML Is there >>>>>> interest from the python people to get this >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> into the build? >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> The introduction of spotless >>>>>> for Java has been a good improvement and >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> maybe the python code base may >>>>>> benefit of this too. >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> WDYT? >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> >>>>>> >> > >>>> >>>>>>>>> >>>>>> > > >>>>>> [1] >>>>>> https://github.com/python/black >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> -- >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> Michał Walenia >>>>>> >> > >>>> >> Polidea | Software Engineer >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> M: +48 791 432 002 <+48%20791%20432%20002> >>>>>> >> > >>>> >> E: [email protected] >>>>>> >> > >>>> >> >>>>>> >> > >>>> >> Unique Tech >>>>>> >> > >>>> >> Check out our projects! >>>>>> >>>>> >>>> >>>> -- >>>> >>>> Michał Walenia >>>> Polidea <https://www.polidea.com/> | Software Engineer >>>> >>>> M: +48 791 432 002 <+48791432002> >>>> E: [email protected] >>>> >>>> Unique Tech >>>> Check out our projects! <https://www.polidea.com/our-work> >>>> >>>
