Brandon, I don’t know the exact policy. It’s certainly preferable that contributors include headers in their contribution. But, Matt points out that that raises the bar to contributions, so I think it's OK to defer them until release time.
(IANAL, but here’s my rationale: A contribution has the same legal status with and without headers. Code sitting in GitHub, even in the main branch, is not ‘published’ in a legal sense until the release happens. But when the code is released, the headers need to be present remind people of their obligations under the license.) Julian > On Jun 13, 2021, at 5:19 AM, Brandon Jackson <[email protected]> wrote: > > When is it important for these legal headers to be present? > Example: > Does every commit have to have them? > Is this a clean-up before release exercise where right before a release a > script could modify every contribution and add the required headers to the > textual files? > > Understanding that small point could yield a small utility on the client > side, like "asf-bless.sh" which could parse the tree of files and add > headers where any are missing before contributors commit and generate a > pull request. > > Brandon > > > > On Sun, Jun 13, 2021 at 4:12 AM Matt Casters <[email protected]> > wrote: > >> You are both right of-course. I'm not sure why I didn't see it before but >> we can simply add an option to add a header of choice defined in a file >> somewhere to whichever file format we generate now (XML) or in the future. >> Sure, it would not be visible to users of the GUI but it would solve the >> conundrum. >> >> Op zo 13 jun. 2021 07:28 schreef Julian Hyde <[email protected]>: >> >>> A couple more: >>> >>> 5. Make the release manager responsible for adding headers to files that >>> are missing them. >>> 6. Use a tool such as autostyle [1] that detects problems and fixes them. >>> >>> Julian >>> >>> [1] https://github.com/autostyle/autostyle < >>> https://github.com/autostyle/autostyle> >>> >>> >>>> On Jun 12, 2021, at 7:34 PM, Hans Van Akelyen < >>> [email protected]> wrote: >>>> >>>> The annoying part is that it needs to have the header when added to the >>>> repository but outside of the repository it doesn't. >>>> We currently have around 300 pipelines and 100 workflows in the >>> repository >>>> and we are advocating how easy it is to contribute these things to hop. >>> Now >>>> we would have to say, well it is easy but you need to add a header... >> and >>>> guess what... every time you change something you will need to add it >>> again >>>> because using the save button will overwrite your content. >>>> >>>> There are a couple of ways to solve this: >>>> 1) automate it with a github actions/Jenkins >>>> 2) manually add the header >>>> 3) add a toggle to the gui/code that needs to be activated when you are >>>> creating pipelines for the repository >>>> 4) move to a binary format >>>> >>>> 1. Is not allowed/possible afaik, Jenkins definitely does not have >> write >>>> access to the code base, Github might have permission to write to a pr. >>> But >>>> then the question arises if it is even allowed to add a header to a >> file >>>> without the user confirming this. >>>> >>>> 2. This adds another boundary for non-developers/regular users to >>>> contribute samples and integration tests, they don't care about the >>> content >>>> of a hpl/hwf in their eyes this is a "binary-file" that needs no >> editing, >>>> and surely not every time you change a minor thing. >>>> We are really trying hard to convince people to contribute small things >>>> like a single sample or a single test, but noticed that even the usage >> of >>>> github and how to create a PR can be a "hard" process that requires >>>> hand-holding for our user base that consists mainly of non-developers. >>> This >>>> would raise the bar a bit higher making it harder for those willing to >>> jump. >>>> >>>> 3. This might work for the core developers/contributors but will >> probably >>>> be forgotten by the friendly user that wants to contribute once, >> meaning >>> we >>>> would have to point to them to add the header or do it ourselves. >>>> >>>> 4. No need for headers here we could even keep the current xml >> structure >>>> but zip the content of the hpl/hwf >>>> >>>> So to summarize, in the short term getting a release out shouldn't be >>> hard. >>>> One of us can add the header to all the files and be done with it. But >> in >>>> the long run this process is not sustainable. >>>> >>>> Cheers, >>>> Hans >>>> >>>> On Sun, 13 Jun 2021 at 00:52, Julian Hyde <[email protected]> >>> wrote: >>>> >>>>> I still don’t see why the discussion about Apache release policy needs >>> to >>>>> be connected with discussion about file formats. It’s simpler to >> resolve >>>>> the issue about release policy first, make the release, and come back >>> and >>>>> discuss file format later. >>>>> >>>>> Regarding release policy. When a user contributes a test case to Hop, >>> that >>>>> is a creative work according to copyright law. Like any contribution, >> we >>>>> don’t “claim copyright”; they retain copyright, but contribute under >>> Apache >>>>> license. And we require that text files have a header. >>>>> >>>>> No one is proposing adding headers to pipeline and workflow files that >>> are >>>>> not contributed to Hop. >>>>> >>>>> I find it hard to believe that adding a header to a test case will >> make >>> it >>>>> behave differently, in the vast majority of cases. Exceptions can be >>> made >>>>> for the few case where it matters. >>>>> >>>>> Julian >>>>> >>>>> >>>>> >>>>>> On Jun 12, 2021, at 3:25 PM, Matt Casters <[email protected] >>> .INVALID> >>>>> wrote: >>>>>> >>>>>> That's really my point: it's really not as straightforward at all >> like >>>>> you >>>>>> claimed Julian. The files are produced by the Hop GUI and that's >> what >>> we >>>>>> want. We want to test what is actually used by our end-users, not >> some >>>>>> theoretical use-case which is typically handled by >>>>> JUnit/Mockito/Powermock >>>>>> and their ilk. It's this old-school vision that an XML file has to >> be >>>>>> written by hand or something like that which messes up this debate. >>>>>> The .hpl/.hwf file format does not and should not include the ASF >>> header >>>>>> either. For our users it would be inappropriate as we can't claim >>>>>> copyright on works produced by others. In other words, when some >>> person >>>>> or >>>>>> company uses our software and creates a pipeline, we can't just claim >>>>>> copyright for that file. At least that's how I see things. >>>>>> >>>>>> As for YAML: my dislike for it is enormous but since it wouldn't >> solve >>>>> the >>>>>> header issue I wouldn't pick it for that reason alone since it allows >>>>>> comments. Perhaps we should serialize in some binary format to get >>> past >>>>>> this issue. Since we'll need to continue XML serialization anyway >> it's >>>>>> just a question of storing the integration tests and samples in a way >>>>> that >>>>>> can be approved by the ASF. >>>>>> >>>>>> >>>>>> On Sat, Jun 12, 2021 at 10:54 PM Julian Hyde <[email protected] >>> >>>>> wrote: >>>>>> >>>>>>> I don’t think the discussion about headers really forces this issue. >>>>> It’s >>>>>>> a technical decision and shouldn’t be rushed. >>>>>>> >>>>>>> Regarding the headers. It is straightforward to add headers to >>> existing >>>>>>> files. It is also straightforward to use a tool such as checkstyle >> to >>>>>>> enforce them (so, any PR that adds a .hpl file without a header will >>>>> get a >>>>>>> build error, which the contributor will duly fix). >>>>>>> >>>>>>> In my opinion, Hop should allow multiple formats. XML is rather old, >>> and >>>>>>> people find it difficult to read without practice. JSON is a bit >> more >>>>>>> modern, but has terrible support for multi-line strings and (in its >>>>>>> official form) doesn’t allow comments and is strict about quoting of >>>>>>> identifiers. YAML (or similar) is worth considering; its model is >>>>>>> compatible with JSON, it allows comments, it has much better support >>> for >>>>>>> multi-line strings, and it tends to diff/merge easier than XML and >>> JSON. >>>>>>> >>>>>>> Julian >>>>>>> >>>>>>> >>>>>>>> On Jun 12, 2021, at 1:38 PM, Matt Casters <[email protected] >>>>> .INVALID> >>>>>>> wrote: >>>>>>>> >>>>>>>> Folks, >>>>>>>> >>>>>>>> It's been up in the air for quite some time now but it looks like >>> we're >>>>>>>> being forced by certain discussions in the release voting of >>> 0.99-rc1. >>>>>>> How >>>>>>>> would you feel about moving to JSON for the standard file format of >>>>>>>> pipelines and workflows? >>>>>>>> I propose .hpj and .hwj as extensions. >>>>>>>> This would push back our releases for a month or so while we >> convert >>>>> the >>>>>>>> remaining serialization code to the new @HopMetadataProperty API >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Matt >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Neo4j Chief Solutions Architect >>>>>> *✉ *[email protected] >>>>>> ☎ +32486972937 >>>>> >>>>> >>> >>> >>
