Re: [gentoo-dev] Standard parsable format for profiles/package.mask file
On 22/09/2023 00.22, Ulrich Mueller wrote: >> On Thu, 21 Sep 2023, Arthur Zamarin wrote: > >> = "Formal" format = > >> Each entry is composed of 2 parts: "#"-prefixed explanation block and >> list of "${CATEGORY}/${PN}" packages. Entries are separated when a new >> explanation block starts (meaning first "#"-prefixed line after packages >> list). You may add newlines between packages in packages list. > > "Must" rather than "may" here? You certainly cannot list several > packages in the same line. Agreed, poor choice of words. >> The first line of the "#"-prefixed explanation block must be of the >> format "${AUTHOR_NAME} <${EMAIL}> (${SINGLE_DATE})" when the date is of >> format -MM-DD, in UTC timezone. > >> If this is a last-rite message, the last line must list the last-rite >> last date (removal date) and the last-rite bug number. You can also list >> other bugs relevant to the last-rite. So I think a format of: "Removal >> on ${REMOVAL_DATE}. Bug #NN, #NN." Where the bug list is comma >> and space separated, we have at least one space (" +" regex) between the >> removal date and bug list, and the date is of -MM-DD format. >> I prefer this line is separate (and not continuous of prefix message text). > >> The explanation block itself can reference bugs, by matching the regex >> "[Bb]ugs? #\d+(, +#\d+)*" (For example: "bug #713106, #753134"). I think >> this is quite a simple one, but powerful enough for most. > >> Lines with single newline between them (so no blank line between them) >> are considered as single paragraph continuum. If you want to start new >> paragraph, leave a blank line (still prefixed with #) - think similar to >> markdown. A line matching the last-rite line is always it's own paragraph. > > Is this rule about paragraphs needed? It is at odds with the rule that > the removal date and bug must be on their own line (i.e. that line is > _not_ part of a "paragraph continuum"). Hmm, yeah, rereading my text shows I've over-complicated it. What I wanted is that last paragraph (yes, if there are many bugs it might wrap into new line) can be not separated with blank line from "main explanation block". > What about the introductory comment block in the file? Should there be a > defined syntax for a separator between it and the rest of the file? For > example, everything above the first line matching "^#[ \t]*---" could be > ignored by automatic tools, and they would insert new entries below that > separator. Good point, and I should address it as you recommended. I will mention the ignore-until-this-line, and that new entries should be added as first entry after that ignore-until-this-line. >> Should it be a GLEP, I don't think so? But I'm unsure about it. We do >> need to document it (for example header of that exact file). > > It shouldn't be too difficult to wrap this up as a GLEP. OTOH, we don't > have a GLEP for eclassdoc either. Yeah, after all the input, yes, I will work on a formal GLEP. It will take time, but I hope to prepare a first draft in the coming 2 weeks. > Ulrich -- Arthur Zamarin arthur...@gentoo.org Gentoo Linux developer (Python, pkgcore stack, Arch Teams, GURU) OpenPGP_signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Standard parsable format for profiles/package.mask file
On 22/09/2023 12.21, Ulrich Mueller wrote: >> On Fri, 22 Sep 2023, Oskari Pirhonen wrote: > >>> Each entry is composed of 2 parts: "#"-prefixed explanation block and >>> list of "${CATEGORY}/${PN}" packages. Entries are separated when a new >>> explanation block starts (meaning first "#"-prefixed line after packages >>> list). You may add newlines between packages in packages list. > >> What about mandatory blank line(s) between entries? That way it ensures >> they are visually separated when skimming through the file. Plus, you >> can easily jump from entry to entry in editors that support >> paragraph-wise movement. > > Yes, please. Mandatory blank lines between entries, and no blank lines > (or lines containing only whitespace) within entries. Especially, no > blank lines in the list of packages. Yeah I agree. Originally I wanted to allow blank lines between packages in same entry (to enable you to group them), but as further considerations and your input, this is a bad idea (if you want to divide the group, create separate entries). -- Arthur Zamarin arthur...@gentoo.org Gentoo Linux developer (Python, pkgcore stack, Arch Teams, GURU) OpenPGP_signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Standard parsable format for profiles/package.mask file
> On Fri, 22 Sep 2023, Oskari Pirhonen wrote: >> Each entry is composed of 2 parts: "#"-prefixed explanation block and >> list of "${CATEGORY}/${PN}" packages. Entries are separated when a new >> explanation block starts (meaning first "#"-prefixed line after packages >> list). You may add newlines between packages in packages list. > What about mandatory blank line(s) between entries? That way it ensures > they are visually separated when skimming through the file. Plus, you > can easily jump from entry to entry in editors that support > paragraph-wise movement. Yes, please. Mandatory blank lines between entries, and no blank lines (or lines containing only whitespace) within entries. Especially, no blank lines in the list of packages. signature.asc Description: PGP signature
Re: [gentoo-dev] Standard parsable format for profiles/package.mask file
On Thu, Sep 21, 2023 at 22:40:05 +0300, Arthur Zamarin wrote: > = "Formal" format = > > Each entry is composed of 2 parts: "#"-prefixed explanation block and > list of "${CATEGORY}/${PN}" packages. Entries are separated when a new > explanation block starts (meaning first "#"-prefixed line after packages > list). You may add newlines between packages in packages list. > What about mandatory blank line(s) between entries? That way it ensures they are visually separated when skimming through the file. Plus, you can easily jump from entry to entry in editors that support paragraph-wise movement. - Oskari signature.asc Description: PGP signature
Re: [gentoo-dev] Standard parsable format for profiles/package.mask file
On Thu, Sep 21, 2023 at 23:22:27 +0200, Ulrich Mueller wrote: > > On Thu, 21 Sep 2023, Arthur Zamarin wrote: > > > = "Formal" format = > > > Each entry is composed of 2 parts: "#"-prefixed explanation block and > > list of "${CATEGORY}/${PN}" packages. Entries are separated when a new > > explanation block starts (meaning first "#"-prefixed line after packages > > list). You may add newlines between packages in packages list. > > "Must" rather than "may" here? You certainly cannot list several > packages in the same line. > I read it to mean something like this would be allowed: # Blurb about package1, package2, and package3 category/package1 category/package2 category/package3 Whether it makes sense to allow that is a different question. - Oskari signature.asc Description: PGP signature
Re: [gentoo-dev] Standard parsable format for profiles/package.mask file
Tim Harder writes: > On 2023-09-21 Thu 15:22, Ulrich Mueller wrote: >>> On Thu, 21 Sep 2023, Arthur Zamarin wrote: >>> Should it be a GLEP, I don't think so? But I'm unsure about it. We do >>> need to document it (for example header of that exact file). >> >>It shouldn't be too difficult to wrap this up as a GLEP. > > To me standardizing a format in Gentoo (outside of PMS-related > functionality) requires a GLEP or at the very least some semi-formal > documentation outside the file in question in a place like the > devmanual. Consider it due diligence of the process that allows people > writing code to target the format without having to chase details down > into code bases or mailing list threads. +1
Re: [gentoo-dev] Standard parsable format for profiles/package.mask file
On 2023-09-21 Thu 15:22, Ulrich Mueller wrote: On Thu, 21 Sep 2023, Arthur Zamarin wrote: Should it be a GLEP, I don't think so? But I'm unsure about it. We do need to document it (for example header of that exact file). It shouldn't be too difficult to wrap this up as a GLEP. To me standardizing a format in Gentoo (outside of PMS-related functionality) requires a GLEP or at the very least some semi-formal documentation outside the file in question in a place like the devmanual. Consider it due diligence of the process that allows people writing code to target the format without having to chase details down into code bases or mailing list threads. OTOH, we don't have a GLEP for eclassdoc either. This is a poor example since it's partly the reason why an awk script with issues relating to extensibility and maintainability is still used to generate eclass manpages. I mainly let it slide when writing pkgcore/pkgcheck parsing functionality because the devmanual [0] was a passable resource at the time. Tim [0]: https://devmanual.gentoo.org/eclass-writing/#documenting-eclasses
Re: [gentoo-dev] Standard parsable format for profiles/package.mask file
> On Thu, 21 Sep 2023, Arthur Zamarin wrote: > = "Formal" format = > Each entry is composed of 2 parts: "#"-prefixed explanation block and > list of "${CATEGORY}/${PN}" packages. Entries are separated when a new > explanation block starts (meaning first "#"-prefixed line after packages > list). You may add newlines between packages in packages list. "Must" rather than "may" here? You certainly cannot list several packages in the same line. > The first line of the "#"-prefixed explanation block must be of the > format "${AUTHOR_NAME} <${EMAIL}> (${SINGLE_DATE})" when the date is of > format -MM-DD, in UTC timezone. > If this is a last-rite message, the last line must list the last-rite > last date (removal date) and the last-rite bug number. You can also list > other bugs relevant to the last-rite. So I think a format of: "Removal > on ${REMOVAL_DATE}. Bug #NN, #NN." Where the bug list is comma > and space separated, we have at least one space (" +" regex) between the > removal date and bug list, and the date is of -MM-DD format. > I prefer this line is separate (and not continuous of prefix message text). > The explanation block itself can reference bugs, by matching the regex > "[Bb]ugs? #\d+(, +#\d+)*" (For example: "bug #713106, #753134"). I think > this is quite a simple one, but powerful enough for most. > Lines with single newline between them (so no blank line between them) > are considered as single paragraph continuum. If you want to start new > paragraph, leave a blank line (still prefixed with #) - think similar to > markdown. A line matching the last-rite line is always it's own paragraph. Is this rule about paragraphs needed? It is at odds with the rule that the removal date and bug must be on their own line (i.e. that line is _not_ part of a "paragraph continuum"). What about the introductory comment block in the file? Should there be a defined syntax for a separator between it and the rest of the file? For example, everything above the first line matching "^#[ \t]*---" could be ignored by automatic tools, and they would insert new entries below that separator. > Should it be a GLEP, I don't think so? But I'm unsure about it. We do > need to document it (for example header of that exact file). It shouldn't be too difficult to wrap this up as a GLEP. OTOH, we don't have a GLEP for eclassdoc either. Ulrich signature.asc Description: PGP signature
[gentoo-dev] Standard parsable format for profiles/package.mask file
Hi all I want to suggest a standard format for profiles/package.mask, for multiple reasons: 1. Easier to write simple to understand mask or last-rites entries. When all entries are in similar format, the reader knows where to expect important information and such. Also easier for writer to convey all needed information. 2. We can teach tools to parse it and render nicely, or help you fill the file. For example I've tried to implement a parser for packages.gentoo.org so it shows as nice as possible the message, see as example [1]. On the other hand, `pkgdev mask` [2] can help you fill the message (including bug number, last-rite until date, author & email line). Both of them mostly works, but when someone "breaks" the unofficial syntax, the tools fail sadly. This is why I want to recommend we create a mostly standard syntax, so we can all expect the same thing and have nice things. Also please note that for now I want to formalize the format only for profiles/package.mask file, and not the one inside all the different profiles. If you think we better apply to all of them, we can think on it separately please :) The current format is mostly acceptable, but let's tighten it. I will implement a pkgcheck check that will validate the format and error out if invalid. [1] https://packages.gentoo.org/packages/sys-fs/eudev [2] https://pkgcore.github.io/pkgdev/man/pkgdev/mask.html = "Formal" format = Each entry is composed of 2 parts: "#"-prefixed explanation block and list of "${CATEGORY}/${PN}" packages. Entries are separated when a new explanation block starts (meaning first "#"-prefixed line after packages list). You may add newlines between packages in packages list. The first line of the "#"-prefixed explanation block must be of the format "${AUTHOR_NAME} <${EMAIL}> (${SINGLE_DATE})" when the date is of format -MM-DD, in UTC timezone. If this is a last-rite message, the last line must list the last-rite last date (removal date) and the last-rite bug number. You can also list other bugs relevant to the last-rite. So I think a format of: "Removal on ${REMOVAL_DATE}. Bug #NN, #NN." Where the bug list is comma and space separated, we have at least one space (" +" regex) between the removal date and bug list, and the date is of -MM-DD format. I prefer this line is separate (and not continuous of prefix message text). The explanation block itself can reference bugs, by matching the regex "[Bb]ugs? #\d+(, +#\d+)*" (For example: "bug #713106, #753134"). I think this is quite a simple one, but powerful enough for most. Lines with single newline between them (so no blank line between them) are considered as single paragraph continuum. If you want to start new paragraph, leave a blank line (still prefixed with #) - think similar to markdown. A line matching the last-rite line is always it's own paragraph. = Example = After all of those rambling, here is an example (it will result in 3 paragraphs, 2 explanation and 1 last-rite finish): # Arthur Zamarin (2023-09-21) # Very broken, no idea why packaged, need to drop ASAP. The project # is done with supporting this package. See for history bug #667889. # # As a better plan, you should migrate to dev-lang/perl, which has # better compatibility with dev-lang/ruby when used with dev-lang/lua # bindings. # Removal on 2023-10-21. Bug #667687, #667689. dev-lang/python Call for comments So how does it sound? I know it is easy to try to limit the syntax for me (since I"ll need to implement parsing of it), but I think this format above matches most of the currently used once, and the one created by `pkgdev mask`. But i needed, I'm open to improve it by comments. Should it be a GLEP, I don't think so? But I'm unsure about it. We do need to document it (for example header of that exact file). -- Arthur Zamarin arthur...@gentoo.org Gentoo Linux developer (Python, pkgcore stack, Arch Teams, GURU) OpenPGP_signature.asc Description: OpenPGP digital signature