Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Thu, 27 Mar 2014 03:53:47 +0100 yac y...@gentoo.org wrote: What I was describing is the difference between fundamental properties of categories and tags. You are trying to redefine categories in terms of a concept that they didn't originally represent. From a package mangler perspective, categories aren't just a label for a package. They're fundamentally part of a package's name. -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Fri, Mar 28, 2014 at 1:14 PM, Ciaran McCreesh ciaran.mccre...@googlemail.com wrote: On Thu, 27 Mar 2014 03:53:47 +0100 yac y...@gentoo.org wrote: What I was describing is the difference between fundamental properties of categories and tags. You are trying to redefine categories in terms of a concept that they didn't originally represent. No one's redefining anything. You seem awfully fixated on the history that forced categories to exist, which doesn't really matter in this context. Regardless of any of that, people can and _do_ attempt to use categories as a rudimentary method of attempting to search for packages. As you and several others have so eloquently pointed out, that's not their purpose. Concurrently, from the other direction, myself and several others have noted that they're thoroughly inadequate for that anyway. That's why this topic keeps coming up and why this (work-in-progress) GLEP exists in the first place. From a package mangler perspective, categories aren't just a label for a package. They're fundamentally part of a package's name. From that standpoint, they're even less adequate for lookup; encoding metadata in names has never turned out well for anyone. Cheers, Wyatt
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Fri, 28 Mar 2014 15:46:49 -0400 Wyatt Epp wyatt@gmail.com wrote: On Fri, Mar 28, 2014 at 1:14 PM, Ciaran McCreesh ciaran.mccre...@googlemail.com wrote: On Thu, 27 Mar 2014 03:53:47 +0100 yac y...@gentoo.org wrote: What I was describing is the difference between fundamental properties of categories and tags. You are trying to redefine categories in terms of a concept that they didn't originally represent. No one's redefining anything. You seem awfully fixated on the history that forced categories to exist, which doesn't really matter in this context. Regardless of any of that, people can and _do_ attempt to use categories as a rudimentary method of attempting to search for packages. Giving something a unique unambiguous name is not a historical issue. It's something we still need, and a core part of how package manglers work. You can't just pretend that categories there for exactly this. From a package mangler perspective, categories aren't just a label for a package. They're fundamentally part of a package's name. From that standpoint, they're even less adequate for lookup; encoding metadata in names has never turned out well for anyone. Things still need a unique unambiguous name. It's that or GUIDs... -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Ciaran McCreesh: On Fri, 28 Mar 2014 15:46:49 -0400 Wyatt Epp wyatt@gmail.com wrote: On Fri, Mar 28, 2014 at 1:14 PM, Ciaran McCreesh ciaran.mccre...@googlemail.com wrote: On Thu, 27 Mar 2014 03:53:47 +0100 yac y...@gentoo.org wrote: What I was describing is the difference between fundamental properties of categories and tags. You are trying to redefine categories in terms of a concept that they didn't originally represent. No one's redefining anything. You seem awfully fixated on the history that forced categories to exist, which doesn't really matter in this context. Regardless of any of that, people can and _do_ attempt to use categories as a rudimentary method of attempting to search for packages. Giving something a unique unambiguous name is not a historical issue. It's something we still need, and a core part of how package manglers work. You can't just pretend that categories there for exactly this. From a package mangler perspective, categories aren't just a label for a package. They're fundamentally part of a package's name. From that standpoint, they're even less adequate for lookup; encoding metadata in names has never turned out well for anyone. Things still need a unique unambiguous name. It's that or GUIDs... derailed. -BEGIN PGP SIGNATURE- iQJ8BAEBCgBmBQJTNdltXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQzMDlCNDQ4NjEyNDI4NjA5REVEMDI3MzIy MjBDRDFDNUJERUVEMDIwAAoJECIM0cW97tAgBeoP/A2f7vHUF3eueeeUsbr7tWIT z9mURgUl7fKsH7ZQ3cHeEqtL4dDWKmM6XRXuCpRgLK7zMQ1AqFiZSoFOMHCgySs2 TWCpZpTEJQfX6KFbyxuF5Y8GUk8nj0UdfOoYRjOUxlqaNyTG95ZaTKTkTM11EbW1 ER7Tpwj7bJeuKEaHWesPF5zXkzPaZxgu8UDwTu6jYSr0KMpw6GeoEuHiL4lBoXNk LbKWyt0tDIy4U74U1R68U8yFjwkmvUdIdF8khHO77B3/EDn8/V8fwETkkxhh3YZ3 UHTAZFT7/OKkz3XdycXpbbKUn0aPCSev4/W2QY77ZICL6E6NK1zFAWHIrYO+jJQu jfP0/iZUpmZPpZbwofjbQgqa6jOtgfQdhP5AXQneApzocf74OoV/1/zbSHpyGIII SiYb/CNrOU4TDDZo0/+8xcc1GFIBlELy4bpa+UwpGBGqF0KYrt9G1kwVK7YBvo1I vpbQb/8wIF1HBg97JbbPsTPIYGcMMR7UZ/FcoKQoe5FQ6A6uyCqQRWqR8DYCRqoJ 6lOpsEwGcOnJzNOfAP2nmdE0ZOT0Fg+M4mBIdksNBb12i4MVi+q0BPTxxpS7wLnc dlA2Ix97y3YPqzBIw9l0593e2abSjmdboIRu6I5duy8zJ/OE8YX8scOnNRZWYBSK 9HC8Mt4Hi1yJiv2oYeEA =ygQ2 -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 25 March 2014 03:55, Damien Levac damien.le...@gmail.com wrote: A lot of people already replied to this question: package search. A trivial example, a user want to know all terminals available in portage. Of course he could try a `emerge --searchdesc terminal`, but then he would get anything mentioning terminal in the description: which would probably include a lot of terminal applications which are not terminals themselves... `emerge --search terminal` just doesn't cut it as konsole wouldn't be a result but is a terminal emulator... On the other hand, terminals are spread through many categories (gnome-terminal in gnome-base konsole in kde-base to name the most obvious example). Thus tags are a nice way for user to find the applications they want. This example for me suggests we'll need to have some kind of process of defining what tags should be used for what things, similar to how we have a process for global USE, mostly, because inconsistency is a bad thing here. Because looking at this example and the results of `eix -cS terminal`, I see lots of things that may also be ambiguously tagged terminal due to being a terminal based application. Thus, either terminal-emulator or terminal-app or similar tags seem necessary. emerge --search tag:terminal-app tag:jabber-client ( or similar ) should thus result in net-im/mcabber And now that we're starting to flesh out mock tags that may make sense, it quickly seems we'll eventually want some kind of tag hierarchy. But as long as the tag is restricted to [A-Za-z-]+ or similar, we should have enough syntactical space to add a hierarchy in later if we find out we need it. For the sake of avoiding bikeshed, we should avoid hierarchy until we've proven tags are useful and have discovered we really need hierarchy. YAGNI -- Kent
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sat, 29 Mar 2014 09:39:06 +1300 Kent Fredric kentfred...@gmail.com wrote: On 25 March 2014 03:55, Damien Levac damien.le...@gmail.com wrote: A lot of people already replied to this question: package search. A trivial example, a user want to know all terminals available in portage. Of course he could try a `emerge --searchdesc terminal`, but then he would get anything mentioning terminal in the description: which would probably include a lot of terminal applications which are not terminals themselves... `emerge --search terminal` just doesn't cut it as konsole wouldn't be a result but is a terminal emulator... On the other hand, terminals are spread through many categories (gnome-terminal in gnome-base konsole in kde-base to name the most obvious example). Thus tags are a nice way for user to find the applications they want. Because looking at this example and the results of `eix -cS terminal`, I see lots of things that may also be ambiguously tagged terminal due to being a terminal based application. Thus, either terminal-emulator or terminal-app or similar tags seem necessary. emerge --search tag:terminal-app tag:jabber-client ( or similar ) should thus result in net-im/mcabber You do this by searching for intersection of tags. terminal ∩ jabber ∩ client --- Jan Matějka| Developer https://gentoo.org | Gentoo Linux GPG: A33E F5BC A9F6 DAFD 2021 6FB6 3EBF D45B EEB6 CA8B signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 29 March 2014 09:56, yac y...@gentoo.org wrote: terminal ∩ jabber ∩ client And now you want *only* terminal terminals, do you have to search for terminal ∩ !( jabber ∪ client ∪ everything ∪ else ) ? Or terminal ∩ emulator ( Which may include terminals for emulators instead of terminal emulators ) -- Kent http://kent-fredric.fox.geek.nz
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
And now you file a bug to get that incorrectly applied terminal tag changed to cli, because they don't mean the same thing. On Fri, Mar 28, 2014 at 4:06 PM, Kent Fredric kentfred...@gmail.com wrote: On 29 March 2014 09:56, yac y...@gentoo.org wrote: terminal ∩ jabber ∩ client And now you want *only* terminal terminals, do you have to search for terminal ∩ !( jabber ∪ client ∪ everything ∪ else ) ? Or terminal ∩ emulator ( Which may include terminals for emulators instead of terminal emulators ) -- Kent http://kent-fredric.fox.geek.nz
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Fri, Mar 28, 2014 at 4:39 PM, Kent Fredric kentfred...@gmail.com wrote: This example for me suggests we'll need to have some kind of process of defining what tags should be used for what things, similar to how we have a process for global USE, mostly, because inconsistency is a bad thing here. Yes, you want a controlled, well-defined vocabulary. That's important. On the other hand, don't get too bent out of shape about it. These things fall over when you start adding dumb arbitrary restrictions like there needs to be consensus or there need to be at least n packages beforehand. Because looking at this example and the results of `eix -cS terminal`, I see lots of things that may also be ambiguously tagged terminal due to being a terminal based application. Thus, either terminal-emulator or terminal-app or similar tags seem necessary. terminal: terminal emulators. Make it an alias to terminal_emulator. cli: things that have a normal, line-based terminal interface. See also: curses. It's not hard to choose good, unambiguous tags when you can use aliasing to shorthand and unify. That's why it's more important than implication, because controlling your vocabulary is seriously important. And now that we're starting to flesh out mock tags that may make sense, it quickly seems we'll eventually want some kind of tag hierarchy. No. You really, really, reaally don't. At least not in the sense that you seem to be thinking. It makes tags annoying to add and annoying to use, so no one does either and the whole thing falls over. But as long as the tag is restricted to [A-Za-z-]+ or similar, we should have enough syntactical space to add a hierarchy in later if we find out we need it. Don't worry, we won't. With only the facilities I've outlined in my first post, the system will scale well beyond a million packages and tens of thousands of unique tags, so don't worry too much about exhausting our semantic description space. Cheers, Wyatt
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Monday 24 of March 2014 16:28:44 Ciaran McCreesh wrote: | On Mon, 24 Mar 2014 10:55:38 -0400 | Damien Levac damien.le...@gmail.com wrote: | A lot of people already replied to this question: package search. | | Sure, but can you point to prior examples of this kind of stuff | actually working? https://wiki.debian.org/Debtags http://debtags.debian.net/search/ True, may not be as popular as full-text description search. regards MM signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Fri, 28 Mar 2014 20:02:30 + Ciaran McCreesh ciaran.mccre...@googlemail.com wrote: On Fri, 28 Mar 2014 15:46:49 -0400 Wyatt Epp wyatt@gmail.com wrote: On Fri, Mar 28, 2014 at 1:14 PM, Ciaran McCreesh ciaran.mccre...@googlemail.com wrote: On Thu, 27 Mar 2014 03:53:47 +0100 yac y...@gentoo.org wrote: What I was describing is the difference between fundamental properties of categories and tags. You are trying to redefine categories in terms of a concept that they didn't originally represent. No one's redefining anything. You seem awfully fixated on the history that forced categories to exist, which doesn't really matter in this context. Regardless of any of that, people can and _do_ attempt to use categories as a rudimentary method of attempting to search for packages. Giving something a unique unambiguous name is not a historical issue. It's something we still need, and a core part of how package manglers work. You can't just pretend that categories there for exactly this. I see your point. Resolving ambiguity is certainly needed and categories are prettier than most distributions approach like prefixing the package name with python-. However, it still seems that besides resolving ambiguity categories are in part also used to provide information better expressed with tags, like the genre of a game. jcallen was kind enough to provide a script that finds ambiguous package names and prints them with the categories they are in [1]_ and the output for portage tree [2]_, which supports my suspicion that there indeed are no ambiguities in game names. Maybe more cases like this can be found. .. [1] http://bpaste.net/show/VuEHVqLlLgsfsdL71tuz/ .. [2] http://bpaste.net/show/195029/ --- Jan Matějka| Developer https://gentoo.org | Gentoo Linux GPG: A33E F5BC A9F6 DAFD 2021 6FB6 3EBF D45B EEB6 CA8B signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Tue, 25 Mar 2014 18:31:45 +0100 Jeroen Roovers j...@gentoo.org wrote: On Tue, 25 Mar 2014 08:03:08 +0100 Jan Matejka y...@gentoo.org wrote: No, categories are essentially directories. fixed: categories are essentially also directories. Also? No, categories are *essentially* directories: they keep files apart that should not go together. In precisely that way, their names happen to aid in building unique atoms, which you need to be able to tell a package manager (or development tool) which precise bunch of files you want to read/address/target/modify/etc. They are *also* other things, like identifiers for actual categories of packages (hence the name) These are all accidental properties of our categories application. I don't see how they are relevant. What I was describing is the difference between fundamental properties of categories and tags. which may or may not suit someone's needs in finding packages based on keywords. That's where tags comes in. Stating in a GLEP that they're a giant mistake means you'll have to polish the document till you have rephrased that into something true agreed. and acceptable, or until you have purged every mention and reference of the giant mistake because it does not serve the purpose of the GLEP at all. Categories are *essential* to the way the repositories now work, and they're not going away, especially not by way of this GLEP. See below. I was asking about tags, not about categories. The original mails are: On Sun, 23 Mar 2014 15:46:09 +0100 Jeroen Roovers j...@gentoo.org wrote: On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags This GLEP author would love to blight categories out of gentoo history as a giant mistake. Why? Categories are essentially tags, only less powerful as they can express relationship of 1:N while tags are can express M:N How is this a question about tags and not categories? The GLEP's statements about categories appear to be a straw man. It basically states that: * we introduced categories to aid in finding packages * but it turned out that categories suck at helping us find packages * so now we need to add tags * but we can keep categories because they have proven useful for other stuff Please explain how is the straw man different from real issue. --- Jan Matějka| Developer https://gentoo.org | Gentoo Linux GPG: A33E F5BC A9F6 DAFD 2021 6FB6 3EBF D45B EEB6 CA8B signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On Mon, 24 Mar 2014 15:25:12 +0100 Jeroen Roovers j...@gentoo.org wrote: On Mon, 24 Mar 2014 12:36:19 +0100 Jan Matejka y...@gentoo.org wrote: Categories are essentially tags, only less powerful as they can express relationship of 1:N while tags are can express M:N No, categories are essentially directories. fixed: categories are essentially also directories. same thing, 1:N relationship without symlinks and other misuse of filesystem. I was asking about tags, not about categories. The original mails are: On Sun, 23 Mar 2014 15:46:09 +0100 Jeroen Roovers j...@gentoo.org wrote: On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags This GLEP author would love to blight categories out of gentoo history as a giant mistake. Why? jer Categories are essentially tags, only less powerful as they can express relationship of 1:N while tags are can express M:N How is this a question about tags and not categories? It appears it's very hard to answer the simple questions of why we need tags and how we would use them. The answers should typically involve some explanation of how you're going to use the things once you have them. jer - -- Jan Matějka| Gentoo Developer https://gentoo.org | Gentoo Linux GPG: A33E F5BC A9F6 DAFD 2021 6FB6 3EBF D45B EEB6 CA8B -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBCgAGBQJTMSouAAoJEIN+7RD5ejaho4AH/1YFbArFwx6t8OoI8yrWCulA LDt5qBAcZiJoqV9H1V5YdNgcNiLDKIyTCPbkWc4obkgRuNVIJxFAe+duYRQydudW 6KKS2lYQXWSkbDRmJTWOt7BnerHyemvk6AluQn741a2uPZyUI//FPQL8fZkYlR6i 3HFFlW0dI6PHa/9Np8G+RBAs29e8qAR7QKQzDLd9BF/s+6KIK2/FO8pAgMdZBVKk jJ4Aq1AuRsqrdY0HO940Boiy0ylBFjxB27ej59UmjAzvyOMj9YRf1LqNkgMABENu ohEckguSryOpBjjD2ZaZrfMbbJTGqfVgz44nhT0s6Nbocb5RmVYp988GIwFQCg4= =1uIP -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On Mon, 24 Mar 2014 09:32:40 +0200 Alan McKinnon alan.mckin...@gmail.com wrote: Who is going to approve/disapprove tagable attributes and the tags themselves? How will you resolve disagreements people have? Sounds like a job for QA What about the case of a package maintainer that simply can't be bothered doing tags at all? I'm not against tagging per se, they can be useful. But they do have to be strictly controlled otherwise things get out of hand very quickly. Every case I've seen of software that uses a freeform tagging mechanism fails almost instantly as it becomes very inconsistent. I have one of these apps in a corporate setting right now, have you any idea how many ways people can come up with to tag the concept of cloud? Some of these could probably be detected by meeting a treshold in Levenshtein distance (or some variant of) and making a suggestion to consider found alternative before doing the commit - -- Jan Matějka| Gentoo Developer https://gentoo.org | Gentoo Linux GPG: A33E F5BC A9F6 DAFD 2021 6FB6 3EBF D45B EEB6 CA8B -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBCgAGBQJTMS9uAAoJEIN+7RD5ejah5bIH/i6dERPVS2i/rd76HQRHjynr w7C4N5OQi+cN339f2/JusPwxrfBrUiN7ulsgWACMPz4s8/ZA9yrsRRnqvC2P8bnR 25n94z0vUZa3K5V3MIuDugfKa6nwVY9gZHZj6BP8KNnl84ETasxpG5lR3XTqs0az 4pJG18rbwtk22+7q38hXQv9/vRfAZH3Lx5ilG1+F0+I39miXW6ylsS37ovkdrQ97 rUvNasT+5GcB6jd3tXDQuOJs8UgGuBNgTjzZfrk5Y+6+Dqj2oL5ERRONOS6UN5RB TYGw9KI1Rj7pRWE1gIi/fhoXbugj0DZArRC8fA3D2NEyYFIStopjI0hI3bXljFs= =Z9+y -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On Sun, 23 Mar 2014 17:40:20 -0400 Joshua Kinard ku...@gentoo.org wrote: TBH, I don't like the use of XML at all. No, we don't need to go one level (format) deeper. The 'all' thing is probably unnecessary What problem does having 'all' tag solve? Seems pretty useless to me. Granted, a tag of dev offers no value (dev-python - 'dev','python'), but if you were looking for a web browser versus a web server, having default tags of 'www','client' or 'www','servers' helps for packages in www-client and www-servers. This might be helpful but rather as one-time, initial, hand-picked generation on case-by-case (by the category) basis. I've always wondered is we allowed portage to have one additional level of nesting if that'd help any (i.e., games-* - games/*). Squashing games-*/ to just games/ and defining genre by tags. Seems pretty doable, I like this. - -- Jan Matějka| Gentoo Developer https://gentoo.org | Gentoo Linux GPG: A33E F5BC A9F6 DAFD 2021 6FB6 3EBF D45B EEB6 CA8B -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBCgAGBQJTMTZeAAoJEIN+7RD5ejah85gIAItDtxuEntu2nhb4uvltKHfu dpnT0KePuAZKwV8H59jRx7AovfMo9nTjqs88Sgw6v9NbbKNxRmW3PPWmuJUnLniU eG31vMsUJ1CgXxNLWaXaYZRi1QTYnJqJM5LDnfFsh4mj9Dk7t1/XCA6rKcICO3qQ sqEDaSAyOYLBsTGPOyC2trrZNAsLEu2oLImzECXNHa6tNMJt75BJdGfKzFDTGBtF XiG/qi2IV7ClYxVZP4W1LwN+SVUmLiEDUyMeP6FRgVdEmZcdlQGLm6kBiYD0A/2F xeWHPoQpgkPRZuLRNv0vvvatO+A2KpXY1rs0s3BYb0xk3MDEGwE+X1ZrcDRVsIg= =5YXb -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Or argue for 100 posts, either way. -A It might be worthwile to prototype this functionality as an external service. 1. It should be easier to crowdsource the tags definitions, assignments and QA and therefore helps to build consistent database faster and easier. 2. decouples tagging from maintaining a package, therefore makes more community driven and prevents THAT'S MY PACKAGE, DON'T TOUCH IT YOU SCHMUCK type of quarrel. 3. Classic prototyping benefit - might identify problematic areas for implementation. The downside is the implementation might be more time consuming depending on architecture. Mostly due to 1. I guess but that could be handled by using git and pull requests for crowdsource and then the implementation should be basicly no different from using portage itself, besides using different VCS. - -- Jan Matějka| Developer https://gentoo.org | Gentoo Linux GPG: A33E F5BC A9F6 DAFD 2021 6FB6 3EBF D45B EEB6 CA8B -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBCgAGBQJTMTicAAoJEIN+7RD5ejahIlsH/1UmMa1P+QBEZ/HpfOPX1d4Z S0nYho8vinPQVenOXMY/0KVvh4bzhmrTZWm4itLVB/5taI88m5fzQbvYSNlEn+Nn zRTcmyuze+3mmDf3++mfzzjmByYLiYJwCdhNF7HH4V04Ph/aEQbjO++9EL37bW/J uCYs8b0Vtn97utW2mJcUa7Wbgluo6jhCDHW8yGuZW4JfCo6bRQfqqlxGGGufIXK9 N8FX01Kbt2HFqgwdK7uZIfn0Gh9xGkIL2Jk2WCFzUHjZgJTy0UuPGeUSCOsTt7wg CcETDWrzscuydtdhZaFmtZ3Xs7eQ7I98nwEdlkSdgbsfQFYQlPvomCGZyGehtTo= =ykWo -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Jan Matejka: I've always wondered is we allowed portage to have one additional level of nesting if that'd help any (i.e., games-* - games/*). Squashing games-*/ to just games/ and defining genre by tags. Seems pretty doable, I like this. Sounds like a huge waste of time. And have fun doing all the pkgmoves in CVS. However, I think we can agree that the GLEP needs some more work. But it seems the majority is interested, so don't give up. -BEGIN PGP SIGNATURE- iQJ8BAEBCgBmBQJTMaRjXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQzMDlCNDQ4NjEyNDI4NjA5REVEMDI3MzIy MjBDRDFDNUJERUVEMDIwAAoJECIM0cW97tAgvmIQAOlsgvQqTL6ce3MX2mMruNeh GCJcIOZl/aPZA9wg3oo1wWqdC1+ZCFf4gWcmi604P1WcPAcWfiJPgvjpLiQ0NTmR 2Ru7iHHEVoHa6x6xqm9+LeMDVFe2dPdL4mNL0KCAWmSLkmZSxPEXmh+AR+xxnSNg RJvqY5mrb/u2bzyLzLtbmtaaKa/5+lT8sI6cn76Z9MZavQA6ormDX78e2XoD44CJ TV/COWWngOx8GB3NhyNxbSOdjAvKBGgQpCk/oAIZ0xW4lCiUOFpKJERGk//dbC7h BpveJuA8TR1Htn9a914o91TJF1l4eC7mF0lB0oDDE5MVgEv1RjS5EjBJoaxahWtm aEVLXWV1tlfg6GWIzSqW9vhLXeseS2KeAuJr7rG/3wR7T8c//XytCw8DW6SsK6z5 tsfyKtscrm+I/lT8VoIqpTlOmcJ6kkHqLLQkc3sKZDT7ynKWhNghDj/OahcFdZ8L rWiKZ+RZHLOpdj09gJDGq8fzLCGca0WX2MvjA0bgdVoMzBvzYxnz8qDHtUGGZHU5 /BX6nu0RL/neF1p+XFCtHJaPodcJkia8jf9QmcJufWr2y2g0vpXTpm2GE32t3e+b 8jP/WG46RZCmydLCzVRyYs5qUFgbgDZLJwKL4rnxK3dOd1wpX3HnIW+A1q7P3cmw sZfn+0YanuAsw3AdhsFk =kAA7 -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Tue, 25 Mar 2014 08:03:08 +0100 Jan Matejka y...@gentoo.org wrote: No, categories are essentially directories. fixed: categories are essentially also directories. Also? No, categories are *essentially* directories: they keep files apart that should not go together. In precisely that way, their names happen to aid in building unique atoms, which you need to be able to tell a package manager (or development tool) which precise bunch of files you want to read/address/target/modify/etc. They are *also* other things, like identifiers for actual categories of packages (hence the name), which may or may not suit someone's needs in finding packages based on keywords. Stating in a GLEP that they're a giant mistake means you'll have to polish the document till you have rephrased that into something true and acceptable, or until you have purged every mention and reference of the giant mistake because it does not serve the purpose of the GLEP at all. Categories are *essential* to the way the repositories now work, and they're not going away, especially not by way of this GLEP. See below. I was asking about tags, not about categories. The original mails are: On Sun, 23 Mar 2014 15:46:09 +0100 Jeroen Roovers j...@gentoo.org wrote: On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags This GLEP author would love to blight categories out of gentoo history as a giant mistake. Why? Categories are essentially tags, only less powerful as they can express relationship of 1:N while tags are can express M:N How is this a question about tags and not categories? The GLEP's statements about categories appear to be a straw man. It basically states that: * we introduced categories to aid in finding packages * but it turned out that categories suck at helping us find packages * so now we need to add tags * but we can keep categories because they have proven useful for other stuff You seem to be repeating the same illogical argument here. Now don't fix it here: go fix the GLEP. jer
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 24/03/2014 02:43, Tom Wijsman wrote: On Sun, 23 Mar 2014 23:47:22 +0200 Alan McKinnon alan.mckin...@gmail.com wrote: Tags work best when they describe narrow, clearly defined attributes, and the thing they are applied to can have one, two or more of these attributes or sometimes even none. Music and movie genres are an excellent example - there are only so many of them and for the most part one can tell whether a tag really is a genre or not. There are more ways to search for a music or a movie than a genre: Genre was just one example of tag usage for illustration. Doesn't mean there aren't other equally good or valid examples. What mood is it in? What are key elements of its plot or lyrics? Where does it take place? For which audience is it meant? Which praises has it received? What kind of style is it made in? What is it based on? What is the attitude of it? What looks or effects does it have? Is it appropriate for children? Does it contain explicit things? Let's do this for movies. I'm looking for a ... ... serial killer (key element) that is scary (mood)? Carrie, Halloween, Saw, Scream, ... ... musical (genre) that makes one feel good (mood)? Aaja Nachle, Frozen, Grease, The Sound of Music, ... ... good versus evil (plot) based on comics (based on)? Batman, Sin City, Superman, The Avengers, ... ... goofy (attitude) hero (key element) where nothing goes right (plot)? Due Date, Faulty Towers, Monty Python's Flying Circus, Mr Bean, ... These are results from an actual movie recommendation site; similarly, the same exists for music too, where you can for example look for a female american singer-songwriter singing catchy contemporary country. Getting back to Gentoo; when I would look for some package, I want it to be a lightweight, do audio recordings, organize these audio recordings and do effects on these audio recordings. So, I'll be looking for tags like lightweight, audio-recording, file-organization, sound-effects; if that's to broad, I can take two of them and test some of that. Thinking about the different types of things to search for; I'm thinking about ... ... what the characteristics of the software are (light/heavy, new/old, extensible/modular/nonstandard, ...), ... what the software can do (record audio, organize files, ...), ... what category (browser, development, DAW software, utility, ...), ... what kind of interface the software has to me (CLI, GUI, ...), ... what interconnectivity the software has (internet, bluetooth, ...), ... and so on ... We could make a list of types (some already mentioned above) and a list of possible tags for that type to shape the tag system somewhat. Have you considered just how much heavy lifting that is? Who is going to compile the list of tags? Who is going to approve/disapprove tagable attributes and the tags themselves? How will you resolve disagreements people have? What about the case of a package maintainer that simply can't be bothered doing tags at all? I'm not against tagging per se, they can be useful. But they do have to be strictly controlled otherwise things get out of hand very quickly. Every case I've seen of software that uses a freeform tagging mechanism fails almost instantly as it becomes very inconsistent. I have one of these apps in a corporate setting right now, have you any idea how many ways people can come up with to tag the concept of cloud? I have tags in there where someone translated the word cloud to a different language! It sounded like a good idea at the time to them All in all, tagging is a huge amount of work and the odds of failure are high. People need to be aware of this reality. Wyatt Epp's post at 03:25 expresses very nicely in a more formal language what I'm saying. -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Mon, 24 Mar 2014 09:32:40 +0200 Alan McKinnon alan.mckin...@gmail.com wrote: On 24/03/2014 02:43, Tom Wijsman wrote: On Sun, 23 Mar 2014 23:47:22 +0200 Alan McKinnon alan.mckin...@gmail.com wrote: Tags work best when they describe narrow, clearly defined attributes, and the thing they are applied to can have one, two or more of these attributes or sometimes even none. Music and movie genres are an excellent example - there are only so many of them and for the most part one can tell whether a tag really is a genre or not. There are more ways to search for a music or a movie than a genre: Genre was just one example of tag usage for illustration. Doesn't mean there aren't other equally good or valid examples. +1 Ah, in that case, what I've said backs up your thought. \o/ We could make a list of types (some already mentioned above) and a list of possible tags for that type to shape the tag system somewhat. Have you considered just how much heavy lifting that is? Who is going to compile the list of tags? +1 Yes, it's why I've stated before this should be crowd sourced. Who is going to approve/disapprove tagable attributes and the tags themselves? Approval by default (with a quick skim over it) where we disapprove what's not appropriate once we spot it could work. The tagging rules will make themselves here. Those whom are interested could do it; that is, I'd expect Alec to help out a bit, maybe I do too, maybe others? How will you resolve disagreements people have? Discussion and/or votes. What about the case of a package maintainer that simply can't be bothered doing tags at all? +1 [see crowd sourced idea] I'm not against tagging per se, they can be useful. +1, same thought; it's nice to have, but it needs to be good to work. But they do have to be strictly controlled otherwise things get out of hand very quickly. Every case I've seen of software that uses a freeform tagging mechanism fails almost instantly as it becomes very inconsistent. I have one of these apps in a corporate setting right now, have you any idea how many ways people can come up with to tag the concept of cloud? I have tags in there where someone translated the word cloud to a different language! It sounded like a good idea at the time to them All in all, tagging is a huge amount of work and the odds of failure are high. People need to be aware of this reality. +1 As can be seen that it can be made to work with things like movie and music recommendation; it indeed took a while till they got at that point, doing the work right avoids us to spend too much time on this. Wyatt Epp's post at 03:25 expresses very nicely in a more formal language what I'm saying. +1 -- With kind regards, Tom Wijsman (TomWij) Gentoo Developer E-mail address : tom...@gentoo.org GPG Public Key : 6D34E57D GPG Fingerprint : C165 AF18 AB4C 400B C3D2 ABF0 95B2 1FCD 6D34 E57D
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Mon, 24 Mar 2014 12:36:19 +0100 Jan Matejka y...@gentoo.org wrote: Categories are essentially tags, only less powerful as they can express relationship of 1:N while tags are can express M:N No, categories are essentially directories. I was asking about tags, not about categories. It appears it's very hard to answer the simple questions of why we need tags and how we would use them. The answers should typically involve some explanation of how you're going to use the things once you have them. jer
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 14-03-24 10:25 AM, Jeroen Roovers wrote: On Mon, 24 Mar 2014 12:36:19 +0100 Jan Matejka y...@gentoo.org wrote: Categories are essentially tags, only less powerful as they can express relationship of 1:N while tags are can express M:N No, categories are essentially directories. I was asking about tags, not about categories. It appears it's very hard to answer the simple questions of why we need tags and how we would use them. The answers should typically involve some explanation of how you're going to use the things once you have them. jer A lot of people already replied to this question: package search. A trivial example, a user want to know all terminals available in portage. Of course he could try a `emerge --searchdesc terminal`, but then he would get anything mentioning terminal in the description: which would probably include a lot of terminal applications which are not terminals themselves... `emerge --search terminal` just doesn't cut it as konsole wouldn't be a result but is a terminal emulator... On the other hand, terminals are spread through many categories (gnome-terminal in gnome-base konsole in kde-base to name the most obvious example). Thus tags are a nice way for user to find the applications they want. Damien
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Mon, 24 Mar 2014 10:55:38 -0400 Damien Levac damien.le...@gmail.com wrote: A lot of people already replied to this question: package search. I didn't ask for an explanation on the mailing list. I quoted [1] because it needs to be more specific exactly where it needs to be more specific. The GLEP still doesn't explain properly why it exists in the first place. jer [1] https://wiki.gentoo.org/wiki/Package_Tags
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Mon, 24 Mar 2014 10:55:38 -0400 Damien Levac damien.le...@gmail.com wrote: A lot of people already replied to this question: package search. Sure, but can you point to prior examples of this kind of stuff actually working? -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 14-03-24 12:28 PM, Ciaran McCreesh wrote: On Mon, 24 Mar 2014 10:55:38 -0400 Damien Levac damien.le...@gmail.com wrote: A lot of people already replied to this question: package search. Sure, but can you point to prior examples of this kind of stuff actually working? I have no example for package searching... However it is used a lot in multimedia search since it is traditional to give tags to video for example. If you want a funny example of tags, see tags for animes on anidb.net --- this allows users to easily find animes that contains element they enjoy to see. That being said, I am surprised that having no example showing it works should be a deal breaker for trying it out. Wouldn't that mindset kill innovation? Personally, I expect it to be not so great at the beginning, as the tags chosen will most likely be the most clever ones on the first try, and will get more and more useful as the tag convention get better. Such a feature could later being used to create applications like *gasp* an intuitive GUI interface to portage or for statistical analysis of packages... Damien
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Mon, Mar 24, 2014 at 12:28 PM, Ciaran McCreesh ciaran.mccre...@googlemail.com wrote: On Mon, 24 Mar 2014 10:55:38 -0400 Damien Levac damien.le...@gmail.com wrote: A lot of people already replied to this question: package search. Sure, but can you point to prior examples of this kind of stuff actually working? eix -C allows you to search for categories. It's horrendously under-powered, but almost a useful prototype of what could be. Pandora uses this general concept with superb granularity for graphing similarities in music. That the MGP data is only used for a streaming service is depressing. Alternativeto.net is software oriented and has a good bit of this. Results? http://alternativeto.net/tag/tiling/ Bam. Tiling window managers. (These are almost certainly all user-sourced; notice the innocent misuse in that list.) The various Danbooru-style sites will generally show off impressive community-sourced rigour as well as proving the efficacy of alias/implication at scale. I have a lot of respect for their collective pep. Most are NSFW, but this one probably won't be (much): http://safebooru.org/ The Library of Congress? (The modern library is practically built on this sort of metadata.) Regards, Wyatt
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Mon, 24 Mar 2014 13:31:43 -0400 Damien Levac damien.le...@gmail.com wrote: That being said, I am surprised that having no example showing it works should be a deal breaker for trying it out. Wouldn't that mindset kill innovation? I ask, because this isn't the first time tags have been proposed as the magic solution to everything. Each previous time, everyone has had a slightly different, incompatible idea of what tags are, what they're supposed to do, and how they're supposed to do it. So I would like to see someone explain in detail, and without glossing over the inconvenient technicalities, just how tags will help with searching. Such a feature could later being used to create applications like *gasp* an intuitive GUI interface to portage or for statistical analysis of packages... Could, maybe. But the current lack of intuitive GUI and the lack of statistical analysis both have absolutely nothing to do with not having tags... -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sat, 22 Mar 2014 23:48:06 + hasufell hasuf...@gentoo.org wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Alec Warner: https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Or argue for 100 posts, either way. Sounds good, but how do we get consistency in there? I mean... this only works if we have some sort of consensus about tag names, at least more common ones. By aggregating a global list of tag names; that way, when you tag a package you can look for tags on the global list that apply to it, and if it happens two different ways to name something were brought up you can also discuss it with one another. I don't think the inconsistency would become of a size to be concerned about; but yes, at the very least we need to watch out in the beginning to not let it happen... Though, choosing the right tag naming early on might be a need for this to succeed; maybe we can brainstorm some examples of how packages would be tagged, to get an idea about it. -- With kind regards, Tom Wijsman (TomWij) Gentoo Developer E-mail address : tom...@gentoo.org GPG Public Key : 6D34E57D GPG Fingerprint : C165 AF18 AB4C 400B C3D2 ABF0 95B2 1FCD 6D34 E57D
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Or argue for 100 posts, either way. A possible problem with this would be whether much maintainers would be concerned enough to spend their time on this. By spending time thinking up with tags you give to a package, you lose some time working on a bug. Adding some quick tags one can think of does something when you're busy; but I'm not sure if limited time would yield a good set of tags. Crowdsourcing, as brought forward[1] by rich0, could yield a far more rich set of tags; together with a small bit of moderation for quality. [1]: http://article.gmane.org/gmane.linux.gentoo.devel/90693 -- With kind regards, Tom Wijsman (TomWij) Gentoo Developer E-mail address : tom...@gentoo.org GPG Public Key : 6D34E57D GPG Fingerprint : C165 AF18 AB4C 400B C3D2 ABF0 95B2 1FCD 6D34 E57D
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
Alec Warner dixit: https://wiki.gentoo.org/wiki/Package_Tags Without expecting to have any weight on the discussion, I just wanted to let you know: As a system maintainer I like to use the categories, e.g. when doing 'eix -I media-fonts/' or in package.use 'media-fonts/* X'.
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sun, 23 Mar 2014 00:04:08 + hasufell hasuf...@gentoo.org wrote: Ciaran McCreesh: On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags And do what with them? Right now this is a solution without a problem. Finding packages. Descriptions are not consistent, categories too generic. Please explain, with examples, how tags will help with this. -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Ciaran McCreesh: On Sun, 23 Mar 2014 00:04:08 + hasufell hasuf...@gentoo.org wrote: Ciaran McCreesh: On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags And do what with them? Right now this is a solution without a problem. Finding packages. Descriptions are not consistent, categories too generic. Please explain, with examples, how tags will help with this. When thinking about games, it is pretty obvious and common practice on almost any gaming platform/service like steam. You can't just say this game is an rpg... it may be a mix of genres. Tags may even identify features like multiplay. USE flags cannot deliver that, because there is no multiplay or 3rd-person flag for obvious reasons. Descriptions try to be short and and give an idea what the game is about, not list all the possible genres or common search terms. It also works the other way around. There are a lot of applications that are scattered across multiple categories like terminal emulators or file managers. The user ends up searching the web or wiki instead of using portage tools which would be far more efficient. so we have: * tags to extend categories, e.g. when a package might fit in more than one * tags to group similar packages which are scattered across multiple categories * tags for features or attributes that cannot be expressed by USE flags and cannot be guaranteed to be part of the description * and so on -BEGIN PGP SIGNATURE- iQJ8BAEBCgBmBQJTLtt8XxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQzMDlCNDQ4NjEyNDI4NjA5REVEMDI3MzIy MjBDRDFDNUJERUVEMDIwAAoJECIM0cW97tAgn3QP/0KLFi6Zdv/OkdwMi05gXqlr NHnHAPf2v83gTeAikBaXRq+P11wWzraUPMrQBNe6agU2VmQGpJTt97KrVzzXJAuQ ND1W2Dne6wV/c61UY/KDnGExb9QSXi6ow5eNZoJjoX1sUEorXaNDlI57sYaywlny auT45Vhp87jwJLFydM4dGK4girbqSPR145bLumdB1fj5PGKc9z3e8MT2MQ+4UgYo m4VGWxoJ//j6TX6Wv5zk0WJRPVoRdOqcTcficp90Km56d+eDV9Ijx5K0ZIQ46+7z zj0xZvCLGKYsELgQlXHrCHrhYH12xkyo54WzVP2SpLN7AldKs73qr+Ntst3cLxUw HL7inMHzRJoGsGbuYVXPzfOyDC23LDaofJrMjdny/vrUfA/I+Iu6NgAjAcy59QaC QtW/DIpoZtosHSz6Bh4UG89a/KwhgVzPyJ2C/On0FtOv6oJmGjuCRj3SfH1hM5s8 6D3DYxXDFjfJR8WPrnTpwyMDaPgMP1Aow+WowEHFnp9ApBa8at1QONJm020SBZZx f7vSi6Iu6C34kg6dzojuVQoSoP/wpzWDksh9hNRhrZnsefpjZRCN5cCjMqMI30ua ZTU7vVG1BAeUjB18EzPIccLrk/2Tv8QDYvIRnNHsFWdedOQK7t5cbIo9tmIpYmNb ucdX2RTAXEoxjN/dCgIV =SPv7 -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags This GLEP author would love to blight categories out of gentoo history as a giant mistake. Why? jer
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 23/03/14 15:46, Jeroen Roovers wrote: This GLEP author would love to blight categories out of gentoo history as a giant mistake. It does not matter. Just remove that line. It is irrelevant. - -- Alexander berna...@gentoo.org https://secure.plaimi.net/~alexander -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlMu98oACgkQRtClrXBQc7VElwD/Siuqz64ggZ23xZ7904sbgcrG Hkjp62BFzo8/LW5aHhMBAKiME3FuPuY+Ev4o5o/2j5QsKasHjPh0vuiCcHGoTY+N =pPnG -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sun, Mar 23, 2014 at 2:45 AM, Tom Wijsman tom...@gentoo.org wrote: On Sat, 22 Mar 2014 23:48:06 + hasufell hasuf...@gentoo.org wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Alec Warner: https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Or argue for 100 posts, either way. Sounds good, but how do we get consistency in there? I mean... this only works if we have some sort of consensus about tag names, at least more common ones. By aggregating a global list of tag names; that way, when you tag a package you can look for tags on the global list that apply to it, and if it happens two different ways to name something were brought up you can also discuss it with one another. I don't think the inconsistency would become of a size to be concerned about; but yes, at the very least we need to watch out in the beginning to not let it happen... Though, choosing the right tag naming early on might be a need for this to succeed; maybe we can brainstorm some examples of how packages would be tagged, to get an idea about it. This is basically the same problem with USE flags. Personally I also dislike global USE flags on multiple levels, so I'm not entirely interested in tag consistency. That being said, I wouldn't object to such a feature very strongly. I don't consider it a blocker to GLEP adoption, merely a concern that we can address later. -A -- With kind regards, Tom Wijsman (TomWij) Gentoo Developer E-mail address : tom...@gentoo.org GPG Public Key : 6D34E57D GPG Fingerprint : C165 AF18 AB4C 400B C3D2 ABF0 95B2 1FCD 6D34 E57D
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sun, Mar 23, 2014 at 2:57 AM, Tom Wijsman tom...@gentoo.org wrote: On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Or argue for 100 posts, either way. A possible problem with this would be whether much maintainers would be concerned enough to spend their time on this. By spending time thinking up with tags you give to a package, you lose some time working on a bug. Adding some quick tags one can think of does something when you're busy; but I'm not sure if limited time would yield a good set of tags. Crowdsourcing, as brought forward[1] by rich0, could yield a far more rich set of tags; together with a small bit of moderation for quality. Crowdsourcing poses its own set of problems, most of which I'm not eager to design software around. I'd rather deploy the GLEP, wait 6 months, see if it failed, and if so, do something else (or nothing else, and simply repeal it.) -A [1]: http://article.gmane.org/gmane.linux.gentoo.devel/90693 -- With kind regards, Tom Wijsman (TomWij) Gentoo Developer E-mail address : tom...@gentoo.org GPG Public Key : 6D34E57D GPG Fingerprint : C165 AF18 AB4C 400B C3D2 ABF0 95B2 1FCD 6D34 E57D
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Alec Warner: On Sun, Mar 23, 2014 at 2:45 AM, Tom Wijsman tom...@gentoo.org wrote: so I'm not entirely interested in tag consistency What are they for then if I cannot efficiently use them to search for software? (which I cannot, if there is no consistency) -BEGIN PGP SIGNATURE- iQJ8BAEBCgBmBQJTLzRjXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQzMDlCNDQ4NjEyNDI4NjA5REVEMDI3MzIy MjBDRDFDNUJERUVEMDIwAAoJECIM0cW97tAg0QMP/jhl0WmdDqDLB38uK1+G5Q8R Q5KhfDPikq0oeVRsxRIa37MQkVpDKzawKPYc1ITOAZ5P2vVRGTe+QiB7R/Ul8r2j 253o5gDZn17zsi7koHrF2T+XgDhT4OEJExVH49GT7XEvpxtXIl7y4T22dlghD2H4 nzLyJExW1eAao7TAAV2SVCiUskW8Ex07ei1yAYBodxcAHV4W8M74aZ6KiB81vYhm SxnXiQfKHYwzE7aQMMd5yefmDFA34OCH1PDXXI7PNtKUW1u771cmg/RuHVp4Ekdp badVqbKU5SrnPocg78hQRSpJSBPkI1N96v8Le1FDfjU0q+Q+G9a8ml+KkybQORFR CN+fU7yO9bs/r+/wb3Jzicn2LqFbLXX7j66NBuzUBVxHACBSyeqMi4yK58ZPi/rg LjxZ1wqb7zfW710DSICwUJyNrufQBgbzl4T0OtJPN45oE5HW8POXX2JZeIWiZLcF O3GdnZ/gkWcbVOGHjjAAWRiBkI7uReDsGL34nbOcVj1fZFh4fp1CUhIG1N1v3GNe 8MZY73zopE8wMRb+27H9ZTMt/jlDcpshJZ5mjrOkRTI8yjYxIFeBVokO6yUSEktg aBlPwhtMjlQ0KdUO2o86BAK26/BaguLhN8MCHimZbU5DOA6fp1P7lOqVg5Qzzn7k uGtRoAylQi13O2kXSOLW =/mYw -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 hasufell: Alec Warner: On Sun, Mar 23, 2014 at 2:45 AM, Tom Wijsman tom...@gentoo.org wrote: so I'm not entirely interested in tag consistency What are they for then if I cannot efficiently use them to search for software? (which I cannot, if there is no consistency) That said... if it's not about search terms, but rather about listing all tags of a single package, then it's pretty useless and people should rather use longdescription. -BEGIN PGP SIGNATURE- iQJ8BAEBCgBmBQJTLzVoXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQzMDlCNDQ4NjEyNDI4NjA5REVEMDI3MzIy MjBDRDFDNUJERUVEMDIwAAoJECIM0cW97tAgmAsP/18uMwgHRtEdMHyHAITvUFO+ 1O1orSM78BUEfEQrXozD+OJ4BnOFChBTY2ue1BlVRB7l39HV4BfQwXtYLJl6oisG w4tJestVsSPsPDm1ke3zPqXzGk0QN+jsv/MnZQd/EyKyEh97icewKVifYFuQe1Ta fVhxHDLKIM0GJRsWnhrv6z06kMEVvUZcOg0wq6TysPx3YKe/2igG9aApNJdndxbK 1SNRNwkEzuHiuMZpwGuiyFFU7J0Q3YSl2yf4BLU9wxgD7K8XDoAF495EZovo2XhU /Mua/RqOXLFMF/Axr82j+WPpUKtXREHl1JiVRytYC0iHhcyiHLhmv6TaOW01odPK BsMxP4NIKfIIhWk+WadXNgFmop42Nd/g+5N8b3BcHloDetSJRPVxOL6N/KAPQEXR x+J29XVXX+GE8pnNDJlvS930/I5dyabDKrQJwmh4eWZChFTjD+5lfndjWF4YHMvj bN6kGibBfI6EQ6VqUJZ6LRgC1NYzFqeH4scR1kn4WT6DoSJw1O0KviZ8i2imr3+5 Ey1UtTZY8g+Npcv6oU1yCbqqhcudQqLSq01b9Lp/yNWVfRShpl8TshGcqcCtstEh 5FJLghXUNoucHMSUWlIcDSkXWQdlMddG4AWFs33gQAqC942FbaWKObfJV05rwXuN T2IysJBrXVb7ZDsCrsz8 =PYYp -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
Dnia 2014-03-22, o godz. 15:33:27 Alec Warner anta...@gentoo.org napisał(a): https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Honestly, I don't think metadata.xml is a good place for it. While I like the consistency with general use of that file, I feel like it's going to make the application of tags more cumbersome, more noisy and make it harder to maintain consistency. As I see it, tags are not the same kind of package property as the description or package name. As I see it, current metadata.xml properties are somehow constant. They are usually set by the maintainer, do not change often and are strictly related only to the package. Tags, on the other hand, are more 'live'. They place the package somewhere in the 'global' tag hierarchy that can change over time. I expect that people other than maintainers will be adding tags to packages (and changing them), and that people will invent new tags and apply them to more packages. So, first of all, your solution would mean that every commit adding a new tag or changing one of the tags would modify the package metadata.xml. This means a Manifest update and a ChangeLog entry (please don't get into more rules for ChangeLogs now), and this means it will be harder to find actually useful entries there. So we make tag updates harder, and increase time and size of rsync. Secondly, since tags for every package will be held in different files, people will need dedicated tools to collect tags from all those files and add matching tags to their own packages. Long story short, we're going to have many 'duplicate' tags that will require even more commits with ChangeLog entries and Manifest updates. Worse than that, your GLEP doesn't even have any basic rules for naming tags -- like what language form to use and, say, which character to use instead of space. This sounds like the sort of things that's going to make it even harder to get some consistency, especially if some developers are going to follow someone else committing earlier and some will follow their own rules. I'd honestly prefer that -- if we should really keep tags in the tree -- to do that with a single 'metadata/tags' file or some kind of hierarchy there. Keep them outside the package directory -- bind packages to tags, rather than tags to packages. Keep all the commits in a single place without altering the ebuild work flow. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Michał Górny: Dnia 2014-03-22, o godz. 15:33:27 Alec Warner anta...@gentoo.org napisał(a): https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. I'd honestly prefer that -- if we should really keep tags in the tree -- to do that with a single 'metadata/tags' file or some kind of hierarchy there. Keep them outside the package directory -- bind packages to tags, rather than tags to packages. Keep all the commits in a single place without altering the ebuild work flow. That sounds better. That way it is also easier to get some consistency. E.g. tags can be discussed... but adding packages to tags is up to the maintainers. The GLEP should maybe cover a basic set of tags. Then projects like games, science etc could add their sets as well which may be a bit more specific... instead of random maintainers adding random tags. -BEGIN PGP SIGNATURE- iQJ8BAEBCgBmBQJTLz8xXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQzMDlCNDQ4NjEyNDI4NjA5REVEMDI3MzIy MjBDRDFDNUJERUVEMDIwAAoJECIM0cW97tAgYpsP/ip5e4Jf1WmU3ThkQmLTu8p2 j67H8RNciPIrxnhhqCtl86mVVsBYKnMA1jwQI/5Yu1aqXnjTM+mEpwbWmas79vm5 0Djam0584DUDCcxkQPUFBs0qmxcWKzQOtClONPWbdgRryKS0csBoDhrJX1JtA3aQ Cn5Nj1psgaMlS/YeezQI1IVnIJIHaSuJ5v4AQZCwKofuMAeQvhFa3WaZVMcApJxj ARABa4ZQ3kt7baL0J9/L9vmMbGZ0mWb0K0qGZ9kqhkUtRIgC2fhXad1haIHlcyGB diXh5UyJgwHKuYKJ1OcmsVHc1EtueJUWCWoRsOQduRfcHahdRhkRh0+zk3HW6hq2 5m+GMrYzFkBBYcfZmFpCK2ElYQ4Pk4rncagLavry7THY/7+8MlTNhdGKMTHo99nk M5WzcZI1S24sY5h/vZHIkpx2IS+gZE5+9FpJ2H76uu+hk7vU8t2owjZcret1FbZ9 sM4DgmSjDkMWNWDBVlyIiCDoT0VKFEG+8rNa1o9msnrbpyIu7xHHNcgPG5Xvx2Rk ebatg8mq/qPQMEFOwICej72q1AbeXZcEPxKuL13g5RcDAHMjlPfL0pzN8g341+i+ UepHoU/RK30sd+Kp+NsLIS+RKesuujNS7DQa8FOtr8GuP0sAo4BdV+syT6avKK8A ixYiHLmcm4Y1eTdV6e20 =Q5/K -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 03/23/2014 15:44, Michał Górny wrote: Dnia 2014-03-22, o godz. 15:33:27 Alec Warner anta...@gentoo.org napisał(a): https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Honestly, I don't think metadata.xml is a good place for it. While I like the consistency with general use of that file, I feel like it's going to make the application of tags more cumbersome, more noisy and make it harder to maintain consistency. As I see it, tags are not the same kind of package property as the description or package name. As I see it, current metadata.xml properties are somehow constant. They are usually set by the maintainer, do not change often and are strictly related only to the package. IMHO, metadata.xml is actually the best place to describe a package with tags, but I am not so sure it's the best place to define a tag. I guess if we automate the indexing of tags, much like how use.desc.local is generated from metadata.xml, then that might eliminate some of the maintenance overhead. The only way tags are going to work well is to keep the management of them as automated as possible. They should only be involved in searches for packages, and nothing else. E.g., hypothetical emerge command might be: emerge -T mail,client, which will show me all packages with the tag of 'mail' and 'client' (I didn't check emerge to see if -T already has a purpose, btw). And I think we should limit the number of tags allowed per package to a reasonable number. Maybe five tags maximum? StackOverflow is one example where they restrict questions to five tags. In addition, SO tries to suggest to you already-existing tags so that you reuse them instead of creating new ones all the time. Repoman could be extended to issue a warning when metadata.xml contains previously undefined tags and optionally display a match of similarly-named, existing tags (if only to catch misspellings, 'mial' or 'cleint' instead of 'mail' and 'client'). Tags, on the other hand, are more 'live'. They place the package somewhere in the 'global' tag hierarchy that can change over time. I expect that people other than maintainers will be adding tags to packages (and changing them), and that people will invent new tags and apply them to more packages. So, first of all, your solution would mean that every commit adding a new tag or changing one of the tags would modify the package metadata.xml. This means a Manifest update and a ChangeLog entry (please don't get into more rules for ChangeLogs now), and this means it will be harder to find actually useful entries there. So we make tag updates harder, and increase time and size of rsync. Instead of individual tag lines in metadata.xml for each tag, why not a single tags line that contains a comma-delimited list of up to five tags, whitespace optional? That should help reduce the fluff of the tree by adding this feature. E.g., tagsone,two,three,four,five/tags vs. tagone/tag tagtwo/tag tagthree/tag tagfour/tag tagfive/tag (36 bytes vs. 82 bytes) Secondly, since tags for every package will be held in different files, people will need dedicated tools to collect tags from all those files and add matching tags to their own packages. Long story short, we're going to have many 'duplicate' tags that will require even more commits with ChangeLog entries and Manifest updates. If we automate the generation of a master tag index file, like use.desc.local, this can be avoided. emerge can simply go rummage through the master index for matching tag entries instead of going through the entire tree. Because if we wanted to sift through the entire tree, grep would be a far better method (compiled C and probably better text-matching algorithms than emerge). Worse than that, your GLEP doesn't even have any basic rules for naming tags -- like what language form to use and, say, which character to use instead of space. This sounds like the sort of things that's going to make it even harder to get some consistency, especially if some developers are going to follow someone else committing earlier and some will follow their own rules. Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no spaces. A lot of problems are avoided if we keep tags to one-word descriptors only. E.g., for mail clients, they would carry both 'mail' and 'client' as two of their five tags. For kmail, a third tag would be 'kde' and Evolution would have 'gnome' instead. I'd honestly prefer that -- if we should really keep tags in the tree -- to do that with a single 'metadata/tags' file or some kind of hierarchy there. Keep them outside the package directory -- bind packages to tags, rather than tags to packages. Keep all the commits in a single place without altering the ebuild work flow. While I definitely like the idea of a single, master file, I feel this could run away pretty quickly as it's continuously updated. For example, adding a new package,
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
Dnia 2014-03-23, o godz. 16:27:43 Joshua Kinard ku...@gentoo.org napisał(a): On 03/23/2014 15:44, Michał Górny wrote: Tags, on the other hand, are more 'live'. They place the package somewhere in the 'global' tag hierarchy that can change over time. I expect that people other than maintainers will be adding tags to packages (and changing them), and that people will invent new tags and apply them to more packages. So, first of all, your solution would mean that every commit adding a new tag or changing one of the tags would modify the package metadata.xml. This means a Manifest update and a ChangeLog entry (please don't get into more rules for ChangeLogs now), and this means it will be harder to find actually useful entries there. So we make tag updates harder, and increase time and size of rsync. Instead of individual tag lines in metadata.xml for each tag, why not a single tags line that contains a comma-delimited list of up to five tags, whitespace optional? That should help reduce the fluff of the tree by adding this feature. E.g., tagsone,two,three,four,five/tags Either use XML, or don't use XML. Don't make this some kind of ugly mixture of XML with non-XML. So: tags tagone/tag tagtwo/tag /tags if we're really going for this. But I guess our DTD doesn't allow easy definition of single tags/ with no forced position. Secondly, since tags for every package will be held in different files, people will need dedicated tools to collect tags from all those files and add matching tags to their own packages. Long story short, we're going to have many 'duplicate' tags that will require even more commits with ChangeLog entries and Manifest updates. If we automate the generation of a master tag index file, like use.desc.local, this can be avoided. emerge can simply go rummage through the master index for matching tag entries instead of going through the entire tree. Because if we wanted to sift through the entire tree, grep would be a far better method (compiled C and probably better text-matching algorithms than emerge). And this goes pretty much backwards to what we were aiming at. We should finally kill use.desc.local, not get inspired by the redundancy. Worse than that, your GLEP doesn't even have any basic rules for naming tags -- like what language form to use and, say, which character to use instead of space. This sounds like the sort of things that's going to make it even harder to get some consistency, especially if some developers are going to follow someone else committing earlier and some will follow their own rules. Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no spaces. A lot of problems are avoided if we keep tags to one-word descriptors only. E.g., for mail clients, they would carry both 'mail' and 'client' as two of their five tags. For kmail, a third tag would be 'kde' and Evolution would have 'gnome' instead. I'm pretty sure you will finally hit something that goes with two words. Protocol name or something. I'd also suggest that 'all' be considered a default, global tag for all packages, it be a reserved tag internal to emerge and other package managers, and not count against the number of allowed tags (meaning that technically, a package is allow five tags + 'all'). As for default tags when a package does not define any, the package category gets split at the hyphen and becomes two independent tags. This is overridden when at least one tag is defined in metadata.xml. Will this have a real benefit? Sounds like unnecessary confusion for a minor gain to me. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 03/23/2014 17:05, Michał Górny wrote: Dnia 2014-03-23, o godz. 16:27:43 Joshua Kinard ku...@gentoo.org napisał(a): On 03/23/2014 15:44, Michał Górny wrote: Tags, on the other hand, are more 'live'. They place the package somewhere in the 'global' tag hierarchy that can change over time. I expect that people other than maintainers will be adding tags to packages (and changing them), and that people will invent new tags and apply them to more packages. So, first of all, your solution would mean that every commit adding a new tag or changing one of the tags would modify the package metadata.xml. This means a Manifest update and a ChangeLog entry (please don't get into more rules for ChangeLogs now), and this means it will be harder to find actually useful entries there. So we make tag updates harder, and increase time and size of rsync. Instead of individual tag lines in metadata.xml for each tag, why not a single tags line that contains a comma-delimited list of up to five tags, whitespace optional? That should help reduce the fluff of the tree by adding this feature. E.g., tagsone,two,three,four,five/tags Either use XML, or don't use XML. Don't make this some kind of ugly mixture of XML with non-XML. So: tags tagone/tag tagtwo/tag /tags if we're really going for this. But I guess our DTD doesn't allow easy definition of single tags/ with no forced position. TBH, I don't like the use of XML at all. Never have and never will. I am a big fan of INI-style definitions (i.e., like Samba's config). XML just leads to a lot of unneeded fluff in what should be a really small file, which is why I was proposing a single tags element instead of multiple tag elements. E.g., instead for local USE of this: use flag name='foo'FOO/flag flag name='bar'BAR/flag flag name='baz'BAZ/flag /use (96 bytes) This would be better: [local use] foo = FOO bar = BAR baz = BAZ (47 bytes) Not a complicated example, but would be 50% reduction in size. But, I digress... Secondly, since tags for every package will be held in different files, people will need dedicated tools to collect tags from all those files and add matching tags to their own packages. Long story short, we're going to have many 'duplicate' tags that will require even more commits with ChangeLog entries and Manifest updates. If we automate the generation of a master tag index file, like use.desc.local, this can be avoided. emerge can simply go rummage through the master index for matching tag entries instead of going through the entire tree. Because if we wanted to sift through the entire tree, grep would be a far better method (compiled C and probably better text-matching algorithms than emerge). And this goes pretty much backwards to what we were aiming at. We should finally kill use.desc.local, not get inspired by the redundancy. And what replaces it? What differentiates a global USE flag that has purpose across multiple packages (like 'ipv6') against a flag that only exists for a single package? I'll agree that USE flags have definitely gotten out of control, and the trend now seems to be moving sharply away from defining a global USE definition in make.conf instead to per-package USE flags in /etc/portage/package.use. Which, while offering more granular control, can be mind-numbingly annoying at times. The automated generation of use.local.desc definitely made maintenance of some things easier. We've gotta index USE flags some how, and separating them into global and local categories still makes sense to me. But, I'm probably just going senile... Worse than that, your GLEP doesn't even have any basic rules for naming tags -- like what language form to use and, say, which character to use instead of space. This sounds like the sort of things that's going to make it even harder to get some consistency, especially if some developers are going to follow someone else committing earlier and some will follow their own rules. Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no spaces. A lot of problems are avoided if we keep tags to one-word descriptors only. E.g., for mail clients, they would carry both 'mail' and 'client' as two of their five tags. For kmail, a third tag would be 'kde' and Evolution would have 'gnome' instead. I'm pretty sure you will finally hit something that goes with two words. Protocol name or something. Perhaps, but we can fight that battle when we get there. starting off with one-word tags keeps things simple for now and that'll make it easier to determine whether this experiment actually pans out or not. I'd also suggest that 'all' be considered a default, global tag for all packages, it be a reserved tag internal to emerge and other package managers, and not count against the number of allowed tags (meaning that technically, a package is allow five tags + 'all'). As for default tags when a package does
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 23/03/2014 22:08, hasufell wrote: Michał Górny: Dnia 2014-03-22, o godz. 15:33:27 Alec Warner anta...@gentoo.org napisał(a): https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. I'd honestly prefer that -- if we should really keep tags in the tree -- to do that with a single 'metadata/tags' file or some kind of hierarchy there. Keep them outside the package directory -- bind packages to tags, rather than tags to packages. Keep all the commits in a single place without altering the ebuild work flow. That sounds better. That way it is also easier to get some consistency. E.g. tags can be discussed... but adding packages to tags is up to the maintainers. The GLEP should maybe cover a basic set of tags. Then projects like games, science etc could add their sets as well which may be a bit more specific... instead of random maintainers adding random tags. Regular user/sysadmin chipping in: This topic seems a lot like a solution seeking a problem to solve, or alternatively a dev is looking for an easy way to describe stuff. Not that there's anything wrong with that, but the proposal as written is way too vague to be useful. Tags work best when they describe narrow, clearly defined attributes, and the thing they are applied to can have one, two or more of these attributes or sometimes even none. Music and movie genres are an excellent example - there are only so many of them and for the most part one can tell whether a tag really is a genre or not. Nothing resembling such limits are proposed in this GLEP, there's not even a recommendation of what the tags will describe or how everything will be tagged equally. What happens if someone zealously over-tags all of gnome and the same thing doesn't happen for kde? Does kde just not show up in tag searches anymore? So this just seems like a nice-to-have that hasn't been properly thought through. The main stated use of it is for packages that logically belong to more than one category. So instead of a general catch all, do whatever you want mechanism, let's rather solve that exact problem by for example adding a specific field to metadata eg supplementary categories. Pick those that apply from a clearly defined list and store the data in a clearly defined place. Such a thing can be made more generic, by making it a clear mechanism to describe extra metadata and the things to be described go through a defined process first before making it into the list. this concept is not present in the GLEP as currently written. -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
Dnia 2014-03-23, o godz. 17:40:20 Joshua Kinard ku...@gentoo.org napisał(a): On 03/23/2014 17:05, Michał Górny wrote: Dnia 2014-03-23, o godz. 16:27:43 Joshua Kinard ku...@gentoo.org napisał(a): On 03/23/2014 15:44, Michał Górny wrote: Tags, on the other hand, are more 'live'. They place the package somewhere in the 'global' tag hierarchy that can change over time. I expect that people other than maintainers will be adding tags to packages (and changing them), and that people will invent new tags and apply them to more packages. So, first of all, your solution would mean that every commit adding a new tag or changing one of the tags would modify the package metadata.xml. This means a Manifest update and a ChangeLog entry (please don't get into more rules for ChangeLogs now), and this means it will be harder to find actually useful entries there. So we make tag updates harder, and increase time and size of rsync. Instead of individual tag lines in metadata.xml for each tag, why not a single tags line that contains a comma-delimited list of up to five tags, whitespace optional? That should help reduce the fluff of the tree by adding this feature. E.g., tagsone,two,three,four,five/tags Either use XML, or don't use XML. Don't make this some kind of ugly mixture of XML with non-XML. So: tags tagone/tag tagtwo/tag /tags if we're really going for this. But I guess our DTD doesn't allow easy definition of single tags/ with no forced position. TBH, I don't like the use of XML at all. Never have and never will. I am a big fan of INI-style definitions (i.e., like Samba's config). XML just leads to a lot of unneeded fluff in what should be a really small file, which is why I was proposing a single tags element instead of multiple tag elements. metadata.xml is XML at the moment, so you are supposed to obey its rules, whether you like them or not. if you want to replace it with something else, feel free to try. But don't make a shitsoup mixin out of it. Secondly, since tags for every package will be held in different files, people will need dedicated tools to collect tags from all those files and add matching tags to their own packages. Long story short, we're going to have many 'duplicate' tags that will require even more commits with ChangeLog entries and Manifest updates. If we automate the generation of a master tag index file, like use.desc.local, this can be avoided. emerge can simply go rummage through the master index for matching tag entries instead of going through the entire tree. Because if we wanted to sift through the entire tree, grep would be a far better method (compiled C and probably better text-matching algorithms than emerge). And this goes pretty much backwards to what we were aiming at. We should finally kill use.desc.local, not get inspired by the redundancy. And what replaces it? What differentiates a global USE flag that has purpose across multiple packages (like 'ipv6') against a flag that only exists for a single package? Applications are supposed to read metadata.xml for local flags. That's all about it. Having an extra index file doesn't really make sense there. Worse than that, your GLEP doesn't even have any basic rules for naming tags -- like what language form to use and, say, which character to use instead of space. This sounds like the sort of things that's going to make it even harder to get some consistency, especially if some developers are going to follow someone else committing earlier and some will follow their own rules. Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no spaces. A lot of problems are avoided if we keep tags to one-word descriptors only. E.g., for mail clients, they would carry both 'mail' and 'client' as two of their five tags. For kmail, a third tag would be 'kde' and Evolution would have 'gnome' instead. I'm pretty sure you will finally hit something that goes with two words. Protocol name or something. Perhaps, but we can fight that battle when we get there. starting off with one-word tags keeps things simple for now and that'll make it easier to determine whether this experiment actually pans out or not. If you introduce arbitrary limitations, people will either find a way around them (which means getting even worse mess) or omit some tags. Either way, tags become less helpful. I'd also suggest that 'all' be considered a default, global tag for all packages, it be a reserved tag internal to emerge and other package managers, and not count against the number of allowed tags (meaning that technically, a package is allow five tags + 'all'). As for default tags when a package does not define any, the package category gets split at the hyphen and becomes two independent tags. This is overridden when at least one tag is defined in
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 03/23/2014 17:51, Michał Górny wrote: Dnia 2014-03-23, o godz. 17:40:20 Joshua Kinard ku...@gentoo.org napisał(a): On 03/23/2014 17:05, Michał Górny wrote: Dnia 2014-03-23, o godz. 16:27:43 Joshua Kinard ku...@gentoo.org napisał(a): On 03/23/2014 15:44, Michał Górny wrote: Tags, on the other hand, are more 'live'. They place the package somewhere in the 'global' tag hierarchy that can change over time. I expect that people other than maintainers will be adding tags to packages (and changing them), and that people will invent new tags and apply them to more packages. So, first of all, your solution would mean that every commit adding a new tag or changing one of the tags would modify the package metadata.xml. This means a Manifest update and a ChangeLog entry (please don't get into more rules for ChangeLogs now), and this means it will be harder to find actually useful entries there. So we make tag updates harder, and increase time and size of rsync. Instead of individual tag lines in metadata.xml for each tag, why not a single tags line that contains a comma-delimited list of up to five tags, whitespace optional? That should help reduce the fluff of the tree by adding this feature. E.g., tagsone,two,three,four,five/tags Either use XML, or don't use XML. Don't make this some kind of ugly mixture of XML with non-XML. So: tags tagone/tag tagtwo/tag /tags if we're really going for this. But I guess our DTD doesn't allow easy definition of single tags/ with no forced position. TBH, I don't like the use of XML at all. Never have and never will. I am a big fan of INI-style definitions (i.e., like Samba's config). XML just leads to a lot of unneeded fluff in what should be a really small file, which is why I was proposing a single tags element instead of multiple tag elements. metadata.xml is XML at the moment, so you are supposed to obey its rules, whether you like them or not. if you want to replace it with something else, feel free to try. But don't make a shitsoup mixin out of it. I'm not proposing to change it now...bit too late for that. But if I ever come across a TARDIS on eBay, well... That said, Is XML that specific that every single atom has to be wrapped by an individual tag? A comma-separated list of values in its own XML tag is prohibited by the spec? I don't use XML often (if at all), so I am not familiar with its intrinsics. Secondly, since tags for every package will be held in different files, people will need dedicated tools to collect tags from all those files and add matching tags to their own packages. Long story short, we're going to have many 'duplicate' tags that will require even more commits with ChangeLog entries and Manifest updates. If we automate the generation of a master tag index file, like use.desc.local, this can be avoided. emerge can simply go rummage through the master index for matching tag entries instead of going through the entire tree. Because if we wanted to sift through the entire tree, grep would be a far better method (compiled C and probably better text-matching algorithms than emerge). And this goes pretty much backwards to what we were aiming at. We should finally kill use.desc.local, not get inspired by the redundancy. And what replaces it? What differentiates a global USE flag that has purpose across multiple packages (like 'ipv6') against a flag that only exists for a single package? Applications are supposed to read metadata.xml for local flags. That's all about it. Having an extra index file doesn't really make sense there. But they don't currently, do they? As far as I know, most everything parses the use.local.desc file. Wouldn't having portage apps read/parse every package's metadata.xml file introduce a lot of disk I/O to seek out those files across the entire tree? That would seem like a bigger step backwards if so. Worse than that, your GLEP doesn't even have any basic rules for naming tags -- like what language form to use and, say, which character to use instead of space. This sounds like the sort of things that's going to make it even harder to get some consistency, especially if some developers are going to follow someone else committing earlier and some will follow their own rules. Easy: ASCII, alphanumeric only, must start with a letter, lowercase, no spaces. A lot of problems are avoided if we keep tags to one-word descriptors only. E.g., for mail clients, they would carry both 'mail' and 'client' as two of their five tags. For kmail, a third tag would be 'kde' and Evolution would have 'gnome' instead. I'm pretty sure you will finally hit something that goes with two words. Protocol name or something. Perhaps, but we can fight that battle when we get there. starting off with one-word tags keeps things simple for now and that'll make it easier to determine whether this experiment actually pans out or not. If
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 24 March 2014 11:54, Joshua Kinard ku...@gentoo.org wrote: That said, Is XML that specific that every single atom has to be wrapped by an individual tag? A comma-separated list of values in its own XML tag is prohibited by the spec? I don't use XML often (if at all), so I am not familiar with its intrinsics. By nesting CSV inside XML, you've now got 2 formats to deal with instead of 1. In pure XML, you can get a properly decoded array of tag elements with a simple XPath query: //tag But with CSV-in-a-tag you have to extract the tag and subsequently parse it. So you're hand implementing a parser to parse parts of XML that already convey data without needing to hand-parse. Which is more effort for everyone who touches the file, not less. Add to that automated ways to update the tags ( again, having to implement a custom serialiser in addition to the custom parser ) and its just not worth the tiny amount of savings. Because really, if space efficiency was #1 priority, we'd not be using XML at all, let alone XML with pesky whitespace indentation that consumes needless bytes. =) -- Kent
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On 03/23/2014 19:18, Kent Fredric wrote: On 24 March 2014 11:54, Joshua Kinard ku...@gentoo.org wrote: That said, Is XML that specific that every single atom has to be wrapped by an individual tag? A comma-separated list of values in its own XML tag is prohibited by the spec? I don't use XML often (if at all), so I am not familiar with its intrinsics. By nesting CSV inside XML, you've now got 2 formats to deal with instead of 1. In pure XML, you can get a properly decoded array of tag elements with a simple XPath query: //tag But with CSV-in-a-tag you have to extract the tag and subsequently parse it. I am probably thinking from a Python perspective then. All you have to do is grab the value of tags and then split it on the comma. No custom parsing needed, since that function is built into Python. I guess this might not be the case with other languages, though, and it really just adds to my distaste of XML as a format for metadata.xml in the first place. So you're hand implementing a parser to parse parts of XML that already convey data without needing to hand-parse. Which is more effort for everyone who touches the file, not less. Add to that automated ways to update the tags ( again, having to implement a custom serialiser in addition to the custom parser ) and its just not worth the tiny amount of savings. Because really, if space efficiency was #1 priority, we'd not be using XML at all, let alone XML with pesky whitespace indentation that consumes needless bytes. =) I guess I need to start looking for used TARDISes then... Thanks for the explanation. -- Joshua Kinard Gentoo/MIPS ku...@gentoo.org 4096R/D25D95E3 2011-03-28 The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between. --Emperor Turhan, Centauri Republic
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sun, 23 Mar 2014 23:47:22 +0200 Alan McKinnon alan.mckin...@gmail.com wrote: Tags work best when they describe narrow, clearly defined attributes, and the thing they are applied to can have one, two or more of these attributes or sometimes even none. Music and movie genres are an excellent example - there are only so many of them and for the most part one can tell whether a tag really is a genre or not. There are more ways to search for a music or a movie than a genre: What mood is it in? What are key elements of its plot or lyrics? Where does it take place? For which audience is it meant? Which praises has it received? What kind of style is it made in? What is it based on? What is the attitude of it? What looks or effects does it have? Is it appropriate for children? Does it contain explicit things? Let's do this for movies. I'm looking for a ... ... serial killer (key element) that is scary (mood)? Carrie, Halloween, Saw, Scream, ... ... musical (genre) that makes one feel good (mood)? Aaja Nachle, Frozen, Grease, The Sound of Music, ... ... good versus evil (plot) based on comics (based on)? Batman, Sin City, Superman, The Avengers, ... ... goofy (attitude) hero (key element) where nothing goes right (plot)? Due Date, Faulty Towers, Monty Python's Flying Circus, Mr Bean, ... These are results from an actual movie recommendation site; similarly, the same exists for music too, where you can for example look for a female american singer-songwriter singing catchy contemporary country. Getting back to Gentoo; when I would look for some package, I want it to be a lightweight, do audio recordings, organize these audio recordings and do effects on these audio recordings. So, I'll be looking for tags like lightweight, audio-recording, file-organization, sound-effects; if that's to broad, I can take two of them and test some of that. Thinking about the different types of things to search for; I'm thinking about ... ... what the characteristics of the software are (light/heavy, new/old, extensible/modular/nonstandard, ...), ... what the software can do (record audio, organize files, ...), ... what category (browser, development, DAW software, utility, ...), ... what kind of interface the software has to me (CLI, GUI, ...), ... what interconnectivity the software has (internet, bluetooth, ...), ... and so on ... We could make a list of types (some already mentioned above) and a list of possible tags for that type to shape the tag system somewhat. -- With kind regards, Tom Wijsman (TomWij) Gentoo Developer E-mail address : tom...@gentoo.org GPG Public Key : 6D34E57D GPG Fingerprint : C165 AF18 AB4C 400B C3D2 ABF0 95B2 1FCD 6D34 E57D
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sun, 23 Mar 2014 16:03:38 +0100 Alexander Berntsen berna...@gentoo.org wrote: On 23/03/14 15:46, Jeroen Roovers wrote: This GLEP author would love to blight categories out of gentoo history as a giant mistake. That's not what I wrote. It's a quotation. It does not matter. Just remove that line. It is irrelevant. The point in asking why it's there was to establish why the GLEP as a whole is relevant. In other words: it would be trivial[1] yet pains-taking[2] to establish an alternative means to address the package manager to package targets, but why would we want to do it? Examples of where atoms fail and where tags do better could enlighten us. jer [1] Set up a PM wrapper that translates tags into atoms. [2] Set up a database of such translations, with a really easy fail-over to ordinary atoms where the database is incomplete.
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sat, Mar 22, 2014 at 6:33 PM, Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags Ack, this had to happen on a weekend when I wasn't paying attention! And you beat me to it, too-- I was working on something in this vein, but wasn't quite satisfied with the design yet. Oh well. You're sort of on the right track, but there are some very important aspects missing that will make the whole thing collapse with their absence. (This thread has been in various places, but I frankly don't feel like finding the relevant snippets, so you get a text dump. Sorry about that.) The first thing missing is aliasing (most proposals for this sort of system miss this at first; don't feel too bad). There are many, many, many cases where you want more than one single tag query to resolve to the same canonical tag. The ability to define aliases that take care of this automatically is critical. In my notes on this, I had a global alias file, and users can have an /etc/portage/tag.alias. It's just text -- nothing special -- that defines antecedent = consequent relationships. This means the antecedent is _replaced_ by the consequent. As a quick example, cpp = c++ This also allows for simple changes to the canonical name. Second, implication is important for decreasing maintenance burden. An implication is an antecedent - consequent relationship where the consequent is automatically added if the antecedent is present. Unlike aliasing, the consequent doesn't _replace_ the antecedent. An example of this is acpi - power_management, because acpi is a distinct aspect of power management, and has value on its own. Over time, this significantly lowers the maintenance burden of an expanding vocabulary and tree. With that in place, I want to make something clear: consistency in the vocabulary is absolutely critical. I cannot overemphasise how important this is. Adding tags without any sort of discipline leads to an unmaintainable vocabulary, which makes the whole thing as worthless as some people think. So there needs some sort of basic canonical list of tags with their descriptions, and yes people should be expected to be rigourous in how they approach this. I've attached a rough draft of descriptions and aliases that I pulled together a while ago (analogous to /etc/portage/profiles/use.desc). This is where aliasing becomes essential, because it allows us to guarantee some amount of consistency. We're only human and can't be expected to cover every situation, but there's plenty of low-hanging fruit in this area. e.g.: app = application # Alias abbreviation to full tag editors = editor# Make plural - singular aliases standard where sensible. # Rule of thumb 1: This is a(n)... admin = administration # Rule of thumb 2: This is a(n)... ...tool backup = back-up# Can use hyphenated forms benchmark = benchmarking# As with admin, only gerund form. cdr = disk_authoring# Spaces replaced with underscores at word boundaries i18n = internationalisation # Will need to come to a consensus on the s/z spelling and make some aliases. cpp = c++ # Valid tags should be restricted to basic ASCII minus spaces (replaced with underscores) for our own sanity .net = dotnet # This could go either way, but the leading period makes my Unix blood distrust it. gamedev = game_development # games becomes ambiguous with game so prefer a more-clear form. lang = language = programming_language # Not to be confused with the i18n language support. Avoid confusion with clear naming version_control = source_control = vcs # Well known abbreviations can be used in place of their expansions mail = email# No sense not being clear mail_server = mail_transfer_agent = mta # Multiple aliases to the same thing are acceptable nntp = {{newsreader usenet}}# The braced notation denotes an intersection of two tags. Need to decide if this sort of alias is legal. I'm thinking no, honestly. sys = system# BUT it's in conflict with @system! Don't do that. www = web # These are all things that deal with the web specifically. apache = apache_module # classes of packages that have their own categories is exactly why this is a good idea. The above is just an excerpt copied directly from my notes on aliasing. Some other stuff: - Query syntax and semantics can be addressed in greater detail later. There's some nice sugar to be had here. - Likewise, tools. Something along the lines of quse and equery would be handy in support of this. - Aliases for reasonable search terms are not a bad idea. - I've stated at various points in the past, but categories are already tags after a fashion. They're
[gentoo-dev] RFC GLEP 1005: Package Tags
https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Or argue for 100 posts, either way. -A
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Alec Warner: https://wiki.gentoo.org/wiki/Package_Tags Object or forever hold your peace. Or argue for 100 posts, either way. -A Sounds good, but how do we get consistency in there? I mean... this only works if we have some sort of consensus about tag names, at least more common ones. -BEGIN PGP SIGNATURE- iQJ8BAEBCgBmBQJTLiE2XxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQzMDlCNDQ4NjEyNDI4NjA5REVEMDI3MzIy MjBDRDFDNUJERUVEMDIwAAoJECIM0cW97tAgxL0QALkNpQpAELADbq/Bz9G8ehmB bFgJPaDWe/SfnC6VV2zKaIgpTNj6Fa/801sUueXLxVY6AsLpuGt0MJf1Hq8O7pOD p5MT0zwLfxAuOkFiKDxXcSaGfoV3fRV13PbXv+nSsmBhlek902qMK+a7+nXwxZYx WtF5PGlIX8JJDvaC6wqdUV0MjSqPrTp1dKOREn6kiBWVRfXcKshFcdpcH74jyzHD XkT5m9m73FdqJD0Qtje0Ga5iXRKk/zEDQovOAnpbykpmQHRLXGXVPW9s/gm2zqRS evzYlI5lkSnLrAjSDuoM1t9MLXxb1CtyRKeCjWg5weXL+7YXSe65lqASGa4i27zf GrydcbRMUERGrcQrDf0Fsee1OepYsPNZ35KxU+yACT0gix5v8kAxheqvgSWRamjw irwumzpnKV/Xc6vlsy159JwaQqXRcQXK+x+PWQyDe1FESZ+HrpC9NSjzY3L05SpY rYgd11uQtL+I6RN90pRNYOfBDJ0zDsFw0ZUMe5+nTfL87dgkUN35E014ATQ4dxUF Y7ZUT7er+Qc40tXNBYeOen2RdpiCKCNy+melsK1qIuh5gIvnTqKhbiw5GSGM/efK jzr/4WHO1Qm0v0CAN2rR15cYnsvilaIrkSJk0PClqpB85jiwUBjwuz4SNDuded/l XRcaP6ToJDevR+s55Pom =PocK -END PGP SIGNATURE-
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags And do what with them? Right now this is a solution without a problem. -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
On Sat, Mar 22, 2014 at 7:48 PM, hasufell hasuf...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags Sounds good, but how do we get consistency in there? I mean... this only works if we have some sort of consensus about tag names, at least more common ones. The alternative to consistency/etc is crowdsourcing, but there is no way to get that in metadata.xml (unless you allow it to be edited by some more accessible service). The whole build-it-and-they-will-come approach tends to work better if you're doing crowdsourcing. As it stands this is more of a top-down approach, and that usually works better if the use scenarios are understood in advance. Rich
Re: [gentoo-dev] RFC GLEP 1005: Package Tags
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Ciaran McCreesh: On Sat, 22 Mar 2014 15:33:27 -0700 Alec Warner anta...@gentoo.org wrote: https://wiki.gentoo.org/wiki/Package_Tags And do what with them? Right now this is a solution without a problem. Finding packages. Descriptions are not consistent, categories too generic. -BEGIN PGP SIGNATURE- iQJ8BAEBCgBmBQJTLiT4XxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQzMDlCNDQ4NjEyNDI4NjA5REVEMDI3MzIy MjBDRDFDNUJERUVEMDIwAAoJECIM0cW97tAglPcP/1KG91ZUg8h0EXB8WsAAfNjd ESXBQ0nA6MdojIMfQjdBrQAj7l1QHQU10mwCn3arvBzg2rG09i89lCe9cy5cf+MC dl5Vqw6ukvnCxVqQYrucgcBFf2GLMEvmg8VyML0elbklKn+z2gzBxiUtZNUON3mg KrSi29DbTA2mZHjgQNc2jK9+B4i4Svv+U0VK2eNkzsDM+PI6yCBjIWq13rkKQZRX nIkVxv7T4eWXWoxjGPibAoRNVswrPvTGFVJVSiW23ud30lCGg8Eq3r2Fzq6pdjp0 9MMQrSncrpJRUrPCOuqTNu4TW53NaFVdJA0HLkpDRj5adMKYIH54HCo9vE+ghaT9 dnYI6ggrxMJOdfNP29y8HUzjzvV9GBcH0aQEdlO1frr+tEt+PsDFiq+SqU1dU+Y2 yacSVqH+j5EEM8vNLO2mJzsukl07ksbICMJhyBqeUSRP8jUSa+QyZleueiksvTpU HOdos415JrkeAYr+K6qNleefmrXZnX2zwpvz5grW4YXLB9vn97ClAhWrd12x4Xps qTiGH69knxB4qomN4Dm7SW7WvBGg71o4MILcFqZ/Eh/GXQ7/i03Kig/PO6zd0hPK 4XtaLAX0qqy9PJLKaYtu1Yg7IL+PRx9lyOrSHtZoCwDICoQD0hn3EkoVHEoLrmin 7nJ1SruikV+JOf1TxTiR =jt/A -END PGP SIGNATURE-