Re: META.yml keywords
Smylers wrote: And I still reckon most humans are approximately appalling at picking appropriate keywords anyway. A system like you're proposing still requires an individual module's author to think of the right keywords and bother to do this, which is putting a single-point of failure in the system. This list already advises authors (those that ask) on appropriate names, it could readily name the top 10 keywords also. Those that dont ask cannot be helped w/o undue complexity. OTGH, its not impractical for the many regulars here to moderate/gatekeep on what modules are publicized. New uploads (ie new dists) result in a reply to the user, and a post to module-authors, flagging a new dist for name approval. KISS. (moderation may not be sufficiently simple)
Re: META.yml keywords
Ken Williams writes: I think we could get far more mileage out of tuning the search engine better for the needs of perl-module searching, You speak great sense. Now it's back to Simon's point about the source code for it not seeming to be available. Smylers, I think your proposal is interesting ... Proposal is flattering to my message -- thinking out loud was more like it. I wasn't seriously expecting anybody to start implementing any of the things I said, but was just pointing out that there are many different ways of approaching the problem which don't involve keywords ... Smylers
Re: META.yml keywords
On Sat, Jul 17, 2004 at 03:40:52PM +0200, A. Pagaltzis wrote: Which was exactly the purpose: to be able to make sure that the list with official keywords really does only contain official keywords, so a release tool can complain about misspellings f.ex. If you simply allow both in a single list, then netwrok will go unnoticed and make your module invisible to searches with the correct keyword. I don't think the existence of two lists should matter to the indexer -- official keywords in the freeform list should have the same value as official ones in the fixed keys list. That sort of defeats the above point, I guess, but a list for fixed keys only still helps those who want its benefits. It might suffice to have the release tool check the list and tell the user which keywords are official and which aren't, but I don't know if that is helpful enough -- I personally would like to be able to tell it to choke on all mistakes *except* those I specifically declared as known non-official ones. The only benefit I can see is that of spell-checking and that would be better done by an actual spell-checker. Isn't it important not mis-spell any keywords, regardless of their officialness? F
Re: META.yml keywords
* Randy W. Sims [EMAIL PROTECTED] [2004-07-17 12:45]: There is, however, another advantage to the catagory approach: Searching would likely be more consistent. It would help authors to place their modules so that they can be found with similar modules. It would also help ensure that users looking for a particular type module will get back a result set that is likely to contain all/most of the modules of that type. Why does it have to be either/or? There could be two keyword lists, one with fixed keywords, and the other freeform. Their names would have to be chosen carefully to suggest this as the intended use (rather than filling both with the same keywords) -- maybe ``keywords'' and ``additional_keywords'' or something. Regards, -- Aristotle If you can't laugh at yourself, you don't take life seriously enough.
Re: META.yml keywords
On Sat, Jul 17, 2004 at 01:32:36PM +0200, A. Pagaltzis wrote: * Randy W. Sims [EMAIL PROTECTED] [2004-07-17 12:45]: There is, however, another advantage to the catagory approach: Searching would likely be more consistent. It would help authors to place their modules so that they can be found with similar modules. It would also help ensure that users looking for a particular type module will get back a result set that is likely to contain all/most of the modules of that type. Why does it have to be either/or? There could be two keyword lists, one with fixed keywords, and the other freeform. Their names would have to be chosen carefully to suggest this as the intended use (rather than filling both with the same keywords) -- maybe ``keywords'' and ``additional_keywords'' or something. I agree that If there is to be an official list of keyowrds then it shouldn't be either/or. The officials haven't regenerated the module list for 2 years, there's no reason to think that the keyword officials will stay up to date. That said, I don't think having 2 lists is useful. The author should supply a single list of keywords. Those that are on the official list are on the official list, those that aren't aren't. The search engine/indexer will be far better at figuring that out than the module author. Otherwise you are just obliging the authors to keep track of the official list and move keywords around in their meta info as the official list chnages. It would be up to the search engine to perhaps give more weight to official keywords. The search engine could also maintain official synonyms so that postgres and pg are indexed together, F
Re: META.yml keywords
* Fergal Daly [EMAIL PROTECTED] [2004-07-17 14:56]: That said, I don't think having 2 lists is useful. The author should supply a single list of keywords. Those that are on the official list are on the official list, those that aren't aren't. The search engine/indexer will be far better at figuring that out than the module author. Otherwise you are just obliging the authors to keep track of the official list and move keywords around in their meta info as the official list chnages. Which was exactly the purpose: to be able to make sure that the list with official keywords really does only contain official keywords, so a release tool can complain about misspellings f.ex. If you simply allow both in a single list, then netwrok will go unnoticed and make your module invisible to searches with the correct keyword. I don't think the existence of two lists should matter to the indexer -- official keywords in the freeform list should have the same value as official ones in the fixed keys list. That sort of defeats the above point, I guess, but a list for fixed keys only still helps those who want its benefits. It might suffice to have the release tool check the list and tell the user which keywords are official and which aren't, but I don't know if that is helpful enough -- I personally would like to be able to tell it to choke on all mistakes *except* those I specifically declared as known non-official ones. Regards, -- Aristotle If you can't laugh at yourself, you don't take life seriously enough.
Re: META.yml keywords
On Jul 17, 2004, at 7:39 AM, Smylers wrote: Warning: Wild conjecture and multiple unrelated crazy ideas ahead. This is becoming _way_ too complicated. Agreed. And I still reckon most humans are approximately appalling at picking appropriate keywords anyway. A system like you're proposing still requires an individual module's author to think of the right keywords and bother to do this, which is putting a single-point of failure in the system. Agreed. However, improved Cpan searching would be welcome. Agreed. Note that the primary contribution of search.cpan.org is that it has a really nice interface and lots of functionality. But its actual search algorithm still isn't very good (it just uses WAIT, IIRC) - I think we could get far more mileage out of tuning the search engine better for the needs of perl-module searching, and perhaps doing stuff like grouping the results by distribution, than we'd ever get this way. And we wouldn't need to add complexity to the modules themselves. Smylers, I think your proposal is interesting but I don't think it's necessary for improving searching over CPAN modules. -Ken
Re: META.yml keywords (was: Re: Finding prior art Perl modules)
On Jul 14, 2004, at 2:11 PM, Randy W. Sims wrote: The spec doesn't currently provide for keywords. I do think it would be a good idea, BUT I think it needs to be done in a way to avoid abuse. I think maybe it would be better to put keywords right in the pod for the module, so they become part of the documentation too. This is similar to the way they often appear in academic papers, right after the abstract - and people often find it useful for pinpointing the subject matter. -Ken
Re: META.yml keywords
On 7/14/2004 3:44 PM, Mark Stosberg wrote: On Wed, Jul 14, 2004 at 03:11:11PM -0400, Randy W. Sims wrote: Fergal Daly wrote: Does META.yaml have a place for keyowrds? The spec doesn't currently provide for keywords. I do think it would be a good idea, BUT I think it needs to be done in a way to avoid abuse. I'd hate to see META.yml files grow by the kb as authors add every conceivable keyword they can think of and try to manipulate the search. The search algorithm could pay attention to the first X keywords and ignore the rest. Or at least, it could heavily weight the first few. I think this is part of how search engines prevent the same kind of above of the meta-tag keyword system. You can put as many keywords as you want into the list, but I think the search engines only really care about the first few. That seems a reasonable approach to overcoming the abuse problem. There is, however, another advantage to the catagory approach: Searching would likely be more consistent. It would help authors to place their modules so that they can be found with similar modules. It would also help ensure that users looking for a particular type module will get back a result set that is likely to contain all/most of the modules of that type. I would prefer something like this over the choosing from the fix list idea. Having free-form tags is a feature I like on: http://del.icio.us/ It allows new classifications to spontaneously appear. I will conceed that there are definate advantages to the keyword approach. Randy.
Re: META.yml keywords
On 7/16/2004 8:54 AM, Ken Williams wrote: On Jul 14, 2004, at 2:11 PM, Randy W. Sims wrote: The spec doesn't currently provide for keywords. I do think it would be a good idea, BUT I think it needs to be done in a way to avoid abuse. I think maybe it would be better to put keywords right in the pod for the module, so they become part of the documentation too. This is similar to the way they often appear in academic papers, right after the abstract - and people often find it useful for pinpointing the subject matter. Is it usefull to have in the pod? It seems like meta information to me; i.e. it seems like it would only be usefull to indexers. I am concerned that META.yml might become a huge document if everything gets put in there, but I think if we agree that it would be usefull to have keywords/catagories, then the best place is in META.yml. This also has the benefit that you can have a lightweight indexer that only has to look in one place. Randy.
META.yml keywords (was: Re: Finding prior art Perl modules)
Fergal Daly wrote: Does META.yaml have a place for keyowrds? The spec doesn't currently provide for keywords. I do think it would be a good idea, BUT I think it needs to be done in a way to avoid abuse. I'd hate to see META.yml files grow by the kb as authors add every conceivable keyword they can think of and try to manipulate the search. As limiting and as clumsy as it seems, I think that if keywords are added then it should be from a limited set of keywords, i.e. more of a classification scheme, really, where modules can appear under multiple classifications. Randy.
Re: META.yml keywords (was: Re: Finding prior art Perl modules)
On Jul 14, 2004, at 12:11, Randy W. Sims wrote: Fergal Daly wrote: Does META.yaml have a place for keyowrds? As limiting and as clumsy as it seems, I think that if keywords are added then it should be from a limited set of keywords, i.e. more of a classification scheme, really, where modules can appear under multiple classifications. Keywords are necessarily specific to the domain of the module, so I don't think that any global entity can designate an appropriate fixed set. For instance, my module Net::OSCAR implements the protocol used by AOL Instant Messenger, so I'd give it keywords [OSCAR, AIM, IM, AOL Instant Messenger, instant messenger, instant messaging, chat].
Re: META.yml keywords (was: Re: Finding prior art Perl modules)
* Randy W. Sims ml-perl at thepierianspring.org [2004/07/14 15:11]: Fergal Daly wrote: Does META.yaml have a place for keyowrds? The spec doesn't currently provide for keywords. Is anyone generating META.yaml files by hand? I thought they were all generated (and regenerated) by Module::Build/MakeMaker? How would that work in the case of keywords? (darren) -- I interpret advertising as damage and route around it. pgpDGGRIRi0Bz.pgp Description: PGP signature
Re: META.yml keywords (was: Re: Finding prior art Perl modules)
On Wed, Jul 14, 2004 at 03:11:11PM -0400, Randy W. Sims wrote: Fergal Daly wrote: Does META.yaml have a place for keyowrds? The spec doesn't currently provide for keywords. I do think it would be a good idea, BUT I think it needs to be done in a way to avoid abuse. I'd hate to see META.yml files grow by the kb as authors add every conceivable keyword they can think of and try to manipulate the search. The search algorithm could pay attention to the first X keywords and ignore the rest. Or at least, it could heavily weight the first few. I think this is part of how search engines prevent the same kind of above of the meta-tag keyword system. You can put as many keywords as you want into the list, but I think the search engines only really care about the first few. I would prefer something like this over the choosing from the fix list idea. Having free-form tags is a feature I like on: http://del.icio.us/ It allows new classifications to spontaneously appear. Mark -- . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark StosbergPrincipal Developer [EMAIL PROTECTED] Summersault, LLC 765-939-9301 ext 202 database driven websites . . . . . http://www.summersault.com/ . . . . . . . .
Re: META.yml keywords
Matthew Sachs wrote: On Jul 14, 2004, at 12:11, Randy W. Sims wrote: Fergal Daly wrote: Does META.yaml have a place for keyowrds? As limiting and as clumsy as it seems, I think that if keywords are added then it should be from a limited set of keywords, i.e. more of a classification scheme, really, where modules can appear under multiple classifications. Keywords are necessarily specific to the domain of the module, so I don't think that any global entity can designate an appropriate fixed set. For instance, my module Net::OSCAR implements the protocol used by AOL Instant Messenger, so I'd give it keywords [OSCAR, AIM, IM, AOL Instant Messenger, instant messenger, instant messaging, chat]. Classification for a module would probably be something like: Net :: Protocol Communications :: Chat :: AOL Instant Messenger (That last comes from sf.net's topic system) With the classification above AND a good one line synopsis of the module (which is already part of META.yml) most, if not all, of your keywords are covered. Randy.
Re: META.yml keywords (was: Re: Finding prior art Perl modules)
Mark Stosberg [EMAIL PROTECTED] writes: [...] The search algorithm could pay attention to the first X keywords and ignore the rest. Or at least, it could heavily weight the first few. I think this is part of how search engines prevent the same kind of above of the meta-tag keyword system. You can put as many keywords as you want into the list, but I think the search engines only really care about the first few. My understanding is that nowadays, most search engines ignore keywords altogether, because they were so badly abused they became worthless. ScottG.
META.yml keywords (was: Re: Finding prior art Perl modules)
Fergal Daly wrote: Does META.yaml have a place for keyowrds? The spec doesn't currently provide for keywords. I do think it would be a good idea, BUT I think it needs to be done in a way to avoid abuse. I'd hate to see META.yml files grow by the kb as authors add every conceivable keyword they can think of and try to manipulate the search. As limiting and as clumsy as it seems, I think that if keywords are added then it should be from a limited set of keywords, i.e. more of a classification scheme, really, where modules can appear under multiple classifications. Randy.