Re: META.yml keywords

2004-07-18 Thread Jim Cromie
Smylers wrote:
And I still reckon most humans are approximately appalling at picking
appropriate keywords anyway.  A system like you're proposing still
requires an individual module's author to think of the right keywords
and bother to do this, which is putting a single-point of failure in the
system.
 

This list already advises authors (those that ask) on appropriate names,
it could readily name the top 10 keywords also.  Those that dont ask cannot
be helped w/o undue complexity.
OTGH, its not impractical for the many regulars here to moderate/gatekeep
on what modules are publicized.  New uploads (ie new dists) result in a 
reply
to the user, and a post to module-authors, flagging a new dist for name 
approval.

KISS.  (moderation may not be sufficiently simple)



Re: META.yml keywords

2004-07-18 Thread Smylers
Ken Williams writes:

 I think we could get far more mileage out of tuning the search engine
 better for the needs of perl-module searching,

You speak great sense.  Now it's back to Simon's point about the source
code for it not seeming to be available.

 Smylers, I think your proposal is interesting ...

Proposal is flattering to my message -- thinking out loud was more
like it.  I wasn't seriously expecting anybody to start implementing any
of the things I said, but was just pointing out that there are many
different ways of approaching the problem which don't involve keywords
...

Smylers



Re: META.yml keywords

2004-07-18 Thread Fergal Daly
On Sat, Jul 17, 2004 at 03:40:52PM +0200, A. Pagaltzis wrote:
 Which was exactly the purpose: to be able to make sure that the
 list with official keywords really does only contain official
 keywords, so a release tool can complain about misspellings f.ex.
 If you simply allow both in a single list, then netwrok will go
 unnoticed and make your module invisible to searches with the
 correct keyword.
 
 I don't think the existence of two lists should matter to the
 indexer -- official keywords in the freeform list should have the
 same value as official ones in the fixed keys list. That sort of
 defeats the above point, I guess, but a list for fixed keys only
 still helps those who want its benefits.
 
 It might suffice to have the release tool check the list and tell
 the user which keywords are official and which aren't, but I
 don't know if that is helpful enough -- I personally would like
 to be able to tell it to choke on all mistakes *except* those I
 specifically declared as known non-official ones.

The only benefit I can see is that of spell-checking and that would be
better done by an actual spell-checker. Isn't it important not mis-spell any
keywords, regardless of their officialness?

F


Re: META.yml keywords

2004-07-17 Thread A. Pagaltzis
* Randy W. Sims [EMAIL PROTECTED] [2004-07-17 12:45]:
 There is, however, another advantage to the catagory approach:
 Searching would likely be more consistent. It would help
 authors to place their modules so that they can be found with
 similar modules. It would also help ensure that users looking
 for a particular type module will get back a result set that is
 likely to contain all/most of the modules of that type.

Why does it have to be either/or?

There could be two keyword lists, one with fixed keywords, and
the other freeform. Their names would have to be chosen carefully
to suggest this as the intended use (rather than filling both
with the same keywords) -- maybe ``keywords'' and
``additional_keywords'' or something.

Regards,
-- 
Aristotle
If you can't laugh at yourself, you don't take life seriously enough.


Re: META.yml keywords

2004-07-17 Thread Fergal Daly
On Sat, Jul 17, 2004 at 01:32:36PM +0200, A. Pagaltzis wrote:
 * Randy W. Sims [EMAIL PROTECTED] [2004-07-17 12:45]:
  There is, however, another advantage to the catagory approach:
  Searching would likely be more consistent. It would help
  authors to place their modules so that they can be found with
  similar modules. It would also help ensure that users looking
  for a particular type module will get back a result set that is
  likely to contain all/most of the modules of that type.
 
 Why does it have to be either/or?
 
 There could be two keyword lists, one with fixed keywords, and
 the other freeform. Their names would have to be chosen carefully
 to suggest this as the intended use (rather than filling both
 with the same keywords) -- maybe ``keywords'' and
 ``additional_keywords'' or something.

I agree that If there is to be an official list of keyowrds then it
shouldn't be either/or. The officials haven't regenerated the module list
for 2 years, there's no reason to think that the keyword officials will stay
up to date.

That said, I don't think having 2 lists is useful. The author should supply
a single list of keywords. Those that are on the official list are on the
official list, those that aren't aren't. The search engine/indexer will be
far better at figuring that out than the module author. Otherwise you are
just obliging the authors to keep track of the official list and move
keywords around in their meta info as the official list chnages.

It would be up to the search engine to perhaps give more weight to official
keywords. The search engine could also maintain official synonyms so that
postgres and pg are indexed together,

F



Re: META.yml keywords

2004-07-17 Thread A. Pagaltzis
* Fergal Daly [EMAIL PROTECTED] [2004-07-17 14:56]:
 That said, I don't think having 2 lists is useful. The author
 should supply a single list of keywords. Those that are on the
 official list are on the official list, those that aren't
 aren't. The search engine/indexer will be far better at
 figuring that out than the module author. Otherwise you are
 just obliging the authors to keep track of the official list
 and move keywords around in their meta info as the official
 list chnages.

Which was exactly the purpose: to be able to make sure that the
list with official keywords really does only contain official
keywords, so a release tool can complain about misspellings f.ex.
If you simply allow both in a single list, then netwrok will go
unnoticed and make your module invisible to searches with the
correct keyword.

I don't think the existence of two lists should matter to the
indexer -- official keywords in the freeform list should have the
same value as official ones in the fixed keys list. That sort of
defeats the above point, I guess, but a list for fixed keys only
still helps those who want its benefits.

It might suffice to have the release tool check the list and tell
the user which keywords are official and which aren't, but I
don't know if that is helpful enough -- I personally would like
to be able to tell it to choke on all mistakes *except* those I
specifically declared as known non-official ones.

Regards,
-- 
Aristotle
If you can't laugh at yourself, you don't take life seriously enough.


Re: META.yml keywords

2004-07-17 Thread Ken Williams
On Jul 17, 2004, at 7:39 AM, Smylers wrote:
Warning:  Wild conjecture and multiple unrelated crazy ideas ahead.
This is becoming _way_ too complicated.
Agreed.

And I still reckon most humans are approximately appalling at picking
appropriate keywords anyway.  A system like you're proposing still
requires an individual module's author to think of the right keywords
and bother to do this, which is putting a single-point of failure in 
the
system.
Agreed.
However, improved Cpan searching would be welcome.
Agreed.  Note that the primary contribution of search.cpan.org is that 
it has a really nice interface and lots of functionality.  But its 
actual search algorithm still isn't very good (it just uses WAIT, IIRC) 
- I think we could get far more mileage out of tuning the search engine 
better for the needs of perl-module searching, and perhaps doing stuff 
like grouping the results by distribution, than we'd ever get this way. 
 And we wouldn't need to add complexity to the modules themselves.

Smylers, I think your proposal is interesting but I don't think it's 
necessary for improving searching over CPAN modules.

 -Ken


Re: META.yml keywords (was: Re: Finding prior art Perl modules)

2004-07-16 Thread Ken Williams
On Jul 14, 2004, at 2:11 PM, Randy W. Sims wrote:
The spec doesn't currently provide for keywords. I do think it would 
be a good idea, BUT I think it needs to be done in a way to avoid 
abuse.
I think maybe it would be better to put keywords right in the pod for 
the module, so they become part of the documentation too.  This is 
similar to the way they often appear in academic papers, right after 
the abstract - and people often find it useful for pinpointing the 
subject matter.

 -Ken


Re: META.yml keywords

2004-07-16 Thread Randy W. Sims
On 7/14/2004 3:44 PM, Mark Stosberg wrote:
On Wed, Jul 14, 2004 at 03:11:11PM -0400, Randy W. Sims wrote:
Fergal Daly wrote:

Does META.yaml have a place for keyowrds?
The spec doesn't currently provide for keywords. I do think it would be 
a good idea, BUT I think it needs to be done in a way to avoid abuse. 
I'd hate to see META.yml files grow by the kb as authors add every 
conceivable keyword they can think of and try to manipulate the search. 

The search algorithm could pay attention to the first X keywords and
ignore the rest. Or at least, it could heavily weight the first few.
I think this is part of how search engines prevent the same kind of
above of the meta-tag keyword system. You can put as many keywords as
you want into the list, but I think the search engines only really care
about the first few.
That seems a reasonable approach to overcoming the abuse problem. There 
is, however, another advantage to the catagory approach: Searching would 
likely be more consistent. It would help authors to place their modules 
so that they can be found with similar modules. It would also help 
ensure that users looking for a particular type module will get back a 
result set that is likely to contain all/most of the modules of that type.

I would prefer something like this over the choosing from the fix list
idea.
Having free-form tags is a feature I like on: http://del.icio.us/
It allows new classifications to spontaneously appear.
I will conceed that there are definate advantages to the keyword approach.
Randy.


Re: META.yml keywords

2004-07-16 Thread Randy W. Sims
On 7/16/2004 8:54 AM, Ken Williams wrote:
On Jul 14, 2004, at 2:11 PM, Randy W. Sims wrote:
The spec doesn't currently provide for keywords. I do think it would 
be a good idea, BUT I think it needs to be done in a way to avoid abuse.

I think maybe it would be better to put keywords right in the pod for 
the module, so they become part of the documentation too.  This is 
similar to the way they often appear in academic papers, right after the 
abstract - and people often find it useful for pinpointing the subject 
matter.
Is it usefull to have in the pod? It seems like meta information to me; 
i.e. it seems like it would only be usefull to indexers. I am concerned 
that META.yml might become a huge document if everything gets put in 
there, but I think if we agree that it would be usefull to have 
keywords/catagories, then the best place is in META.yml. This also has 
the benefit that you can have a lightweight indexer that only has to 
look in one place.

Randy.


META.yml keywords (was: Re: Finding prior art Perl modules)

2004-07-14 Thread Randy W. Sims
Fergal Daly wrote:
Does META.yaml have a place for keyowrds?
The spec doesn't currently provide for keywords. I do think it would be 
a good idea, BUT I think it needs to be done in a way to avoid abuse. 
I'd hate to see META.yml files grow by the kb as authors add every 
conceivable keyword they can think of and try to manipulate the search. 
As limiting and as clumsy as it seems, I think that if keywords are 
added then it should be from a limited set of keywords, i.e. more of a 
classification scheme, really, where modules can appear under multiple 
classifications.

Randy.


Re: META.yml keywords (was: Re: Finding prior art Perl modules)

2004-07-14 Thread Matthew Sachs
On Jul 14, 2004, at 12:11, Randy W. Sims wrote:
Fergal Daly wrote:
Does META.yaml have a place for keyowrds?
As limiting and as clumsy as it seems, I think that if keywords are 
added then it should be from a limited set of keywords, i.e. more of a 
classification scheme, really, where modules can appear under multiple 
classifications.
Keywords are necessarily specific to the domain of the module, so I 
don't think that any global entity can designate an appropriate fixed 
set.  For instance, my module Net::OSCAR implements the protocol used 
by AOL Instant Messenger, so I'd give it keywords [OSCAR, AIM, 
IM, AOL Instant Messenger, instant messenger, instant 
messaging, chat].


Re: META.yml keywords (was: Re: Finding prior art Perl modules)

2004-07-14 Thread darren chamberlain
* Randy W. Sims ml-perl at thepierianspring.org [2004/07/14 15:11]:
 Fergal Daly wrote:
 
 Does META.yaml have a place for keyowrds?
 
 The spec doesn't currently provide for keywords.

Is anyone generating META.yaml files by hand?  I thought they were all
generated (and regenerated) by Module::Build/MakeMaker?  How would that
work in the case of keywords?

(darren)

-- 
I interpret advertising as damage and route around it.


pgpDGGRIRi0Bz.pgp
Description: PGP signature


Re: META.yml keywords (was: Re: Finding prior art Perl modules)

2004-07-14 Thread Mark Stosberg
On Wed, Jul 14, 2004 at 03:11:11PM -0400, Randy W. Sims wrote:
 Fergal Daly wrote:
 
 Does META.yaml have a place for keyowrds?
 
 The spec doesn't currently provide for keywords. I do think it would be 
 a good idea, BUT I think it needs to be done in a way to avoid abuse. 
 I'd hate to see META.yml files grow by the kb as authors add every 
 conceivable keyword they can think of and try to manipulate the search. 

The search algorithm could pay attention to the first X keywords and
ignore the rest. Or at least, it could heavily weight the first few.

I think this is part of how search engines prevent the same kind of
above of the meta-tag keyword system. You can put as many keywords as
you want into the list, but I think the search engines only really care
about the first few.

I would prefer something like this over the choosing from the fix list
idea.

Having free-form tags is a feature I like on: http://del.icio.us/
It allows new classifications to spontaneously appear.

Mark

--
 . . . . . . . . . . . . . . . . . . . . . . . . . . . 
   Mark StosbergPrincipal Developer  
   [EMAIL PROTECTED] Summersault, LLC 
   765-939-9301 ext 202 database driven websites
 . . . . . http://www.summersault.com/ . . . . . . . .


Re: META.yml keywords

2004-07-14 Thread Randy W. Sims
Matthew Sachs wrote:
On Jul 14, 2004, at 12:11, Randy W. Sims wrote:
Fergal Daly wrote:
Does META.yaml have a place for keyowrds?

As limiting and as clumsy as it seems, I think that if keywords are 
added then it should be from a limited set of keywords, i.e. more of a 
classification scheme, really, where modules can appear under multiple 
classifications.

Keywords are necessarily specific to the domain of the module, so I 
don't think that any global entity can designate an appropriate fixed 
set.  For instance, my module Net::OSCAR implements the protocol used by 
AOL Instant Messenger, so I'd give it keywords [OSCAR, AIM, IM, 
AOL Instant Messenger, instant messenger, instant messaging, chat].
Classification for a module would probably be something like:
Net :: Protocol
Communications :: Chat :: AOL Instant Messenger
(That last comes from sf.net's topic system)
With the classification above AND a good one line synopsis of the module 
(which is already part of META.yml) most, if not all, of your keywords 
are covered.

Randy.


Re: META.yml keywords (was: Re: Finding prior art Perl modules)

2004-07-14 Thread Scott W Gifford
Mark Stosberg [EMAIL PROTECTED] writes:

[...]

 The search algorithm could pay attention to the first X keywords and
 ignore the rest. Or at least, it could heavily weight the first few.
 
 I think this is part of how search engines prevent the same kind of
 above of the meta-tag keyword system. You can put as many keywords as
 you want into the list, but I think the search engines only really care
 about the first few.

My understanding is that nowadays, most search engines ignore keywords
altogether, because they were so badly abused they became worthless.

ScottG.


META.yml keywords (was: Re: Finding prior art Perl modules)

2004-07-14 Thread Randy W. Sims
Fergal Daly wrote:
Does META.yaml have a place for keyowrds?
The spec doesn't currently provide for keywords. I do think it would be 
a good idea, BUT I think it needs to be done in a way to avoid abuse. 
I'd hate to see META.yml files grow by the kb as authors add every 
conceivable keyword they can think of and try to manipulate the search. 
As limiting and as clumsy as it seems, I think that if keywords are 
added then it should be from a limited set of keywords, i.e. more of a 
classification scheme, really, where modules can appear under multiple 
classifications.

Randy.