-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Platonides wrote:
> Farkas, Illes wrote:
>> Dear All,
>>
>> Is the dump file containing the page abstracts for Yahoo produced by
>> human or machines ?
>>
>> Thanks
> 
> It's producesd by a machine, extracting the beginning of all articles 
> (which are human-created).

It's a machine attempting to pull the first two sentences of the article
as plaintext, sometimes more successfully than others. :)

I'm not sure these files are actually still being used, though.

You can find the code in:
http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/ActiveAbstract/


But I think the newer code here to pull the first sentence is more
reliable (requires current MediaWiki with new parser):
http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/OpenSearchXml/

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkklvocACgkQwRnhpk1wk458QgCfQythKEvXp9ssRsILQOejNQ09
bWoAn31APe3W773YkBTy2UuKOE2drQJ9
=MGM8
-----END PGP SIGNATURE-----

_______________________________________________
MediaWiki-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Reply via email to