Thanks a bunch to everyone who chimed in here. These hints brought us
forward quite a bit!

Bob

On Wed, Sep 21, 2016 at 12:50 AM, Giuseppe Profiti
<profgiuse...@gmail.com> wrote:
> [forwarding my answer from analytics ml, I forgot to subscribe to this list 
> too]
>
> Hi Robert,
> one solution may be to use a query on Wikidata to retrieve the name
> for the stubs category in all the different languages. Then you could
> use a tool like PetScan to retrive all the pages in such categories,
> or write your own tool by using either a query on the database or
> Mediawiki API.
> You can find a sample solution here:
> http://paws-public.wmflabs.org/paws-public/3270/Stub%20categories.ipynb
>
> I wrote that thing while on a train, so it may be messy and/or  sub-optimal.
> I would like to thank Alex Monk and Yuvi Panda for their help with SQL
> on paws today.
>
> Best,
> Giuseppe
>
> 2016-09-20 11:26 GMT+02:00 Robert West <w...@cs.stanford.edu>:
>> Hi everyone,
>>
>> Does anyone know if there's a straightforward (ideally language-independent)
>> way of identifying stub articles in Wikipedia?
>>
>> Whatever works is ok, whether it's publicly available data or data
>> accessible only on the WMF cluster.
>>
>> I've found lists for various languages (e.g., Italian or English), but the
>> lists are in different formats, so separate code is required for each
>> language, which doesn't scale.
>>
>> I guess in the worst case, I'll have to grep for the respective stub
>> templates in the respective wikitext dumps, but even this requires to know
>> for each language what the respective template is. So if anyone could point
>> me to a list of stub templates in different languages, that would also be
>> appreciated.
>>
>> Thanks!
>> Bob
>>
>> --
>> Up for a little language game? -- http://www.unfun.me
>>
>> _______________________________________________
>> Analytics mailing list
>> analyt...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



-- 
Up for a little language game? -- http://www.unfun.me

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to