Thanks again for all the feedback. Due to my limits on time, I went ahead and submitted a task on the query service.
https://jira.toolserver.org/browse/DBQ-140 I don't know where this goes from here, but if anyone has any suggestion please share. Thanks, Jim On Mon, May 2, 2011 at 9:12 AM, Jim Hutchinson <[email protected]> wrote: > > > On Fri, Apr 29, 2011 at 7:58 AM, Manish Goregaokar > <[email protected]>wrote: > >> >> 1. Select 200 random articles. >>> 2. Get the top contributors for each of them. >>> 3. Get the edit counts for those contributors. >>> >> >> I think he has the list/s of 200 articles, and does not want random ones. >> Plus, he doesn't want the editcounts, he wants their top edited articles, >> with the editcount per article. >> >> My personal opinion is that this HAS to be done via php (though I can't >> comment of server load). >> Use php-mysql to determine the list of top contributors per given article, >> then loop for each contributor, and give *his* top edited articles... >> Shouldn't be hard, though you might want to clarify what you mean by "top". >> (Top 3? More than X edits? More than X% edits per day/week/month/beginning >> of time? More than X% edits of the top editor?). >> >> > Thanks again for the info. Yes, this is basically correct. I am looking to > collect this info based on 100 articles from the Wikipedia science series. > If the data proves relatively easy to collect, I like to collect data on all > articles in the science series which is around 200 articles. Top > contributors for me are those with 10 or more edits in the sampled article > from the science series. For the sake of clarity, here is a short sample of > the data I'm looking for. > > From the "science" article http://en.wikipedia.org/wiki/Science > > Clicking "view history" and then "contributors" gives a ranked list of all > contributors in order of most edits. > > > http://toolserver.org/~daniel/WikiSense/Contributors.php?wikilang=en&wikifam=.wikipedia.org&grouped=on&page=Science > > The top three editors (lets call them A, B, and C) currently have 445, 73 > and 70 edits respectively. Clicking on contributor "A" to see their user > page and then the "user contributions" from the tool box shows all their > edits. For example, he/she has several edits to the articles "intelligent > design" and "southern poverty law center", etc. and user "B" has edits to > "rock formations" and "human evolution". I would like to count frequency of > all these edits across the top users for the sampled (e.g. science) articles > sorted by the article title. > > I don't know what the best way to arrange the data would be, but below is a > Google Doc Spreadsheet that sort of shows what I think it would look like. > > http://goo.gl/VIWd6 > > If the Query Service seems the best approach (is this done using the > php-mysql referenced above or is it a different process?) then I will go > ahead and create a task on https://jira.toolserver.org/browse/DBQ. If this > is not the best or correct way to go any guidance is appreciated. > > Thanks. > > -- > Jim >
_______________________________________________ Toolserver-l mailing list ([email protected]) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
