Plain text file would be best, one article title per row
On Tue, Sep 9, 2014 at 10:43 AM, Navino Evans <[email protected]> wrote: > Most definitely! That would be absolutely fantastic. > > What format of list would be most useful for you to work with? > On 9 Sep 2014 15:38, "John" <[email protected]> wrote: > >> Its not that big of a deal, once i set the system up. is it possible to >> have you post the list in a static location on your webserver? I could then >> just have the bot grab and use that list. >> >> >> On Tue, Sep 9, 2014 at 10:35 AM, Navino Evans <[email protected]> >> wrote: >> >>> That's great to know, thank you. >>> >>> We'll make sure we only use the API within that limit - basically just >>> for individual calls when a user adds a new event to our database. >>> >>> For the bulk processing, we would need to update the backlinks >>> information as a monthly maintenance task, so I wouldn't want to trouble >>> you with this each time. >>> >>> Would you rather we stick with data dump processing for the large scale >>> stuff? >>> >>> >>> >>> On 9 Sep 2014 15:05, "John" <[email protected]> wrote: >>> >>>> If you want a report on that many pages drop me a list of those titles >>>> and and I can write a report for you given that volume of affected pages. >>>> >>>> I would say 1-2 seconds between quires should be reasonable for a >>>> moderate volume of quires. Any large scale request I will do server side >>>> and avoid hammering the web-servers for something that is better batched. >>>> >>>> >>>> On Tue, Sep 9, 2014 at 9:58 AM, Navino Evans <[email protected]> >>>> wrote: >>>> >>>>> Once again, a huge thank you for taking the time to do this John - >>>>> That's exactly what I was looking for! - the helpfulness of this >>>>> community >>>>> never ceases to amaze me :) >>>>> >>>>> Hopefully I haven't initiated a journey down the rabbit hole into a >>>>> fully fledged muliti-language counting machine ;) >>>>> >>>>> >>>>> Can I just ask what the limit of reasonable use would be for making >>>>> API calls to this new tool? (e.g. number of calls per day) >>>>> >>>>> It would be incredibly useful if we could use it to update the events >>>>> in our database once a month (we are using it to rank historical events by >>>>> 'importance'), but we are already have approximately 1.5 million events so >>>>> am aware this may be way beyond what would be acceptable. >>>>> >>>>> On Tue, Sep 9, 2014 at 2:56 PM, John <[email protected]> >>>>> wrote: >>>>> >>>>>> That's doable, however it will require a little more time as I need >>>>>> to unearth some old code to handle multi-projects/languages >>>>>> >>>>>> >>>>>> On Tue, Sep 9, 2014 at 9:51 AM, Jan Ainali <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Awesome John! >>>>>>> >>>>>>> Now I only wish that one could specify language code also ;) >>>>>>> >>>>>>> >>>>>>> *Med vänliga hälsningar,Jan Ainali* >>>>>>> >>>>>>> Verksamhetschef, Wikimedia Sverige >>>>>>> <http://se.wikimedia.org/wiki/Huvudsida> >>>>>>> 0729 - 67 29 48 >>>>>>> >>>>>>> >>>>>>> *Tänk dig en värld där varje människa har fri tillgång till >>>>>>> mänsklighetens samlade kunskap. Det är det vi gör.* >>>>>>> Bli medlem. <http://blimedlem.wikimedia.se> >>>>>>> >>>>>>> >>>>>>> 2014-09-09 15:34 GMT+02:00 John <[email protected]>: >>>>>>> >>>>>>>> Per request, its no frills but what you what you asked for: >>>>>>>> http://tools.wmflabs.org/betacommand-dev/cgi-bin/backlinks >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Sep 9, 2014 at 8:32 AM, Navino Evans < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> That is fantastic news... I'm incredibly grateful for the help and >>>>>>>>> advice. >>>>>>>>> >>>>>>>>> On Tue, Sep 9, 2014 at 1:27 PM, John <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Given the overhead of the API and that he only needs a count >>>>>>>>>> getting that info should be fairly easy via a python cgi wrapper >>>>>>>>>> around an >>>>>>>>>> sql query. >>>>>>>>>> >>>>>>>>>> The only thing that I cannot do is #3 since the software does not >>>>>>>>>> differentiate between links in templates and links not in templates. >>>>>>>>>> Its a >>>>>>>>>> requested feature for years now. >>>>>>>>>> >>>>>>>>>> Give me a few hours and ill get you the tool you want. This >>>>>>>>>> should be less than 30 minutes work >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Sep 9, 2014 at 7:55 AM, Jan Ainali < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Related tip: In the API you can get a list of backlinks (but you >>>>>>>>>>> have to count them yourself) from the main namespace including all >>>>>>>>>>> redirects by a query like this: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> https://en.wikipedia.org/w/api.php?action=query&list=backlinks&format=json&bltitle=Example&blnamespace=0&blfilterredir=all&bllimit=250&blredirect= >>>>>>>>>>> >>>>>>>>>>> More info at: https://www.mediawiki.org/wiki/API:Backlinks >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *Med vänliga hälsningar,Jan Ainali* >>>>>>>>>>> >>>>>>>>>>> Verksamhetschef, Wikimedia Sverige >>>>>>>>>>> <http://se.wikimedia.org/wiki/Huvudsida> >>>>>>>>>>> 0729 - 67 29 48 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *Tänk dig en värld där varje människa har fri tillgång till >>>>>>>>>>> mänsklighetens samlade kunskap. Det är det vi gör.* >>>>>>>>>>> Bli medlem. <http://blimedlem.wikimedia.se> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2014-09-09 13:41 GMT+02:00 Navino Evans <[email protected]> >>>>>>>>>>> : >>>>>>>>>>> >>>>>>>>>>>> Wow! That would be awesome :) >>>>>>>>>>>> >>>>>>>>>>>> The API we are looking for can be as simple as sending a GET >>>>>>>>>>>> request to a url ( >>>>>>>>>>>> http://www.somewhere.com/api/count?t=wikipedia_title_goes_here), >>>>>>>>>>>> returning a number in "text/plain" format. >>>>>>>>>>>> >>>>>>>>>>>> The actual count that we're interested is for English Wikipedia >>>>>>>>>>>> only, and would ideally include the following, all added up into a >>>>>>>>>>>> single >>>>>>>>>>>> number: >>>>>>>>>>>> >>>>>>>>>>>> 1) All links from articles in Main Namespace only (for our >>>>>>>>>>>> purpose it would be better to not include links from User pages, >>>>>>>>>>>> Talk pages >>>>>>>>>>>> etc if possible) >>>>>>>>>>>> >>>>>>>>>>>> 2) Including links from Redirect pages (e.g. counting a link >>>>>>>>>>>> from "Michel Jackson" redirect as part of the count from the >>>>>>>>>>>> article >>>>>>>>>>>> "Michael Jackson") >>>>>>>>>>>> >>>>>>>>>>>> 3) Excluding links that are within a template transcluded in an >>>>>>>>>>>> article (so we don't need to count the links inside Navboxes >>>>>>>>>>>> within an >>>>>>>>>>>> article for example) >>>>>>>>>>>> >>>>>>>>>>>> 4) For our purpose, it doesn't really matter whether >>>>>>>>>>>> transclusions of the actual page that is called are included in >>>>>>>>>>>> the count >>>>>>>>>>>> (we generally won't be using it for checking templates, timeline >>>>>>>>>>>> and list >>>>>>>>>>>> articles). >>>>>>>>>>>> >>>>>>>>>>>> Just to give the full picture for this request - my use of >>>>>>>>>>>> this tool will be for a company (www.histropedia.com), so I >>>>>>>>>>>> wouldn't want to take up your time with this unless it's something >>>>>>>>>>>> you feel >>>>>>>>>>>> should be available for wider use. My plan was to get the >>>>>>>>>>>> developer working >>>>>>>>>>>> on our site to make this tool for the community if it didn't exist >>>>>>>>>>>> somewhere, but we would be reliant on datadumps so could not get >>>>>>>>>>>> live >>>>>>>>>>>> information (which would be incredibly useful for us, and I hope >>>>>>>>>>>> many >>>>>>>>>>>> others). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Sep 8, 2014 at 8:10 PM, John <[email protected] >>>>>>>>>>>> > wrote: >>>>>>>>>>>> >>>>>>>>>>>>> What numbers/data do you want? I can whip up a replacement for >>>>>>>>>>>>> it. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Monday, September 8, 2014, Navino Evans < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, does anyone know if there is a tool currently >>>>>>>>>>>>>> available for counting backlinks to Wikipedia articles via an >>>>>>>>>>>>>> API? I have >>>>>>>>>>>>>> been using this tool >>>>>>>>>>>>>> http://dispenser.homenet.org/~dispenser/cgi-bin/backlinkscount.py >>>>>>>>>>>>>> - but it seems to have finally gone offline completely following >>>>>>>>>>>>>> some >>>>>>>>>>>>>> recent controversy with user:Dispenser. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any advice much appreciated! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Navino >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> Labs-l mailing list >>>>>>>>>>>>> [email protected] >>>>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> ___________________________ >>>>>>>>>>>> >>>>>>>>>>>> Histropedia >>>>>>>>>>>> The Timeline for all of History >>>>>>>>>>>> www.histropedia.com >>>>>>>>>>>> >>>>>>>>>>>> Follow us on: >>>>>>>>>>>> Twitter <https://twitter.com/Histropedia> Facebo >>>>>>>>>>>> <https://www.facebook.com/Histropedia>ok >>>>>>>>>>>> <https://www.facebook.com/Histropedia> Google + >>>>>>>>>>>> <https://plus.google.com/u/0/b/104484373317792180682/104484373317792180682/posts> >>>>>>>>>>>> L <http://www.linkedin.com/company/histropedia-ltd>inke >>>>>>>>>>>> <http://www.linkedin.com/company/histropedia-ltd>dIn >>>>>>>>>>>> <http://www.linkedin.com/company/histropedia-ltd> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Labs-l mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Labs-l mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Labs-l mailing list >>>>>>>>>> [email protected] >>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> ___________________________ >>>>>>>>> >>>>>>>>> Histropedia >>>>>>>>> The Timeline for all of History >>>>>>>>> www.histropedia.com >>>>>>>>> >>>>>>>>> Follow us on: >>>>>>>>> Twitter <https://twitter.com/Histropedia> Facebo >>>>>>>>> <https://www.facebook.com/Histropedia>ok >>>>>>>>> <https://www.facebook.com/Histropedia> Google + >>>>>>>>> <https://plus.google.com/u/0/b/104484373317792180682/104484373317792180682/posts> >>>>>>>>> L <http://www.linkedin.com/company/histropedia-ltd>inke >>>>>>>>> <http://www.linkedin.com/company/histropedia-ltd>dIn >>>>>>>>> <http://www.linkedin.com/company/histropedia-ltd> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Labs-l mailing list >>>>>>>>> [email protected] >>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Labs-l mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Labs-l mailing list >>>>>>> [email protected] >>>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Labs-l mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> ___________________________ >>>>> >>>>> Histropedia >>>>> The Timeline for all of History >>>>> www.histropedia.com >>>>> >>>>> Follow us on: >>>>> Twitter <https://twitter.com/Histropedia> Facebo >>>>> <https://www.facebook.com/Histropedia>ok >>>>> <https://www.facebook.com/Histropedia> Google + >>>>> <https://plus.google.com/u/0/b/104484373317792180682/104484373317792180682/posts> >>>>> L <http://www.linkedin.com/company/histropedia-ltd>inke >>>>> <http://www.linkedin.com/company/histropedia-ltd>dIn >>>>> <http://www.linkedin.com/company/histropedia-ltd> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Labs-l mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Labs-l mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>> >>>> >>> _______________________________________________ >>> Labs-l mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>> >>> >> >> _______________________________________________ >> Labs-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/labs-l >> >> > _______________________________________________ > Labs-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-l > >
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
