Re: [Wikitech-l] Reason for actionthrottledtext API blocking?

2020-01-13 Thread Gergő Tisza
On Mon, Jan 13, 2020 at 6:57 AM Baskauf, Steven James <
steve.bask...@vanderbilt.edu> wrote:

> The other thing that is different about what I'm doing and what is being
> done by the other user who is not encountering this problem is that I'm
> authenticating directly by establishing a session when the script starts
> (lines 347-349).


As Brad said, you should use OAuth:
https://www.mediawiki.org/wiki/Manual:Pywikibot/OAuth/Wikimedia
It won't help with the throttling, but it's simpler and more secure (and if
you do use PAWS, it should work without any setup).

Eventually I will probably apply for a bot flag, but I doubt that this bot
> will ever be autonomous, so is that really necessary?


While the ultimate authority on thi is always the community of the given
wiki (via its bureaucrats, or a bot approval committee in some cases), IMO
semi-autonomous tools are the ones where a human reviews every edit, which
does not seem to be the case here.
In any case, it is necessary if you want to make several dozen edits a
minute and are not in any other group which has the noratelimit right.


> Would it matter if I used my own account instead of a separate bot account?
>

Depends on the exact limit you've hit, but probably not unless you are an
administrator on that wiki.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Reason for actionthrottledtext API blocking?

2020-01-13 Thread Brad Jorsch (Anomie)
On Mon, Jan 13, 2020 at 9:57 AM Baskauf, Steven James <
steve.bask...@vanderbilt.edu> wrote:

> 1. The bot account doesn't have a bot flag.
> https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/DanmicholoBot
> implies that the lack of a bot flag might be the problem. But the script is
> not an autonomous bot and is still under development, so I didn't think a
> bot flag was necessary under those circumstances.
>

This is relevant only in that the bot group includes the "noratelimit"
right, which bypasses rate limiting in many cases.


> 2. I'm not running the script from paws.wmflabs.org .  Is it a problem
> that the API calls are coming from an unknown IP address? Does that matter?
> 3. I'm not using Pywikibot.  But that shouldn't be necessary.
>

Neither of these should be relevant.

On the other hand, a relevant factor is that only certain actions are
subject to the rate limits, and different actions have different limits.
The other user you discussed it with may have been using different actions
that aren't limited, or that have limits he wasn't hitting.


> The other thing that is different about what I'm doing and what is being
> done by the other user who is not encountering this problem is that I'm
> authenticating directly by establishing a session when the script starts
> (lines 347-349).  The other user apparently somehow doesn't have to
> authenticate when using Pywikibot as long as he's logged in to
> paws.wmflabs.org .  But I haven't dug in to find out how Pywikibot works,
> so I don't really understand how that is possible or whether that's
> important in establishing that the bot is legitimate and not a vandal/spam
> bot.
>

He may be using OAuth. See
https://www.mediawiki.org/wiki/OAuth/For_Developers and
https://www.mediawiki.org/wiki/OAuth/Owner-only_consumers for some details.

-- 
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Reason for actionthrottledtext API blocking?

2020-01-13 Thread Baskauf, Steven James
I am working on a script to load and update Vanderbilt researcher records in 
Wikidata via the API (see 
https://github.com/HeardLibrary/linked-data/tree/master/publications for 
information and 
https://github.com/HeardLibrary/linked-data/blob/master/publications/process_csv_metadata_full.py
 for the relevant script).

The script is working adequately; however after about 25 writes, I get blocked 
with an actionthrottledtext error with the message " As an anti-abuse measure, 
you are limited from performing this action too many times in a short space of 
time, and you have exceeded this limit.\nPlease try again in a few minutes."  I 
am able to avoid being blocked if I delay three seconds between writes, but a 
delay of only one second results in me still being blocked.

I am observing maxlag (lines 282-313) and am sending a User-Agent header, 
VanderBot/0.1 (steve.bask...@vanderbilt.edu).  I'm making the API calls using 
the bot account https://www.wikidata.org/wiki/User:VanderBot, which does not 
have a bot flag.

I've been talking to another user who is running a script from paws.wmflabs.org 
using Pywikibot without a bot flag and he has not encountered this error.  So I 
am wondering what I am doing wrong that is causing this error and what I can do 
to stop being blocked.  The possible causes I can imagine are:

1. The bot account doesn't have a bot flag.  
https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/DanmicholoBot
 implies that the lack of a bot flag might be the problem. But the script is 
not an autonomous bot and is still under development, so I didn't think a bot 
flag was necessary under those circumstances.
2. I'm not running the script from paws.wmflabs.org .  Is it a problem that the 
API calls are coming from an unknown IP address? Does that matter?
3. I'm not using Pywikibot.  But that shouldn't be necessary.

The other thing that is different about what I'm doing and what is being done 
by the other user who is not encountering this problem is that I'm 
authenticating directly by establishing a session when the script starts (lines 
347-349).  The other user apparently somehow doesn't have to authenticate when 
using Pywikibot as long as he's logged in to paws.wmflabs.org .  But I haven't 
dug in to find out how Pywikibot works, so I don't really understand how that 
is possible or whether that's important in establishing that the bot is 
legitimate and not a vandal/spam bot.

Any advice you can give about how to stop being blocked would be appreciated.  
Eventually I will probably apply for a bot flag, but I doubt that this bot will 
ever be autonomous, so is that really necessary?  Would it matter if I used my 
own account instead of a separate bot account?

Steve

--
Steven J. Baskauf, Ph.D.
Data Science and Data Curation Specialist
Jean & Alexander Heard Libraries, Vanderbilt University
Nashville, TN 37235, USA

Office: Eskind Biomedical Library, EMB 111
Phone: (615) 343-4582
https://my.vanderbilt.edu/baskauf/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l