I turns out, this is a little more complicated than it appeared at first;
usercontribs and list users have different concepts of "invalid". If you ask
for usercontribs on "1.2.3.4", it's valid. If you pass in "1.2.3.0/24", you
get baduser.. But list users returns:
{
"batchcomplete": "",
"query": {
"users": [
{
"name": "1.2.3.4",
"invalid": ""
}
]
}
}
which I guess makes sense in that context since it can't map it to a userid. I
can work around this, but mentioning it for the sake of some poor developer
searching the archives N years from now trying to figure it out :-)
> On Aug 19, 2021, at 6:21 PM, Bryan Davis <[email protected]> wrote:
>
> On Thu, Aug 19, 2021 at 4:04 PM Roy Smith <[email protected]> wrote:
>>
>> I've got a tool which parses sockpuppet investigation (SPI) pages and does
>> some analysis. One of the steps is I need to validate that all of the
>> usernames found in the SPI report are valid. I do that by sequentially
>> calling usercontribs on each name with uclimit=1 and seeing if I get a
>> baduser error.
>>
>> This works, but it's slow because I need to make 1 API call for each user.
>> For a big SPI case, the time to do this swamps everything else. Is there a
>> more efficient way to do this? Some API call where I can give it a bunch of
>> usernames in a batch and have it tell me which ones are invalid?
>> Alternatively, is there a regex I could apply on the client side to test if
>> a username is valid?
>>
>> The most common type of invalid name I see is when somebody puts down an
>> iprange (i.e. 1.2.4.0/24) as a username. Testing for that client-side would
>> be trivial, but it might miss some others.
>
> You can do lookups in batches of 50 (500 if you have the
> "apihighlimits" right which is commonly granted by the "Bots" group on
> movement wikis) with
> <https://en.wikipedia.org/w/api.php?action=help&modules=query%2Busers>.
>
> Here's a quick example:
> <https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=users&format=json&utf8=1&formatversion=2&ususers=Bryan%20Davis%7CBryanDavis%7CBDavis%20(WMF)%7Cbd808>
>
> The results will look something like:
> ```
> {
> "batchcomplete": true,
> "query": {
> "users": [
> {
> "name": "Bryan Davis",
> "missing": true
> },
> {
> "userid": 2619078,
> "name": "BryanDavis"
> },
> {
> "userid": 19474624,
> "name": "BDavis (WMF)"
> },
> {
> "userid": 24257381,
> "name": "Bd808"
> }
> ]
> }
> }
> ```
>
> Bryan
> --
> Bryan Davis Technical Engagement Wikimedia Foundation
> Principal Software Engineer Boise, ID USA
> [[m:User:BDavis_(WMF)]] irc: bd808
> _______________________________________________
> Cloud mailing list -- [email protected]
> List information:
> https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
>
_______________________________________________
Cloud mailing list -- [email protected]
List information:
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/