Sigh. It's even more complicated than that. It looks like the "name" entry doesn't always match the name you passed in the API call, but is subject to case mapping, trailing whitespace stripping, and maybe a few other things?
$ curl -s 'https://en.wikipedia.org/w/api.php?action=query&format=json&list=users&usprop=groups%7Ceditcount%7Cgender&ususers=roySmith|roysmith' | json_pp { "query" : { "users" : [ { "name" : "RoySmith", "gender" : "unknown", "groups" : [ "sysop", "*", "user", "autoconfirmed" ], "userid" : 130326, "editcount" : 58645 }, { "name" : "Roysmith", "missing" : "" } ] }, "batchcomplete" : "" } I'm assuming the entries in the returned "users" list are guaranteed to be in the same order as the input parameters? I can't find anyplace that says this, but it seems logical. Can somebody confirm that it's true? > On Sep 4, 2021, at 6:46 PM, Roy Smith <[email protected]> wrote: > > I turns out, this is a little more complicated than it appeared at first; > usercontribs and list users have different concepts of "invalid". If you ask > for usercontribs on "1.2.3.4", it's valid. If you pass in "1.2.3.0/24", you > get baduser.. But list users returns: > > { > "batchcomplete": "", > "query": { > "users": [ > { > "name": "1.2.3.4", > "invalid": "" > } > ] > } > } > > which I guess makes sense in that context since it can't map it to a userid. > I can work around this, but mentioning it for the sake of some poor developer > searching the archives N years from now trying to figure it out :-) > > >> On Aug 19, 2021, at 6:21 PM, Bryan Davis <[email protected] >> <mailto:[email protected]>> wrote: >> >> On Thu, Aug 19, 2021 at 4:04 PM Roy Smith <[email protected] >> <mailto:[email protected]>> wrote: >>> >>> I've got a tool which parses sockpuppet investigation (SPI) pages and does >>> some analysis. One of the steps is I need to validate that all of the >>> usernames found in the SPI report are valid. I do that by sequentially >>> calling usercontribs on each name with uclimit=1 and seeing if I get a >>> baduser error. >>> >>> This works, but it's slow because I need to make 1 API call for each user. >>> For a big SPI case, the time to do this swamps everything else. Is there a >>> more efficient way to do this? Some API call where I can give it a bunch >>> of usernames in a batch and have it tell me which ones are invalid? >>> Alternatively, is there a regex I could apply on the client side to test if >>> a username is valid? >>> >>> The most common type of invalid name I see is when somebody puts down an >>> iprange (i.e. 1.2.4.0/24) as a username. Testing for that client-side >>> would be trivial, but it might miss some others. >> >> You can do lookups in batches of 50 (500 if you have the >> "apihighlimits" right which is commonly granted by the "Bots" group on >> movement wikis) with >> <https://en.wikipedia.org/w/api.php?action=help&modules=query%2Busers >> <https://en.wikipedia.org/w/api.php?action=help&modules=query%2Busers>>. >> >> Here's a quick example: >> <https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=users&format=json&utf8=1&formatversion=2&ususers=Bryan%20Davis%7CBryanDavis%7CBDavis%20(WMF)%7Cbd808 >> >> <https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=users&format=json&utf8=1&formatversion=2&ususers=Bryan%20Davis%7CBryanDavis%7CBDavis%20(WMF)%7Cbd808>> >> >> The results will look something like: >> ``` >> { >> "batchcomplete": true, >> "query": { >> "users": [ >> { >> "name": "Bryan Davis", >> "missing": true >> }, >> { >> "userid": 2619078, >> "name": "BryanDavis" >> }, >> { >> "userid": 19474624, >> "name": "BDavis (WMF)" >> }, >> { >> "userid": 24257381, >> "name": "Bd808" >> } >> ] >> } >> } >> ``` >> >> Bryan >> -- >> Bryan Davis Technical Engagement Wikimedia Foundation >> Principal Software Engineer Boise, ID USA >> [[m:User:BDavis_(WMF)]] irc: bd808 >> _______________________________________________ >> Cloud mailing list -- [email protected] >> <mailto:[email protected]> >> List information: >> https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/ >> <https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/> >> > > _______________________________________________ > Cloud mailing list -- [email protected] > List information: > https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
_______________________________________________ Cloud mailing list -- [email protected] List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
