Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-26 Thread Leila Zia
Kerry, thanks for kicking this off. One update on our end: There is a general alignment between a few different teams/departments in WMF that this is an important problem to support chekcusers with in a better way than what we do today. I gave a presentation in Wikimania about the research on

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-26 Thread Jonathan Morgan
Nemo, Can you please elaborate on what use of language, and whose use of language, you are criticizing? It is not clear from your email what "jargon" you refer to, and why you feel it is inappropriate. Jonathan On Mon, Aug 26, 2019 at 12:59 AM Federico Leva (Nemo) wrote: > Please everyone

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-26 Thread Federico Leva (Nemo)
Please everyone avoid using jargon specific to the English Wikipedia on this cross-language and cross-wiki mailing list. Aaron Halfaker, 23/08/19 17:36: I think embeddings[1] would be a nice way to create a signature. There is some discussion of acceptable user fingerprinting (presumably to

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-24 Thread Timothy Wood
Is that what they do? I thought we mostly did that. TJW/GMG On Sat, Aug 24, 2019, 06:20 Nick Wilson (Quiddity) wrote: > On Fri, Aug 23, 2019 at 5:23 PM Kerry Raymond > wrote: > > > That's why I think we need "signatures" which is my shorthand for things > > like a hash function or a bounding

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-24 Thread Nick Wilson (Quiddity)
On Fri, Aug 23, 2019 at 5:23 PM Kerry Raymond wrote: > That's why I think we need "signatures" which is my shorthand for things > like a hash function or a bounding box, a means by which many non-matching > accounts can be eliminated at low cost, reserving the high cost comparisons > (machine or

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-23 Thread Timothy Wood
Then again, apparently the Foundation has a PR team whose only job is to compile the latest marketing buzzwords, and they seem to really love AI. You might get some buy in. Never know. V/r TJW/GMG On Fri, Aug 23, 2019, 11:23 Kerry Raymond wrote: > That's why I think we need "signatures" which

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-23 Thread Kerry Raymond
That's why I think we need "signatures" which is my shorthand for things like a hash function or a bounding box, a means by which many non-matching accounts can be eliminated at low cost, reserving the high cost comparisons (machine or human) only for high probability candidates. It is

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-23 Thread Aaron Halfaker
I think embeddings[1] would be a nice way to create a signature. Essentially, we could dump data about a person's activities into it (words added, namespaces edited, time of day of edits, temporal frequency of editing, # of revisions per session, frequency of citation by type, etc.) and get a

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-23 Thread Timothy Wood
You are correct that in all but the most obvious cases, filing an SPI can be exceptionally time consuming. I'm afraid there is no obvious technical solution there that would not involve a complicated AI that is probably beyond the ability of the foundation to produce. There is quite a bit of data

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-23 Thread RhinosF1
Just a note that you can still go through warnings for vandalism etc. and report to AIV. Or at that edit speed, you may have a chance at AN at reporting for bot-like edits which will draw attention to the account. If you ever need help, things like #wikipedia-en-help on Freenode IRC exist so you

Re: [Wiki-research-l] sockpuppets and how to find them sooner

2019-08-23 Thread Kerry Raymond
To reply to my own question . Can we find a way to create a "signature" of an account's pattern of editing? Perhaps it might be a set of signatures, maybe one for the categories that the account appears to be active in, another for the type of edit, etc. Then if these signatures were