[Wikimedia-l] Re: Paid editing dashboard and metrics?

2021-09-08 Thread Samuel Klein
Thank you Mario!  I'll ask further there. S

On Wed, Sep 8, 2021 at 4:29 PM Mario Gómez  wrote:

> You might want to discuss this topic at
> https://meta.wikimedia.org/wiki/Talk:Wikiproject:Antispam
>
> This page is watched by many people working against UPE in various
> projects. Many are familiar with large-scale UPE operations. Some of us
> already work with tools for automated and semi-automated detection. So it
> might be more fruitful to discuss it with contributors experienced in that
> area.
>
> Best,
>
> Mario
>
> On Wed, Sep 8, 2021 at 7:22 PM Samuel Klein  wrote:
>
>> Hi Steven :)  Good points.  I agree with Adam that this is a major energy
>> and enthusiasm drain for eitors.
>>
>> As to how we could start with data collection:
>>
>> * Monitor the market.
>>   a) Work with groups that are in the market and completely transparent
>> about their work to maintain a sense of rates and volume
>>   b) Search general contracting sites, general search engines, and
>> specific reputation brokers for new options; maintain a catalog
>>   c) Spot-check and commission work. As with Böhmermann
>> 's show - he spent
>> under 500 Euros and identified two networks of UPE.
>>
>>  * Build better tools for tracking and countering undisclosed paid editing
>>   a)  Tracking: automated scoring, as with ORES
>>   b)  Countering: As you say: tools to help people coordinate work,
>> making it more fun and collaborative to take on UPE. Especially for
>> often-targeted categories -- politicians + companies..
>>   b)  Both: Focus on tools for detecting large farms over time, and
>> cleaning up the mess left by a farm.
>>
>> You're right about community size being a defense.  But only for a time
>> -- the growing demand for this actively subverts community members. Olaf
>> was one!  So we also need to think of ways to reduce and divert that demand
>> into constructive channels.
>>
>> SJ
>>
>> On Tue, Sep 7, 2021 at 11:11 PM Steven Walling 
>> wrote:
>>
>>> Given that it’s completely trivial to make new pseudonymous accounts how
>>> would you propose even remotely accurate data collection to measure paid
>>> editing?
>>>
>>> If we are worried about the impact of paid editors on the integrity of
>>> content, we are much better served investing even more in efforts to
>>> dramatically strengthen our volunteer community’s ability to defend the
>>> projects. That means better software to help each editor do more, making it
>>> fun, easy and welcoming for new contributors, and fighting the attrition in
>>> admins and other functionaries. If our volunteer community was larger and
>>> healthier, the threat of paid interference would be less scary.
>>>
>>> On Tue, Sep 7, 2021 at 7:20 PM Samuel Klein  wrote:
>>>
 Aha -- I was pointed to en:wp's List of paid editing companies
 .
 (thanks!)  This is a great resource and deserves to be better linked.   The
 page is semi-active - 4 additions in the last month, including the Olaf
 case. I've cleaned it up a bit and linked it to the German page. This
 really needs some automated scripting and tracking, at the scale of ORES...

 Is there any routine analysis / stats compiled of edits associated with
 these orgs, or of their activity online?

 On Tue, Sep 7, 2021 at 2:19 PM Samuel Klein  wrote:

> Jan Böhmermann 
> published an amazing expose on political WP editing in Germany; it gets
> good around 15 minutes in
> . In the video he
> exposed the workings of a paid editing farm run (by Olaf Kosinsky (
> Wikidata ; CheckUser
> discussion
> 
> ; archived PR-services site
> ),
> an excellent long-time editor with over 3 million edits.
>
> *We need to distinguish paid editing from general COI editing*.  Paid
> editing is COI editing by professionals, who have strong external
> incentives to persist, no leeway in the outcome they are aiming
> for, experience in doing this in dozens of cases, and may have colleagues
> who can drop in as 'uninvolved' editors to forge consensus or social
> proof.[1]
>
> This is one of our great recurring challenges, siphoning off both our
> reputation and our community.  There are many things we can do about paid
> editing, starting with maintaining *paid-editing metrics and a
> dashboard* of known and estimated paid editing.  We can estimate its
> prevalence by the availabiity of services online[2]; and look for patterns
> of such editing on wiki.  Even with large error margins, 

[Wikimedia-l] Re: Paid editing dashboard and metrics?

2021-09-08 Thread Mario Gómez
You might want to discuss this topic at
https://meta.wikimedia.org/wiki/Talk:Wikiproject:Antispam

This page is watched by many people working against UPE in various
projects. Many are familiar with large-scale UPE operations. Some of us
already work with tools for automated and semi-automated detection. So it
might be more fruitful to discuss it with contributors experienced in that
area.

Best,

Mario

On Wed, Sep 8, 2021 at 7:22 PM Samuel Klein  wrote:

> Hi Steven :)  Good points.  I agree with Adam that this is a major energy
> and enthusiasm drain for eitors.
>
> As to how we could start with data collection:
>
> * Monitor the market.
>   a) Work with groups that are in the market and completely transparent
> about their work to maintain a sense of rates and volume
>   b) Search general contracting sites, general search engines, and
> specific reputation brokers for new options; maintain a catalog
>   c) Spot-check and commission work. As with Böhmermann
> 's show - he spent
> under 500 Euros and identified two networks of UPE.
>
>  * Build better tools for tracking and countering undisclosed paid editing
>   a)  Tracking: automated scoring, as with ORES
>   b)  Countering: As you say: tools to help people coordinate work, making
> it more fun and collaborative to take on UPE. Especially for often-targeted
> categories -- politicians + companies..
>   b)  Both: Focus on tools for detecting large farms over time, and
> cleaning up the mess left by a farm.
>
> You're right about community size being a defense.  But only for a time --
> the growing demand for this actively subverts community members. Olaf was
> one!  So we also need to think of ways to reduce and divert that demand
> into constructive channels.
>
> SJ
>
> On Tue, Sep 7, 2021 at 11:11 PM Steven Walling 
> wrote:
>
>> Given that it’s completely trivial to make new pseudonymous accounts how
>> would you propose even remotely accurate data collection to measure paid
>> editing?
>>
>> If we are worried about the impact of paid editors on the integrity of
>> content, we are much better served investing even more in efforts to
>> dramatically strengthen our volunteer community’s ability to defend the
>> projects. That means better software to help each editor do more, making it
>> fun, easy and welcoming for new contributors, and fighting the attrition in
>> admins and other functionaries. If our volunteer community was larger and
>> healthier, the threat of paid interference would be less scary.
>>
>> On Tue, Sep 7, 2021 at 7:20 PM Samuel Klein  wrote:
>>
>>> Aha -- I was pointed to en:wp's List of paid editing companies
>>> .
>>> (thanks!)  This is a great resource and deserves to be better linked.   The
>>> page is semi-active - 4 additions in the last month, including the Olaf
>>> case. I've cleaned it up a bit and linked it to the German page. This
>>> really needs some automated scripting and tracking, at the scale of ORES...
>>>
>>> Is there any routine analysis / stats compiled of edits associated with
>>> these orgs, or of their activity online?
>>>
>>> On Tue, Sep 7, 2021 at 2:19 PM Samuel Klein  wrote:
>>>
 Jan Böhmermann 
 published an amazing expose on political WP editing in Germany; it gets
 good around 15 minutes in
 . In the video he
 exposed the workings of a paid editing farm run (by Olaf Kosinsky (
 Wikidata ; CheckUser
 discussion
 
 ; archived PR-services site
 ), an
 excellent long-time editor with over 3 million edits.

 *We need to distinguish paid editing from general COI editing*.  Paid
 editing is COI editing by professionals, who have strong external
 incentives to persist, no leeway in the outcome they are aiming
 for, experience in doing this in dozens of cases, and may have colleagues
 who can drop in as 'uninvolved' editors to forge consensus or social
 proof.[1]

 This is one of our great recurring challenges, siphoning off both our
 reputation and our community.  There are many things we can do about paid
 editing, starting with maintaining *paid-editing metrics and a
 dashboard* of known and estimated paid editing.  We can estimate its
 prevalence by the availabiity of services online[2]; and look for patterns
 of such editing on wiki.  Even with large error margins, this would be a
 step above simply waiting for outbreaks to be discovered and reacting to
 the visible bits of the iceberg.

 What sort of metrics like this do we have already?  Who is 

[Wikimedia-l] Re: Paid editing dashboard and metrics?

2021-09-08 Thread Samuel Klein
Hi Steven :)  Good points.  I agree with Adam that this is a major energy
and enthusiasm drain for eitors.

As to how we could start with data collection:

* Monitor the market.
  a) Work with groups that are in the market and completely transparent
about their work to maintain a sense of rates and volume
  b) Search general contracting sites, general search engines, and specific
reputation brokers for new options; maintain a catalog
  c) Spot-check and commission work. As with Böhmermann
's show - he spent under
500 Euros and identified two networks of UPE.

 * Build better tools for tracking and countering undisclosed paid editing
  a)  Tracking: automated scoring, as with ORES
  b)  Countering: As you say: tools to help people coordinate work, making
it more fun and collaborative to take on UPE. Especially for often-targeted
categories -- politicians + companies..
  b)  Both: Focus on tools for detecting large farms over time, and
cleaning up the mess left by a farm.

You're right about community size being a defense.  But only for a time --
the growing demand for this actively subverts community members. Olaf was
one!  So we also need to think of ways to reduce and divert that demand
into constructive channels.

SJ

On Tue, Sep 7, 2021 at 11:11 PM Steven Walling 
wrote:

> Given that it’s completely trivial to make new pseudonymous accounts how
> would you propose even remotely accurate data collection to measure paid
> editing?
>
> If we are worried about the impact of paid editors on the integrity of
> content, we are much better served investing even more in efforts to
> dramatically strengthen our volunteer community’s ability to defend the
> projects. That means better software to help each editor do more, making it
> fun, easy and welcoming for new contributors, and fighting the attrition in
> admins and other functionaries. If our volunteer community was larger and
> healthier, the threat of paid interference would be less scary.
>
> On Tue, Sep 7, 2021 at 7:20 PM Samuel Klein  wrote:
>
>> Aha -- I was pointed to en:wp's List of paid editing companies
>> .
>> (thanks!)  This is a great resource and deserves to be better linked.   The
>> page is semi-active - 4 additions in the last month, including the Olaf
>> case. I've cleaned it up a bit and linked it to the German page. This
>> really needs some automated scripting and tracking, at the scale of ORES...
>>
>> Is there any routine analysis / stats compiled of edits associated with
>> these orgs, or of their activity online?
>>
>> On Tue, Sep 7, 2021 at 2:19 PM Samuel Klein  wrote:
>>
>>> Jan Böhmermann 
>>> published an amazing expose on political WP editing in Germany; it gets
>>> good around 15 minutes in
>>> . In the video he
>>> exposed the workings of a paid editing farm run (by Olaf Kosinsky (
>>> Wikidata ; CheckUser discussion
>>> 
>>> ; archived PR-services site
>>> ), an
>>> excellent long-time editor with over 3 million edits.
>>>
>>> *We need to distinguish paid editing from general COI editing*.  Paid
>>> editing is COI editing by professionals, who have strong external
>>> incentives to persist, no leeway in the outcome they are aiming
>>> for, experience in doing this in dozens of cases, and may have colleagues
>>> who can drop in as 'uninvolved' editors to forge consensus or social
>>> proof.[1]
>>>
>>> This is one of our great recurring challenges, siphoning off both our
>>> reputation and our community.  There are many things we can do about paid
>>> editing, starting with maintaining *paid-editing metrics and a
>>> dashboard* of known and estimated paid editing.  We can estimate its
>>> prevalence by the availabiity of services online[2]; and look for patterns
>>> of such editing on wiki.  Even with large error margins, this would be a
>>> step above simply waiting for outbreaks to be discovered and reacting to
>>> the visible bits of the iceberg.
>>>
>>> What sort of metrics like this do we have already?  Who is working on
>>> such things?
>>> Since the above video came out, de:wp started a table of WP editing
>>> services
>>> .
>>> It currently includes an initial dozen examples, with no estimate of
>>> activity (the 1 account known to be associated with each is in most cases
>>> blocked; but most have active websites soliciting work) This would be
>>> useful in all languages.
>>>
>>> SJ
>>>
>>>  [1] as Melmann wrote
>>> 

[Wikimedia-l] Re: Paid editing dashboard and metrics?

2021-09-07 Thread Adam Wight
Hi SJ,

Just passing along a link shared @Waltercolor during another discussion
about paid editing,
https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Mois_anti-pub

I completely agree that we need more ways to identify and reverse
undisclosed paid editing.  For a moment, the WMF's Scoring Platform team
was hoping to build an automated model trained on known promotional
editing—we suspected that it had stereotypical grammar and other flaws that
make it possible to find using this approach.  Personally, I think that
increasing visibility of paid editing might be helpful, "sunshine" or the
threat of showing up in the newspaper might make it feel riskier for
clients to decide to pay for these services.

In response to Steven's point, we should certainly make general
improvements to the editing experience but note that fighting paid editing
*is* one of these improvements.  Interactions with paid editors and their
socks can be exhausting and unrewarding.

-[[mw:User:Adamw]]

On Wed 8. Sep 2021 at 04:20, Samuel Klein  wrote:

> Aha -- I was pointed to en:wp's List of paid editing companies
> .
> (thanks!)  This is a great resource and deserves to be better linked.   The
> page is semi-active - 4 additions in the last month, including the Olaf
> case. I've cleaned it up a bit and linked it to the German page. This
> really needs some automated scripting and tracking, at the scale of ORES...
>
> Is there any routine analysis / stats compiled of edits associated with
> these orgs, or of their activity online?
>
> On Tue, Sep 7, 2021 at 2:19 PM Samuel Klein  wrote:
>
>> Jan Böhmermann 
>> published an amazing expose on political WP editing in Germany; it gets
>> good around 15 minutes in
>> . In the video he
>> exposed the workings of a paid editing farm run (by Olaf Kosinsky (
>> Wikidata ; CheckUser discussion
>> 
>> ; archived PR-services site
>> ), an
>> excellent long-time editor with over 3 million edits.
>>
>> *We need to distinguish paid editing from general COI editing*.  Paid
>> editing is COI editing by professionals, who have strong external
>> incentives to persist, no leeway in the outcome they are aiming
>> for, experience in doing this in dozens of cases, and may have colleagues
>> who can drop in as 'uninvolved' editors to forge consensus or social
>> proof.[1]
>>
>> This is one of our great recurring challenges, siphoning off both our
>> reputation and our community.  There are many things we can do about paid
>> editing, starting with maintaining *paid-editing metrics and a dashboard*
>> of known and estimated paid editing.  We can estimate its prevalence by the
>> availabiity of services online[2]; and look for patterns of such editing on
>> wiki.  Even with large error margins, this would be a step above simply
>> waiting for outbreaks to be discovered and reacting to the visible bits of
>> the iceberg.
>>
>> What sort of metrics like this do we have already?  Who is working on
>> such things?
>> Since the above video came out, de:wp started a table of WP editing
>> services
>> .
>> It currently includes an initial dozen examples, with no estimate of
>> activity (the 1 account known to be associated with each is in most cases
>> blocked; but most have active websites soliciting work) This would be
>> useful in all languages.
>>
>> SJ
>>
>>  [1] as Melmann wrote
>> 
>>  recently:
>> "*in my experience, **all the most difficult edits are WP:PAID
>> **. Most non-paid COI
>> comes from a place of desire to make things better, and often can be
>> relatively easily guided towards a better place... [or] it is relatively
>> easy to use existing enforcement mechanisms to to correct and ultimately
>> control their behaviours. PR professionals, on the other hand, are subtle
>> and sometimes downright deceptive, and it takes lots of effort to check
>> their edits when most of the time you lack context and expertise and you
>> really have to research in depth to see their edits for what they really
>> are. I think that one of the fundamental mistakes of the current policy is
>> lumping paid editors with general COI editing as paid editors are
>> fundamentally playing on a different level in terms of PR expertise and
>> incentives*"
>>
>> [2] Just searching for this online led to ads from dozens of services.
>> The first 10 below seem 

[Wikimedia-l] Re: Paid editing dashboard and metrics?

2021-09-07 Thread Steven Walling
Given that it’s completely trivial to make new pseudonymous accounts how
would you propose even remotely accurate data collection to measure paid
editing?

If we are worried about the impact of paid editors on the integrity of
content, we are much better served investing even more in efforts to
dramatically strengthen our volunteer community’s ability to defend the
projects. That means better software to help each editor do more, making it
fun, easy and welcoming for new contributors, and fighting the attrition in
admins and other functionaries. If our volunteer community was larger and
healthier, the threat of paid interference would be less scary.

On Tue, Sep 7, 2021 at 7:20 PM Samuel Klein  wrote:

> Aha -- I was pointed to en:wp's List of paid editing companies
> .
> (thanks!)  This is a great resource and deserves to be better linked.   The
> page is semi-active - 4 additions in the last month, including the Olaf
> case. I've cleaned it up a bit and linked it to the German page. This
> really needs some automated scripting and tracking, at the scale of ORES...
>
> Is there any routine analysis / stats compiled of edits associated with
> these orgs, or of their activity online?
>
> On Tue, Sep 7, 2021 at 2:19 PM Samuel Klein  wrote:
>
>> Jan Böhmermann 
>> published an amazing expose on political WP editing in Germany; it gets
>> good around 15 minutes in
>> . In the video he
>> exposed the workings of a paid editing farm run (by Olaf Kosinsky (
>> Wikidata ; CheckUser discussion
>> 
>> ; archived PR-services site
>> ), an
>> excellent long-time editor with over 3 million edits.
>>
>> *We need to distinguish paid editing from general COI editing*.  Paid
>> editing is COI editing by professionals, who have strong external
>> incentives to persist, no leeway in the outcome they are aiming
>> for, experience in doing this in dozens of cases, and may have colleagues
>> who can drop in as 'uninvolved' editors to forge consensus or social
>> proof.[1]
>>
>> This is one of our great recurring challenges, siphoning off both our
>> reputation and our community.  There are many things we can do about paid
>> editing, starting with maintaining *paid-editing metrics and a dashboard*
>> of known and estimated paid editing.  We can estimate its prevalence by the
>> availabiity of services online[2]; and look for patterns of such editing on
>> wiki.  Even with large error margins, this would be a step above simply
>> waiting for outbreaks to be discovered and reacting to the visible bits of
>> the iceberg.
>>
>> What sort of metrics like this do we have already?  Who is working on
>> such things?
>> Since the above video came out, de:wp started a table of WP editing
>> services
>> .
>> It currently includes an initial dozen examples, with no estimate of
>> activity (the 1 account known to be associated with each is in most cases
>> blocked; but most have active websites soliciting work) This would be
>> useful in all languages.
>>
>> SJ
>>
>>  [1] as Melmann wrote
>> 
>>  recently:
>> "*in my experience, **all the most difficult edits are WP:PAID
>> **. Most non-paid COI
>> comes from a place of desire to make things better, and often can be
>> relatively easily guided towards a better place... [or] it is relatively
>> easy to use existing enforcement mechanisms to to correct and ultimately
>> control their behaviours. PR professionals, on the other hand, are subtle
>> and sometimes downright deceptive, and it takes lots of effort to check
>> their edits when most of the time you lack context and expertise and you
>> really have to research in depth to see their edits for what they really
>> are. I think that one of the fundamental mistakes of the current policy is
>> lumping paid editors with general COI editing as paid editors are
>> fundamentally playing on a different level in terms of PR expertise and
>> incentives*"
>>
>> [2] Just searching for this online led to ads from dozens of services.
>> The first 10 below seem to be clones of the same service (perhaps run by
>> the same farm)
>>  Elite Wiki Writers
>>  Wiki Curators
>>  Wiki Genies
>>  Wikipedia Legends
>>  Wiki Page Writing
>>  Wiki Page Creator
>>  WikiProfs
>>  Wiki Specialist LLC
>>  Wiki Writers Workshop
>>  Wikipedia Publisher
>>  Wikipedia Services
>>  360 

[Wikimedia-l] Re: Paid editing dashboard and metrics?

2021-09-07 Thread Samuel Klein
Aha -- I was pointed to en:wp's List of paid editing companies
.
(thanks!)  This is a great resource and deserves to be better linked.   The
page is semi-active - 4 additions in the last month, including the Olaf
case. I've cleaned it up a bit and linked it to the German page. This
really needs some automated scripting and tracking, at the scale of ORES...

Is there any routine analysis / stats compiled of edits associated with
these orgs, or of their activity online?

On Tue, Sep 7, 2021 at 2:19 PM Samuel Klein  wrote:

> Jan Böhmermann 
> published an amazing expose on political WP editing in Germany; it gets
> good around 15 minutes in
> . In the video he
> exposed the workings of a paid editing farm run (by Olaf Kosinsky (
> Wikidata ; CheckUser discussion
> 
> ; archived PR-services site
> ), an
> excellent long-time editor with over 3 million edits.
>
> *We need to distinguish paid editing from general COI editing*.  Paid
> editing is COI editing by professionals, who have strong external
> incentives to persist, no leeway in the outcome they are aiming
> for, experience in doing this in dozens of cases, and may have colleagues
> who can drop in as 'uninvolved' editors to forge consensus or social
> proof.[1]
>
> This is one of our great recurring challenges, siphoning off both our
> reputation and our community.  There are many things we can do about paid
> editing, starting with maintaining *paid-editing metrics and a dashboard*
> of known and estimated paid editing.  We can estimate its prevalence by the
> availabiity of services online[2]; and look for patterns of such editing on
> wiki.  Even with large error margins, this would be a step above simply
> waiting for outbreaks to be discovered and reacting to the visible bits of
> the iceberg.
>
> What sort of metrics like this do we have already?  Who is working on such
> things?
> Since the above video came out, de:wp started a table of WP editing
> services
> .
> It currently includes an initial dozen examples, with no estimate of
> activity (the 1 account known to be associated with each is in most cases
> blocked; but most have active websites soliciting work) This would be
> useful in all languages.
>
> SJ
>
>  [1] as Melmann wrote
> 
>  recently:
> "*in my experience, **all the most difficult edits are WP:PAID
> **. Most non-paid COI comes
> from a place of desire to make things better, and often can be relatively
> easily guided towards a better place... [or] it is relatively easy to use
> existing enforcement mechanisms to to correct and ultimately control their
> behaviours. PR professionals, on the other hand, are subtle and sometimes
> downright deceptive, and it takes lots of effort to check their edits when
> most of the time you lack context and expertise and you really have to
> research in depth to see their edits for what they really are. I think that
> one of the fundamental mistakes of the current policy is lumping paid
> editors with general COI editing as paid editors are fundamentally playing
> on a different level in terms of PR expertise and incentives*"
>
> [2] Just searching for this online led to ads from dozens of services.
> The first 10 below seem to be clones of the same service (perhaps run by
> the same farm)
>  Elite Wiki Writers
>  Wiki Curators
>  Wiki Genies
>  Wikipedia Legends
>  Wiki Page Writing
>  Wiki Page Creator
>  WikiProfs
>  Wiki Specialist LLC
>  Wiki Writers Workshop
>  Wikipedia Publisher
>  Wikipedia Services
>  360 Ghostwriting
>  Contentfly
>  Otter PR
>  Premium Content Writing
>  ReputationX
>  Upwork
>
>
>

-- 
Samuel Klein  @metasj   w:user:sj  +1 617 529 4266
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/7YUU6NFVERF6ZKWKKZXQV3VONDQ55TWS/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org