Re: [Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-18 Thread Yann Forget
Yes, that would be very welcome by all contributors reviewing images.

Regards,
Yann

On Tue, 18 Jun 2019, 22:29 James Heilman,  wrote:

> So Yann should we as a community just build something as a proof of
> concept? If we are talking less than 250 USD per month, I am sure we can
> scrounge up the money for a trial 6 month trial.
>
> James
>
> On Tue, Jun 18, 2019 at 2:59 AM Yann Forget  wrote:
>
> > Hi,
> >
> > Yes, James' pricing doesn't match the actual cost.
> > We do not need to check all images uploaded to Commons, only the
> suspicious
> > ones (small images without EXIF data).
> > If we check 2,000 images a day (more than enough IMO), that would cost
> $7 a
> > day, so $210 a month.
> >
> > Regards,
> > Yann
> >
> >
> > Le mar. 18 juin 2019 à 01:11, James Salsman  a
> écrit :
> >
> > > Google has been offering reverse image search as part of their vision
> > API:
> > >
> > > https://cloud.google.com/vision/docs/internet-detection
> > >
> > > The pricing is $3.50 per 1,000 queries for up to 5,000,000 queries per
> > > month:
> > >
> > > https://cloud.google.com/vision/pricing
> > >
> > > Above that quantity "Contact Google for more information":
> > >
> > > https://cloud.google.com/contact/
> > >
> > >
> > > On Mon, Jun 17, 2019 at 8:23 AM James Forrester
> > >  wrote:
> > > >
> > > > On Mon, 17 Jun 2019 at 06:28, Yann Forget  wrote:
> > > >
> > > > > It has been suggested many times to ask Google for an access to
> their
> > > API
> > > > > for searching images,
> > > > > so that we could have a bot tagging copyright violations (no free
> > > access
> > > > > for automated search).
> > > > > That would the single best improvement in Wikimedia Commons
> workflow
> > > for
> > > > > years.
> > > > > And it would benefit all Wikipedia projects, big or small.
> > > > >
> > > >
> > > > Yann,
> > > >
> > > > As you should remember, we asked Google for API access to their
> reverse
> > > > image search system, years ago (maybe 2013?). They said that there
> > isn't
> > > > such an API any more (they killed it off in ~2012, I think), and that
> > > they
> > > > wouldn't make a custom one for us. The only commercial alternative we
> > > found
> > > > at the time would have cost us approximately US$3m a month at upload
> > > > frequency for Commons then, and when contacted said they wouldn't do
> > any
> > > > discounts for Wikimedia. Obviously, this is far too much for the
> > > > Foundation's budget (it would be even more now), and an inappropriate
> > way
> > > > to spend donor funds. Providing the service in-house would involve
> > > building
> > > > a search index of the entire Internet's (generally non-free) images
> and
> > > > media, which would cost a fortune and is totally incompatible with
> the
> > > > mission of the movement. This was relayed out to Commons volunteers
> at
> > > the
> > > > time, I'm pretty sure.
> > > >
> > > > Obviously Google might have changed their mind, though it seems
> > > unlikely. I
> > > > imagine that Google engineers and product owners don't follow this
> > list,
> > > so
> > > > it's unlikely that they will re-create the API without being asked
> > > directly.
> > > >
> > > > J.
> > > > --
> > > > *James D. Forrester* (he/him  or they/themself
> > > > )
> > > > Wikimedia Foundation 
> > > > ___
> > > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> > > https://meta.wikimedia.org/wiki/Wikimedia-l
> > > > New messages to: Wikimedia-l@lists.wikimedia.org
> > > > Unsubscribe:
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > 
> > >
> > > ___
> > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> > > https://meta.wikimedia.org/wiki/Wikimedia-l
> > > New messages to: Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > 
> >
> >
> >
> > --
> > Jai Jagat 2020 Grand March Coordination Team
> > https://www.jaijagat2020.org/
> > +91-74 34 93 33 58 (also WhatsApp)
> > ___
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> > https://meta.wikimedia.org/wiki/Wikimedia-l
> > New messages to: Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > 
>
>
>
> --
> James Heilman
> MD, CCFP-EM, Wikipedian
> ___
> Wikimedia-l mailing list, guidelines at:
> 

Re: [Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-18 Thread James Heilman
So Yann should we as a community just build something as a proof of
concept? If we are talking less than 250 USD per month, I am sure we can
scrounge up the money for a trial 6 month trial.

James

On Tue, Jun 18, 2019 at 2:59 AM Yann Forget  wrote:

> Hi,
>
> Yes, James' pricing doesn't match the actual cost.
> We do not need to check all images uploaded to Commons, only the suspicious
> ones (small images without EXIF data).
> If we check 2,000 images a day (more than enough IMO), that would cost $7 a
> day, so $210 a month.
>
> Regards,
> Yann
>
>
> Le mar. 18 juin 2019 à 01:11, James Salsman  a écrit :
>
> > Google has been offering reverse image search as part of their vision
> API:
> >
> > https://cloud.google.com/vision/docs/internet-detection
> >
> > The pricing is $3.50 per 1,000 queries for up to 5,000,000 queries per
> > month:
> >
> > https://cloud.google.com/vision/pricing
> >
> > Above that quantity "Contact Google for more information":
> >
> > https://cloud.google.com/contact/
> >
> >
> > On Mon, Jun 17, 2019 at 8:23 AM James Forrester
> >  wrote:
> > >
> > > On Mon, 17 Jun 2019 at 06:28, Yann Forget  wrote:
> > >
> > > > It has been suggested many times to ask Google for an access to their
> > API
> > > > for searching images,
> > > > so that we could have a bot tagging copyright violations (no free
> > access
> > > > for automated search).
> > > > That would the single best improvement in Wikimedia Commons workflow
> > for
> > > > years.
> > > > And it would benefit all Wikipedia projects, big or small.
> > > >
> > >
> > > Yann,
> > >
> > > As you should remember, we asked Google for API access to their reverse
> > > image search system, years ago (maybe 2013?). They said that there
> isn't
> > > such an API any more (they killed it off in ~2012, I think), and that
> > they
> > > wouldn't make a custom one for us. The only commercial alternative we
> > found
> > > at the time would have cost us approximately US$3m a month at upload
> > > frequency for Commons then, and when contacted said they wouldn't do
> any
> > > discounts for Wikimedia. Obviously, this is far too much for the
> > > Foundation's budget (it would be even more now), and an inappropriate
> way
> > > to spend donor funds. Providing the service in-house would involve
> > building
> > > a search index of the entire Internet's (generally non-free) images and
> > > media, which would cost a fortune and is totally incompatible with the
> > > mission of the movement. This was relayed out to Commons volunteers at
> > the
> > > time, I'm pretty sure.
> > >
> > > Obviously Google might have changed their mind, though it seems
> > unlikely. I
> > > imagine that Google engineers and product owners don't follow this
> list,
> > so
> > > it's unlikely that they will re-create the API without being asked
> > directly.
> > >
> > > J.
> > > --
> > > *James D. Forrester* (he/him  or they/themself
> > > )
> > > Wikimedia Foundation 
> > > ___
> > > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> > https://meta.wikimedia.org/wiki/Wikimedia-l
> > > New messages to: Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > 
> >
> > ___
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> > https://meta.wikimedia.org/wiki/Wikimedia-l
> > New messages to: Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > 
>
>
>
> --
> Jai Jagat 2020 Grand March Coordination Team
> https://www.jaijagat2020.org/
> +91-74 34 93 33 58 (also WhatsApp)
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 



-- 
James Heilman
MD, CCFP-EM, Wikipedian
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-18 Thread Yann Forget
Hi,

Yes, James' pricing doesn't match the actual cost.
We do not need to check all images uploaded to Commons, only the suspicious
ones (small images without EXIF data).
If we check 2,000 images a day (more than enough IMO), that would cost $7 a
day, so $210 a month.

Regards,
Yann


Le mar. 18 juin 2019 à 01:11, James Salsman  a écrit :

> Google has been offering reverse image search as part of their vision API:
>
> https://cloud.google.com/vision/docs/internet-detection
>
> The pricing is $3.50 per 1,000 queries for up to 5,000,000 queries per
> month:
>
> https://cloud.google.com/vision/pricing
>
> Above that quantity "Contact Google for more information":
>
> https://cloud.google.com/contact/
>
>
> On Mon, Jun 17, 2019 at 8:23 AM James Forrester
>  wrote:
> >
> > On Mon, 17 Jun 2019 at 06:28, Yann Forget  wrote:
> >
> > > It has been suggested many times to ask Google for an access to their
> API
> > > for searching images,
> > > so that we could have a bot tagging copyright violations (no free
> access
> > > for automated search).
> > > That would the single best improvement in Wikimedia Commons workflow
> for
> > > years.
> > > And it would benefit all Wikipedia projects, big or small.
> > >
> >
> > Yann,
> >
> > As you should remember, we asked Google for API access to their reverse
> > image search system, years ago (maybe 2013?). They said that there isn't
> > such an API any more (they killed it off in ~2012, I think), and that
> they
> > wouldn't make a custom one for us. The only commercial alternative we
> found
> > at the time would have cost us approximately US$3m a month at upload
> > frequency for Commons then, and when contacted said they wouldn't do any
> > discounts for Wikimedia. Obviously, this is far too much for the
> > Foundation's budget (it would be even more now), and an inappropriate way
> > to spend donor funds. Providing the service in-house would involve
> building
> > a search index of the entire Internet's (generally non-free) images and
> > media, which would cost a fortune and is totally incompatible with the
> > mission of the movement. This was relayed out to Commons volunteers at
> the
> > time, I'm pretty sure.
> >
> > Obviously Google might have changed their mind, though it seems
> unlikely. I
> > imagine that Google engineers and product owners don't follow this list,
> so
> > it's unlikely that they will re-create the API without being asked
> directly.
> >
> > J.
> > --
> > *James D. Forrester* (he/him  or they/themself
> > )
> > Wikimedia Foundation 
> > ___
> > Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> > New messages to: Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
>
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 



-- 
Jai Jagat 2020 Grand March Coordination Team
https://www.jaijagat2020.org/
+91-74 34 93 33 58 (also WhatsApp)
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-17 Thread effe iets anders
The landscape has changed quite a bit since 2012, and there are a number of
players that could offer a service like this by now. It may be worthwhile
exploring them briefly (including but not limited to Google), if we believe
this is important enough to invest time in (and I agree that there is a
number of use cases from the community point of view at least).

Lodewijk

On Mon, Jun 17, 2019 at 8:24 AM James Forrester 
wrote:

> On Mon, 17 Jun 2019 at 06:28, Yann Forget  wrote:
>
> > It has been suggested many times to ask Google for an access to their API
> > for searching images,
> > so that we could have a bot tagging copyright violations (no free access
> > for automated search).
> > That would the single best improvement in Wikimedia Commons workflow for
> > years.
> > And it would benefit all Wikipedia projects, big or small.
> >
>
> Yann,
>
> As you should remember, we asked Google for API access to their reverse
> image search system, years ago (maybe 2013?). They said that there isn't
> such an API any more (they killed it off in ~2012, I think), and that they
> wouldn't make a custom one for us. The only commercial alternative we found
> at the time would have cost us approximately US$3m a month at upload
> frequency for Commons then, and when contacted said they wouldn't do any
> discounts for Wikimedia. Obviously, this is far too much for the
> Foundation's budget (it would be even more now), and an inappropriate way
> to spend donor funds. Providing the service in-house would involve building
> a search index of the entire Internet's (generally non-free) images and
> media, which would cost a fortune and is totally incompatible with the
> mission of the movement. This was relayed out to Commons volunteers at the
> time, I'm pretty sure.
>
> Obviously Google might have changed their mind, though it seems unlikely. I
> imagine that Google engineers and product owners don't follow this list, so
> it's unlikely that they will re-create the API without being asked
> directly.
>
> J.
> --
> *James D. Forrester* (he/him  or they/themself
> )
> Wikimedia Foundation 
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-17 Thread James Salsman
Google has been offering reverse image search as part of their vision API:

https://cloud.google.com/vision/docs/internet-detection

The pricing is $3.50 per 1,000 queries for up to 5,000,000 queries per month:

https://cloud.google.com/vision/pricing

Above that quantity "Contact Google for more information":

https://cloud.google.com/contact/


On Mon, Jun 17, 2019 at 8:23 AM James Forrester
 wrote:
>
> On Mon, 17 Jun 2019 at 06:28, Yann Forget  wrote:
>
> > It has been suggested many times to ask Google for an access to their API
> > for searching images,
> > so that we could have a bot tagging copyright violations (no free access
> > for automated search).
> > That would the single best improvement in Wikimedia Commons workflow for
> > years.
> > And it would benefit all Wikipedia projects, big or small.
> >
>
> Yann,
>
> As you should remember, we asked Google for API access to their reverse
> image search system, years ago (maybe 2013?). They said that there isn't
> such an API any more (they killed it off in ~2012, I think), and that they
> wouldn't make a custom one for us. The only commercial alternative we found
> at the time would have cost us approximately US$3m a month at upload
> frequency for Commons then, and when contacted said they wouldn't do any
> discounts for Wikimedia. Obviously, this is far too much for the
> Foundation's budget (it would be even more now), and an inappropriate way
> to spend donor funds. Providing the service in-house would involve building
> a search index of the entire Internet's (generally non-free) images and
> media, which would cost a fortune and is totally incompatible with the
> mission of the movement. This was relayed out to Commons volunteers at the
> time, I'm pretty sure.
>
> Obviously Google might have changed their mind, though it seems unlikely. I
> imagine that Google engineers and product owners don't follow this list, so
> it's unlikely that they will re-create the API without being asked directly.
>
> J.
> --
> *James D. Forrester* (he/him  or they/themself
> )
> Wikimedia Foundation 
> ___
> Wikimedia-l mailing list, guidelines at: 
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
> 

___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-17 Thread James Forrester
On Mon, 17 Jun 2019 at 06:28, Yann Forget  wrote:

> It has been suggested many times to ask Google for an access to their API
> for searching images,
> so that we could have a bot tagging copyright violations (no free access
> for automated search).
> That would the single best improvement in Wikimedia Commons workflow for
> years.
> And it would benefit all Wikipedia projects, big or small.
>

Yann,

As you should remember, we asked Google for API access to their reverse
image search system, years ago (maybe 2013?). They said that there isn't
such an API any more (they killed it off in ~2012, I think), and that they
wouldn't make a custom one for us. The only commercial alternative we found
at the time would have cost us approximately US$3m a month at upload
frequency for Commons then, and when contacted said they wouldn't do any
discounts for Wikimedia. Obviously, this is far too much for the
Foundation's budget (it would be even more now), and an inappropriate way
to spend donor funds. Providing the service in-house would involve building
a search index of the entire Internet's (generally non-free) images and
media, which would cost a fortune and is totally incompatible with the
mission of the movement. This was relayed out to Commons volunteers at the
time, I'm pretty sure.

Obviously Google might have changed their mind, though it seems unlikely. I
imagine that Google engineers and product owners don't follow this list, so
it's unlikely that they will re-create the API without being asked directly.

J.
-- 
*James D. Forrester* (he/him  or they/themself
)
Wikimedia Foundation 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-17 Thread Yann Forget
Hi,

It has been suggested many times to ask Google for an access to their API
for searching images,
so that we could have a bot tagging copyright violations (no free access
for automated search).
That would the single best improvement in Wikimedia Commons workflow for
years.
And it would benefit all Wikipedia projects, big or small.

Regards,
Yann

Le lun. 17 juin 2019 à 17:54, Leila Zia  a écrit :

> Hi Benjamin,
>
> My name is Leila and I'm in the Research team in Wikimedia Foundation.
> Please see below.
>
> On Mon, Jun 17, 2019 at 12:59 AM Benjamin Lees 
> wrote:
> >
> > The community has been working on copyright violation issues for a long
> > time.[2]  There are probably ways the WMF could support improvements in
> > this area.  Maybe the WMF could even design some system that would
> > magically solve the problem.  But it's certainly not the community
> standing
> > in the way.
>
> While I understand that you brought this up as one example within a
> broader context and set of challenges, now that you have brought it
> up, I'd like to ask you for a specific guidance. Can you help me
> understand, in your view, what are some of the most pressing issues on
> this front from the perspective of those who work to detect and
> address copyright violations? (Not knowing a lot about this space, my
> first thought is to have better algorithms to detect copyright
> violations in Wikipedia (?) text (?) across many languages. Is this
> the most pressing issue?)
>
> Some more info about how we work at the end of this email.[4]
>
> Best,
> Leila
>
> > [1]
> >
> https://en.wikipedia.org/wiki/Wikipedia:Autoconfirmed_article_creation_trial
> > [2]
> https://en.wikipedia.org/wiki/Wikipedia:Copyright_violations#Resources
> > Also consider
> >
> https://lists.wikimedia.org/pipermail/wikimedia-l/2013-November/128777.html
> > back in 2013.
> [3]
> https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations
> [4]
> To give you some more information about the context I operate in:
>
> * Part of the work of our team is to listen to community conversations
> in lists such as wikimedia-l to find research questions/directions to
> work on. If we can understand the problem space clearly and define
> research questions bsaed on, we can work on priorities with the
> corresponding communities and start the research on these questions
> ourselves or through our Formal Collaborations program [3].
>
> * The types of problems that we can work (relatively) more quickly on
> are those for which the output can be an API, data-set, or knowledge.
>
> * We won't start the research based on hearing the most pressing
> issues from you. If we see that based on your response there is a
> promising direction for further research, we will follow up (with the
> corresponding parts of the community involved in this space) to learn
> more about the general and specific problems.
>

-- 
Jai Jagat 2020 Grand March Coordination Team
https://www.jaijagat2020.org/
+91-74 34 93 33 58 (also WhatsApp)
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


Re: [Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-17 Thread Mister Thrapostibongles
Leila

Since I raised this particular issue,, I'll take the liberty of giving an
answer to this question, even though you addressed it to Benjamin.  The
failure that I was pointing to was not the failure to identify copyright
violations, but the failure to address the huge backlog of probable
infringements identified at, for example,
https://en.wikipedia.org/wiki/Wikipedia:Contributor_copyright_investigations/2008
where
there is a backlog of *thousands* of articles created by *one* user.  In
the absence of any coordinated management of the workload, at the current
rate of progress it will take about another decade to clear this single
case.  My analysis is that the pressing issue here is precisely that there
is no-one for whom this is a pressing issue: no-one is responsible for
clearing up the mess, and if there were, there are no resources available
to be allocated to it, and if there were, there is no way of deciding where
to allocate those resources.

Thrapostibongles

On Mon, Jun 17, 2019 at 1:24 PM Leila Zia  wrote:

> Hi Benjamin,
>
> My name is Leila and I'm in the Research team in Wikimedia Foundation.
> Please see below.
>
> On Mon, Jun 17, 2019 at 12:59 AM Benjamin Lees 
> wrote:
> >
> > The community has been working on copyright violation issues for a long
> > time.[2]  There are probably ways the WMF could support improvements in
> > this area.  Maybe the WMF could even design some system that would
> > magically solve the problem.  But it's certainly not the community
> standing
> > in the way.
>
> While I understand that you brought this up as one example within a
> broader context and set of challenges, now that you have brought it
> up, I'd like to ask you for a specific guidance. Can you help me
> understand, in your view, what are some of the most pressing issues on
> this front from the perspective of those who work to detect and
> address copyright violations? (Not knowing a lot about this space, my
> first thought is to have better algorithms to detect copyright
> violations in Wikipedia (?) text (?) across many languages. Is this
> the most pressing issue?)
>
> Some more info about how we work at the end of this email.[4]
>
> Best,
> Leila
>
> > [1]
> >
> https://en.wikipedia.org/wiki/Wikipedia:Autoconfirmed_article_creation_trial
> > [2]
> https://en.wikipedia.org/wiki/Wikipedia:Copyright_violations#Resources
> > Also consider
> >
> https://lists.wikimedia.org/pipermail/wikimedia-l/2013-November/128777.html
> > back in 2013.
> [3]
> https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations
> [4]
> To give you some more information about the context I operate in:
>
> * Part of the work of our team is to listen to community conversations
> in lists such as wikimedia-l to find research questions/directions to
> work on. If we can understand the problem space clearly and define
> research questions bsaed on, we can work on priorities with the
> corresponding communities and start the research on these questions
> ourselves or through our Formal Collaborations program [3].
>
> * The types of problems that we can work (relatively) more quickly on
> are those for which the output can be an API, data-set, or knowledge.
>
> * We won't start the research based on hearing the most pressing
> issues from you. If we see that based on your response there is a
> promising direction for further research, we will follow up (with the
> corresponding parts of the community involved in this space) to learn
> more about the general and specific problems.
>
> ___
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> 
___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 


[Wikimedia-l] Copyright workflows - research (Was: Re: Foundation management of volunteers)

2019-06-17 Thread Leila Zia
Hi Benjamin,

My name is Leila and I'm in the Research team in Wikimedia Foundation.
Please see below.

On Mon, Jun 17, 2019 at 12:59 AM Benjamin Lees  wrote:
>
> The community has been working on copyright violation issues for a long
> time.[2]  There are probably ways the WMF could support improvements in
> this area.  Maybe the WMF could even design some system that would
> magically solve the problem.  But it's certainly not the community standing
> in the way.

While I understand that you brought this up as one example within a
broader context and set of challenges, now that you have brought it
up, I'd like to ask you for a specific guidance. Can you help me
understand, in your view, what are some of the most pressing issues on
this front from the perspective of those who work to detect and
address copyright violations? (Not knowing a lot about this space, my
first thought is to have better algorithms to detect copyright
violations in Wikipedia (?) text (?) across many languages. Is this
the most pressing issue?)

Some more info about how we work at the end of this email.[4]

Best,
Leila

> [1]
> https://en.wikipedia.org/wiki/Wikipedia:Autoconfirmed_article_creation_trial
> [2] https://en.wikipedia.org/wiki/Wikipedia:Copyright_violations#Resources
> Also consider
> https://lists.wikimedia.org/pipermail/wikimedia-l/2013-November/128777.html
> back in 2013.
[3] https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations
[4]
To give you some more information about the context I operate in:

* Part of the work of our team is to listen to community conversations
in lists such as wikimedia-l to find research questions/directions to
work on. If we can understand the problem space clearly and define
research questions bsaed on, we can work on priorities with the
corresponding communities and start the research on these questions
ourselves or through our Formal Collaborations program [3].

* The types of problems that we can work (relatively) more quickly on
are those for which the output can be an API, data-set, or knowledge.

* We won't start the research based on hearing the most pressing
issues from you. If we see that based on your response there is a
promising direction for further research, we will follow up (with the
corresponding parts of the community involved in this space) to learn
more about the general and specific problems.

___
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,