Re: [Wikitech-l] captcha idea: proposal for gnome outreach for women 14

2014-03-03 Thread Happy Melon
On 28 February 2014 18:29, Brad Jorsch (Anomie) bjor...@wikimedia.orgwrote:

 On Fri, Feb 28, 2014 at 12:07 PM, Mansi Gokhale gokhalemans...@gmail.com
 wrote:


 Then there's the issue of different interpretation. Take for example
 https://www.mediawiki.org/wiki/File:Find-all-captcha-idea.png. Is the
 second image wearing glasses? Or is that a lorgnette or something like
 opera glasses, both of which are held in front of the eyes rather than
 worn?

 https://www.mediawiki.org/wiki/File:Find-the-different-captcha-idea.pnghas
 a similar problem. The first image is the only one with a cigarette, and
 the only one with non-realistic coloring. The second is the only bald one,
 and the only one with something resembling a lorgnette, and the only one
 not looking in the general direction of the camera, and the only one with a
 book. The fourth is the only child. The sixth is the only obvious female
 (I'm not sure about the cat). The eighth is the only one smiling, and the
 only one with visible teeth.


I think this is oversimplifying.  Of course some people can interpret a
picture puzzle in slightly different ways - the whole *point* of a captcha
is to distinguish between the intuitive reasoning of a human and the
formulaic reasoning of a computer; if there was absolutely no ambiguity, it
would be a very poor captcha.  In exactly the same way that the letters on
a captcha will sometimes be distorted in such a way that humans genuinely
make a mistake, sometimes the questions in a picture puzzle can be
sufficiently distorted to the point that they are answered incorrectly.
The 'difficulty' of *any* captcha obviously needs to be carefully
calibrated to hit the sweet spot between mundanity and ambiguity.  But
putting out nine pictures of humans and one picture of a cat and asking for
the odd one out is no easier to misinterpret than a squiggle that might
be a G or might be a 6.

--HM
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] captcha idea: proposal for gnome outreach for women 14

2014-03-03 Thread Steven Walling
On Fri, Feb 28, 2014 at 6:29 PM, Brad Jorsch (Anomie) bjor...@wikimedia.org
 wrote:

 If you display 8 images and the user has to pick one, then even by random
 guessing the attacker has a 12.5% chance of passing the captcha. That's not
 good at all. Finding all matching is slightly better since it reduces the
 guessability (1/256 for 8 images), but still not very good. A traditional
 captcha using only A-Z is 1/308915776. To do as well with image picking,
 you'd need to ask the user to choose the matches from a set of about 28.
 Adding in numbers 2-9 is 1/1544804416, needing a set of about 31 images.

 The set of possible images also needs to be very large and the
 categorization private.

 https://www.mediawiki.org/wiki/Talk:Requests_for_comment/CAPTCHA#Issue:_image_classification_CAPTCHAs_need_a_secret_corpusgoes
 into much more detail on this issue.


A recent example that springs to mind with image-based CAPTCHAs (instead of
text) is Snapchat's Find the Ghost, which is very fun for users and
apparently was broken very quickly.[1] A lot of times I hear people also
suggest we try a honeypot on login/signup instead of text-based CAPTCHAs,
and like the Snapchat example, one of the weaknesses here is just not
accounting for that fact that people will target popular sites/apps
directly. They'll inspect the DOM to find honeypots, they'll notice you use
the same logo shape and use computer vision to find that shape, etc.

However, it is not overstating it to say that the text-based CAPTCHA we use
now is the single most frustrating part of creating an account or logging
in (if you misremember your password, which users do all the time). To
quote one of our usability tests during the last login/signup redesign:
This is ridiculous. I can't even see this..[2]

One simpler thing we might try and do right now is regenerate our current
pool of CAPTCHAs to make them a bit less hard to read. We've done this kind
of tweaking before without too much trouble I think?[3]

1. techcrunch.com/2014/01/21/snaptcha/
2.
https://www.mediawiki.org/wiki/Account_creation_user_experience/User_testing
3. See bug 43546 which Aaron Schulz kindly took care of. He may be able to
elaborate more.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] captcha idea: proposal for gnome outreach for women 14

2014-03-03 Thread Brad Jorsch (Anomie)
On Mon, Mar 3, 2014 at 6:05 AM, Happy Melon happy.melon.w...@gmail.comwrote:

 But
 putting out nine pictures of humans and one picture of a cat and asking for
 the odd one out is no easier to misinterpret than a squiggle that might
 be a G or might be a 6.


It seems to me that putting nine pictures of humans and one picture of a
cat is probably not much harder of a computer vision task than trying to
determine which letter a particular squiggle corresponds to, either. (And
that's leaving aside the fact that an 10% success rate for random guessing
seems pretty bad for a captcha.)

So naturally I thought that the real captchas would have a subtler level of
intended oddness, so that the possibility for unintended oddness to confuse
people would be greater.


-- 
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] captcha idea: proposal for gnome outreach for women 14

2014-03-02 Thread Matthew Flaschen
I'm adding the design list.  I talked about this recently with a couple 
of the designers.


Matt Flaschen

On 02/28/2014 12:07 PM, Mansi Gokhale wrote:

hello,

These are some approaches i can think of instead of a text based captcha.

The image idea where users are asked to spot the odd one out like
demonstrated or find all the similar images like mentioned in
herehttps://www.mediawiki.org/wiki/CAPTCHA
.

Also a picture with a part chipped in could be shown and chipped pictures
could be given as options

like find the missing part from a jigsaw puzzle.

The image which would be shown is http://imgur.com/uefeb08

http://imgur.com/KEJqCg3 is the picture which would be the correct option.

The other options could be rotated versions of this , which would not be so
easy for the bot to match. (unless it somehow worked some digital
processing algorithm and matched the color gradients or something like
that).

This is a good option for people who do not know english or are illiterate
and maybe would not understand questions like : is this a bird , plane ,
superman? after being shown a picture.

Tell me what you think

(Sorry to upload those images on imgur. i dont know how to put them on the
wiki .Hope that is ok)


have posted this on the CAPTCHA
pagehttps://www.mediawiki.org/wiki/Talk:CAPTCHAalso
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l




___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] captcha idea: proposal for gnome outreach for women 14

2014-02-28 Thread Arthur Richards
I think this is an intriguing approach - particularly for use cases on
mobile devices. We display captchas as necessary through MobileFrontend
when they are triggered, but the mobile experience is horrible (arguably
the whole captcha experience is horrible regardless of the medium, but
that's another conversation). As long as we need to surface captchas,
something non-text based, especially if it didn't require typing, would be
preferable.


On Fri, Feb 28, 2014 at 10:07 AM, Mansi Gokhale gokhalemans...@gmail.comwrote:

 hello,

 These are some approaches i can think of instead of a text based captcha.

 The image idea where users are asked to spot the odd one out like
 demonstrated or find all the similar images like mentioned in
 herehttps://www.mediawiki.org/wiki/CAPTCHA
 .

 Also a picture with a part chipped in could be shown and chipped pictures
 could be given as options

 like find the missing part from a jigsaw puzzle.

 The image which would be shown is http://imgur.com/uefeb08

 http://imgur.com/KEJqCg3 is the picture which would be the correct option.

 The other options could be rotated versions of this , which would not be so
 easy for the bot to match. (unless it somehow worked some digital
 processing algorithm and matched the color gradients or something like
 that).

 This is a good option for people who do not know english or are illiterate
 and maybe would not understand questions like : is this a bird , plane ,
 superman? after being shown a picture.

 Tell me what you think

 (Sorry to upload those images on imgur. i dont know how to put them on the
 wiki .Hope that is ok)


 have posted this on the CAPTCHA
 pagehttps://www.mediawiki.org/wiki/Talk:CAPTCHAalso
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha Idea Proposal for GSOC 2014

2014-02-28 Thread Sumana Harihareswara
Hi and thanks for being interested in Wikimedia!

Please take a look at how your email looked to a lot of people:
http://imgur.com/4OuPSyN

(You can see it in our mailing list archives:
http://lists.wikimedia.org/pipermail/wikitech-l/2014-February/074812.html )

Could you re-send it with your numbered points separated better, so we can
read it?  Thanks!

Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha Idea Proposal for GSOC 2014

2014-02-28 Thread Aalekh Nigam
I figured out following way we can approach the project:1)Alphabetical order 
captcha:We can use Html5's drag and drop Api to list a particular Set of images 
into one category .for example in the example mentoined in the demo here ,i 
made a collection of diff words starting with letters A,B,C as an output i 
grouped up words with starting letter A diff from words with starting letter 
B,CAs,i used text in this example we can use images of diff animals such as 
cat's and dog.and by drag and drop we can group images of cat and that of 
dog in diff categories.
2)Annotation captcha:We can use Images With annotations from commons determine 
the subcategoriy the annotations belongs to and then give relevant options to 
the users ;for example in the file we can search from names of different 
annotations to which they corresponds to from wikipedia(names given here are 
those of mountain) and then give the the option's much relevant to the image.
3)Effect captcha:We can use image as a question which are changed by the effect 
produced php's gd library and the use the same file with another effect and 
then ask user to match the two files.for example:the image1 can be used as 
a question asking user to click on the image that matches with the question 
image and as an answer we can give this spiral image of the original 
image.Similarly we can give filters to different images producing different 
options asking user for right answer.
4)Direct captcha:We can ask to user direct questions like ask for selecting cat 
out of options consisting of images of cats and humans.an example by pginer 
demonstrate this example
5)Ask User to click on given effect: Asking user to click on images consisting 
of spiral effect's out of options which consist of images with spiral and other 
effects(example:grey scale).
6)Drag and Drop character in Correct Place: We can use drag and drop api of 
html5 to ask user to form an particular alphabet or no out of the pieces of 
character provided .Here is an example to form an character A and an digit 
8 out of the same pieces of character.
This drag and drop capability can be further enhance to form a particular 
shapes.For example form a clip art from a particular set of piece of 
shapesfor example the image given here inserts the correct nose as asked in 
the in the questions out of the possible options provided.
Most,Importantly i think creation of an index system would be fruitful since it 
would rank the inappropriate images on the basis of users response (rank is 
negative for an image if user needs to reload a captcha) to a provided 
captcha.This as the time passes will provide us with relevant images which are 
user friendly and equivalently secure to use...
In addition i sincerely appreciate a point mentioned by Gmansi of creation of 
jigsaw puzzle for the images but in my view point there will be listing of some 
particular category of images and those ranked higher in indexing system to be 
used as jigsaw puzzle.
As an additonal help we can use Extension Assira to make our extension smarter.
please give your valuable suggestions as we can work to improve this amazing 
project. nbsp;at nbsp;https://www.mediawiki.org/wiki/Talk:CAPTCHA nbsp;:)
nbsp;Thank YouAalekh NigamaalekhN
From: Aalekh Nigamlt;aalekh1...@rediffmail.comgt;
Sent: Fri, 28 Feb 2014 23:32:16 
To: wikitech-l@lists.wikimedia.orglt;wikitech-l@lists.wikimedia.orggt;
Subject: Captcha Idea Proposal for GSOC 2014
1)Alphabetical order captcha:We can use Html5's drag and drop Api to list a 
particular Set of images into one category .for example in the example 
mentoinednbsp;in the demo herenbsp;,i made a collection of diff words 
starting with letters A,B,C as an output i grouped up words with starting 
letter A diff from words with starting letter B,CAs,i used text in this 
example we can use images of diff animals such as cat's and dog.and by drag 
and drop we can group images of cat and that of dog in diff 
categories.2)Annotation captcha:We can usenbsp;Images With 
annotationsnbsp;from commons determine the subcategoriy the annotations 
belongs to and then give relevant options to the usersnbsp;;for example in 
thenbsp;filenbsp;we can search from names of different annotations to which 
they corresponds to from wikipedia(names given here are those of mountain) and 
then give the the option's much relevant to the image.3)Effect captcha:We can 
use image as a question which are changed by the effect produced php's gd 
library and the use the same file with another effect and then ask user to 
match the two files.for example:thenbsp;image1nbsp;can be used as a 
question asking user to click on the image that matches with the question image 
and as an answer we can give thisnbsp;spiral imagenbsp;of the original 
image.Similarly we can give filters to different images producing different 
options asking user for right answer.4)Direct captcha:We can ask to user direct 
questions 

Re: [Wikitech-l] captcha idea: proposal for gnome outreach for women 14

2014-02-28 Thread Brad Jorsch (Anomie)
On Fri, Feb 28, 2014 at 12:07 PM, Mansi Gokhale gokhalemans...@gmail.comwrote:

 The image idea where users are asked to spot the odd one out like
 demonstrated or find all the similar images like mentioned in
 herehttps://www.mediawiki.org/wiki/CAPTCHA


If you display 8 images and the user has to pick one, then even by random
guessing the attacker has a 12.5% chance of passing the captcha. That's not
good at all. Finding all matching is slightly better since it reduces the
guessability (1/256 for 8 images), but still not very good. A traditional
captcha using only A-Z is 1/308915776. To do as well with image picking,
you'd need to ask the user to choose the matches from a set of about 28.
Adding in numbers 2-9 is 1/1544804416, needing a set of about 31 images.

The set of possible images also needs to be very large and the
categorization private.
https://www.mediawiki.org/wiki/Talk:Requests_for_comment/CAPTCHA#Issue:_image_classification_CAPTCHAs_need_a_secret_corpusgoes
into much more detail on this issue.

Then there's the issue of different interpretation. Take for example
https://www.mediawiki.org/wiki/File:Find-all-captcha-idea.png. Is the
second image wearing glasses? Or is that a lorgnette or something like
opera glasses, both of which are held in front of the eyes rather than worn?

https://www.mediawiki.org/wiki/File:Find-the-different-captcha-idea.png has
a similar problem. The first image is the only one with a cigarette, and
the only one with non-realistic coloring. The second is the only bald one,
and the only one with something resembling a lorgnette, and the only one
not looking in the general direction of the camera, and the only one with a
book. The fourth is the only child. The sixth is the only obvious female
(I'm not sure about the cat). The eighth is the only one smiling, and the
only one with visible teeth.

Also a picture with a part chipped in could be shown and chipped pictures
 could be given as options like find the missing part from a jigsaw puzzle.


 The image which would be shown is http://imgur.com/uefeb08

 http://imgur.com/KEJqCg3 is the picture which would be the correct option.

 The other options could be rotated versions of this , which would not be
 so easy for the bot to match. (unless it somehow worked some digital
 processing algorithm and matched the color gradients or something like
 that).


That seems very simple for a computer to solve. Just find the option with
minimal difference along the join edges, which is probably easier than what
they already do for OCRing text captchas.


As far as captchas, I still think https://xkcd.com/810/ is the way to go.


-- 
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] captcha idea: proposal for gnome outreach for women 14

2014-02-28 Thread Brad Jorsch (Anomie)
On Fri, Feb 28, 2014 at 1:29 PM, Brad Jorsch (Anomie) bjor...@wikimedia.org
 wrote:

 A traditional captcha using only A-Z is 1/308915776.


That should be a traditional *6 letter* captcha using only A-Z.

Sorry for the noise.

-- 
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha Idea Proposal for GSOC 2014

2014-02-28 Thread Brad Jorsch (Anomie)
Your links didn't work at all, so I can't give specific comments.

On Fri, Feb 28, 2014 at 1:02 PM, Aalekh Nigam aalekh1...@rediffmail.comwrote:

 1)Alphabetical order captcha:We can use Html5's drag and drop Api to list
 a particular Set of images into one category .for example in the example
 mentoinednbsp;in the demo herenbsp;,i made a collection of diff words
 starting with letters A,B,C as an output i grouped up words with
 starting letter A diff from words with starting letter B,CAs,i used
 text in this example we can use images of diff animals such as cat's and
 dog.and by drag and drop we can group images of cat and that of dog in
 diff categories.


What if someone thinks your picture of a dog is wolf, or puppy, or
hound, or terrier, or animal, etc? Or what if the user identifies
your images in Spanish or Chinese rather than English, resulting in a
different order?

Also, how easy would it be for the spambot to download the entire list of
images+names and just brute force it?

And what are the bot's chances by randomly guessing? If there are 8 images
to sort, it's a 1/40320 chance. Which isn't very good as far as captchas
go, 6 letters A-Z is 1/308915776.


 2)Annotation captcha:We can usenbsp;Images With annotationsnbsp;from
 commons determine the subcategoriy the annotations belongs to and then give
 relevant options to the usersnbsp;;for example in thenbsp;filenbsp;we
 can search from names of different annotations to which they corresponds to
 from wikipedia(names given here are those of mountain) and then give the
 the option's much relevant to the image.


What's to stop the spambot from finding the image on Commons?

And looking at that category, are users really going to be able to reliably
identify the Fiat Grande Punto in
https://commons.wikimedia.org/wiki/File:%22_01_-_ITALY_-_ALFA_ROMEO_SPIDER_SILVER_15.jpg,
or figure out WTF UP 5 and UP 6 are supposed to be in
https://commons.wikimedia.org/wiki/File:%22_12_-_ITALY_-_Serie_UP_di_Gaetano_Pesce_UP_5_e_6_al_Triennale_Design_Museum_di_Milano_4.jpg,
or Colli Euganei in
https://commons.wikimedia.org/wiki/File:%22_12_-_ITALY-_Sunset_in_Cavarzere_08.JPG,
or identify the birds by scientific name in
https://commons.wikimedia.org/wiki/File:-_Plastic_boxes_-.jpg, or guess
which chloroplast (in German!) to pick in
https://commons.wikimedia.org/wiki/File:03-10_Mnium2.jpg?



 3)Effect captcha:We can use image as a question which are changed by the
 effect produced php's gd library and the use the same file with another
 effect and then ask user to match the two files.for
 example:thenbsp;image1nbsp;can be used as a question asking user to click
 on the image that matches with the question image and as an answer we can
 give thisnbsp;spiral imagenbsp;of the original image.Similarly we can
 give filters to different images producing different options asking user
 for right answer.


Spambots already solve this sort of thing when OCRing text captchas.


 4)Direct captcha:We can ask to user direct questions like ask for
 selecting cat out of options consisting of images of cats and humans.an
 example by pginer demonstrate thisnbsp;example


I just replied to this idea at
http://lists.wikimedia.org/pipermail/wikitech-l/2014-February/074816.html


 5)Ask User to click on given effect: Asking user to click on images
 consisting of spiral effect's out of options which consist of images with
 spiral and other effects(example:greyscale).


That requires people actually know what the effects names are, which
doesn't seem particularly accessible. And again, OCRing is probably harder
for bots.


 6)Drag and Drop character in Correct Place: We can use drag and drop api
 of html5 to ask user to form an particular alphabet or no out of the pieces
 of character provided .Herenbsp;is an example to form an character A and
 an digit 8 out of the same pieces of character.This drag and drop
 capability can be further enhance to form a particular shapes.For example
 form a clip art from a particular set of piece of shapesfor example the
 image givennbsp;hereinserts the correct nose as asked in the in the
 questions out of the possible options provided.Most,Importantly i think
 creation of an index system would be fruitful since it would rank the
 inappropriate images on the basis of users response (rank is negative for
 an image if user needs to reload a captcha) to a provided captcha.This as
 the time passes will provide us with relevant images which are user
 friendly and equivalently secure to use...In addition i sincerely
 appreciate a point mentioned by Gmansi of creation of jigsaw puzzle for the
 images but in my view point there will be listing of some particular
 category of images and those ranked higher in indexing system to be used as
 jigsaw puzzle.as an additonal help we can usenbsp;Extension
 Assiranbsp;to make our extension smarter.please give your valuable
 suggestions as we can work to improve this amazing project. nbsp;at 

Re: [Wikitech-l] Captcha filter list

2014-01-04 Thread Nathan Larson
On Wed, Jan 1, 2014 at 2:31 AM, Benjamin Lees emufarm...@gmail.com wrote:

 On Wed, Jan 1, 2014 at 2:00 AM, Tim Landscheidt t...@tim-landscheidt.de
 wrote:
 I checked out the registration form for a white supremacist forum, and they
 just use reCAPTCHA.  No doubt they'll be developing a CAPTJCA or CAPTMCA
 soon enough.


 Conversely, one could use an empathy CAPTCHA to try to screen out that
type of crowd (although it might merely screen out the politically
incorrect). http://www.wired.com/threatlevel/2012/10/empathy-captcha/
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2014-01-01 Thread Nathan Larson
On Wed, Jan 1, 2014 at 2:31 AM, Benjamin Lees emufarm...@gmail.com wrote:

 I checked out the registration form for a white supremacist forum, and they
 just use reCAPTCHA.  No doubt they'll be developing a CAPTJCA or CAPTMCA
 soon enough.

 There are likely a number of strings that should be added to the default
 blacklist.  Patches welcome!


Yes, sites that want to screen out users with certain political, religious,
etc. sensibilities could, with some small tweaks to the code, create a
ConfirmEdit whitelist that causes *all* CAPTCHAs to contain words
calculated to offend those groups. Actually, with QuestyCaptcha, this is
already possible. One could include in $wgCaptchaQuestions answers that
require the user to type in strings denigrating certain groups or deities;
swearing allegiance to certain causes or entities; or stating that one has
certain stigmatized attractions, preferences or desires or engages in
certain stigmatized behaviors. One could also use it to screen for a
certain level of intelligence, knowledge, etc. in a given field by asking
questions only certain people would be able to answer. The possibilities
are endless once one's main priority becomes to repel or screen out, rather
than attract, most potential users.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2013-12-31 Thread Steven Walling
On Tue, Dec 31, 2013 at 7:05 PM, George Herbert george.herb...@gmail.comwrote:

 Just got a report and screenshot that a new user got this string for their
 captcha on en.wikipedia nigerblew

 http://snag.gy/JpSUR.jpg

 Though several people are pointing out that Niger is a country, I think
 it's reasonable to try and avoid things close to the two-g version of that
 word; nobody's denigrated by avoiding possibly offensive if misinterpreted
 words.  I recall there's a filter list?...


I don't know if there's a filter list, but this has happened before. I've
seen many of our past and current CAPTCHAs when we were testing the
signup/login redesign. I recall seeing both headshits and obamadick.

The CAPTCHA is two randomly generated English words. Personally, I think it
might just be good to regenerate the CAPTCHAs entirely from time to time.
AFAIK it's not that hard and our CAPTCHAs are weak anyway.

Steven
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2013-12-31 Thread Benjamin Lees
There's a blacklist that has been included with FancyCaptcha for a few
months, although I don't know whether it's the same as the one the WMF uses.

See https://bugzilla.wikimedia.org/show_bug.cgi?id=21025 and the associated
patches.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2013-12-31 Thread Tyler Romeo
It's a CAPTCHA, not an article or piece of actual content. If people are
actually getting offended by randomly generated CAPTCHAs I think they need
to find something more worthwhile to complain about.

-- 
Tyler Romeo
On Jan 1, 2014 12:27 AM, Benjamin Lees emufarm...@gmail.com wrote:

 There's a blacklist that has been included with FancyCaptcha for a few
 months, although I don't know whether it's the same as the one the WMF
 uses.

 See https://bugzilla.wikimedia.org/show_bug.cgi?id=21025 and the
 associated
 patches.
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2013-12-31 Thread George Herbert
Tyler, websites everywhere blacklist offensive words (and with some
regularity, look and sound-alikes)  from the random captcha generator...

I don't personally care, you don't, but if we offend people needlessly it's
an oops.  We need some elements of the site to meet Lowest Common
Denominator rather than most enlightened participant.

That all said, it is possible the screenshot was a fake; the brand new user
posted two things on-wiki; one a faked source for a BBC actress article,
and two the screenshot for the captcha.

On-wiki conclusion was we were trolled.  Not sure if that means the
screenshot was a photoshop job or a real one, whether that was part of the
troll or not.  But the actual edit was a good solid troll.




On Tue, Dec 31, 2013 at 9:30 PM, Tyler Romeo tylerro...@gmail.com wrote:

 It's a CAPTCHA, not an article or piece of actual content. If people are
 actually getting offended by randomly generated CAPTCHAs I think they need
 to find something more worthwhile to complain about.

 --
 Tyler Romeo
 On Jan 1, 2014 12:27 AM, Benjamin Lees emufarm...@gmail.com wrote:

  There's a blacklist that has been included with FancyCaptcha for a few
  months, although I don't know whether it's the same as the one the WMF
  uses.
 
  See https://bugzilla.wikimedia.org/show_bug.cgi?id=21025 and the
  associated
  patches.
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
-george william herbert
george.herb...@gmail.com
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2013-12-31 Thread Marc A. Pelletier
On 01/01/2014 12:34 AM, George Herbert wrote:
 Tyler, websites everywhere blacklist offensive words (and with some
 regularity, look and sound-alikes)  from the random captcha generator...

Yes, and then we end up with the Scunthorpe problem instead.

I agree it's a little bit silly, and also a loosing proposition; even
trying to filter for /actual/ cuss words is hard enough (because the
list of word/fragments someone *might* find offensive is boundless); if
we try to also block misspelling, lookalikes or cognates we might as
well block /^[a-z]*$/.

-- Marc


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2013-12-31 Thread Benjamin Lees
On Wed, Jan 1, 2014 at 12:42 AM, Marc A. Pelletier m...@uberbox.org wrote:

 Yes, and then we end up with the Scunthorpe problem instead.


The Scunthorpe problem is not actually a problem here, because we're just
limiting the CAPTCHAs we serve to users, not filtering their input.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2013-12-31 Thread Tim Landscheidt
Marc A. Pelletier m...@uberbox.org wrote:

 Tyler, websites everywhere blacklist offensive words (and with some
 regularity, look and sound-alikes)  from the random captcha generator...

 Yes, and then we end up with the Scunthorpe problem instead.

 I agree it's a little bit silly, and also a loosing proposition; even
 trying to filter for /actual/ cuss words is hard enough (because the
 list of word/fragments someone *might* find offensive is boundless); if
 we try to also block misspelling, lookalikes or cognates we might as
 well block /^[a-z]*$/.

Not only that, the selection of blacklisted words may be of-
fensive itself.  For example,
https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FConfirmEdit/master/blacklist
protects the Christian god with two entries, while Allah is
up for ridicule.  And it might even require Jews to type in
the tetragrammaton and thus effectively ban them from a
site.

IMHO if someone is offended by a captcha, they should click
on reload.

Tim


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha filter list

2013-12-31 Thread Benjamin Lees
On Wed, Jan 1, 2014 at 2:00 AM, Tim Landscheidt t...@tim-landscheidt.dewrote:

 Not only that, the selection of blacklisted words may be of-
 fensive itself.


That doesn't seem like a problem, since the list isn't visible in the user
interface.  Users will not be complaining that they aren't receiving
niggardly, cocky, and cumin in CAPTCHAs.


 For example,

 https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FConfirmEdit/master/blacklist
 protects the Christian god with two entries, while Allah is
 up for ridicule.  And it might even require Jews to type in
 the tetragrammaton and thus effectively ban them from a
 site.


I checked out the registration form for a white supremacist forum, and they
just use reCAPTCHA.  No doubt they'll be developing a CAPTJCA or CAPTMCA
soon enough.

There are likely a number of strings that should be added to the default
blacklist.  Patches welcome!
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] CAPTCHA

2013-03-22 Thread Platonides
On 21/03/13 08:05, Federico Leva (Nemo) wrote:
 Restrictive wikis for captchas are only a handful (plus pt.wiki which is
 in permanent emergency mode).
 https://meta.wikimedia.org/wiki/Newly_registered_user
 For them you could request confirmed flag at
 https://meta.wikimedia.org/wiki/SRP
 Personally I found it easier to do the required 10, 50 or whatever edits
 on a userpage. 5 min at most and you're done.
 
 Nemo

Their problem is likely that their accounts are new, not that those
wikis additionally require a minimum number of edits (only a handful of
wikis have that).


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] CAPTCHA

2013-03-22 Thread Steven Walling
On Wed, Mar 20, 2013 at 8:23 PM, James Heilman jmh...@gmail.com wrote:

 Hey All

 I have someone helping me add translation done by Translators Without
 Borders of key medical articles. An issue that slows the work is that
 many languages require CAPTCHA to save the edits. Is their anyway
 around this (ie to get an account confirmed in all languages)?


This doesn't quite solve your problem, but one enhancement that may reduce
frustration is the addition of a refresh button on the CAPTCHA (
https://bugzilla.wikimedia.org/show_bug.cgi?id=14230).

This is slowly but surely being worked on at
https://gerrit.wikimedia.org/r/#/c/44376/

Steven
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] CAPTCHA

2013-03-21 Thread Federico Leva (Nemo)
Restrictive wikis for captchas are only a handful (plus pt.wiki which is 
in permanent emergency mode). 
https://meta.wikimedia.org/wiki/Newly_registered_user
For them you could request confirmed flag at 
https://meta.wikimedia.org/wiki/SRP
Personally I found it easier to do the required 10, 50 or whatever edits 
on a userpage. 5 min at most and you're done.


Nemo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] CAPTCHA

2013-03-20 Thread Tyler Romeo
Technically it should be possible. I believe there's a Request for
permissions page or something of the sorts on meta-wiki for this purpose.
Somebody with more knowledge can correct me if I'm wrong.

*-- *
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerro...@gmail.com


On Wed, Mar 20, 2013 at 11:23 PM, James Heilman jmh...@gmail.com wrote:

 Hey All

 I have someone helping me add translation done by Translators Without
 Borders of key medical articles. An issue that slows the work is that
 many languages require CAPTCHA to save the edits. Is their anyway
 around this (ie to get an account confirmed in all languages)?

 Project is here

 http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Medicine/Translation_task_force/RTT

 --
 James Heilman
 MD, CCFP-EM, Wikipedian

 The Wikipedia Open Textbook of Medicine
 www.opentextbookofmedicine.com

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha for non-English speakers II

2012-08-02 Thread Pau Giner
I made some mockups to illustrate some of the ideas on captchas that could
be less problematic for non-English speakers, improve the general UX and
rely on images from commons.

- Panorama captcha:
http://commons.wikimedia.org/wiki/File:Panorama-captcha-idea.png
Based on tagging parts of a panorama picture with the appropriate word (in
the UI language or Basic English words).

- 'Who is who' captcha:
http://commons.wikimedia.org/wiki/File:Find-all-captcha-idea.png
Based on finding from a set of similar images the ones that fit a specific
criteria (with an image describing also the criteria).

- 'Find the different' captcha:
http://commons.wikimedia.org/wiki/File:Find-the-different-captcha-idea.png
Based on finding the image that is different from a set of images.


These captchas will probably generate new problems for the technical side,
require adjustments to reduce the chance of a machine to solve them, or may
just be unfeasible to generate, but I wanted to provide these ideas in case
anybody else may use it as a base for improve on any technical weakness
they may have and make them at least as hard to solve for a machine as
text-based captchas are.

Pau

On Wed, Aug 1, 2012 at 5:53 PM, Helder . helder.w...@gmail.com wrote:

 On Thu, Jul 26, 2012 at 10:53 AM, Everton Zanella Alvarenga
 ezalvare...@wikimedia.org wrote:
  After working on campus with new
  editors in Brazil, I've checked this is a real obstacle, since most
  people here cannot ready English at all.
 
  I'd like to know if there are plans to solve this issue - I hope I
  don't sound rude, maybe this can be a minor issue when we don't see
  the difficulties people from a different place can face. I think this
  is important for Wikipedias other than the English one (just read
  people comments in the bug) and we can be loosing new contributors
  because of their first impressions. Thanks,

 It should be noted that this is the only Wikipedia where the captcha
 is triggered for any edits made by anonymous or unconfirmed users[1],
 not just for edits which add urls.
 So, those users are affected by any issues of captcha on ALL their edits.

 Best regards,
 Helder

 [1]
 https://gerrit.wikimedia.org/r/gitweb?p=operations/mediawiki-config.git;a=blob;f=wmf-config/InitialiseSettings.php;h=5dd1bb63d399f0ba7bc49a4656dfed9109b550da;hb=HEAD#l8470

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
Pau Giner
Interaction Designer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-08-02 Thread Steven Walling
On Thu, Aug 2, 2012 at 1:59 AM, Pau Giner pgi...@wikimedia.org wrote:

 I made some mockups to illustrate some of the ideas on captchas that could
 be less problematic for non-English speakers, improve the general UX and
 rely on images from commons.

 - Panorama captcha:
 http://commons.wikimedia.org/wiki/File:Panorama-captcha-idea.png
 Based on tagging parts of a panorama picture with the appropriate word (in
 the UI language or Basic English words).

 - 'Who is who' captcha:
 http://commons.wikimedia.org/wiki/File:Find-all-captcha-idea.png
 Based on finding from a set of similar images the ones that fit a specific
 criteria (with an image describing also the criteria).

 - 'Find the different' captcha:
 http://commons.wikimedia.org/wiki/File:Find-the-different-captcha-idea.png
 Based on finding the image that is different from a set of images.


 These captchas will probably generate new problems for the technical side,
 require adjustments to reduce the chance of a machine to solve them, or may
 just be unfeasible to generate, but I wanted to provide these ideas in case
 anybody else may use it as a base for improve on any technical weakness
 they may have and make them at least as hard to solve for a machine as
 text-based captchas are.


Thanks Pau, that's really helpful. :)

Since we've progressed from just the idea stage to mockups, but we still
have a lot of different options, I've started an RfC on MediaWiki.org where
we can list all the issues and potential solutions. I don't personally
think we need to come to a consensus right now, but CAPTCHAs are going to
keep coming up even if no action is taken in the short term, so we should
document all our ideas.

https://www.mediawiki.org/wiki/Requests_for_comment/CAPTCHA

Steven
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-08-01 Thread Risker
On 31 July 2012 22:25, MZMcBride z...@mzmcbride.com wrote:

 Risker wrote:
  Putting on my checkuser hat for a moment - yes, please please look at
  finding a different CAPTCHA process - the cross-wiki spamming by bots
 that
  are able to break the CAPTCHA is becoming overwhelming.  This issue has
  been reported separately, and there may be a different fix, but this is a
  pretty big deal as a few hundred volunteer hours a month are going into
 the
  despamming effort.

 Reported separately where?


Bugzilla, I understand, by one or more of the checkusers


 CAPTCHAs were designed for test if you're human, not test if you're
 spam. It's a wonder they've worked this long. I imagine better anti-spam
 tools are needed (which may be a new extension, new AbuseFilter filters,
 better user scripts, etc.).


They're spambots and thus both non-human and spammy.


 If the situation is as dire as it sounds, it shouldn't be difficult to find
 a few resources to throw at the problem. In a discussion like this,
 examples
 of particular problematic behavior (links!) are always most helpful to
 developers, I've found. This is the bad behavior we're seeing and want to
 stop. How should we do that? :-)


I've been advised some of WMF's best are already working on it; however, if
you want to see examples you could look at steward Billinghurst's block
logs, mostly on non=English wikis.

Risker/Anne
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-08-01 Thread Chris Steipp
On Wed, Aug 1, 2012 at 4:30 AM, Risker risker...@gmail.com wrote:

 If the situation is as dire as it sounds, it shouldn't be difficult to find
 a few resources to throw at the problem. In a discussion like this,
 examples
 of particular problematic behavior (links!) are always most helpful to
 developers, I've found. This is the bad behavior we're seeing and want to
 stop. How should we do that? :-)


 I've been advised some of WMF's best are already working on it; however, if
 you want to see examples you could look at steward Billinghurst's block
 logs, mostly on non=English wikis.

There are indeed a few of us working to improve the tools that we're
using to prevent, detect, and react/recover from spam on WMF wikis.
None of our tools are going to be perfect, and that's why we need all
approaches.

Preventing spam bots from editing using captchas is just one tool that
we're using to prevent spam, so it's not at the top of my list for
development projects. But this conversation has been helpful. If there
are any ways that we can make captchas easier for legitimate users,
prevent more bots, and decrease the amount of spam that AbuseFilter
has to catch, then it's a win for us. If there is strong consensus for
ways to improve our captchas, then I think we can certainly add it to
our list of projects and prioritize it with our available resources
(or help find volunteers to implement).

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-08-01 Thread Helder .
On Thu, Jul 26, 2012 at 10:53 AM, Everton Zanella Alvarenga
ezalvare...@wikimedia.org wrote:
 After working on campus with new
 editors in Brazil, I've checked this is a real obstacle, since most
 people here cannot ready English at all.

 I'd like to know if there are plans to solve this issue - I hope I
 don't sound rude, maybe this can be a minor issue when we don't see
 the difficulties people from a different place can face. I think this
 is important for Wikipedias other than the English one (just read
 people comments in the bug) and we can be loosing new contributors
 because of their first impressions. Thanks,

It should be noted that this is the only Wikipedia where the captcha
is triggered for any edits made by anonymous or unconfirmed users[1],
not just for edits which add urls.
So, those users are affected by any issues of captcha on ALL their edits.

Best regards,
Helder

[1] 
https://gerrit.wikimedia.org/r/gitweb?p=operations/mediawiki-config.git;a=blob;f=wmf-config/InitialiseSettings.php;h=5dd1bb63d399f0ba7bc49a4656dfed9109b550da;hb=HEAD#l8470

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-31 Thread Tei
Sounds like captchas is something you want to make plug and play, and
use some external project that is evolving quickly to stay in the
winning side of a arms race.
Also sounds like captchas is something you want to be handled by
locals, to avoid the situation a chinese wiki with a english captcha.

Is pretty much proved that small self-made captchas don't do for
something like mediawiki, because attackers target it and is a huge
delicious target.


Has experience of people with AI and computer power raise, perhaps
this will become a lost battle*. The other option is anon can't edit
articles, ...anon edits are invisible and waiting for moderation,
..anon changes are satinified in some way (perhaps not allowing new
external links / modiying links ).


* I can imagine the ability of bots to understand catpchas will grown,
but not the ability of humans.


-- 
--
ℱin del ℳensaje.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-31 Thread James Forrester
On 30 July 2012 15:22, Platonides platoni...@gmail.com wrote:
 On 30/07/12 15:28, Pau Giner wrote:
 From the UX perspective, a captcha is always an obstacle for the
 interaction flow.

 I agree. But when you're spammed to death if there's no captcha,
 you end up accepting it as a necessary evil.

Just to jump in here, it's not actually clear that our CAPTCHAs work
at all at this point (per Tim's e-mail from last year of being able to
robotically break ours 75% of the time).

On https://www.mediawiki.org/wiki/Admin_tools_development (created
last week), we in WMF Engineering noted that we'd want to look
properly at some data around these CAPTCHAs and how they're working.
This might show us that it would be sensible to just turn them off
(which of course would help usability for all users), as long as we're
happy that the tools for preventing the vandalism they were intended
to stop are working well.

Yours,
-- 
James D. Forrester
Product Manager for Visual Editor and Flagged Revisions
Wikimedia Foundation, Inc.

jforres...@wikimedia.org | @jdforrester | +1 415-839-6885 x6844

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-31 Thread Risker
On 31 July 2012 13:53, James Forrester jforres...@wikimedia.org wrote:

 On 30 July 2012 15:22, Platonides platoni...@gmail.com wrote:
  On 30/07/12 15:28, Pau Giner wrote:
  From the UX perspective, a captcha is always an obstacle for the
  interaction flow.
 
  I agree. But when you're spammed to death if there's no captcha,
  you end up accepting it as a necessary evil.

 Just to jump in here, it's not actually clear that our CAPTCHAs work
 at all at this point (per Tim's e-mail from last year of being able to
 robotically break ours 75% of the time).

 On https://www.mediawiki.org/wiki/Admin_tools_development (created
 last week), we in WMF Engineering noted that we'd want to look
 properly at some data around these CAPTCHAs and how they're working.
 This might show us that it would be sensible to just turn them off
 (which of course would help usability for all users), as long as we're
 happy that the tools for preventing the vandalism they were intended
 to stop are working well.

 Yours,
 -


Putting on my checkuser hat for a moment - yes, please please look at
finding a different CAPTCHA process - the cross-wiki spamming by bots that
are able to break the CAPTCHA is becoming overwhelming.  This issue has
been reported separately, and there may be a different fix, but this is a
pretty big deal as a few hundred volunteer hours a month are going into the
despamming effort.

Risker/Anne
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-31 Thread Platonides
On 31/07/12 19:53, James Forrester wrote:
 I agree. But when you're spammed to death if there's no captcha,
 you end up accepting it as a necessary evil.
 
 Just to jump in here, it's not actually clear that our CAPTCHAs work
 at all at this point (per Tim's e-mail from last year of being able to
 robotically break ours 75% of the time).
 
 On https://www.mediawiki.org/wiki/Admin_tools_development (created
 last week), we in WMF Engineering noted that we'd want to look
 properly at some data around these CAPTCHAs and how they're working.
 This might show us that it would be sensible to just turn them off
 (which of course would help usability for all users), as long as we're
 happy that the tools for preventing the vandalism they were intended
 to stop are working well.
 
 Yours,

I went to a certain site I was recently pointed to.
Site is plain MediaWiki. No antispam extensions installed. Bot ips
weren't blocked either. Bots seem to have been editing in a single article.

Article created in 2011
10 July 2012: First vandalism edit. Page replaced with gibberish,
including gibberish links. This looks like a test to see if it is
patrolled or not.
On 2012-07-15 they start replacing with working domains and keywords.
From 2012-07-15 to 2012-07-30 there are 500-600 spammy edits *per day*.
Today (2012-07-31) edit count raised to 1643 edits. That's a rate of
1.14 edits per minute!

Those look like generic bots, though. SimpleAntiSpam or MathCaptcha may
be able to stop them. It may be worth preparing some honeypots for them
and observing their behavior.

Our wikis are much better protected, though. Any such bot would be
blocked, the article protected, the ips added to the SpamBlacklist, and
an EditFilter written to autoblock him everytime.
But it is useful to see the sharks that are out there. And even with
many wikignomes, they can easily get overwhelmed when trying to stop it
first time.

Regards


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-31 Thread MZMcBride
Risker wrote:
 Putting on my checkuser hat for a moment - yes, please please look at
 finding a different CAPTCHA process - the cross-wiki spamming by bots that
 are able to break the CAPTCHA is becoming overwhelming.  This issue has
 been reported separately, and there may be a different fix, but this is a
 pretty big deal as a few hundred volunteer hours a month are going into the
 despamming effort.

Reported separately where?

CAPTCHAs were designed for test if you're human, not test if you're
spam. It's a wonder they've worked this long. I imagine better anti-spam
tools are needed (which may be a new extension, new AbuseFilter filters,
better user scripts, etc.).

If the situation is as dire as it sounds, it shouldn't be difficult to find
a few resources to throw at the problem. In a discussion like this, examples
of particular problematic behavior (links!) are always most helpful to
developers, I've found. This is the bad behavior we're seeing and want to
stop. How should we do that? :-)

MZMcBride



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-31 Thread Dmitriy Sintsov



1 Август 2012 г. 6:26:02 пользователь MZMcBride (z...@mzmcbride.com) написал:

Risker wrote:
 Putting on my checkuser hat for a moment - yes, please please look at
 finding a different CAPTCHA process - the cross-wiki spamming by bots that
 are able to break the CAPTCHA is becoming overwhelming.    This issue has
 been reported separately, and there may be a different fix, but this is a
 pretty big deal as a few hundred volunteer hours a month are going into the
 despamming effort.

Reported separately where?

CAPTCHAs were designed for test if you're human, not test if you're
spam. It's a wonder they've worked this long. I imagine better anti-spam
tools are needed (which may be a new extension, new AbuseFilter filters,
better user scripts, etc.).

If the situation is as dire as it sounds, it shouldn't be difficult to find
a few resources to throw at the problem. In a discussion like this, examples
of particular problematic behavior (links!) are always most helpful to
developers, I've found. This is the bad behavior we're seeing and want to
stop. How should we do that? :-)


How's about image-based approach?
http://en.wikipedia.org/wiki/CAPTCHA#Interaction_with_images_as_an_alternative_to_texting_.28text_typing.29
http://www.picatcha.com/captcha/
Dmitriy

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-30 Thread Pau Giner
From the UX perspective, a captcha is always an obstacle for the
interaction flow.
Reducing the complexity of user interaction when solving the captcha can
benefit all kinds of users but also solve problems for non-English speakers.

Checkbox and honeypot-based captchas avoid most of the problems of
text-based captchas since interaction is simplified to the minimum for the
user:
http://uxmovement.com/forms/captchas-vs-spambots-why-the-checkbox-captcha-wins/


Simple questions where the user can select an answer (not type) will solve
some of the input-related issues for non-English speakers.
These questions can be of different kinds (e.g., Which one does not belong
to the group: Red, Green, Skateboard, Blue?, Is fire hot or cold?) and
they can be based on text or image selection.
An example of image-based captcha is available at
http://www.picatcha.com/captcha/

Tagging media can be also used as a captcha. Google has been experimenting
with asking users to tag videos as a captcha:
http://cups.cs.cmu.edu/soups/2009/proceedings/a14-kleuver.pdf  [PDF]


In any case, some experimentation would be required to determine any of the
above approaches (or combination of several) provides an appropriate
security-usability balance for the specific needs of the Wikipedia.


Pau



On Sat, Jul 28, 2012 at 8:29 PM, Platonides platoni...@gmail.com wrote:

 On 28/07/12 16:55, Everton Zanella Alvarenga wrote:
  In the conclusion:
 
  Contrary to the common belief, text-based CAPTCHAs can be difficult
  for foreigners.
 
  It is worth reading and likely the same for references there in. The
  first sentence is similar to what I have experience in 3 classes. And
  people begin to get anxious and usually say If I type wrongly again,
  I'll give up. I've seen 3 students saying this to me.
 
  Even if hypothetically had in an experiment that only 1% of foreigners
  will face difficulties with CAPTCHA in a foreign language (I bet it's
  much more from real life experience), how much users this would
  represent in one of the most accessed sites in the world?
 
  Tom

 There are two types of foreigners here:
 - One are speakers of another language written in latin1 (such as
 Brazilians).
 - Another are those who use a diferent writing script, such as Russians
 or Greeks.

 In the first case, they should have little problem. Native speakers of
 the language used for the wordlist have an extra help, because they are
 more likely to recognise the words and it can also help them perform
 error recovery.

 It would be nice to provide a captcha with a native wordlist, but by
 limiting to ascii characters, it can get pretty universal.

 Distortion where a letter looks like a different one is still
 problematic. Even people with English knowledge can have trouble with
 it, so being a native speaker doesn't magically make you invulnerable to
 captcha errors.
 On 16th July of 2007 Arnomane reported a case where o distortion made
 it look like an a, on August I reported another where an s looked
 like a g.
 I expect that using random characters would make it worse, though.

 People with other scripts are a different matter.
 * They may not be able to recognise the latin characters.
 * You may be forcing them to change the language layouts for solving the
 captcha.
 * Foreign visitors may not be able to pass your captcha.
 ** Lack of appropiate keyboard layout.
 ** Unable to differenciate the characters (you want me to differenciate
 ت  and ث distorted in a noisy background?)
 ** No fonts installed for viewing the characters (eg. 〝 vs 〞) such as
 if you were trying to browse the in character map the  script characters
 of the language (potentially hundreds!) looking for a visual match.

 Yet, there are reports such as this by Liangent (native Chinese speaker)
 on this list on 5th February 2011:
  I hate the case that I'm asked with a Chinese captcha when I'm surfing
  some Chinese websites without IME available.
 
  Besides I don't prefer Chinese captchas personally because Chinese
  characters usually require more key hits.


 At least for those languages I think we would need a switch to get a
 captcha in the different language.

 We should also add the button to get a new captcha (bug 14230), which
 should help when you get the wrong captcha.
 And I think we should also add a Problems solving the captcha? Mail us
 link for those cases when people can't pass the captcha.
 Not that it would solve their problems, but it would at least provide a
 way to lighten their frustration.


 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
Pau Giner
Interaction Designer
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-30 Thread Daniel Friesen

Those checkbox and honeypot captchas look like junk to me.

Firstly the checkbox captcha. It relies entirely on the assumption that  
spambots don't have JavaScript. It also assumes that spambots won't simply  
get wise and throw a few regexp tests to figure out when the plugin is  
sitting on the page inserting a form element. If people actually start  
using checkbox captchas they will inevitably become useless.
Additionally it imposes the requirement that the client has JavaScript  
enabled simply to make an edit. This is something we consider unacceptable.


honeypot-captchas... yeah, we already have that:
https://www.mediawiki.org/wiki/Extension:SimpleAntiSpam
If it weren't for the fact that it's useless for login-only and private  
wikis I'd bake it right into core.
honeypot-captchas aren't actually captchas. As a testament to that a real  
captcha and SimpleAntiSpam can be installed at the same time.
And I do recommend you do that. SimpleAntiSpam trips up the trivial bots  
while the captcha deals with the non-trivial link inserting bots.
But that's all they do. Beyond the most worthless of spambots,  
honeypot-captchas have absolutely no value. If a bot is capable of  
breaking any normal captcha it is already sophisticated enough that a  
honeypot-captcha will do absolutely nothing.


Need I remind people we have bots walking around that know how to register  
and login to MediaWiki. Know how to deal with image captchas. Know how to  
wait for autoconfirmed status. Know how to confirm an AbuseFilter warning  
page. And even know how to upload an image and use it in wikitext.


--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
On Mon, 30 Jul 2012 06:28:13 -0700, Pau Giner pgi...@wikimedia.org wrote:


From the UX perspective, a captcha is always an obstacle for the
interaction flow.
Reducing the complexity of user interaction when solving the captcha can
benefit all kinds of users but also solve problems for non-English  
speakers.


Checkbox and honeypot-based captchas avoid most of the problems of
text-based captchas since interaction is simplified to the minimum for  
the

user:
http://uxmovement.com/forms/captchas-vs-spambots-why-the-checkbox-captcha-wins/


Simple questions where the user can select an answer (not type) will  
solve

some of the input-related issues for non-English speakers.
These questions can be of different kinds (e.g., Which one does not  
belong

to the group: Red, Green, Skateboard, Blue?, Is fire hot or cold?) and
they can be based on text or image selection.
An example of image-based captcha is available at
http://www.picatcha.com/captcha/

Tagging media can be also used as a captcha. Google has been  
experimenting

with asking users to tag videos as a captcha:
http://cups.cs.cmu.edu/soups/2009/proceedings/a14-kleuver.pdf  [PDF]


In any case, some experimentation would be required to determine any of  
the

above approaches (or combination of several) provides an appropriate
security-usability balance for the specific needs of the Wikipedia.


Pau



On Sat, Jul 28, 2012 at 8:29 PM, Platonides platoni...@gmail.com wrote:


On 28/07/12 16:55, Everton Zanella Alvarenga wrote:
 In the conclusion:

 Contrary to the common belief, text-based CAPTCHAs can be difficult
 for foreigners.

 It is worth reading and likely the same for references there in. The
 first sentence is similar to what I have experience in 3 classes. And
 people begin to get anxious and usually say If I type wrongly again,
 I'll give up. I've seen 3 students saying this to me.

 Even if hypothetically had in an experiment that only 1% of foreigners
 will face difficulties with CAPTCHA in a foreign language (I bet it's
 much more from real life experience), how much users this would
 represent in one of the most accessed sites in the world?

 Tom

There are two types of foreigners here:
- One are speakers of another language written in latin1 (such as
Brazilians).
- Another are those who use a diferent writing script, such as Russians
or Greeks.

In the first case, they should have little problem. Native speakers of
the language used for the wordlist have an extra help, because they are
more likely to recognise the words and it can also help them perform
error recovery.

It would be nice to provide a captcha with a native wordlist, but by
limiting to ascii characters, it can get pretty universal.

Distortion where a letter looks like a different one is still
problematic. Even people with English knowledge can have trouble with
it, so being a native speaker doesn't magically make you invulnerable to
captcha errors.
On 16th July of 2007 Arnomane reported a case where o distortion made
it look like an a, on August I reported another where an s looked
like a g.
I expect that using random characters would make it worse, though.

People with other scripts are a different matter.
* They may not be able to recognise the latin characters.
* You may be forcing them to change the language 

Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-30 Thread Platonides
On 30/07/12 15:28, Pau Giner wrote:
 From the UX perspective, a captcha is always an obstacle for the
 interaction flow.
I agree.
But when you're spammed to death if there's no captcha, you end up
accepting it as a necessary evil.
But don't let this pessimistic view stop you from proposing new
alternatives.


 Reducing the complexity of user interaction when solving the captcha can
 benefit all kinds of users but also solve problems for non-English speakers.
 
 Checkbox and honeypot-based captchas avoid most of the problems of
 text-based captchas since interaction is simplified to the minimum for the
 user:
 http://uxmovement.com/forms/captchas-vs-spambots-why-the-checkbox-captcha-wins/

No. Those work against generic spambots. For a small site, pretty much
any custom-made captcha will work.
When someone designs against your captcha, you need to provide a hard test.
If we were comparing against a math captcha, checkbox is more usable
while only slightly weaker. None of them has a chance against a captcha
designed against them.

If you run Wikipedia, bad guys will work to defeat your captcha and
spam/vandalise/annoy you.
If you are developing MediaWiki, a wiki used in thousands of sites [1],
spammers will work to make bots capable to spam those many MediaWiki
installs (cf. DantMan reply)
If you are Open Source, then it's much harder to make (not only due to
security by obscurity of the code, but also of the own challenges...).


1- http://www.google.com/search?q=%22powered%20by%20mediawiki%22
~201.000.000 results



 Simple questions where the user can select an answer (not type) will solve
 some of the input-related issues for non-English speakers.
 These questions can be of different kinds (e.g., Which one does not belong
 to the group: Red, Green, Skateboard, Blue?, Is fire hot or cold?) and
 they can be based on text or image selection.
 An example of image-based captcha is available at
 http://www.picatcha.com/captcha/

No.
Those are *harder* since you need a knowledge of English language and terms.

I can fill in a text captcha in a foreign language site since its own
appearance (after being trained by hundreds of sites!) shows what it is
expected from me.
If I go to http://www.picatcha.com/captcha/, I am asked to Select ALL
the images of «concept». Which is fine but requires me to know what is
that «concept». I might eg. think that hourglasses are a kind of
spectacles (eyeglasses) and get very annoyed by not being able to pass it.

Also, making good questions is tricky. You need to produce loads of that
kind of questions with their answers, if you made just a few hundreds
(eg. it's done by a human), I could make a list of questions with their
answer (manually solved) and spam you as many times I want.

You want to make intelligent questions hard for bots, but anyone should
be able to solve them, even if they are young, uneducated or foreign.
I may know that I have to rule colors out, but I don't which of
skateboard vs turquoise is the color.
And yet, you can't dumbify it so much that a computer will be able to
answer it.

Suppose you are performing questions of type Is X Y or Z? and have
made thousands of pairs (that you can't share!).
A naive approach would just to answer Y or Z at random, accepting a 50%
of failure (bots don't mind resending their requests many times, a 50%
blocking captcha is broken). But we can do better, when you ask my bot
Is fire hot or cold? it could go and search google for those concepts:
* fire hot 1.210.000.000 results
* fire cold 656.000.000 results

There's a very clear correlation of fire with hot rather than with cold,
thus it chooses 'hot', and defeats your captcha. :)



 Tagging media can be also used as a captcha. Google has been experimenting
 with asking users to tag videos as a captcha:
 http://cups.cs.cmu.edu/soups/2009/proceedings/a14-kleuver.pdf  [PDF]

If we were doing this with Wikimedia Commons videos
a) The video set is known, as are the descriptions. Ergo, match the
video with its file and .
b) IMHO having to watch a video (even if short) is *more* annoying than
typing a text captcha.*
c) No/poor localisation.


* This needs to be balanced with how much you want to enter the
captcha-walled garden, of course. I may accept watching your CEO
boasting about your service (from which you then ask me the captcha**)
in exchange for a gmail-like mail account or multigigabyte dropbox
storage, but not to watch one everytime I sign in!

** Don't complain if he's tagged by most users as 'boring'. :)


 In any case, some experimentation would be required to determine any of the
 above approaches (or combination of several) provides an appropriate
 security-usability balance for the specific needs of the Wikipedia.

We would first need an evaluation of what is considered spam, and how to
measure. If we get lots of bots the next day you enable it, it's clearly
broken, but how much time would we need before being x% confident that
it is secure enough, when you are 

Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-28 Thread Everton Zanella Alvarenga
Usability of CAPTCHAs Or usability issues in CAPTCHA design, Jeff
Yan and Ahmad Salah El Ahmad (Newcastle University, UK)

http://homepages.cs.ncl.ac.uk/jeff.yan/soups08.pdf

Pages 3 and 4:

Friendly to foreigners? In theory, text-based CAPTCHAs are
intuitive to world-wide users and have little localization issues –
these were recognised by many researchers (e.g. [5]) as major
advantages of text-based CAPTCHAs over other schemes.
However, in a small scale test carried out with 20 students in the
first author’s class in October 2007, we observed that many
foreign students whose mother tongue does not use the Latin
alphabet performed much worse than those whose first language
is based on Latin alphabet (e.g. native English speakers), when
asked to recognise distorted challenges generated by BaffleText
[6], an early text-based scheme. The former found it hard to
recognise (or even guess) distorted letters in the scheme.

[...]

The performance difference between foreigners and natives does
not appear to be large in the case of reCAPTCHA. However,
given the size of population using this service (hundreds of
thousands websites serving millions of people at least, for
example, popular sites such as Facebook and Twitter are amongst
subscribers of this service), this “being friendly to foreigners”
issue can be a serious usability concern. Moreover, for schemes
whose designers were unaware of this issue, usability problems
caused can be even worse.

[...]

In the conclusion:

Contrary to the common belief, text-based CAPTCHAs can be difficult
for foreigners.

It is worth reading and likely the same for references there in. The
first sentence is similar to what I have experience in 3 classes. And
people begin to get anxious and usually say If I type wrongly again,
I'll give up. I've seen 3 students saying this to me.

Even if hypothetically had in an experiment that only 1% of foreigners
will face difficulties with CAPTCHA in a foreign language (I bet it's
much more from real life experience), how much users this would
represent in one of the most accessed sites in the world?

Tom

-- 
Everton Zanella Alvarenga (also Tom)
Wikimedia Brasil
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-28 Thread Platonides
On 28/07/12 16:55, Everton Zanella Alvarenga wrote:
 In the conclusion:
 
 Contrary to the common belief, text-based CAPTCHAs can be difficult
 for foreigners.
 
 It is worth reading and likely the same for references there in. The
 first sentence is similar to what I have experience in 3 classes. And
 people begin to get anxious and usually say If I type wrongly again,
 I'll give up. I've seen 3 students saying this to me.
 
 Even if hypothetically had in an experiment that only 1% of foreigners
 will face difficulties with CAPTCHA in a foreign language (I bet it's
 much more from real life experience), how much users this would
 represent in one of the most accessed sites in the world?
 
 Tom

There are two types of foreigners here:
- One are speakers of another language written in latin1 (such as
Brazilians).
- Another are those who use a diferent writing script, such as Russians
or Greeks.

In the first case, they should have little problem. Native speakers of
the language used for the wordlist have an extra help, because they are
more likely to recognise the words and it can also help them perform
error recovery.

It would be nice to provide a captcha with a native wordlist, but by
limiting to ascii characters, it can get pretty universal.

Distortion where a letter looks like a different one is still
problematic. Even people with English knowledge can have trouble with
it, so being a native speaker doesn't magically make you invulnerable to
captcha errors.
On 16th July of 2007 Arnomane reported a case where o distortion made
it look like an a, on August I reported another where an s looked
like a g.
I expect that using random characters would make it worse, though.

People with other scripts are a different matter.
* They may not be able to recognise the latin characters.
* You may be forcing them to change the language layouts for solving the
captcha.
* Foreign visitors may not be able to pass your captcha.
** Lack of appropiate keyboard layout.
** Unable to differenciate the characters (you want me to differenciate
ت  and ث distorted in a noisy background?)
** No fonts installed for viewing the characters (eg. 〝 vs 〞) such as
if you were trying to browse the in character map the  script characters
of the language (potentially hundreds!) looking for a visual match.

Yet, there are reports such as this by Liangent (native Chinese speaker)
on this list on 5th February 2011:
 I hate the case that I'm asked with a Chinese captcha when I'm surfing
 some Chinese websites without IME available.
 
 Besides I don't prefer Chinese captchas personally because Chinese
 characters usually require more key hits.


At least for those languages I think we would need a switch to get a
captcha in the different language.

We should also add the button to get a new captcha (bug 14230), which
should help when you get the wrong captcha.
And I think we should also add a Problems solving the captcha? Mail us
link for those cases when people can't pass the captcha.
Not that it would solve their problems, but it would at least provide a
way to lighten their frustration.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-27 Thread Everton Zanella Alvarenga
2012/7/26 Platonides platoni...@gmail.com:

 Thet don't need to read English. They just need to type the letters they
 see on the image. Sure, you can have a small advantage if you know what
 letters could make a valid English word (or if you have the captcha
 dictionary installed), but a Brazilian which can read wikipedia should
 have no problems typing the captcha.

If that is the case, why don't we change the CAPTCH for random letters?

-- 
Everton Zanella Alvarenga (also Tom)
Wikimedia Brasil
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-27 Thread Yury Katkov
I think that making Russian, Korean and Arabian captcha is really bad idea.
English keyboad layout is installed by default in all operation systems, as
far as I know. Moreover very interesting problems can appear if this
feature would be implemented. Who will decide what captcha language is
used? We can look at user IP address - then sometimes the foreigners will
be in trouble. We can use Ukrainian capcha for the Ukrainian wesites - thus
assuming that every person who knows Ukrainian has the Ukrainian keyboard
layout, which is not true.
I think that the assumption that everyone in the internet is able to print
English letters loking at their noised example is not very bold assumption.
26.07.2012 17:53 пользователь Everton Zanella Alvarenga 
ezalvare...@wikimedia.org написал:

 Hi all,

 how are you? I'd like to know about the possibility of solving an old
 issue with CAPTCHA for Wikipedias in languages other than English.
 This bug

 https://bugzilla.wikimedia.org/show_bug.cgi?id=5309

 was created in 2006. There is a discussion here about having CAPTCHA
 in other languages from February 2012


 http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/51951/

 but it seems there was no conclusion. After working on campus with new
 editors in Brazil, I've checked this is a real obstacle, since most
 people here cannot ready English at all.

 I'd like to know if there are plans to solve this issue - I hope I
 don't sound rude, maybe this can be a minor issue when we don't see
 the difficulties people from a different place can face. I think this
 is important for Wikipedias other than the English one (just read
 people comments in the bug) and we can be loosing new contributors
 because of their first impressions. Thanks,

 Tom

 --
 Everton Zanella Alvarenga (also Tom)
 Wikimedia Brasil
 Wikimedia Foundation

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-27 Thread Martijn Hoekstra
Maybe present three or four different capcha's with different scripts,
requiring only one to be filled out?

On Fri, Jul 27, 2012 at 8:09 PM, Yury Katkov katkov.ju...@gmail.com wrote:
 I think that making Russian, Korean and Arabian captcha is really bad idea.
 English keyboad layout is installed by default in all operation systems, as
 far as I know. Moreover very interesting problems can appear if this
 feature would be implemented. Who will decide what captcha language is
 used? We can look at user IP address - then sometimes the foreigners will
 be in trouble. We can use Ukrainian capcha for the Ukrainian wesites - thus
 assuming that every person who knows Ukrainian has the Ukrainian keyboard
 layout, which is not true.
 I think that the assumption that everyone in the internet is able to print
 English letters loking at their noised example is not very bold assumption.
 26.07.2012 17:53 пользователь Everton Zanella Alvarenga 
 ezalvare...@wikimedia.org написал:

 Hi all,

 how are you? I'd like to know about the possibility of solving an old
 issue with CAPTCHA for Wikipedias in languages other than English.
 This bug

 https://bugzilla.wikimedia.org/show_bug.cgi?id=5309

 was created in 2006. There is a discussion here about having CAPTCHA
 in other languages from February 2012


 http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/51951/

 but it seems there was no conclusion. After working on campus with new
 editors in Brazil, I've checked this is a real obstacle, since most
 people here cannot ready English at all.

 I'd like to know if there are plans to solve this issue - I hope I
 don't sound rude, maybe this can be a minor issue when we don't see
 the difficulties people from a different place can face. I think this
 is important for Wikipedias other than the English one (just read
 people comments in the bug) and we can be loosing new contributors
 because of their first impressions. Thanks,

 Tom

 --
 Everton Zanella Alvarenga (also Tom)
 Wikimedia Brasil
 Wikimedia Foundation

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-27 Thread Max Semenik
On 27.07.2012, 22:09 Yury wrote:

 I think that making Russian, Korean and Arabian captcha is really bad idea.
 English keyboad layout is installed by default in all operation systems, as
 far as I know. Moreover very interesting problems can appear if this
 feature would be implemented. Who will decide what captcha language is
 used? We can look at user IP address - then sometimes the foreigners will
 be in trouble. We can use Ukrainian capcha for the Ukrainian wesites - thus
 assuming that every person who knows Ukrainian has the Ukrainian keyboard
 layout, which is not true.
 I think that the assumption that everyone in the internet is able to print
 English letters loking at their noised example is not very bold assumption.

Even funnier: imagine a Eeuropean trying to just read a Chinese
captcha:)

-- 
Best regards,
  Max Semenik ([[User:MaxSem]])


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-27 Thread Strainu
2012/7/28 Max Semenik maxsem.w...@gmail.com:
 On 27.07.2012, 22:09 Yury wrote:

 I think that making Russian, Korean and Arabian captcha is really bad idea.
 English keyboad layout is installed by default in all operation systems, as
 far as I know. Moreover very interesting problems can appear if this
 feature would be implemented. Who will decide what captcha language is
 used? We can look at user IP address - then sometimes the foreigners will
 be in trouble. We can use Ukrainian capcha for the Ukrainian wesites - thus
 assuming that every person who knows Ukrainian has the Ukrainian keyboard
 layout, which is not true.
 I think that the assumption that everyone in the internet is able to print
 English letters loking at their noised example is not very bold assumption.

 Even funnier: imagine a Eeuropean trying to just read a Chinese
 captcha:)

Funny as it may be, this is a non-problem. You can easily have a give
me an English CAPTCHA link... And that would be one more step for a
robot to learn, that is, one more (thin) defence line.

Strainu

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-27 Thread Platonides
On 27/07/12 16:31, Everton Zanella Alvarenga wrote:
 2012/7/26 Platonides platoni...@gmail.com:
 
 Thet don't need to read English. They just need to type the letters they
 see on the image. Sure, you can have a small advantage if you know what
 letters could make a valid English word (or if you have the captcha
 dictionary installed), but a Brazilian which can read wikipedia should
 have no problems typing the captcha.
 
 If that is the case, why don't we change the CAPTCH for random letters?

You should probably ask Neil Harris, the author of the captcha generator
we use.

from his 06/02/2011 mail:
 The wordlists themselves need not be secret: they are only needed to 
 create easily-typed strings that are sufficiently large in number to 
 provide a moderate challenge to brute force guessing.


I have added a random captcha at http://test.wikipedia.beta.wmflabs.org/
You can try adding urls at
http://test.wikipedia.beta.wmflabs.org/w/index.php?title=Main_Pageaction=edit
and http://en.wikipedia.beta.wmflabs.org/wiki/Wikipedia:Sandbox for
comparing the presented captchas.

(yes, testwikibeta is quite broken right now, but the captchas show)


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-26 Thread Everton Zanella Alvarenga
2012/7/26 Everton Zanella Alvarenga ezalvare...@wikimedia.org:

 was created in 2006. There is a discussion here about having CAPTCHA
 in other languages from February 2012

 http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/51951/

Sorry, I meant 2011.

-- 
Everton Zanella Alvarenga (also Tom)
Wikimedia Brasil
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-26 Thread Hunter Fernandes
Is there a such thing as localized captchas?

And should turning off account/ip creation throttling for events also
turn off the captcha requirement?
- Hunter F.


On Thu, Jul 26, 2012 at 6:54 AM, Everton Zanella Alvarenga
ezalvare...@wikimedia.org wrote:
 2012/7/26 Everton Zanella Alvarenga ezalvare...@wikimedia.org:

 was created in 2006. There is a discussion here about having CAPTCHA
 in other languages from February 2012

 http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/51951/

 Sorry, I meant 2011.

 --
 Everton Zanella Alvarenga (also Tom)
 Wikimedia Brasil
 Wikimedia Foundation

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-26 Thread Federico Leva (Nemo)
Ehm, I know that I'll sound like a broken record, but look at the 
WikiCAPTCHA proposal: it's just a proposal, but it could address the 
problem just by fetching books from the relevant Wikisource.

Links in: https://www.mediawiki.org/wiki/CAPTCHA

Nemo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-26 Thread Neil Harris

On 26/07/12 14:58, Hunter Fernandes wrote:

Is there a such thing as localized captchas?

And should turning off account/ip creation throttling for events also
turn off the captcha requirement?
- Hunter F.



It's really a matter of configuration; the core captcha code is 
intrinsically language-agnostic.


The existing captcha code takes input from a file with a few thousand 
short words in, then generates the captchas from a pair of those words.


To localize the captcha, all that is needed is to arrange that a 
different word list (and image pool) is used for each language.


If you have a language you want the captcha implemented in, a good first 
thing to do would be to create a list of say 4 to 5,000 short words in 
that language for use by the captcha code.


-- N.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Captcha for non-English speakers II

2012-07-26 Thread Platonides
On 26/07/12 15:53, Everton Zanella Alvarenga wrote:
 Hi all,
 
 how are you? I'd like to know about the possibility of solving an old
 issue with CAPTCHA for Wikipedias in languages other than English.
 This bug
 
 https://bugzilla.wikimedia.org/show_bug.cgi?id=5309
 
 was created in 2006. There is a discussion here about having CAPTCHA
 in other languages from February 2012
 
 http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/51951/
 
 but it seems there was no conclusion. After working on campus with new
 editors in Brazil, I've checked this is a real obstacle, since most
 people here cannot ready English at all.

Thet don't need to read English. They just need to type the letters they
see on the image. Sure, you can have a small advantage if you know what
letters could make a valid English word (or if you have the captcha
dictionary installed), but a Brazilian which can read wikipedia should
have no problems typing the captcha.

That said, it's easy enough to make a different set of captchas if we
are provided a suitable dictionary of words (note that we don't want
non-ansi letters such as ç in the captcha in case it's seen by a foreign
user which doesn't have such letter on its keyboard).



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] CAPTCHA spell checker

2011-02-06 Thread Andrew Garrett
On Sat, Feb 5, 2011 at 5:32 PM, Tim Starling tstarl...@wikimedia.org wrote:

 My original idea was to search for near matches and to provide an
 autocomplete drop-down, but the necessary UI code for that seemed a
 bit too complicated for a quick weekend project. Maybe later.

We have the UI code already written, in
phase3/resources/jquery/jquery.suggestions.js. Roan wrote it for the
search box :-)

-- 
Andrew Garrett
http://werdn.us/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] CAPTCHA spell checker

2011-02-06 Thread Tim Starling
On 06/02/11 21:39, Andrew Garrett wrote:
 On Sat, Feb 5, 2011 at 5:32 PM, Tim Starling tstarl...@wikimedia.org wrote:

 My original idea was to search for near matches and to provide an
 autocomplete drop-down, but the necessary UI code for that seemed a
 bit too complicated for a quick weekend project. Maybe later.
 
 We have the UI code already written, in
 phase3/resources/jquery/jquery.suggestions.js. Roan wrote it for the
 search box :-)

There's also jquery.ui.autocomplete, and I looked at the old
mwsuggest.js. But GreaseMonkey scripts run in a privileged mode with
access to the window via a special sandbox. So it wasn't clear (with
my limited GreaseMonkey experience) whether any of these solutions
could be trivially ported to that environment.

-- Tim Starling



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l