Re: [PHP] Detecting naughty sites

2006-12-01 Thread Robert Cummings
On Fri, 2006-12-01 at 20:41 +0100, Satyam wrote:
> - Original Message - 
> From: "Satyam" <[EMAIL PROTECTED]>
> 
> 
> > The Wikipedia article of the day provides some interesting facts about 
> > when if became naughty:
> >
> > http://en.wikipedia.org/wiki/History_of_erotic_depictions
> >
> > -- 
> 
> May I add something that came to my mind related to this.   As the Wikipedia 
> article says, the idea of pornography was developed mostly in the Victorian 
> era.  It is fitting to mention that at the same time, those we would now 
> call scientists (the term wasn't popular then) devoted much of their time to 
> screwing. Pretty much as anybody else, you would say.   Not quite, the term 
> meant something else then.
> 
> The Industrial Revolution was facing a big problem , the lack of standards. 
> Each and every machine was unique, when something broke, a replacement had 
> to be crafted, just like the original piece had been.   One of the most 
> important pieces, due to the quantity used in each and every machine, were 
> screws, bolts and nuts.  Thus, it was important to develop standards for 
> screws, length, diameter, threads per inch, depth and profile of the thread 
> and so on, so they could be manufactured in quantities.
> 
> This subject was referred to as 'screwing' and I am sure Queen Victoria 
> would have wholehartedly aproved of it.

That's really screwed up! :B

Cheers,
Rob.
-- 
..
| InterJinn Application Framework - http://www.interjinn.com |
::
| An application and templating framework for PHP. Boasting  |
| a powerful, scalable system for accessing system services  |
| such as forms, properties, sessions, and caches. InterJinn |
| also provides an extremely flexible architecture for   |
| creating re-usable components quickly and easily.  |
`'

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-12-01 Thread Satyam
- Original Message - 
From: "Satyam" <[EMAIL PROTECTED]>



The Wikipedia article of the day provides some interesting facts about 
when if became naughty:


http://en.wikipedia.org/wiki/History_of_erotic_depictions

--


May I add something that came to my mind related to this.   As the Wikipedia 
article says, the idea of pornography was developed mostly in the Victorian 
era.  It is fitting to mention that at the same time, those we would now 
call scientists (the term wasn't popular then) devoted much of their time to 
screwing. Pretty much as anybody else, you would say.   Not quite, the term 
meant something else then.


The Industrial Revolution was facing a big problem , the lack of standards. 
Each and every machine was unique, when something broke, a replacement had 
to be crafted, just like the original piece had been.   One of the most 
important pieces, due to the quantity used in each and every machine, were 
screws, bolts and nuts.  Thus, it was important to develop standards for 
screws, length, diameter, threads per inch, depth and profile of the thread 
and so on, so they could be manufactured in quantities.


This subject was referred to as 'screwing' and I am sure Queen Victoria 
would have wholehartedly aproved of it.


Satyam

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-30 Thread Satyam
The Wikipedia article of the day provides some interesting facts about when 
if became naughty:


http://en.wikipedia.org/wiki/History_of_erotic_depictions

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-29 Thread tedd

At 8:57 PM +0100 11/28/06, Rory Browne wrote:

I didn't mean something quite that simple, or as an absolute solution.

I meant something slightly more advanced, but based on that idea.

From a robot point of view, what do you think is the difference
between the php archives and a porn site?



There's a difference?

Both provide things you can't get?

tedd
--
---
http://sperling.com  http://ancientstones.com  http://earthstones.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-29 Thread Paul Novitski

At 11/29/2006 01:51 AM, Robin Vickery wrote:

Cubist Porn - very big in certain 'artistic' circles.


What, both eggs on the same side of the sausage? 


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-29 Thread Robin Vickery

On 29/11/06, Travis Doherty <[EMAIL PROTECTED]> wrote:

Tom Chubb wrote:

> On 28/11/06, Dave Goodchild <[EMAIL PROTECTED]> wrote:
>
>> Hi all. I am building a web app and as part of it advertisers can upload
>> their ad image and website URL to go with their ad. Is there a good
>> way to
>> detect whether that site is a porn site via php?
>>
>> --
>> http://www.web-buddha.co.uk
>>
>>
> I remember seeing something that used GD to detect colours similar to
> flesh within an image and thinking it was funny that someone had taken
> so much time to to it, but I can't remember where it was.
> I think it was on phpclasses.org but can't find it. Maybe someone else
> remembers it?
>
And more recently a commercial vendor is performing something along the
lines of 'curve recognition' to the same effect.  That's fine for any
vector graphics (banner ads might fall here.)  False positives will be
high with beach photos from family vacation, for example.


Cubist Porn - very big in certain 'artistic' circles.

-robin

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-28 Thread Travis Doherty
Tom Chubb wrote:

> On 28/11/06, Dave Goodchild <[EMAIL PROTECTED]> wrote:
>
>> Hi all. I am building a web app and as part of it advertisers can upload
>> their ad image and website URL to go with their ad. Is there a good
>> way to
>> detect whether that site is a porn site via php?
>>
>> -- 
>> http://www.web-buddha.co.uk
>>
>>
> I remember seeing something that used GD to detect colours similar to
> flesh within an image and thinking it was funny that someone had taken
> so much time to to it, but I can't remember where it was.
> I think it was on phpclasses.org but can't find it. Maybe someone else
> remembers it?
>
And more recently a commercial vendor is performing something along the
lines of 'curve recognition' to the same effect.  That's fine for any
vector graphics (banner ads might fall here.)  False positives will be
high with beach photos from family vacation, for example.

Travis

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-28 Thread Paul Novitski

At 11/28/2006 11:57 AM, Rory Browne wrote:

I didn't mean something quite that simple, or as an absolute solution.

I meant something slightly more advanced, but based on that idea.

From a robot point of view, what do you think is the difference
between the php archives and a porn site?



What eaxctly do you mean by "porn"?  Certainly there are websites 
that 99 people in a room of 100 would label as pornography, but the 
grey area is the killer -- an enormous volume of material that 
various people will label "pornographic" and others won't.  Whose 
opinion will you use in crafting your software?


Only when you define "porn site" with sufficient specificity can you 
attempt to write an algorithm to recognize one.  When you've 
accomplished all that, you can apply similar logic to recognizing 
truly good movies, really delightful reads, absolutely delicious 
recipes, and undeniably correct political opinions.  You'll be a 
genius -- providing you can find enough people to agree with you.


If we humans can't reliably define the boundaries of a set, how can 
we write software to recognize its (pardon the expression) 
members?  How would we know when the software was functioning 
correctly?  Who would judge the accuracy of its findings?


You can work for weeks developing an algorithm to detect "porn" and 
it will take others an hour or a day to find the gaps in the 
definition.  I think any program you could write would have so many 
false positives & false negatives that you'd end up having to 
manually moderate the process anyway.


Please understand that I love (if I may use that verb ever so 
delicately) writing software that parses human expression in search 
of patterns and specific content.  I love bestowing on a program the 
flexibility and grace it requires to enter that messy jungle and 
return with a map or a fact.  I could write a spider that flagged 
websites containing certain words (in English, at least, without 
assistance), but I'm not as comfortable with the prospect of writing 
a sexual content filter so dependable that I'd be happy to leave it 
to guard a gate on its own.  I'm sure it would slam the door on many 
undeserving people and would happily let in others my client wouldn't 
want.  For a commercial site hoping to make money from advertisers, 
it wouldn't pay to have a near-sighted or illiterate gatekeeper.


Perhaps the only way to do what you're suggesting is to write an 
image pattern recognition algorithm so sophisticated that it can 
differentiate a photograph of a hand caressing a breast from a 
photograph of a breast self-exam.


Or are photos of breast self-exams pornographic?

Yikes,
Paul 


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-28 Thread Rory Browne

I didn't mean something quite that simple, or as an absolute solution.

I meant something slightly more advanced, but based on that idea.


From a robot point of view, what do you think is the difference

between the php archives and a porn site?

On 11/28/06, Paul Novitski <[EMAIL PROTECTED]> wrote:


>Hi all. I am building a web app and as part of it advertisers can upload
>their ad image and website URL to go with their ad. Is there a good way to
>detect whether that site is a porn site via php?

>If the sites home page contains the words sex, babes, and a few other
>choice words, which I'll leave to your imagination, then chances are
>it's a porn site.


What chances are those, exactly?  One in a blizzard?

This is exactly why filtering realistically for "pornography" is
virtually impossible -- we can't define the problem sufficiently to
derive realistic solutions, and our inherently flawed solutions are weak.

This listserve thread, containing as it does the words "sex" and
"babes" and "porn," has now flagged the PHP list archives as a "porn"
site -- for anyone silly enough to use a simple keyword match to
identify "porn."  Such a trap would also catch websites discussing
the social & historical significance of "porn," sites that detail
ways to identify "porn" which might include the FBI's, dictionaries
and encyclopedias that explain "porn," vendors who try to use sexy
keywords to attract visitors to their non-"porn" sites, websites on
human sexuality, websites about safe sex, sites about scientific
research in human sexual response, and on and on and on.

Such a simplistic filter would overlook websites written by people
smart enough to obfuscate the key words, say by imagizing them,
misspelling them, or using metaphorical language.

More to the point, though, "pornography" isn't one concrete thing out
there in the world.  It's nebulous, self-defined, ambiguous,
ever-changing, and psychologically and culturally dependent.  This is
why anti-pornography laws are pissing into the wind (oops, did I just
commit "porn"?) -- they want to legislate human desire by attempting
to define one corner of creative expression, then discover that
that's like trying to contain any aspect of the human spirit.  You
can only accomplish it partially and temporarily by brute force or
intellectual repression or both.

Better to challenge those aspects of our culture that breed men who
take and take with no empathy for their victims.

I don't think an automated solution (PHP or otherwise) is
feasible.  The best you can do is to create a club advertisers can
ask to join but can remain in only if their ads meet your
approval.  There's no machine that can judge what's "porn" --
machines get turned on and disgusted by a whole different set of
words and images than we do -- you know, like muddy screwdrivers and
oily vises -- you're going to have to do it yourself.  Look at each
image and judge for yourself.  At least you can rest assured that
your own judgement is sound.

Regards,
Paul

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-28 Thread Paul Novitski



Hi all. I am building a web app and as part of it advertisers can upload
their ad image and website URL to go with their ad. Is there a good way to
detect whether that site is a porn site via php?



If the sites home page contains the words sex, babes, and a few other
choice words, which I'll leave to your imagination, then chances are
it's a porn site.



What chances are those, exactly?  One in a blizzard?

This is exactly why filtering realistically for "pornography" is 
virtually impossible -- we can't define the problem sufficiently to 
derive realistic solutions, and our inherently flawed solutions are weak.


This listserve thread, containing as it does the words "sex" and 
"babes" and "porn," has now flagged the PHP list archives as a "porn" 
site -- for anyone silly enough to use a simple keyword match to 
identify "porn."  Such a trap would also catch websites discussing 
the social & historical significance of "porn," sites that detail 
ways to identify "porn" which might include the FBI's, dictionaries 
and encyclopedias that explain "porn," vendors who try to use sexy 
keywords to attract visitors to their non-"porn" sites, websites on 
human sexuality, websites about safe sex, sites about scientific 
research in human sexual response, and on and on and on.


Such a simplistic filter would overlook websites written by people 
smart enough to obfuscate the key words, say by imagizing them, 
misspelling them, or using metaphorical language.


More to the point, though, "pornography" isn't one concrete thing out 
there in the world.  It's nebulous, self-defined, ambiguous, 
ever-changing, and psychologically and culturally dependent.  This is 
why anti-pornography laws are pissing into the wind (oops, did I just 
commit "porn"?) -- they want to legislate human desire by attempting 
to define one corner of creative expression, then discover that 
that's like trying to contain any aspect of the human spirit.  You 
can only accomplish it partially and temporarily by brute force or 
intellectual repression or both.


Better to challenge those aspects of our culture that breed men who 
take and take with no empathy for their victims.


I don't think an automated solution (PHP or otherwise) is 
feasible.  The best you can do is to create a club advertisers can 
ask to join but can remain in only if their ads meet your 
approval.  There's no machine that can judge what's "porn" -- 
machines get turned on and disgusted by a whole different set of 
words and images than we do -- you know, like muddy screwdrivers and 
oily vises -- you're going to have to do it yourself.  Look at each 
image and judge for yourself.  At least you can rest assured that 
your own judgement is sound.


Regards,
Paul 


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-28 Thread Tom Chubb

On 28/11/06, Dave Goodchild <[EMAIL PROTECTED]> wrote:

Hi all. I am building a web app and as part of it advertisers can upload
their ad image and website URL to go with their ad. Is there a good way to
detect whether that site is a porn site via php?

--
http://www.web-buddha.co.uk



I remember seeing something that used GD to detect colours similar to
flesh within an image and thinking it was funny that someone had taken
so much time to to it, but I can't remember where it was.
I think it was on phpclasses.org but can't find it. Maybe someone else
remembers it?

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Detecting naughty sites

2006-11-28 Thread Arno Kuhl
One way that could work is to check the urls against a list of keywords and
if a match is found put these submissions into a pending state waiting for
you to check and approve or decline. You could also have a second "hot"
keyword list that rejects the submissions outright if you find a match in
the urls. You can back this up with a proper T&C. If you've got something
like this in place you can fine-tune the keyword lists as the site develops
and you find consistent false matches or urls that slip through.

Arno


-Original Message-
From: Dave Goodchild [mailto:[EMAIL PROTECTED]
Sent: 28 November 2006 03:54
To: PHP Mailing
Subject: [PHP] Detecting naughty sites


Hi all. I am building a web app and as part of it advertisers can upload
their ad image and website URL to go with their ad. Is there a good way to
detect whether that site is a porn site via php?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-28 Thread Rory Browne

If the sites home page contains the words sex, babes, and a few other
choice words, which I'll leave to your imagination, then chances are
it's a porn site.

On 11/28/06, Jochem Maas <[EMAIL PROTECTED]> wrote:

Dave Goodchild wrote:
> Hi all. I am building a web app and as part of it advertisers can upload
> their ad image and website URL to go with their ad. Is there a good way to
> detect whether that site is a porn site via php?

buy a screen, a PC, a heartrate monitor and small man from thailand - plugin
the PC, sit the man down behind it and hook both him and the heartrate monitor
up to the PC - now start feeding him urls to watch and sync that with the output
of the heartrate monitor. you may need to run an number of these systems in 
parallel to
counter extreme sexual preference in any one unit.



on a more serious note: NO.
neither Yahoo nor Google is capable of successfully filtering pron - if they 
can't
do it neither can you (there is a million to 1 chance your god's own programmer 
and
that you can/will come up with a rock solid solution but I'm not holding my 
breath.

e.g.: http://www.theregister.co.uk/2006/11/23/yahoo_search_result/

(no bias against Yahoo intended, It just happened to be a relevant example that 
was still
floating around my short-term memory.)



I would suggest using a combination of:

1. solid, legally-sound T&C
2. require real address, etc with registration - and use a CAPTCHA technique, 
which
may not be nice and accessible but then again when did you last here of a blind 
man
uploading a banner ad image?
3. delay publication of ads until each image has been verified by a human
4. to counter the annoyance of no.3 you could add a 'niceguy' flag to your 
userdata
so that people you trust not to upload pron don't have to wait to be verified.

alternatively only allow text ads (works for Google ;-)

>

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Detecting naughty sites

2006-11-28 Thread Jochem Maas
Dave Goodchild wrote:
> Hi all. I am building a web app and as part of it advertisers can upload
> their ad image and website URL to go with their ad. Is there a good way to
> detect whether that site is a porn site via php?

buy a screen, a PC, a heartrate monitor and small man from thailand - plugin
the PC, sit the man down behind it and hook both him and the heartrate monitor
up to the PC - now start feeding him urls to watch and sync that with the output
of the heartrate monitor. you may need to run an number of these systems in 
parallel to
counter extreme sexual preference in any one unit.



on a more serious note: NO.
neither Yahoo nor Google is capable of successfully filtering pron - if they 
can't
do it neither can you (there is a million to 1 chance your god's own programmer 
and
that you can/will come up with a rock solid solution but I'm not holding my 
breath.

e.g.: http://www.theregister.co.uk/2006/11/23/yahoo_search_result/

(no bias against Yahoo intended, It just happened to be a relevant example that 
was still
floating around my short-term memory.)



I would suggest using a combination of:

1. solid, legally-sound T&C
2. require real address, etc with registration - and use a CAPTCHA technique, 
which
may not be nice and accessible but then again when did you last here of a blind 
man
uploading a banner ad image?
3. delay publication of ads until each image has been verified by a human
4. to counter the annoyance of no.3 you could add a 'niceguy' flag to your 
userdata
so that people you trust not to upload pron don't have to wait to be verified.

alternatively only allow text ads (works for Google ;-)

> 

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php