Re: security quirk

2013-01-31 Thread Gandalf Parker
RichD r_delaney2...@yahoo.com contributed wisdom to  news:badd4188-196b-
45e3-ba8a-511d47128...@nh8g2000pbc.googlegroups.com:

 On Jan 30, Gandalf  Parker gand...@the.dead.isp.of.community.net
 wrote:
  Web gurus, what's going on?

 That is the fault of the site itself.
 If they are going to block access to users then they should also block
 access to the automated spiders that hit the site to collect data.
 
 well yeah, but what's going on, under the hood?
 How does it get confused?  How could this
 happen?  I'm looking for some insight, regarding a
 hypothetical programmimg glitch -

(from alt.hacker)

You dont understand. It is not in the code. It is in the site.
It is as if someone comes and picks fruit off of your tree, and you are 
questioning the tree for how it bears fruit. 

The site creates web pages. 
Google collects web pages.
The site needs to set things like robot.txt to tell Google to NOT collect 
the pages in the archives. Which is not an absolute protection but at least 
its an effort that works for most sites.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-30 Thread Martin Musatov
On Jan 29, 8:55 pm, RichD r_delaney2...@yahoo.com wrote:
 I read Wall Street Journal, and occasionally checkNotepadPlus
UserLang name=MUSATOV ext=.myl udlVersion=2.0
Settings
Global caseIgnored=no allowFoldOfComments=no
forceLineCommentsAtBOL=no foldCompact=yes /
Prefix Keywords1=no Keywords2=no Keywords3=no
Keywords4=no Keywords5=no Keywords6=no Keywords7=no
Keywords8=no /
/Settings
KeywordLists
Keywords name=Comments id=000commentBegin 01comment
02commentEnd 03 04/Keywords
Keywords name=Numbers, additional id=1/Keywords
Keywords name=Numbers, prefixes id=2/Keywords
Keywords name=Numbers, extras with prefixes id=3/
Keywords
Keywords name=Numbers, suffixes id=4/Keywords
Keywords name=Operators1 id=5();/Keywords
Keywords name=Operators2 id=6/Keywords
Keywords name=Folders in code1, open id=7Open/
Keywords
Keywords name=Folders in code1, middle id=8middle/
Keywords
Keywords name=Folders in code1, close id=9Close/
Keywords
Keywords name=Folders in code2, open id=10Open/
Keywords
Keywords name=Folders in code2, middle id=11middle/
Keywords
Keywords name=Folders in code2, close id=12Close/
Keywords
Keywords name=Folders in comment, open id=13Open/
Keywords
Keywords name=Folders in comment, middle
id=14middle/Keywords
Keywords name=Folders in comment, close id=15Close/
Keywords
Keywords name=Keywords1 id=16%%/Keywords
Keywords name=Keywords2 id=17/Keywords
Keywords name=Keywords3 id=18/Keywords
Keywords name=Keywords4 id=19/Keywords
Keywords name=Keywords5 id=20/Keywords
Keywords name=Keywords6 id=21/Keywords
Keywords name=Keywords7 id=22/Keywords
Keywords name=Keywords8 id=23/Keywords
Keywords name=Delimiters id=24/Keywords
/KeywordLists
Styles
WordsStyle name=DEFAULT styleID=0 fgColor=FF
bgColor=00 fontName=Monotype Corsiva fontStyle=7
fontSize=14 nesting=0 /
WordsStyle name=COMMENTS styleID=1 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=LINE COMMENTS styleID=2
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=NUMBERS styleID=3 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=KEYWORDS1 styleID=4 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=KEYWORDS2 styleID=5 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=KEYWORDS3 styleID=6 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=KEYWORDS4 styleID=7 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=KEYWORDS5 styleID=8 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=KEYWORDS6 styleID=9 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=KEYWORDS7 styleID=10 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=KEYWORDS8 styleID=11 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=OPERATORS styleID=12 fgColor=00
bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=FOLDER IN CODE1 styleID=13
fgColor=FF bgColor=00 fontName= fontStyle=7
fontSize=10 nesting=0 /
WordsStyle name=FOLDER IN CODE2 styleID=14
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=FOLDER IN COMMENT styleID=15
fgColor=FF bgColor=00 fontName=Times New Roman
fontStyle=7 fontSize=8 nesting=0 /
WordsStyle name=DELIMITERS1 styleID=16
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=DELIMITERS2 styleID=17
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=DELIMITERS3 styleID=18
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=DELIMITERS4 styleID=19
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=DELIMITERS5 styleID=20
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=DELIMITERS6 styleID=21
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=DELIMITERS7 styleID=22
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
WordsStyle name=DELIMITERS8 styleID=23
fgColor=00 bgColor=FF fontStyle=0 nesting=0 /
/Styles
/UserLang
/NotepadPlus

 articles on their Web site.  It's mostly free, with some items
 available to subscribers only.  It seems random, which ones
 they block, about 20%.

 Anywho, sometimes I use their search utility, the usual author
 or title search, and it blocks, then I look it up on Google, and
 link from there, and it loads!  ok, Web 

Re: security quirk

2013-01-30 Thread Gandalf Parker
RichD r_delaney2...@yahoo.com contributed wisdom to  news:b968c6c6-5aa9-
4584-bd7a-5b097f17c...@pu9g2000pbc.googlegroups.com:

 Web gurus, what's going on?
 

That is the fault of the site itself.
If they are going to block access to users then they should also block 
access to the automated spiders that hit the site to collect data.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-30 Thread RichD
On Jan 30, Gandalf  Parker gand...@the.dead.isp.of.community.net
wrote:
  Web gurus, what's going on?

 That is the fault of the site itself.
 If they are going to block access to users then they should also block
 access to the automated spiders that hit the site to collect data.

well yeah, but what's going on, under the hood?
How does it get confused?  How could this
happen?  I'm looking for some insight, regarding a
hypothetical programmimg glitch -


--
Rich
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-30 Thread Joel Goldstick
On Wed, Jan 30, 2013 at 2:39 PM, RichD r_delaney2...@yahoo.com wrote:

 On Jan 30, Gandalf  Parker gand...@the.dead.isp.of.community.net
 wrote:
   Web gurus, what's going on?
 
  That is the fault of the site itself.
  If they are going to block access to users then they should also block
  access to the automated spiders that hit the site to collect data.

 well yeah, but what's going on, under the hood?
 How does it get confused?  How could this
 happen?  I'm looking for some insight, regarding a
 hypothetical programmimg glitch -


 --
 Rich
 --

 As was pointed out, this really is off topic for this group.  You might
try googling.  The NYTimes makes articles available by adding a parameter
to the tail of the url I believe


-- 
Joel Goldstick
http://joelgoldstick.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-30 Thread Big Bad Bob

On 01/29/13 20:55, RichD so wittily quipped:

I read Wall Street Journal, and occasionally check
articles on their Web site.  It's mostly free, with some items
available to subscribers only.  It seems random, which ones
they block, about 20%.

Anywho, sometimes I use their search utility, the usual author
or title search, and it blocks, then I look it up on Google, and
link from there, and it loads!  ok, Web gurus, what's going on?


in my last post, I quoted an article from 'The Register' where they talk 
about how Facebook (literally) broke that feature.


[this works in a LOT of places, but sometimes you have to enable cookies 
or javascript to actually see the content]


--
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-30 Thread Auric__
Martin Musatov wrote:

 On Jan 29, 8:55 pm, RichD r_delaney2...@yahoo.com wrote:
 I read Wall Street Journal, and occasionally checkNotepadPlus
 UserLang name=MUSATOV ext=.myl udlVersion=2.0
[snip]
 /UserLang
 /NotepadPlus

Ignoring the big ol' unneccessary crosspost... What the fuck?

-- 
Oooh, I just learned a new euphemism.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-30 Thread alex23
On Jan 31, 5:39 am, RichD r_delaney2...@yahoo.com wrote:
 well yeah, but what's going on, under the hood?
 How does it get confused?  How could this
 happen?  I'm looking for some insight, regarding a
 hypothetical programmimg glitch -

As has been stated, this has nothing to do with Python, so please stop
posting your questions here.

However, here's an answer to get you to stop repeating yourself: it's
not uncommon to find that content you're restricted from accessing via
a site's own search is available to you through Google. This has to do
with Google's policy of _requiring_ that pages that it is allowed to
index _must_ be available for view. Any site that allows Google to
index its pages that then blocks you from viewing them will swiftly
find themselves web site-a non gratis in Google search. As most
websites are attention whores, they'll do anything to ensure they
remain within Google's indices.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-30 Thread Arne Vajhøj

On 1/29/2013 11:55 PM, RichD wrote:

I read Wall Street Journal, and occasionally check
articles on their Web site.  It's mostly free, with some items
available to subscribers only.  It seems random, which ones
they block, about 20%.

Anywho, sometimes I use their search utility, the usual author
or title search, and it blocks, then I look it up on Google, and
link from there, and it loads!  ok, Web gurus, what's going on?


WSJ want their articles to be findable from Google.

So they open up for Google indexing them.

If they require any type of registration to see an article,
then Google will remove the link.

So therefore WSJ (and many other web sites!) gives more access
if you come from Google than if not.

Arne


--
http://mail.python.org/mailman/listinfo/python-list


security quirk

2013-01-29 Thread RichD
I read Wall Street Journal, and occasionally check
articles on their Web site.  It's mostly free, with some items
available to subscribers only.  It seems random, which ones
they block, about 20%.

Anywho, sometimes I use their search utility, the usual author
or title search, and it blocks, then I look it up on Google, and
link from there, and it loads!  ok, Web gurus, what's going on?


--
Rich
-- 
http://mail.python.org/mailman/listinfo/python-list


Signal versus noise (was: security quirk)

2013-01-29 Thread Ben Finney
RichD r_delaney2...@yahoo.com writes:

 Anywho, sometimes I use their search utility, the usual author
 or title search, and it blocks, then I look it up on Google, and
 link from there, and it loads!  ok, Web gurus, what's going on?

That evidently has nothing in particular to do with the topic of this
forum: the Python programming language.

If you want to just comment on arbitrary things with the internet at
large, you have many other forums available. Please at least try to keep
this forum on-topic.

-- 
 \ “Outside of a dog, a book is man's best friend. Inside of a |
  `\dog, it's too dark to read.” —Groucho Marx |
_o__)  |
Ben Finney

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-29 Thread Rodrick Brown
On Tue, Jan 29, 2013 at 11:55 PM, RichD r_delaney2...@yahoo.com wrote:

 I read Wall Street Journal, and occasionally check
 articles on their Web site.  It's mostly free, with some items
 available to subscribers only.  It seems random, which ones
 they block, about 20%.

 Anywho, sometimes I use their search utility, the usual author
 or title search, and it blocks, then I look it up on Google, and
 link from there, and it loads!  ok, Web gurus, what's going on?


Its Gremlins! I tell you Gremlins!!!



 --
 Rich
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: security quirk

2013-01-29 Thread Chris Rebert
On Tue, Jan 29, 2013 at 8:55 PM, RichD r_delaney2...@yahoo.com wrote:
 I read Wall Street Journal, and occasionally check
 articles on their Web site.  It's mostly free, with some items
 available to subscribers only.  It seems random, which ones
 they block, about 20%.

 Anywho, sometimes I use their search utility, the usual author
 or title search, and it blocks, then I look it up on Google, and
 link from there, and it loads!  ok, Web gurus, what's going on?

http://www.google.com/search?btnG=1pws=0q=first+click+free

BTW, this has absolutely jack squat to do with Python. Please direct
similar future inquiries to a more relevant forum.

Regards,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list