> I think somebody requested a --user-agent=??? feature, I don't know if
> someone has committed a patch, but it shouldn't be a big problem to do
> it, and I think the 'Sorry...' is based on user-agent info in the HTTP
> request.
That was probably me.
I've been very very successful at getting through these paper-thin
walls of protection these sites put up. Bloomberg's was easy. Because of
the Bloomberg thing this afternoon, I whipped up a website that is now
going to be used to test these sites, complete with all routing/forging
rules intact. It works flawlessly for Bloomberg, and you can actually
click and navigate through the links on their site, all without leaving
mine. It transparently proxies the links through to them.
So there's the Bloombergs, the space.com's
(http://www.space.com/syn/avantgo), and the fedbuzz's
(http://www.fedbuzz.com/avantgo/). Each use their own mechanism
for blocking, all of which are bypassable at some level. FedBuzz is fun,
I'm down to the masquerade level with them. I'm just having fun!
But remember, the fact that they are locking this stuff down
*LIMITS* their audience and distribution. It is for this reason that I've
started the PODS project (Palm Open Directory Syndicate, name subject to
change). It is basically a dmoz-style directory of links and urls which
are formatted for the Palm and handheld devices. I've got ~400 sites in
there now, and will be adding some new things so that you can log in and
select which sites you want formatted in which way (Sitescooper vs.
Plucker vs. DOC vs. ...) and download the content that way.
Sometimes the best way to get to some of this content, is by
walking right in the front door. I've had sites which blocked me from
getting content, but then I email them and ask for their URI politely, and
tell them what I am using to read it, and they usually are cooperative in
giving it to me (this just happened yesterday with a Bible site for a user
that emailed me wanting to get into it).
AvantGo (and I know they're listening) doesn't own the web. They
are limiting the audience of their clients and customers. Shrouding urls
in some sort of locked-down system (which actually is fairly easy to flaw)
is no way to treat users who are trying to get to this content to enrich
their lives, and learn more.
PODS will grow fast, I believe, and once we hook in the doc
conversion stuff, and proxy out editors to it to keep it updated, it will
only continue to grow fast.
So here's something I found too:
http://www.palmstation.com/read_comment.py?comment=17077&article=3693
I tried this one day and it worked, then the next day it didn't.
They're on these lists, and I'm glad they're participating. They also have
to remember that I'm going to eventually successfully turn every corner
they throw in my direction. We should be actively emailing the server
admins and webmasters asking them to help us support a larger userbase,
open up the content, and share the information.
/d
_______________________________________________
Sitescooper-talk mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/sitescooper-talk