Thanks for the advice! But it didn't seem to work :( The page is no
longer cached, but the result is the same stale.
I fetch the page each 5min, while in fact it is refreshed each 15m or
so. I wonder if this can be the cause, since I notice that when the
page gets refreshed, the fetcher returns correct content. Although I
wonder WHERE GAE gets the incorrect page??
#
#
I 2011-02-12 22:07:06.197
[keepeyeon/1.348315101922450339].<stdout>: 06:07:06.196 [pool-4-
thread-1][keepEyeOn] DEBUG wowpop.fetchers.UrlFetcher - Headers:
{server=[cloudflare-nginx], date=[Sun, 13 Feb 2011 06:07:06 GMT],
content-type=[text/html], connection=[keep-alive], x-powered-by=[PHP/
5.2.15], set-
cookie=[__cfduid=d793f42a886fded5ae8ddf0892e63978d1297577226;
expires=Mon, 23 Dec 2019 23:50:00 GMT; path=/; domain=.gamer-
district.com, __cfduid=d793f42a886fded5ae8ddf0892e63978d1297577226;
expires=Mon, 23 Dec 2019 23:50:00 GMT; path=/; domain=.www.gamer-
district.com], x-google-cache-control=[remote-fetch], via=[HTTP/1.1
GWA]}
#
I 2011-02-12 22:07:06.197
[keepeyeon/1.348315101922450339].<stdout>: 06:07:06.197 [pool-4-
thread-1][keepEyeOn] DEBUG wowpop.fetchers.UrlFetcher - fetched
500chars: <table style="width: 100%; border: 0; padding: 1px"> <tr>
<td><b style="color: #fff">Gamer District 7x</b></td> <td style="text-
align: right;"> <img src="http://www.gamer-district.com/modules/
mod_realmcore/wow_on.png"></td></tr></table><table style="width: 100%;
border: 0; padding: 3"> <tr> <td>Uptime:</td>
<td>1 hours 19
minutes</td> </tr> <tr> <td>Players
online:</td> <td><b>713</b>
<span style="color: #ADDFFF">318</font> / <span
style="color: #F62817">395</font></td>
When in fact the page's content is
<table style="width: 100%; border: 0; padding: 1px">
<tr>
<td><b style="color: #fff">Gamer District 7x</b></td> <td style="text-
align: right;">
<img src="http://www.gamer-district.com/modules/mod_realmcore/
wow_on.png"></td></tr></table><table style="width: 100%; border: 0;
padding: 3">
<tr>
<td>Uptime:</td>
<td>34 minutes</td> </tr>
<tr>
<td>Players online:</td>
<td><b>375</b> <span style="color:
#ADDFFF">142</
font> / <span style="color: #F62817">233</font></td>
It also happens ONLY for this page... Can it be something wrong with
the page? Something not in the headers, like modification time etc ?
On Feb 12, 9:46 pm, Fabrizio Accatino <[email protected]> wrote:
> Afaik UrlFetch uses an internal cache.
>
> Note the two different heders in you logs:
> x-google-cache-control=[remote-cache-hit]
> x-google-cache-control=[remote-fetch]
>
> To force a fresh urlfetch everytime, add these headers in you request.
>
> connection.addRequestProperty("Cache-Control", "no-cache,max-age=0");
> connection.addRequestProperty("Pragma", "no-cache");
>
> Fabrizio
>
> On Sat, Feb 12, 2011 at 12:08 PM, aka1g <[email protected]> wrote:
> > Hi!
> > I'm fetching webpages and parse them. I have a weird behavior that
> > looks like GAE bug.
> > One of the pages retrieved has a totally different content.
> > I'm getting
> >http://www.gamer-district.com/modules/mod_realmcore/mod_realmcore.php
> > page, and most of the time it is retrieved incorrectly.
> > It can be that some stale/cathed version is returned.
> > I am logging http headers and page content.This only happens for this
> > particular url.
--
You received this message because you are subscribed to the Google Groups
"Google App Engine for Java" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/google-appengine-java?hl=en.