Hi!
I'm fetching webpages and parse them. I have a weird behavior that
looks like GAE bug.
One of the pages retrieved has a totally different content.
I'm getting 
http://www.gamer-district.com/modules/mod_realmcore/mod_realmcore.php
page, and most of the time it is retrieved incorrectly.
It can be that some stale/cathed version is returned.
I am logging http headers and page content.This only happens for this
particular url.
The logs when the content is incorrect:
2011-02-12 02:59:35.159

[keepeyeon/1.348283144444156083].<stdout>: 10:59:35.159 [pool-4-
thread-1][keepEyeOn] DEBUG wowpop.fetchers.UrlFetcher - Headers:
{server=[cloudflare-nginx], date=[Sat, 12 Feb 2011 10:44:31 GMT],
content-type=[text/html], connection=[keep-alive], x-powered-by=[PHP/
5.2.15], content-length=[1190], age=[904], x-google-cache-
control=[remote-cache-hit], via=[HTTP/1.1 GWA (remote cache hit)]}

2011-02-12 02:59:35.159

[keepeyeon/1.348283144444156083].<stdout>: 10:59:35.159 [pool-4-
thread-1][keepEyeOn] DEBUG wowpop.fetchers.UrlFetcher - fetched
500chars: <table style="width: 100%; border: 0; padding: 1px">  <tr>
<td><b style="color: #fff">Gamer District 7x</b></td>   <td style="text-
align: right;"> <img src="http://www.gamer-district.com/modules/
mod_realmcore/wow_on.png"></td></tr></table><table style="width: 100%;
border: 0; padding: 3">           <tr>                  <td>Uptime:</td>        
                <td>1 hours 15
minutes</td>    </tr>                   <tr>                    <td>Players 
online:</td>                        <td><b>457</b>
&nbsp; &nbsp;<span style="color: #ADDFFF">187</font> / <span
style="color: #F62817">270</font></td>


While 5min before it retrieved ok. Log:
2011-02-12 02:54:34.244

[keepeyeon/1.348283144444156083].<stdout>: 10:54:34.243 [pool-4-
thread-1][keepEyeOn] DEBUG wowpop.fetchers.UrlFetcher - Headers:
{server=[cloudflare-nginx], date=[Sat, 12 Feb 2011 10:54:34 GMT],
content-type=[text/html], connection=[keep-alive], x-powered-by=[PHP/
5.2.15], set-
cookie=[__cfduid=db133a16aaad88bded9766b18448779691297508074;
expires=Mon, 23 Dec 2019 23:50:00 GMT; path=/; domain=.gamer-
district.com, __cfduid=db133a16aaad88bded9766b18448779691297508074;
expires=Mon, 23 Dec 2019 23:50:00 GMT; path=/; domain=.www.gamer-
district.com], x-google-cache-control=[remote-fetch], via=[HTTP/1.1
GWA]}

2011-02-12 02:54:34.244

[keepeyeon/1.348283144444156083].<stdout>: 10:54:34.244 [pool-4-
thread-1][keepEyeOn] DEBUG wowpop.fetchers.UrlFetcher - fetched
500chars: <table style="width: 100%; border: 0; padding: 1px">  <tr>
<td><b style="color: #fff">Gamer District 7x</b></td>   <td style="text-
align: right;"> <img src="http://www.gamer-district.com/modules/
mod_realmcore/wow_on.png"></td></tr></table><table style="width: 100%;
border: 0; padding: 3">           <tr>                  <td>Uptime:</td>        
                <td>2 hours 35
minutes</td>    </tr>                   <tr>                    <td>Players 
online:</td>                        <td><b>1400</b>
&nbsp; &nbsp;<span style="color: #ADDFFF">630</font> / <span
style="color: #F62817">770</font></td


The headers are DIFFERENT and have different TIME, also note the
UPTIME value in the html. I'm not caching anything myself.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.

Reply via email to