On Tue, Aug 25, 2009 at 06:30:40AM -0700, Ken Krugler wrote:
> Hi Melroyr,
>
> On Aug 25, 2009, at 3:19am, melroyr wrote:
>
>> Ken, Thanks for your response.
>> If you look at the source at
>> http://flyer.harristeeter.com/HT_eVIC/ThisWeek/ReviewAllSpecials.jsp?ToCat=0
>> thru 13, the page changes its content. However downloading the same  
>> pages
>> thru HTTPClient, I get a message that says the browser does not  
>> support
>> framesets and there is no content.
>
> The use of the frameset tag isn't the issue.
>
> Your problem is that this site sets a cookie (StoreNumberCK) with a  
> store id. If that's set, then you get a page with full content.
>
> If it's not set, you get the page that you sent to the list, which  
> contains a link that, when clicked, will let you pick your local store.
>
> You'l need to figure out what content to set in that cookie, and  
> programmatically create it before making the HTTP GET request.
>
> -- Ken
>

Melroyr,

Please have a look at the HttpClient primer

http://hc.apache.org/httpcomponents-client/primer.html

Oleg

>>
>>
>> melroyr wrote:
>>>
>>> I have written a program to download html pages from harristeeter.
>>> However, when I run my program, I get the following
>>>
>>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
>>> "http://www.w3.org/TR/html4/frameset.dtd";>
>>> <html>
>>> <head>
>>> <title>Your Personal Shopping List</title>
>>> <meta http-equiv="Content-Type" content="text/html;  
>>> charset=iso-8859-1">
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> <script language='javascript'>
>>>
>>> if (top.location != self.location) {
>>>             top.location = self.location
>>> }
>>>
>>> if ('null' == 'null')
>>> {
>>>     var width = screen.width;
>>>     var height = screen.height;
>>>
>>>     var myWidth = 640, myHeight = 480;
>>>     if( typeof( window.innerWidth ) == 'number' ) {
>>>             //Non-IE
>>>             myWidth = window.innerWidth;
>>>             myHeight = window.innerHeight;
>>>     }
>>>     else if( document.documentElement &&
>>>       ( document.documentElement.clientWidth ||
>>> document.documentElement.clientHeight ) )
>>>     {
>>>             //IE 6+ in 'standards compliant mode'
>>>             myWidth = document.documentElement.clientWidth;
>>>             myHeight = document.documentElement.clientHeight;
>>>     }
>>>     else if( document.body &&
>>>              ( document.body.clientWidth || document.body.clientHeight ) )
>>>     {
>>>             //IE 4 compatible
>>>             myWidth = document.body.clientWidth;
>>>             myHeight = document.body.clientHeight;
>>>             height = screen.availHeight;
>>>             width = screen.availWidth;
>>>     }
>>>
>>>     
>>>
>>>     var x = 0;
>>>     var y = 0;
>>>
>>>     
>>>
>>>     var minWidth = (width < 960) ? width : 960;
>>>
>>>     if (myWidth < minWidth && width >= minWidth && myWidth > 0 &&  
>>> myHeight >
>>> 0)
>>>     {
>>>             if (navigator.appName=="Netscape") y = self.screenY;
>>>             else y = self.top;
>>>
>>>             var w = 800;
>>>             var h = myHeight;
>>>             var new_y = y;
>>>             if (screen.width > 1024) w = 1024;
>>>             else if (screen.width > 960) w = 960;
>>>             if (myHeight < (0.80) * height)
>>>             {
>>>                     h = (0.80)*height;
>>>                     new_y = (height - h)/2;
>>>             }
>>>
>>>             if (new_y < y) y = new_y;
>>>
>>>             x = (width - w)/2;
>>>
>>>             if (x < 0)
>>>             {
>>>                     w += x;
>>>                     x = 0;
>>>             }
>>>
>>>             if (y < 0)
>>>             {
>>>                     h += y;
>>>                     y = 0;
>>>             }
>>>
>>>             if (w > width) w = width;
>>>
>>>             
>>>
>>>             if (parseInt(navigator.appVersion)>3)
>>>             {
>>>                if (navigator.appName=="Netscape")
>>>                {
>>>                             self.outerWidth=w;
>>>                             self.outerHeight=h;
>>>                             self.moveTo(x,y);
>>>                }
>>>                else
>>>                {
>>>                             self.resizeTo(w,h);
>>>                             self.moveTo(0,0);
>>>                }
>>>             }
>>>     }
>>>
>>>
>>> location='index.jsp?screenwidth='+screen.width 
>>> +'&default_screenwidth=1&rand='+Math.random();
>>> }
>>>
>>> if ('false' == 'true')
>>> {
>>>     top.location='index.jsp?ID'+Math.round(Math.random()*10000);
>>> }
>>>
>>> </script>
>>>
>>> </head>
>>>
>>>
>>> <frameset rows="*,0" cols="*" frameborder="no" border="0"
>>> framespacing="0">
>>> <frameset rows="132,*" cols="*" frameborder="no" border="0"
>>> framespacing="0">
>>>  <frame src="top.jsp" name="topFrame" scrolling="no" noresize>
>>>  <frameset rows="*" cols="400,*" framespacing="0" frameborder="no"
>>> border="0">
>>>     <frame src="ReviewAllSpecials.jsp" name="mainFrame" scrolling="YES">
>>>     <frame src="list.jsp" name="rightFrame" scrolling="YES" noresize>
>>>  </frameset>
>>> </frameset>
>>> <frame src="actions.jsp" name="bottomFrame" scrolling="YES" noresize>
>>> </frameset>
>>>
>>> <noframes><body>
>>> This application requires the use of frames, which your browser does 
>>> not
>>> support.
>>> </body></noframes>
>>>
>>> </html>
>>>
>>> The URL I am using to download the pages is
>>> http://flyer.harristeeter.com/HT_eVIC/ThisWeek/ReviewAllSpecials.jsp
>>>
>>> Please advise if there is some setting that I need do set in  
>>> HttpClient? I
>>> have read about HtmlCleaner and stuff but I do not think they will  
>>> help.
>>>
>>> Thanks,
>>> Melroy
>>>
>>
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Downloading-HTML-frameset-pages-via-HTTPClient-tp25121961p25131807.html
>> Sent from the HttpClient-User mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
> --------------------------
> Ken Krugler
> TransPac Software, Inc.
> <http://www.transpac.com>
> +1 530-210-6378
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to