There is a configuration setting in that library which does allow you to
crawl HTTPS pages. By default that library plays by the rules like
honors robot.txt file, does not try to crawl HTTPS pages. But that all
restrictions can be removed through the configuration file.

-----Original Message-----
From: Discussion of advanced .NET topics.
[mailto:[EMAIL PROTECTED] On Behalf Of Paul Cowan
Sent: Wednesday, March 08, 2006 10:17 AM
To: ADVANCED-DOTNET@DISCUSS.DEVELOP.COM
Subject: Re: [ADVANCED-DOTNET] HTTP help

That does look interesting Kohli.

One more problem is that some of the requests will be made over https, I
just cannot see how we can achieve the https calls.



[EMAIL PROTECTED]





>From: "Kohli, Naveen" <[EMAIL PROTECTED]>
>Reply-To: "Discussion of advanced .NET topics."
><ADVANCED-DOTNET@DISCUSS.DEVELOP.COM>
>To: ADVANCED-DOTNET@DISCUSS.DEVELOP.COM
>Subject: Re: [ADVANCED-DOTNET] HTTP help
>Date: Wed, 8 Mar 2006 09:54:09 -0500
>
>You can look at using a HTML parser. There are few available out there.
>One that I have been using for some time is
>
>http://www.netomatix.com/Products/DocumentManagement/HTMLParserNet.aspx
>
>N
>
>-----Original Message-----
>From: Discussion of advanced .NET topics.
>[mailto:[EMAIL PROTECTED] On Behalf Of Paul Cowan
>Sent: Wednesday, March 08, 2006 9:37 AM
>To: ADVANCED-DOTNET@DISCUSS.DEVELOP.COM
>Subject: [ADVANCED-DOTNET] HTTP help
>
>Hi all,
>
>Can anyone help me with the following requirements?  We want to parse
an
>HTTP request for a web page and display all the constituent parts that
>make
>up the web page. That is I want to display all the additional requests
>that
>are made to make up the whole page (i.e. css, images and javascript
>files).
>Say I make a request for page1.aspx then the system would log that it
is
>made up of the following resources:
>
>Default.css
>Modern.css
>Image1.jpg
>Script.js
>Etc., etc.
>
>I have no idea how to achieve this, does anybody know??
>
>Thanks
>
>Paul
>
>===================================
>This list is hosted by DevelopMentor(r)  http://www.develop.com
>
>View archives and manage your subscription(s) at
>http://discuss.develop.com
>
>===================================
>This list is hosted by DevelopMentor(r)  http://www.develop.com
>
>View archives and manage your subscription(s) at
http://discuss.develop.com

===================================
This list is hosted by DevelopMentor(r)  http://www.develop.com

View archives and manage your subscription(s) at
http://discuss.develop.com

===================================
This list is hosted by DevelopMentorĀ®  http://www.develop.com

View archives and manage your subscription(s) at http://discuss.develop.com

Reply via email to