RE: Parsing??

Michael She Sat, 16 Dec 2000 23:35:33 -0800
You'll have a fun time.

unfortately HTML unlike XML is not well defined.  Basically you'll have to 
hard code patterns and rules to pull out specific data.

The best would be to switch over to an XML type webpage/data packet 
instead.  That way you can easily manipulate it.

At 12:49 PM 12/16/00 -0500, ibtoad you wrote:

>No, I wish I did.
>Rich
>
>-----Original Message-----
>From: Craig Thomas [mailto:[EMAIL PROTECTED]]
>Sent: Saturday, December 16, 2000 12:37 PM
>To: CF-Talk
>Subject: RE: Parsing??
>
>
>Do you have access to the code of the target page?
>
>All my <cfset> tags and the find() and len() functions add and subtract
>characters from the CFHTTP.FileContent variable...giving me parts of the web
>page I want to display.  Watch out, if the target page changes, your code
>needs to change.
>
>In my code the "Start" variable is where I start, the "Stop" is where I stop
>and the "Headline" is that part of the page I want to display.
>
>Craig
>
>
>-----Original Message-----
>From: ibtoad [mailto:[EMAIL PROTECTED]]
>Sent: Saturday, December 16, 2000 12:18 PM
>To: CF-Talk
>Subject: RE: Parsing??
>
>
>I have successfully used the cfhttp tag to pull a webpage and then display
>it with CFHTTP.FileContent.  But how do I just display specific parts?
>
>Rich
>
>-----Original Message-----
>From: Craig Thomas [mailto:[EMAIL PROTECTED]]
>Sent: Saturday, December 16, 2000 11:10 AM
>To: CF-Talk
>Subject: RE: Parsing??
>
>
>Rich,
>
>To parse a web page you first grab the content with the <CFHTTP> tag, then
>"read" the content using the CFHTTP.FileContent variable.  In the code below
>I use the find and len functions to figure out where to start and stop the
>reading/parsing of the page.
>
>This code works only when the page you are parsing is set up just like the
>code.  Where to start and stop depends on what piece of what page you are
>trying to parse...if you want the entire page, it lives in the
>CFHTTP.FileContent variable returned by the <CFHTTP> call. The online help
>in CF Studio is worth a read to understand the <CFHTTP> tag.
>
>
><!---get news item for today from somewebsite.com--->
>
><cfset TodayIS = DateFormat(Now(), 'dddd')>
>
><cfset YesterDayWas = dateFormat(Now(), 'd')-5>
><cfset YesterDayWas = dateFormat(YesterDayWas, 'dddd')>
>
><cfhttp method="get" url="http://www.somewebsite.com"  resolveurl="Yes">
>
><cfset Start = find("#TodayIS#", cfhttp.FileContent)>
>
><cfset Headline = Right(CFHTTP.FileContent,
>(len(CFHTTP.FileContent)-Start)-23)>
>
><cfset Stop = Find("#YesterDayWas#", Headline)>
>
><cfset Headline = Left(Headline, Stop-20)>
>
>
>Craig,
>
>PK Interactive Inc.
>NY, NY 10001
>212.273.9623
>
>-----Original Message-----
>From: ibtoad [mailto:[EMAIL PROTECTED]]
>Sent: Saturday, December 16, 2000 10:48 AM
>To: CF-Talk
>Subject: Parsing??
>
>
>Can anyone tell me where I can find out how to parse a webpage and pull out
>specific information?  I have never done this before.
>
>Thanks,
>Rich
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        Structure your ColdFusion code with Fusebox. Get the official book at 
http://www.fusionauthority.com/bkinfo.cfm

Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists
RE: Parsing??

Reply via email to