Yes, this is a site I have no control over.  So if the html code changes then I’m hosed like you said.  So, the decision was made not to waste my time on this particular application.

 

Thanks all!

 

Ron Mast

Webmaster

Truth Hardware

Ph: 507-444-4748

Fx: 507-444-5361

www.truth.com


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dave Shuck
Sent: Friday, December 09, 2005 11:33 AM
To: Dallas/Fort Worth ColdFusion User Group Mailing List
Subject: Re: [DFW CFUG] parsing question

 

Ron, the idea of using REfind was that you look for a pattern rather than an exact match.  It is a very powerful function and there is a lot of documentation out there.  Here is a good starting point, but you will need to familiarize yourself with regular expressions too.  Thankfully there is a lot of information available about that as well.

http://livedocs.macromedia.com/coldfusion/6/CFML_Reference/functions-pt262.htm

I am guessing you have no control over the HTML that you are scraping.  Is that correct?  If you do, it might be a good idea to just embed some IDs in elements are embed some HTML comments as reference points.  If you are scraping that page based on the attributes of that table, that means if the site ever changes their style you are hosed.  In general I think it is a bad idea to tie your business logic to someone else's html design that you have no control over.  REfind would help to that end a bit, but obviously not completely.

Alternatively, is this a partner site?  Is XML a possibility?  That seems like a much more precise method.

That'll be $0.02. :)

On 12/9/05, Ron Mast <[EMAIL PROTECTED]> wrote:

Me again,

I'm doing the following:

<cfhttp method="GET" url="" href="http://66.13.185.22/doc/HW_ResultMemberItems.asp?ListItem=7&Size=Nike&textfield=BAU&textfield1=7&Submit=Show" target="_blank">http://66.13.185.22/doc/HW_ResultMemberItems.asp?ListItem=7&Size=Nike&textfield=BAU&textfield1=7&Submit=Show ">

<cfset startOfTable = REFind('<table cellpadding=5 width=650 height="250" cellspacing=1>', "#cfhttp.FileContent#")>

<cfset lengthOfContent = len(cfhttp.FileContent)>

<cfset endOfTable = REFind("</table>", cfhttp.FileContent, 1, "TRUE")>

<cfdump var="#endOfTable#">

 

I did a search for </table> and there are 6 but the endOfTable is only showing me 1 position instead of 6. I guess I don't have the grasp on how to use REFind's reg_expression parameter. Can someone explain to me why?

 

Ron Mast

Webmaster

Truth Hardware

Ph: 507-444-4748

Fx: 507-444-5361

www.truth.com


From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED]] On Behalf Of Tom Woestman
Sent: Friday, December 09, 2005 9:12 AM
To: Dallas/Fort Worth ColdFusion User Group Mailing List
Subject: RE: [DFW CFUG] parsing question

 

Morning Ron,

 

REFind works well for situations like this using parenthesis around the part of the _expression_ you want to extract.  I suggest using REFind in a loop and extracting all values for one item first and place those in an array and then loop to extract the next item's values and put those in another array and so on until you have extracted all the values. 

 

Tom

 


From: Ron Mast [mailto:[EMAIL PROTECTED]]
Sent: Friday, December 09, 2005 6:43 AM
To: Dallas/Fort Worth ColdFusion User Group Mailing List
Subject: [DFW CFUG] parsing question

 

Good morning,

A while back ago there was a discussion on parsing a cfhttp get content. I can't seem to find it, and I created an application that retrieved golf scores etc. which was deleted along the way…oops. Please look at this site and tell me the easiest way to pull the Description, Code, OnHand, Show, Price, and Qty using cfhttp get.  http://66.13.185.22/doc/HW_ResultMemberItems.asp?ListItem=7&Size=Nike&textfield=BAU&textfield1=7&Submit=Show

 

Thanks in advance,

 

Ron Mast

Webmaster

Truth Hardware

Ph: 507-444-4748

Fx: 507-444-5361

www.truth.com

 

_______________________________

This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual to whom they are addressed.  If you are not the intended recipient or the individual responsible for delivering the e-mail to the intended recipient, please be advised that you have received this e-mail in error and that any use, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited.


_______________________________________________
Reply to DFWCFUG:
 [email protected]
Subscribe/Unsubscribe:
 http://lists1.safesecureweb.com/mailman/listinfo/list
List Archives:
  http://lists1.safesecureweb.com/mailman/private/list
  http://www.mail-archive.com/list%40list.dfwcfug.org/
 http://www.mail-archive.com/list%40dfwcfug.org/
DFWCFUG Sponsors:
 www.HostMySite.com
  www.teksystems.com/




--
~Dave Shuck
[EMAIL PROTECTED]
www.daveshuck.com

_______________________________

This e-mail and any files transmitted with it are confidential and are intended solely for the use of the individual to whom they are addressed.  If you are not the intended recipient or the individual responsible for delivering the e-mail to the intended recipient, please be advised that you have received this e-mail in error and that any use, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited.

_______________________________________________
Reply to DFWCFUG: 
  [email protected]
Subscribe/Unsubscribe: 
  http://lists1.safesecureweb.com/mailman/listinfo/list
List Archives: 
  http://lists1.safesecureweb.com/mailman/private/list 
  http://www.mail-archive.com/list%40list.dfwcfug.org/             
  http://www.mail-archive.com/list%40dfwcfug.org/
DFWCFUG Sponsors: 
  www.HostMySite.com 
  www.teksystems.com/

Reply via email to