The Apache Commons project has a screen scraper Java tag that I've
used with CF for some things:
http://jakarta.apache.org/taglibs/doc/scrape-doc/intro.html

On 12/9/05, Jacob Cameron <[EMAIL PROTECTED]> wrote:
>
>
>
> There may be an easier way, but parsers usually take 1-3 hours to code.
> This one is a little complicated (different td parameters), but pretty easy
> because it is uniform code (must come from a database).
>
>
>
> I would do a find for:  <table cellpadding=5 width=650 height="250"
> cellspacing=1>
>
>
>
> Then find the first </table> after that.  That will change your result
> string to only contain the table you want.
>
>
>
> With this string you can simply loop through all <tr></tr>.  Within that
> loop, you will know that the first <td> is Description, second is code,
> third is onHand and so forth.  Since the <td> parameters change, I would
> parse each one looking for </td> or from the 92 or 93rd character into the
> <tr> string on the first element is the start of the description (92 or 93,
> depending if it is a carriage return and line feed or just carriage return).
>
>
>
> It should take about an hour to 1.5 hours to write this quick little parser,
> test and get the data into a database.
>
>
>
> I've also used the XML functions to parse the data, but most HTML throws
> them for a curve.  If it's an xhtml site, it's simple to parse.  It's easier
> to use the XML functions if that is possible because you can then say get
> this object.
>
>
>
>
> Jacob Cameron
>  Blue Lantern, Inc.
>  (972) 226-9595
>  [EMAIL PROTECTED]
>  http://www.bluelantern.com
>
>  ________________________________
>
>
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On Behalf Of Ron Mast
>  Sent: Friday, December 09, 2005 8:43 AM
>  To: Dallas/Fort Worth ColdFusion User Group Mailing List
>  Subject: [DFW CFUG] parsing question
>
>
>
> Good morning,
>
> A while back ago there was a discussion on parsing a cfhttp get content. I
> can't seem to find it, and I created an application that retrieved golf
> scores etc. which was deleted along the way…oops. Please look at this site
> and tell me the easiest way to pull the Description, Code, OnHand, Show,
> Price, and Qty using cfhttp get.
> http://66.13.185.22/doc/HW_ResultMemberItems.asp?ListItem=7&Size=Nike&textfield=BAU&textfield1=7&Submit=Show
>
>
>
> Thanks in advance,
>
>
>
> Ron Mast
>
> Webmaster
>
> Truth Hardware
>
> Ph: 507-444-4748
>
> Fx: 507-444-5361
>
> www.truth.com
>
>
>
> _______________________________
>
> This e-mail and any files transmitted with it are confidential and are
> intended solely for the use of the individual to whom they are addressed.
> If you are not the intended recipient or the individual responsible for
> delivering the e-mail to the intended recipient, please be advised that you
> have received this e-mail in error and that any use, dissemination,
> forwarding, printing, or copying of this e-mail is strictly prohibited.
> _______________________________________________
> Reply to DFWCFUG:
>   [email protected]
> Subscribe/Unsubscribe:
>   http://lists1.safesecureweb.com/mailman/listinfo/list
> List Archives:
>   http://lists1.safesecureweb.com/mailman/private/list
>   http://www.mail-archive.com/list%40list.dfwcfug.org/
>   http://www.mail-archive.com/list%40dfwcfug.org/
> DFWCFUG Sponsors:
>   www.HostMySite.com
>   www.teksystems.com/
>
>
>


--
Matt Woodward
[EMAIL PROTECTED]
http://www.mattwoodward.com
_______________________________________________
Reply to DFWCFUG: 
  [email protected]
Subscribe/Unsubscribe: 
  http://lists1.safesecureweb.com/mailman/listinfo/list
List Archives: 
  http://lists1.safesecureweb.com/mailman/private/list 
  http://www.mail-archive.com/list%40list.dfwcfug.org/             
  http://www.mail-archive.com/list%40dfwcfug.org/
DFWCFUG Sponsors: 
  www.HostMySite.com 
  www.teksystems.com/

Reply via email to