The Apache Commons project has a screen scraper Java tag that I've used with CF for some things: http://jakarta.apache.org/taglibs/doc/scrape-doc/intro.html
On 12/9/05, Jacob Cameron <[EMAIL PROTECTED]> wrote: > > > > There may be an easier way, but parsers usually take 1-3 hours to code. > This one is a little complicated (different td parameters), but pretty easy > because it is uniform code (must come from a database). > > > > I would do a find for: <table cellpadding=5 width=650 height="250" > cellspacing=1> > > > > Then find the first </table> after that. That will change your result > string to only contain the table you want. > > > > With this string you can simply loop through all <tr></tr>. Within that > loop, you will know that the first <td> is Description, second is code, > third is onHand and so forth. Since the <td> parameters change, I would > parse each one looking for </td> or from the 92 or 93rd character into the > <tr> string on the first element is the start of the description (92 or 93, > depending if it is a carriage return and line feed or just carriage return). > > > > It should take about an hour to 1.5 hours to write this quick little parser, > test and get the data into a database. > > > > I've also used the XML functions to parse the data, but most HTML throws > them for a curve. If it's an xhtml site, it's simple to parse. It's easier > to use the XML functions if that is possible because you can then say get > this object. > > > > > Jacob Cameron > Blue Lantern, Inc. > (972) 226-9595 > [EMAIL PROTECTED] > http://www.bluelantern.com > > ________________________________ > > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of Ron Mast > Sent: Friday, December 09, 2005 8:43 AM > To: Dallas/Fort Worth ColdFusion User Group Mailing List > Subject: [DFW CFUG] parsing question > > > > Good morning, > > A while back ago there was a discussion on parsing a cfhttp get content. I > can't seem to find it, and I created an application that retrieved golf > scores etc. which was deleted along the way…oops. Please look at this site > and tell me the easiest way to pull the Description, Code, OnHand, Show, > Price, and Qty using cfhttp get. > http://66.13.185.22/doc/HW_ResultMemberItems.asp?ListItem=7&Size=Nike&textfield=BAU&textfield1=7&Submit=Show > > > > Thanks in advance, > > > > Ron Mast > > Webmaster > > Truth Hardware > > Ph: 507-444-4748 > > Fx: 507-444-5361 > > www.truth.com > > > > _______________________________ > > This e-mail and any files transmitted with it are confidential and are > intended solely for the use of the individual to whom they are addressed. > If you are not the intended recipient or the individual responsible for > delivering the e-mail to the intended recipient, please be advised that you > have received this e-mail in error and that any use, dissemination, > forwarding, printing, or copying of this e-mail is strictly prohibited. > _______________________________________________ > Reply to DFWCFUG: > [email protected] > Subscribe/Unsubscribe: > http://lists1.safesecureweb.com/mailman/listinfo/list > List Archives: > http://lists1.safesecureweb.com/mailman/private/list > http://www.mail-archive.com/list%40list.dfwcfug.org/ > http://www.mail-archive.com/list%40dfwcfug.org/ > DFWCFUG Sponsors: > www.HostMySite.com > www.teksystems.com/ > > > -- Matt Woodward [EMAIL PROTECTED] http://www.mattwoodward.com
_______________________________________________ Reply to DFWCFUG: [email protected] Subscribe/Unsubscribe: http://lists1.safesecureweb.com/mailman/listinfo/list List Archives: http://lists1.safesecureweb.com/mailman/private/list http://www.mail-archive.com/list%40list.dfwcfug.org/ http://www.mail-archive.com/list%40dfwcfug.org/ DFWCFUG Sponsors: www.HostMySite.com www.teksystems.com/
