Charlie,

Thanks for your reply, I did look at that one last night and found that
after I read my documentation closer, the requirements to this job are very
simple.

1) Crawl the web, n levels deep as well as filter with an exclude list.
2) Parse the html
3) n Extract documents (Such forms as PDF's, word etc) at this stage I am
assuming total extraction as there is a requirement to store on the HD, but
I am going to maybe try to talk them out of this option.

I have no problem coding this, but I don't want to reinvent the wheel if
there is already an open source or commercial solution that could be used
instead.



Andrew Scott
Senior Coldfusion Developer
Aegeon Pty. Ltd.
www.aegeon.com.au
Phone: +613  8676 4223
Mobile: 0404 998 273



--~--~---------~--~----~------------~-------~--~----~
 You received this message because you are subscribed to the Google Groups 
"cfaussie" group.
To post to this group, send email to cfaussie@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cfaussie?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to