I've done spiders in CF and they work rather well. The question is more of which is
better than which can do the job. A C++ spider may be more efficient, but harder to
write and maintain (i.e. upgrade) and that's besides the cost associated with a C++
programmer. I'd say write it in CF, use it and if its not to your liking speed wise,
then go to a C++ guy. At that point you'll have done the work and know exactly the
issues to bring to him.
As to how, this would work
<cfhttp url="www.macromedia.com" method="GET">
<CFOUTPUT>#findnocase('<EMBED', cfhttp.filecontent)#</CFOUTPUT>
Note that it fails because MM is detecting if the browser (in this case CFHTTP) has
flash enabled. Using this will make the page think that your an IE browser and show
you a page that has the embed tag in it.
<cfhttp url="www.macromedia.com" method="GET" resolveurl="false"
useragent="Mozilla/4.0 (compatible; MSIE 6.0b; Windows NT 4.0; .NET CLR 1.0.2914)">
Note the useragent. This can be any valid browser type. I just used the one I have
here at the moment.
You can get a lot more complicated and detect a lot more information if you wanted to
but the above is a very simple spider.
At 01:08 PM 6/24/02, you wrote:
>Is CF the wrong tools to use to build a spider program? I want to build a
>program which goes out from a designated starting point and logs
>information about the sites it comes across. In particular, I am looking to
>find sites that have a specific tag on them.
>
>Lets say, for example I am looking to return a list of site's which use the
><embed> tag and have flash content on their site. I also want to calculate
>how many flash movies they have across their domain/site.
>
>Is this something CF could do? Would using CFHTTP and Regex be an
>inefficient way of doing this? Should I hire A C+ programmer??
>
>If CF is suitable for this job, where would I start and are there any
>resources available? Thanx!
>
>Brook Davies
>maracasmedia inc.
>
>
>
>
>
>At 12:52 PM 24/06/02 -0400, you wrote:
>>I am getting the following message when I try to run a query does anybody
>>know what it is?
>>
>> Error Diagnostic Information
>> unknown exception condition
>>
>> CFMLInterpreterImp::writeEmergencyMessage
>>
>>
>> Date/Time: 06/24/02 12:36:46
>> Browser: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
>>
>>
>>
>>
>>
>
______________________________________________________________________
Structure your ColdFusion code with Fusebox. Get the official book at
http://www.fusionauthority.com/bkinfo.cfm
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists