At 04:06 AM 3/14/2013, tamouse mailing lists wrote: >If the files are delivered via the web, by php or some other means, even if >located outside webroot, they'd still be scrapeable.
Bots, however, being "mechanical" (i.e., hard wired or programmed) behave in different ways than humans, and that difference can be exploited in a script. Part of the rationale in putting the files outside the root is that they have no URLs, eliminating one vulnerability (you can't scrape the URL of a file if it has no URL). Late last night I figured out why I was having trouble accessing those external files from my script, and now I'm working out the parsing details that enable one script to access multiple external files. My approach probably won't defeat all bad bots, but it will likely defeat most of them. You can't make code bulletproof, but you can wrap it in Kevlar. Dale H. Cook, Member, NEHGS and MA Society of Mayflower Descendants; Plymouth Co. MA Coordinator for the USGenWeb Project Administrator of http://plymouthcolony.net -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php