on 11/27/02 7:54, [EMAIL PROTECTED] purportedly said: > RE: Help! how is this called?Thank you but this won't help me I guess. > > I could find that info only from within the script, right? > > Well, I want to create a program like that Teleport Pro from Windows that > spiders a web site and download all the pages from the site. > To download the pages is very easy, but the biggest problem is to create the > local file names, and to replace all the links from the downloaded pages to > make them work locally. > > Until now, the only problem I found, is that I can't reliably find the file > name from the path in all the cases.
Well, yes and no. The example URL provided: > http://www.site.com/script.cfm/dir1/dir2/http://www.site.com/file.html is technically a malformed URI. It should be: http://www.site.com/script.cfm/dir1/dir2/http:%2F%2Fwww.site.com%2Ffile.html or minimally: http://www.site.com/script.cfm/dir1/dir2/http:%2F%2Fwww.site.com/file.html You will always find that sites do stupid things, and will have to find ways around them. However, the case of extra PATH_INFO or query strings, it doesn't hurt to treat them as they are, and you will be successful most of the time. Other than issues with the URI above, you should have minimal problems. Keary Suska Esoteritech, Inc. "Leveraging Open Source for a better Internet"
