> Someone had offered (a while ago) to generate a URL that would be
> acceptable for password protected sites given a normal URL. I am not sure
> how that would work for all URLs that may be using different means to
> accept userid+passwords and authenticate you, but if anyone has thought
> through this topic and have any suggestions, I would be willing to listen
> or even communicate offline.



1. when not logged in, visit the page you want to pluck; site should prompt
you for a username/password.

2. view the source and dissect the form - use your browser's Find function
if you get lost.

f.ex.
<form action='login.php' method='get'>Username:<input type='text'
name='user' size=40>
<br>Password:<input type='password' name='pass'><br>
<input type='submit' value='Log In'></form>

Username textbox is named 'user'. Password textbox is named 'pass'. Whole
mess is sent to login.php.

3. encode the form elements into a URL.

format is filename.bin?name=value&name2=value2&name3=value3 etc.

http://www.mynewssite.com/login.php?user=Bob&pass=1337

4. try encoded URL in a Web browser. (freshly de-cookied, preferably.)
5. if it works, use that as a URL for Plucker.

Or you could just log in with Mozilla/Netscape/Firebird/etc. and point
JPluck to your cookies.txt.

--
- combs
[EMAIL PROTECTED]


> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of
> [EMAIL PROTECTED]
> Sent: Friday, June 27, 2003 15:46
> To: [EMAIL PROTECTED]
> Subject: Using JPluck for spidering password protected sites

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to