What about taking advantage of curl's built in cookie functions? In particular, you should look at doing this with a two step process utilizing the CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE functions. First, log in... Then grab the article itself, once the session has begun.
$ch = curl_init(); curl_setopt($ch, CURLOPT_URL,"http://www.yoursite.com"); // set url to post to curl_setopt($ch, CURLOPT_FAILONERROR, 1); curl_setopt($ch, CURLOPT_REFERER, "http://www.wsj.com"); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);// allow redirects curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable curl_setopt($ch, CURLOPT_TIMEOUT, 50); // times out after 4s curl_setopt($ch, CURLOPT_COOKIEJAR, "my_cookies.txt"); //initiates cookie file curl_setopt($ch, CURLOPT_COOKIEFILE, "my_cookies.txt"); // Uses previous session cookies curl_setopt($ch, CURLOPT_VERBOSE, 1); curl_setopt ($ch, CURLOPT_USERAGENT, "mozilla/5.0 (x11; u; linux i686; en-us; rv:1.5a) gecko/20030728 mozilla firebird/0.6.1"); -- jon ------------------- jon roig web developer email: [EMAIL PROTECTED] phone: 888.230.7557 -----Original Message----- From: Richard Miller [mailto:[EMAIL PROTECTED] Sent: Monday, February 09, 2004 6:29 PM To: [EMAIL PROTECTED] Subject: [PHP] CURL and Cookies I would appreciate any help you can give me about a problem I am having with PHP's CURL functions. I want to use CURL to download news from Wall Street Journal Online. When you visit the WSJ home page, you're forwarded to an authentication page to enter your name and password, and then forwarded back to the home page. I want my CURL command to send the authentication cookie so when it's forwarded to the authentication page it forwards right back to the home page without having to enter the name and password. I can get the following CURL command to run fine at the command prompt, but not in PHP: THIS WORKS curl --cookie "WSJIE_LOGIN=blahblahblah" -L -O "http://online.wsj.com/home/us" THIS DOESN'T WORK $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, "http://online.wsj.com/home/us"); curl_setopt($ch, CURLOPT_COOKIE, "WSJIE_LOGIN=blahblahblah"); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $content = curl_exec($ch); curl_close($ch); I used a packet sniffer to see how this works. When I request the home page (above) and send the WSJIE_LOGIN cookie, the home page redirects to the authentication page. The authentication page uses the WSJIE_LOGIN cookie to generate more cookies. Then these 5-6 cookies are sent back to the home page and give the user access to the content. The WSJIE_LOGIN cookie is my own personal authentication cookie; the other cookies change from time to time. But I noticed that the PHP CURL isn't perpetuating these other cookies when it forwards back to the home page, like the command-line CURL does. Here are blocks from the package capture: CLI CURL ... 192.168.001.100.63745-206.157.193.068.00080: GET /home/us HTTP/1.1 User-Agent: curl/7.10.2 (powerpc-apple-darwin7.0) libcurl/7.10.2 OpenSSL/0.9.7b zlib/1.1.4 Cookie: WSJIE_LOGIN=abc Host: online.wsj.com Pragma: no-cache Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* Cookie: fastlogin=xyz; wsjproducts=xyz; user_type=xyz; REMOTE_USER=xyz; UBID=xyz ... PHP CURL ... 192.168.001.100.63750-206.157.193.068.00080: GET /home/us HTTP/1.1 Cookie: WSJIE_LOGIN=abc Host: online.wsj.com Pragma: no-cache Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* ... PHP's curl doesn't forward the cookies that it is given at the previous page, so, of course, I don't get my content. Any ideas why? Richard Miller -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php --- Incoming mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.572 / Virus Database: 362 - Release Date: 1/27/2004 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.572 / Virus Database: 362 - Release Date: 1/27/2004 -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php