Re: [CODE4LIB] calling another webpage within CGI script
Hi Ken Are you behind a web proxy server or firewall? If so, you'll probably need to specify a proxy server in the script. If the proxy is defined in the environment variables on the server, then you can use... my $ua = LWP::UserAgent-new( timeout = 60 ); $ua-env_proxy(); ...otherwise, you might need to hardcode it into the script... my $ua = LWP::UserAgent-new( timeout = 60 ); $ua-proxy(['http'], 'http://squid.wittenberg.edu:3128'); (replace squid.wittenberg.edu:3128 with whatever the proxy server name and port number actually are) regards Dave Pattern University of Huddersfield From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ken Irwin [kir...@wittenberg.edu] Sent: 23 November 2009 19:41 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] calling another webpage within CGI script Hi Joe, That's really helpful, thanks. Actually finding out what the error message is nice: HTTP Error : 500 Can't connect to www.npr.org:80 (connect: Permission denied) I've tried this with a few websites and always get the same error, which tells me that the problem is on my server side. Any idea what I can change so I don't get a permission-denied rejection? I'm not even sure what system I should be looking at. I tried Vishwam's suggestion of granting 777 permissions to both the file and the directory and I get the same response. Is there some Apache setting someplace that says hey, don't you go making web calls while I'm in charge? (This is a Fedora server running Apache, btw). I don't know what to poke at! Ken --- This transmission is confidential and may be legally privileged. If you receive it in error, please notify us immediately by e-mail and remove it from your system. If the content of this e-mail does not relate to the business of the University of Huddersfield, then we do not endorse it and will accept no liability.
Re: [CODE4LIB] calling another webpage within CGI script
Hi, I had a similar problem a while back which was solved by disabling SELinux. http://www.crypt.gen.nz/selinux/disable_selinux.html -Greg
Re: [CODE4LIB] calling another webpage within CGI script - solved!
Hi, We run many Library / web / database applications on RedHat servers with SELinux enabled. Sometimes it takes a bit of investigation and horsing around but I haven't yet found a situation where it had to be disabled. setsebool and chcon can solve most problems and SELinux is an excellent enhancement to standard filesystem and ACL security. -Graham -- Graham Stewart Network and Storage Services Manager, Information Technology Services University of Toronto Library 130 St. George Street Toronto, Ontariograham.stew...@utoronto.ca Canada M5S 1A5Phone: 416-978-6337 | Mobile: 416-550-2806 | Fax: 416-978-1668 Ken Irwin wrote: Hi all, Thanks for your extensive suggestions and comments. A few folks suggested that SELinux might be the issue. Tobin's suggestion to change one of the settings proved effective: # setsebool -P httpd_can_network_connect 1. Thanks to everyone who helped -- I learned a lot. Joys Ken -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Greg McClellan Sent: Tuesday, November 24, 2009 10:04 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] calling another webpage within CGI script Hi, I had a similar problem a while back which was solved by disabling SELinux. http://www.crypt.gen.nz/selinux/disable_selinux.html -Greg
Re: [CODE4LIB] calling another webpage within CGI script - solved!
On Tue, Nov 24, 2009 at 11:18 AM, Graham Stewart graham.stew...@utoronto.ca wrote: We run many Library / web / database applications on RedHat servers with SELinux enabled. Sometimes it takes a bit of investigation and horsing around but I haven't yet found a situation where it had to be disabled. setsebool and chcon can solve most problems and SELinux is an excellent enhancement to standard filesystem and ACL security. Agreed that SELinux is useful but it is a tee-otal pain in the keister if you're ignorantly working against it because you didn't actually know it was there. It's sort of the perfect embodiment between the disconnect between the developer and the sysadmin. And, if this sort of tension interests you, vote for Bess Sadler's presentation at Code4lib 2010: Vampires vs. Werewolves: Ending the War Between Developers and Sysadmins with Puppet and anything else that interests you. http://vote.code4lib.org/election/index/13 -Ross Bringin' it on home Singer.
Re: [CODE4LIB] calling another webpage within CGI script - solved!
An interesting topic ... heading out to cast vote now. In our environment, about 6 years ago we informally identified the gap (grey area, war, however it is described) between server / network managers and developers / Librarians as an obstacle to our end goals and have put considerable effort into closing it. The key efforts being communication (more planning, meetings, informal sessions), collaboration (no-one is working in a vacuum), and the willingness to expand/stretch job descriptions (programmers sometimes participate in hardware / OS work and sysadmins will attend interface / application planning meetings). Supportive management helps. The end result is that sysadmins try as hard as possible to fully understand what an application is doing/requires on their hardware/networks, and programmers almost never run any applications that sysadmins don't know about. So, SELinux has never been a problem because we know what a server needs to do before it ends up in a developer's hands and developers know not to pound their heads against the desk for a day before talking to sysadmins about something that doesn't work. Well, for the most part, anyway ;-) -Graham Ross Singer wrote: On Tue, Nov 24, 2009 at 11:18 AM, Graham Stewart graham.stew...@utoronto.ca wrote: We run many Library / web / database applications on RedHat servers with SELinux enabled. Sometimes it takes a bit of investigation and horsing around but I haven't yet found a situation where it had to be disabled. setsebool and chcon can solve most problems and SELinux is an excellent enhancement to standard filesystem and ACL security. Agreed that SELinux is useful but it is a tee-otal pain in the keister if you're ignorantly working against it because you didn't actually know it was there. It's sort of the perfect embodiment between the disconnect between the developer and the sysadmin. And, if this sort of tension interests you, vote for Bess Sadler's presentation at Code4lib 2010: Vampires vs. Werewolves: Ending the War Between Developers and Sysadmins with Puppet and anything else that interests you. http://vote.code4lib.org/election/index/13 -Ross Bringin' it on home Singer. -- Graham Stewart Network and Storage Services Manager, Information Technology Services University of Toronto Library 130 St. George Street Toronto, Ontariograham.stew...@utoronto.ca Canada M5S 1A5Phone: 416-978-6337 | Mobile: 416-550-2806 | Fax: 416-978-1668
[CODE4LIB] calling another webpage within CGI script
Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? Thanks Ken
Re: [CODE4LIB] calling another webpage within CGI script
Ken, The difference is when you run through command script you are executing the file as /owner/ and as /Other/ when you access it through the browser. Looking at the error message you sent, I believe it might not be executing the complete script. Try setting permissions as 707 or 777 to start with. You may have to create a temporary directory to test with. Let me know if you have any questions, Vishwam Vishwam Annam Wright State University Libraries 120 Paul Laurence Dunbar Library 3640 Colonel Glenn Hwy. Dayton, OH 45435 Office: 937-775-3262 FAX 937-775-2356 Ken Irwin wrote: Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? Thanks Ken
Re: [CODE4LIB] calling another webpage within CGI script
Hi Ken, This may be obvious, but when running from the command line, stdout and stderr are often interleaved together, but on the web server you see stdout in the browser and stderr in the web server error log. Your script is probably exiting with an error either at the 'get' line (line 6) or at the 'die' line (line 7), which is what 'die' does -- terminate your script. Have you checked your web server error log to see what the error is on your 'get' call? Matt On Mon, Nov 23, 2009 at 7:17 AM, Ken Irwin kir...@wittenberg.edu wrote: Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? Thanks Ken
Re: [CODE4LIB] calling another webpage within CGI script
Ken, I tested your script on my server and it also worked for me on the command line and failed via my web server. All I did was add /usr to your path to perl and it worked: #!/usr/bin/perl Roy On 11/23/09 11/23/09 8:17 AM, Ken Irwin kir...@wittenberg.edu wrote: Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? Thanks Ken
Re: [CODE4LIB] calling another webpage within CGI script
On Mon, 23 Nov 2009, Ken Irwin wrote: Hi all, I'm moving to a new web server and struggling to get it configured properly. The problem of the moment: having a Perl CGI script call another web page in the background and make decisions based on its content. On the old server I used an antique Perl script called hcat (from the Pelican bookhttp://oreilly.com/openbook/webclient/ch04.html); I've also tried curl and LWP::Simple. In all three cases, I get the same behavior: it works just fine on the command line, but when called by the web server through a CGI script, the LWP (or other socket connection) gets no results. It sounds like a permissions thing, but I don't know what kind of permissions setting to tinker with. In the test script below, my command line outputs: Content-type: text/plain Getting URL: http://www.npr.org 885 lines Whereas the web output just says Getting URL: http://www.npr.org; - and doesn't even get to the Couldn't get error message. Any clue how I can make use of a web page's contents from w/in a CGI script? (The actual application has to do with exporting data from our catalog, but I need to work out the basic mechanism first.) Here's the script I'm using. #!/bin/perl use LWP::Simple; print Content-type: text/plain\n\n; my $url = http://www.npr.org;; print Getting URL: $url\n; my $content = get $url; die Couldn't get $url unless defined $content; @lines = split (/\n/, $content); foreach (@lines) { $i++; } print \n\n$i lines\n\n; Any ideas? I'd suggest testing the results of the call, rather than just looking for content, as an empty response could be a result of the server you're connecting to. (unlikely in this case, but it happens once in a while, particularly if you turn off redirection, or support caching). Unfortunately, you might have to use LWP::UserAgent, rather than LWP::Simple: #!/bin/perl -- use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent-new( timeout = 60 ); my $response = $ua-get('http://www.npr.org/'); if ( $response-is_success() ) { my $content = $response-decoded_content(); ... } else { print HTTP Error : ,$response-status_line(),\n; } __END__ (and changing the shebang line for my location of perl, your version worked via both CGI and command line) oh ... and you don't need the foreach loop: my $i = @lines; -Joe