Re: [CODE4LIB] calling another webpage within CGI script - solved!

2009-11-24 Thread Graham Stewart

An interesting topic ... heading out to cast vote now.

In our environment, about 6 years ago we informally identified the gap 
(grey area, war, however it is described) between server / network 
managers and developers / Librarians as an obstacle to our end goals and 
have put considerable effort into closing it.  The key efforts being 
communication (more planning, meetings, informal sessions), 
collaboration (no-one is working in a vacuum), and the willingness to 
expand/stretch job descriptions (programmers sometimes participate in 
hardware / OS work and sysadmins will attend interface / application 
planning meetings).  Supportive management helps.


The end result is that sysadmins try as hard as possible to fully 
understand what an application is doing/requires on "their" 
hardware/networks, and programmers almost never run any applications 
that sysadmins don't know about.


So, SELinux has never been a problem because we know what a server needs 
to do before it ends up in a developer's hands and developers know not 
to pound their heads against the desk for a day before talking to 
sysadmins about something that doesn't work.  Well, for the most part, 
anyway ;-)


-Graham

Ross Singer wrote:

On Tue, Nov 24, 2009 at 11:18 AM, Graham Stewart
 wrote:

We run many Library / web / database applications on RedHat servers with
SELinux enabled.  Sometimes it takes a bit of investigation and  horsing
around but I haven't yet found a situation where it had to be disabled.
 setsebool and chcon can solve most problems and SELinux is an excellent
enhancement to standard filesystem and ACL security.


Agreed that SELinux is useful but it is a tee-otal pain in the keister
if you're ignorantly working against it because you didn't actually
know it was there.

It's sort of the perfect embodiment between the disconnect between the
developer and the sysadmin.  And, if this sort of tension interests
you, vote for Bess Sadler's presentation at Code4lib 2010: "Vampires
vs. Werewolves: Ending the War Between Developers and Sysadmins with
Puppet" and anything else that interests you.

http://vote.code4lib.org/election/index/13

-Ross "Bringin' it on home" Singer.


--
Graham Stewart
Network and Storage Services Manager, Information Technology Services
University of Toronto Library
130 St. George Street
Toronto, Ontariograham.stew...@utoronto.ca
Canada   M5S 1A5Phone: 416-978-6337 | Mobile: 416-550-2806 | 
Fax: 416-978-1668


Re: [CODE4LIB] calling another webpage within CGI script - solved!

2009-11-24 Thread Ross Singer
On Tue, Nov 24, 2009 at 11:18 AM, Graham Stewart
 wrote:
> We run many Library / web / database applications on RedHat servers with
> SELinux enabled.  Sometimes it takes a bit of investigation and  horsing
> around but I haven't yet found a situation where it had to be disabled.
>  setsebool and chcon can solve most problems and SELinux is an excellent
> enhancement to standard filesystem and ACL security.

Agreed that SELinux is useful but it is a tee-otal pain in the keister
if you're ignorantly working against it because you didn't actually
know it was there.

It's sort of the perfect embodiment between the disconnect between the
developer and the sysadmin.  And, if this sort of tension interests
you, vote for Bess Sadler's presentation at Code4lib 2010: "Vampires
vs. Werewolves: Ending the War Between Developers and Sysadmins with
Puppet" and anything else that interests you.

http://vote.code4lib.org/election/index/13

-Ross "Bringin' it on home" Singer.


Re: [CODE4LIB] calling another webpage within CGI script - solved!

2009-11-24 Thread Graham Stewart

Hi,

We run many Library / web / database applications on RedHat servers with 
SELinux enabled.  Sometimes it takes a bit of investigation and  horsing 
around but I haven't yet found a situation where it had to be disabled. 
 setsebool and chcon can solve most problems and SELinux is an 
excellent enhancement to standard filesystem and ACL security.


-Graham

--
Graham Stewart
Network and Storage Services Manager, Information Technology Services
University of Toronto Library
130 St. George Street
Toronto, Ontariograham.stew...@utoronto.ca
Canada   M5S 1A5Phone: 416-978-6337 | Mobile: 416-550-2806 | 
Fax: 416-978-1668


Ken Irwin wrote:

Hi all,

Thanks for your extensive suggestions and comments. A few folks suggested that 
SELinux might be the issue. Tobin's suggestion to change one of the settings 
proved effective:
"# setsebool -P httpd_can_network_connect 1".

Thanks to everyone who helped -- I learned a lot.

Joys
Ken

-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Greg 
McClellan
Sent: Tuesday, November 24, 2009 10:04 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] calling another webpage within CGI script

Hi,

I had a similar problem a while back which was solved by disabling 
SELinux. http://www.crypt.gen.nz/selinux/disable_selinux.html


-Greg


Re: [CODE4LIB] calling another webpage within CGI script

2009-11-24 Thread Greg McClellan

Hi,

I had a similar problem a while back which was solved by disabling 
SELinux. http://www.crypt.gen.nz/selinux/disable_selinux.html


-Greg


Re: [CODE4LIB] calling another webpage within CGI script

2009-11-24 Thread David Pattern
Hi Ken

Are you behind a web proxy server or firewall?  If so, you'll probably need to 
specify a proxy server in the script.

If the proxy is defined in the environment variables on the server, then you 
can use...

  my $ua = LWP::UserAgent->new( timeout => 60 );
  $ua->env_proxy();

...otherwise, you might need to hardcode it into the script...

  my $ua = LWP::UserAgent->new( timeout => 60 );
  $ua->proxy(['http'], 'http://squid.wittenberg.edu:3128');

(replace "squid.wittenberg.edu:3128" with whatever the proxy server name and 
port number actually are)

regards
Dave Pattern
University of Huddersfield


From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ken Irwin 
[kir...@wittenberg.edu]
Sent: 23 November 2009 19:41
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] calling another webpage within CGI script

Hi Joe,

That's really helpful, thanks.
Actually finding out what the error message is nice:

HTTP Error : 500 Can't connect to www.npr.org:80 (connect: Permission denied)

I've tried this with a few websites and always get the same error, which tells 
me that the problem is on my server side. Any idea what I can change so I don't 
get a permission-denied rejection? I'm not even sure what system I should be 
looking at.

I tried Vishwam's suggestion of granting 777 permissions to both the file and 
the directory and I get the same response.

Is there some Apache setting someplace that says "hey, don't you go making web 
calls while I'm in charge"?

(This is a Fedora server running Apache, btw).

I don't know what to poke at!

Ken


---
This transmission is confidential and may be legally privileged. If you receive 
it in error, please notify us immediately by e-mail and remove it from your 
system. If the content of this e-mail does not relate to the business of the 
University of Huddersfield, then we do not endorse it and will accept no 
liability.


Re: [CODE4LIB] calling another webpage within CGI script

2009-11-23 Thread Joe Hourcle

On Mon, 23 Nov 2009, Ken Irwin wrote:


Hi Joe,

That's really helpful, thanks.
Actually finding out what the error message is nice:

HTTP Error : 500 Can't connect to www.npr.org:80 (connect: Permission denied)

I've tried this with a few websites and always get the same error, which 
tells me that the problem is on my server side. Any idea what I can 
change so I don't get a permission-denied rejection? I'm not even sure 
what system I should be looking at.



I'm not even sure what could be causing the permission denied.
Normally, I get that response when the port's not open, or there's a 
firewall, and I can't think of a time when it'd work from the command 
line, but not from a CGI.


(well, okay, one, but it's a really odd case, that wouldn't happen for 
most people -- if you edit the web pages from a different machine than 
actually serves the pages, they might have the IP of the server blocked 
from being able to go outbound as a security privilege.  Most security 
folks wouldn't even think about this, but I used to work with a former 
Wittenberg IT person when I worked on Fark, and Mike came up with some 
*very* interesting solutions to things, and one of 'em was adding special 
IP pools so our ISP's customers were served special messages by being 
routed through an invisible proxy that'd serve alternate pages, such as 
informing them that they were late in paying their bills)


They could also be screwing with DNS, but I can think of a reason anyone 
would do it, and again it'd be per-machine, not per-user.


Anyway, try running this from both command line and via a CGI, and see if 
their output matches:


#!/bin/perl --
print "Content-type: text/plain\n\n";
print `uname -a`,"\n\n", `ifconfig -a`;
__END__

If you have to connecct using one name to make modifications, but a 
different name for the webserver, that could be a sign, as well.




I tried Vishwam's suggestion of granting 777 permissions to both the 
file and the directory and I get the same response.


Um ... you should _never_ need 777.  (occassionally 1777, but I can't 
think of a time when 0777 is a good idea.)


777 = executable, readable and writable by _everyone_.
755 = only writable to you.

(1777 has the 'sticky' bit set, which allows the /tmp directory to be 
written to, but you can't go deleting other people's files like you could 
if it were 0777).




Is there some Apache setting someplace that says "hey, don't you go making web calls 
while I'm in charge"?
(This is a Fedora server running Apache, btw).



It might be possible under suExec, but I'm not that familiar with it, as I 
used CGIwrap when I dealt with locking down multi-user systems.  (and to 
the best of my knowledge, it's not possible with CGIwrap).



...


And if all of this fails, you might want to consider asking on either:

http://stackoverflow.com/
http://serverfault.com/

(just ask on one; odds are, you'll ask on one, and they'll decide that 
it's more appropriate on the other one.)



-Joe


Re: [CODE4LIB] calling another webpage within CGI script

2009-11-23 Thread Ken Irwin
Hi Joe,

That's really helpful, thanks.
Actually finding out what the error message is nice:

HTTP Error : 500 Can't connect to www.npr.org:80 (connect: Permission denied)

I've tried this with a few websites and always get the same error, which tells 
me that the problem is on my server side. Any idea what I can change so I don't 
get a permission-denied rejection? I'm not even sure what system I should be 
looking at.

I tried Vishwam's suggestion of granting 777 permissions to both the file and 
the directory and I get the same response. 

Is there some Apache setting someplace that says "hey, don't you go making web 
calls while I'm in charge"? 

(This is a Fedora server running Apache, btw). 

I don't know what to poke at!

Ken


-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Joe 
Hourcle
Sent: Monday, November 23, 2009 2:29 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] calling another webpage within CGI script


I'd suggest testing the results of the call, rather than just looking for 
content, as an empty response could be a result of the server you're 
connecting to.  (unlikely in this case, but it happens once in a while, 
particularly if you turn off redirection, or support caching). 
Unfortunately, you might have to use LWP::UserAgent, rather than 
LWP::Simple:

#!/bin/perl --

use strict; use warnings;
use LWP::UserAgent;

my $ua = LWP::UserAgent->new( timeout => 60 );

my $response = $ua->get('http://www.npr.org/');
if ( $response->is_success() ) {
my $content = $response->decoded_content();
...
} else {
print "HTTP Error : ",$response->status_line(),"\n";
}

__END__

(and changing the shebang line for my location of perl, your version 
worked via both CGI and command line)


oh ... and you don't need the foreach loop:

my $i = @lines;

-Joe


Re: [CODE4LIB] calling another webpage within CGI script

2009-11-23 Thread Joe Hourcle

On Mon, 23 Nov 2009, Ken Irwin wrote:


Hi all,

I'm moving to a new web server and struggling to get it configured properly. The problem of the 
moment: having a Perl CGI script call another web page in the background and make decisions 
based on its content. On the old server I used an antique Perl script called "hcat" 
(from the Pelican book); I've also tried 
curl and LWP::Simple.

In all three cases, I get the same behavior: it works just fine on the command 
line, but when called by the web server through a CGI script, the LWP (or other 
socket connection) gets no results. It sounds like a permissions thing, but I 
don't know what kind of permissions setting to tinker with. In the test script 
below, my command line outputs:

Content-type: text/plain
Getting URL: http://www.npr.org
885 lines

Whereas the web output just says "Getting URL: http://www.npr.org"; - and doesn't even get 
to the "Couldn't get" error message.

Any clue how I can make use of a web page's contents from w/in a CGI script? 
(The actual application has to do with exporting data from our catalog, but I 
need to work out the basic mechanism first.)

Here's the script I'm using.

#!/bin/perl
use LWP::Simple;
print "Content-type: text/plain\n\n";
my $url = "http://www.npr.org";;
print "Getting URL: $url\n";
my $content = get $url;
die "Couldn't get $url" unless defined $content;
@lines = split (/\n/, $content);
foreach (@lines) { $i++; }
print "\n\n$i lines\n\n";

Any ideas?


I'd suggest testing the results of the call, rather than just looking for 
content, as an empty response could be a result of the server you're 
connecting to.  (unlikely in this case, but it happens once in a while, 
particularly if you turn off redirection, or support caching). 
Unfortunately, you might have to use LWP::UserAgent, rather than 
LWP::Simple:


#!/bin/perl --

use strict; use warnings;
use LWP::UserAgent;

my $ua = LWP::UserAgent->new( timeout => 60 );

my $response = $ua->get('http://www.npr.org/');
if ( $response->is_success() ) {
my $content = $response->decoded_content();
...
} else {
print "HTTP Error : ",$response->status_line(),"\n";
}

__END__

(and changing the shebang line for my location of perl, your version 
worked via both CGI and command line)



oh ... and you don't need the foreach loop:

my $i = @lines;

-Joe


Re: [CODE4LIB] calling another webpage within CGI script

2009-11-23 Thread Roy Tennant
Ken,
I tested your script on my server and it also worked for me on the command
line and failed via my web server. All I did was add "/usr" to your path to
perl and it worked:

#!/usr/bin/perl

Roy



On 11/23/09 11/23/09 € 8:17 AM, "Ken Irwin"  wrote:

> Hi all,
> 
> I'm moving to a new web server and struggling to get it configured properly.
> The problem of the moment: having a Perl CGI script call another web page in
> the background and make decisions based on its content. On the old server I
> used an antique Perl script called "hcat" (from the Pelican
> book); I've also tried curl
> and LWP::Simple.
> 
> In all three cases, I get the same behavior: it works just fine on the command
> line, but when called by the web server through a CGI script, the LWP (or
> other socket connection) gets no results. It sounds like a permissions thing,
> but I don't know what kind of permissions setting to tinker with. In the test
> script below, my command line outputs:
> 
> Content-type: text/plain
> Getting URL: http://www.npr.org
> 885 lines
> 
> Whereas the web output just says "Getting URL: http://www.npr.org"; - and
> doesn't even get to the "Couldn't get" error message.
> 
> Any clue how I can make use of a web page's contents from w/in a CGI script?
> (The actual application has to do with exporting data from our catalog, but I
> need to work out the basic mechanism first.)
> 
> Here's the script I'm using.
> 
> #!/bin/perl
> use LWP::Simple;
> print "Content-type: text/plain\n\n";
> my $url = "http://www.npr.org";;
> print "Getting URL: $url\n";
> my $content = get $url;
> die "Couldn't get $url" unless defined $content;
> @lines = split (/\n/, $content);
> foreach (@lines) { $i++; }
> print "\n\n$i lines\n\n";
> 
> Any ideas?
> 
> Thanks
> Ken
> 


Re: [CODE4LIB] calling another webpage within CGI script

2009-11-23 Thread Matt Jones
Hi Ken,

This may be obvious, but when running from the command line, stdout and
stderr are often interleaved together, but on the web server you see stdout
in the browser and stderr in the web server error log.  Your script is
probably exiting with an error either at the 'get' line (line 6) or at the
'die' line (line 7), which is what 'die' does -- terminate your script.
Have you checked your web server error log to see what the error is on your
'get' call?

Matt

On Mon, Nov 23, 2009 at 7:17 AM, Ken Irwin  wrote:

> Hi all,
>
> I'm moving to a new web server and struggling to get it configured
> properly. The problem of the moment: having a Perl CGI script call another
> web page in the background and make decisions based on its content. On the
> old server I used an antique Perl script called "hcat" (from the Pelican
> book); I've also tried
> curl and LWP::Simple.
>
> In all three cases, I get the same behavior: it works just fine on the
> command line, but when called by the web server through a CGI script, the
> LWP (or other socket connection) gets no results. It sounds like a
> permissions thing, but I don't know what kind of permissions setting to
> tinker with. In the test script below, my command line outputs:
>
> Content-type: text/plain
> Getting URL: http://www.npr.org
> 885 lines
>
> Whereas the web output just says "Getting URL: http://www.npr.org"; - and
> doesn't even get to the "Couldn't get" error message.
>
> Any clue how I can make use of a web page's contents from w/in a CGI
> script? (The actual application has to do with exporting data from our
> catalog, but I need to work out the basic mechanism first.)
>
> Here's the script I'm using.
>
> #!/bin/perl
> use LWP::Simple;
> print "Content-type: text/plain\n\n";
> my $url = "http://www.npr.org";;
> print "Getting URL: $url\n";
> my $content = get $url;
> die "Couldn't get $url" unless defined $content;
> @lines = split (/\n/, $content);
> foreach (@lines) { $i++; }
> print "\n\n$i lines\n\n";
>
> Any ideas?
>
> Thanks
> Ken
>


Re: [CODE4LIB] calling another webpage within CGI script

2009-11-23 Thread Vishwam Annam

Ken,

The difference is when you run through command script you are executing 
the file as /"owner"/ and as "/Other/" when you access it through the 
browser. Looking at the error message you sent, I believe it might not 
be executing the complete script. Try setting permissions as 707 or 777 
to start with. You may have to create a temporary directory to test with.


Let me know if you have any questions,

Vishwam
Vishwam Annam
Wright State University Libraries
120 Paul Laurence Dunbar Library
3640 Colonel Glenn Hwy.
Dayton, OH 45435
Office: 937-775-3262
FAX 937-775-2356


Ken Irwin wrote:

Hi all,

I'm moving to a new web server and struggling to get it configured properly. The problem of the 
moment: having a Perl CGI script call another web page in the background and make decisions 
based on its content. On the old server I used an antique Perl script called "hcat" 
(from the Pelican book); I've also tried 
curl and LWP::Simple.

In all three cases, I get the same behavior: it works just fine on the command 
line, but when called by the web server through a CGI script, the LWP (or other 
socket connection) gets no results. It sounds like a permissions thing, but I 
don't know what kind of permissions setting to tinker with. In the test script 
below, my command line outputs:

Content-type: text/plain
Getting URL: http://www.npr.org
885 lines

Whereas the web output just says "Getting URL: http://www.npr.org"; - and doesn't even get 
to the "Couldn't get" error message.

Any clue how I can make use of a web page's contents from w/in a CGI script? 
(The actual application has to do with exporting data from our catalog, but I 
need to work out the basic mechanism first.)

Here's the script I'm using.

#!/bin/perl
use LWP::Simple;
print "Content-type: text/plain\n\n";
my $url = "http://www.npr.org";;
print "Getting URL: $url\n";
my $content = get $url;
die "Couldn't get $url" unless defined $content;
@lines = split (/\n/, $content);
foreach (@lines) { $i++; }
print "\n\n$i lines\n\n";

Any ideas?

Thanks
Ken