> Is there another tool I could use for checking? I mean some tool in the
Perl universe?

Well, lwp-dump is a perl util - comes w/ LWP I believe. The sil.org, for
one, just returns forbidden/403 for their own policy reasons, but as far as
your "is it up?" question, that should be answer enough.  It uses
LWP::UserAgent.  To play fair (though it doesn't help with sil.org) you
should be looking for /robots.txt as you're creating a robot.

Pretty sure there's a libcurl interface (Net::Curl and WWW::Curl for two)
which might have better luck impersonating a proper user to get around the
policy.  But your urls so far have shown some odd repsonse using wget so
you may want to check them out first before your script has at them.

On Tue, Feb 13, 2018 at 2:34 PM, Manfred Lotz <ml_n...@posteo.de> wrote:

> On Tue, 13 Feb 2018 13:50:55 -0600
> Andy Bach <afb...@gmail.com> wrote:
>
> > $ wget http://scripts.sil.org/OFL
> > --2018-02-13 13:42:50--  http://scripts.sil.org/OFL
> > Resolving scripts.sil.org (scripts.sil.org)... 209.12.63.143
> > Connecting to scripts.sil.org (scripts.sil.org)|209.12.63.143|:80...
> > connected.
> > HTTP request sent, awaiting response... 302 Found
> > Location:
> > http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=OFL
> > [following] --2018-02-13 13:42:52--
> > http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=OFL
> > Reusing existing connection to scripts.sil.org:80.
> > HTTP request sent, awaiting response... 302 Moved Temporarily
> > Location: /cms/scripts/page.php?site_id=nrsi&id=OFL&_sc=1 [following]
> > --2018-02-13 13:42:52--
> > http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=OFL&_sc=1
> > Reusing existing connection to scripts.sil.org:80.
> > HTTP request sent, awaiting response... 302 Moved Temporarily
> > Location: /cms/scripts/page.php?site_id=nrsi&id=OFL [following]
> > --2018-02-13 13:42:53--
> > http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=OFL
> > Reusing existing connection to scripts.sil.org:80.
> > HTTP request sent, awaiting response... 200 OK
> > Length: unspecified [text/html]
> > Saving to: ‘OFL’
> >
> >     [
> > <=>
> > ] 37,439      59.6KB/s   in 0.6s
> >
> > 2018-02-13 13:42:55 (59.6 KB/s) - ‘OFL’ saved [37439]
> >
> > so it may not be following the 302s. I'm not sure you're using the
> > correct tool here.  A little more straight forward
> >
> > andy@wiwmb-md-afb-mint:~/spam$ wget http://scripts.sil.org/robots.txt
> > --2018-02-13 13:47:27--  http://scripts.sil.org/robots.txt
> > Resolving scripts.sil.org (scripts.sil.org)... 209.12.63.143
> > Connecting to scripts.sil.org (scripts.sil.org)|209.12.63.143|:80...
> > connected.
> > HTTP request sent, awaiting response... 200 OK
> > Length: 36 [text/plain]
> > Saving to: ‘robots.txt’
> >
> > 100%[=======================================================
> =================================>]
> > 36          --.-K/s   in 0s
> >
> > 2018-02-13 13:47:27 (2.99 MB/s) - ‘robots.txt’ saved [36/36]
> >
> > but
> > $ is_it_up.pl
> > http://scripts.sil.org/robots.txt is DOWN!!!!
> >
> > You might look at more LWP tools:
> > $ lwp-dump https://www.sil.org
> > HTTP/1.1 403 Forbidden
> > Cache-Control: max-age=10
> > Connection: keep-alive
> > Date: Tue, 13 Feb 2018 19:49:47 GMT
> > Server: cloudflare
> > Content-Type: text/html; charset=UTF-8
> > Expires: Tue, 13 Feb 2018 19:49:57 GMT
> > CF-RAY: 3eca501a5d569895-LAX
> > Expect-CT: max-age=604800, report-uri="
> > https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct";
> > Set-Cookie: __cfduid=dd8038f4f2c995fa4b4c7fa8beb2b42f31518551387;
> > expires=Wed, 13-Feb-19 19:49:47 GMT; path=/; domain=.sil.org; HttpOnly
> > X-Frame-Options: SAMEORIGIN
> >
> > <!DOCTYPE html>
> > <!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US">
> > <![endif]--> <!--[if IE 7]>    <html class="no-js ie7 oldie"
> > lang="en-US"> <![endif]--> <!--[if IE 8]>    <html class="no-js ie8
> > oldie" lang="en-US"> <![endif]--> <!--[if gt IE 8]><!--> <html
> > class="no-js" lang="en-US"> <!--<![endif]--> <head>
> > <title>Access denied | www.sil.org used Cloudflare to restrict
> > access</title>
> > <meta charset="UTF-8" />
> > <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
> > <meta http-equiv=...
> > (+ 2770 more bytes not shown)
> >
> > so it's up, but "forbidden" probably as the user agent isn't set or
> > some other policy reason.
> >
> >
>
> I tried WWW::Mechanize, and (of course) got also 403.
>
> Really strange.
>
> Is there another tool I could use for checking? I mean some tool in the
> Perl universe?
>
> --
> Manfred
>
>
>
> > On Tue, Feb 13, 2018 at 11:33 AM, Manfred Lotz <ml_n...@posteo.de>
> > wrote:
> >
> > > On Tue, 13 Feb 2018 10:47:42 -0600
> > > Andy Bach <afb...@gmail.com> wrote:
> > >
> > > > The site doesn't like 'head' requests? get works
> > > > #!/usr/bin/perl
> > > >
> > > > use strict;
> > > > use warnings;
> > > >
> > > > use LWP::Simple;
> > > > #  my $url="https://shlomif.github.io/";;
> > > > my $url="http://www.notabug.org/";;
> > > > print "$url is ", (
> > > >                 (! get($url)) ?  "DOWN!!!!"
> > > >                                 : "up"
> > > >                 ), "\n";
> > > >
> > > > $ is_it_up.pl
> > > > http://www.notabug.org/ is up
> > > >
> > >
> > > You are right.
> > >
> > > But am afraid this is not all of it. If I test
> > > http://scripts.sil.org/OFL then I get an error but it is fine in
> > > firefox.
> > >
> > > Very strange.
> > >
> > > --
> > > Manfred
> > >
> > >
> > >
> > > >
> > > > On Tue, Feb 13, 2018 at 5:25 AM, Manfred Lotz <ml_n...@posteo.de>
> > > > wrote:
> > > >
> > > > > Hi there,
> > > > > Somewhere I found an example how to check if a website is up.
> > > > >
> > > > > Here my sample:
> > > > >
> > > > > #! /usr/bin/perl
> > > > >
> > > > > use strict;
> > > > >
> > > > > use LWP::Simple;
> > > > > my $url="https://notabug.org";;
> > > > > if (! head($url)) {
> > > > >     die "$url is DOWN!!!!"
> > > > > }
> > > > >
> > > > > Running above code I get
> > > > >   https://notabug.org is DOWN!!!! at ./check_url.pl line 8.
> > > > >
> > > > >
> > > > > However, firefox shows the site works ok.
> > > > >
> > > > >
> > > > > What am I doing wrong?
> > > > >
> > > > >
> > > > > --
> > > > > Thanks,
> > > > > Manfred
> > > > >
> > > > > --
> > > > > To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> > > > > For additional commands, e-mail: beginners-h...@perl.org
> > > > > http://learn.perl.org/
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > > --
> > > To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> > > For additional commands, e-mail: beginners-h...@perl.org
> > > http://learn.perl.org/
> > >
> > >
> > >
> >
> >
>
> --
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
>
>
>


-- 

a

Andy Bach,
afb...@gmail.com
608 658-1890 cell
608 261-5738 wk

Reply via email to