Re: UNIX - Installing Crypt::SSLeay

2008-02-15 Thread Reinier Post
On Wed, Feb 13, 2008 at 12:46:15PM -0500, David Moreno wrote:
 I think that, from the paths he pasted, it's a Sun4 Solaris.

In that case, the answer is probably to install a C compiler.

Last time I used Solaris, it didn't come with a C compiler,
but Sun offered CDs with optional additional packages that
include gcc.  (They may since have modernized the distribution
of this software.)  Install that and either you will have a cc,
or you will have a gcc and need to tell CPAN that gcc is your C compiler.

-- 
Reinier


Re: About LWP in general??

2006-09-12 Thread Reinier Post
On Fri, Aug 18, 2006 at 10:48:30AM +0100, chris choi wrote:
 Hi
 I'm new to Perl, but considering writing web robots/Spiders in Perl, but I'm
 not too sure if the LWP is out-dated or something cause I haven't heard
 anything about LWP recently, so I was wondering if you guys know if there is
 a new thing to write WEB robots with on PERL??

No, the LWP is not at all outdated, and very much the standard Web client 
library
for Perl, as far as I'm aware.  There is little development activity goes on, 
but
in this case, as far as I'm aware, this is not a sign of abandonment, but of
maturity: the library is finished and does what it was designed to do.
Gisle Aas, its maintainer, monitors this list for feature requests and
bug reports, and occasionally creates a new LWP version in response.

Yes, there are newer libraries specifically for writing Web robots in Perl;
some of them are in CPAN, and are frequently mentioned here; WWW::Mechanize
is the name I see most often.  These libraries all use LWP, as far as I'm aware,
and you are definitely advised to use them instead of building your own directly
on top of LWP.  However, I do not use any of them at the moment, so I cannot
give you more information.

 thanks
 Chris

-- 
Reinier Post
TU Eindhoven


Re: [Crypt::SSLeay] compile problems on Solaris

2005-11-29 Thread Reinier Post
On Wed, Nov 23, 2005 at 11:31:06AM +0100, Barden, Tino wrote:
 Hello,
 
 I have tried to compile Crypt-SSLeay-0.51 on a Solaris 9 machine and got the 
 following errors:
 
 UZKT3 # perl Makefile.PL
 Found OpenSSL (version OpenSSL 0.9.8) installed at /usr/local/ssl
 Which OpenSSL build path do you want to link against? [/usr/local/ssl] 

[...]

 LD_RUN_PATH=/usr/local/ssl/lib gcc  -G -L/usr/local/lib SSLeay.o  -o 
 blib/arch/auto/Crypt/SSLeay/SSLeay.so   -L/usr/local/ssl/lib -lssl -lcrypto 
 -lgcc   

The problem with compiling on Solaris is usually that a -Rdir has to be 
inserted for
every -Ldir.  So I'd guess hat inserting a -R/usr/local/lib on the gcc 
command line
would fix the problem in your case.

-- 
Reinier


Re: HTML::Parser bug

2005-03-21 Thread Reinier Post
On Sun, Mar 20, 2005 at 01:51:25PM -0800, Bill Moseley wrote:
 On Sun, Mar 20, 2005 at 06:02:26PM +0300, [EMAIL PROTECTED] wrote:
  Hello libwww,
  
  using it to parse html-forms etc...
  noticed, that it recognizes strange comment
  like !-- as starting of the comment,
  not like the whole empty comment, as IE.
 
 Doesn't seem like that's a valid comment.
 
 http://www.w3.org/TR/WD-html40-970917/intro/sgmltut.html#h-3.1.4

Well, the HTML:Parser perldoc says:

  HTML::Parser is not a generic SGML parser. We have tried to make it
  able to deal with the HTML that is actually out there, and it normally
  parses as closely as possible to the way the popular web browsers do it
  instead of strictly following one of the many HTML specifications from
  W3C. Where there is disagreement, there is often an option that you can
  enable to get the official behaviour.

But do all versions of IE parse this the same way?
What do other popular user agents do?

-- 
Reinier


Re: WWW::Mechanize caching

2005-03-09 Thread Reinier Post
On Fri, Feb 25, 2005 at 10:42:43AM +1000, Robert Barta wrote:
 On Thu, Feb 24, 2005 at 11:07:00PM +0100, Reinier Post wrote:
  On Mon, Feb 21, 2005 at 08:27:38AM +1000, Robert Barta wrote:
   Hi all,
   
   I hope I did not miss an obvious solution to the following:
   
   I want a *caching* version of WWW::Mechanize.
  
  Why don't you just use a caching proxy server?  Squid?
 
 First, we need a bit more control on the caching policy. Reconfiguring
 a squid remotely is a bit brittle :-) But, more importantly, we cannot
 assume that a proxy/cache is at every user site where the agent is
 running.

I was assuming you'd put it on the client side.
But perhaps Squid won't run there.

-- 
Reinier


Re: WWW::Mechanize caching

2005-02-24 Thread Reinier Post
On Mon, Feb 21, 2005 at 08:27:38AM +1000, Robert Barta wrote:
 Hi all,
 
 I hope I did not miss an obvious solution to the following:
 
 I want a *caching* version of WWW::Mechanize.

Why don't you just use a caching proxy server?  Squid?

-- 
Reinier Post


Re: HTML::TreeBuilder/HTML::Parser - problem parsing tables

2004-04-06 Thread Reinier Post
On Mon, Apr 05, 2004 at 04:29:50PM +0200, Neven Luetic wrote:

 I wrote a small application to collect samples of pages from sites to do
 some usability checking offline. So it's necessary that the archived
 pages match the original exactly, when displayed.
 
 As some tests on the pages are going to be automated using tags or
 attributes as search criteria and as it was necessary to rewrite any
 links to pictures inside the pages, I decided to use HTML::TreeBuilder
 for this.
 
 However, I encountered a critical difference of pages read using
 HTML::TreeBuilder-parse() for parsing and  HTML::TreeBuilder-as_HTML
 for writing to the original: in several german newspaper sites, that are
 using big tables for their layout, some tables are closed too early by
 the parser. The effect is, that from that point onward the table-cells
 are displayed row by row (this is true for every browser I tried -
 mozilla, firefox, opera, ie6), while the original page looks ok. 
 
 I tried setting HTML::TreeBuilder-implicit_tags(0) (this will be my
 default setting anyway), but it didn't change the behavior. So I
 suppose, the problem is not with some routine *adding* tags that are
 proposed to be missing, but with the parser itself, misinterpreting the
 tree.
 
 Does anybody have an idea about what the problem might be and how I
 could solve this?

I can only rely as a former HTML::TreeBuilder user.
I patched it a little to fix some of its behavbiour.

HTML can be broken as SGML or XML.  HTML::Parser and HTML::TreeBuilder
were designed as heuristic parseers: they try to make sense of broken
HTML.   They even try to make sense of it in the same way that other
applications (major browsers) do.  According to the design,
it transforms anything that vaguely looks like HTML into valid HTML
that has the same effect.  But this cannot be guaranteed in general,
of course: different applications have different heuristics for dealing
with broken HTML.

Perhaps HTML::Parser's heuristics for dealing with broken tables are
different from those of the browser your testing with.  In that case it
would be advisable to extend or modify the HTML::Parser heuristics so
it can conform to what your browser does.  Another possibility is to
put in some custom-built prepcosessing that makes HTML::Parser do
the right thing in your case.  Ultimately the fault is with the
original HTML pages, which should be fixed to at least be syntactically
well-formed.

If the pages your working on are well-formed HTML, you may be troubled by
a more severe problem: HTML::Parser and HTML::TreeBuilder are expected
to leave non-broken HTML exactly the way it is, but they don't always
do so.  There are problems with handling framesets; perhaps there are
other problems.  If you find any, they should really be fixed.

 I'm pretty stuck, as nearly a quarter of all (newspaper and magazine)
 sites tested have this problem, so that it renders the script virtually
 useless.

Can you post a *minimal* HTML fragment that exhibits the problem?

 Greetings
 
 Neven

-- 
Reinier  Post
TU Eindhoven


Re: Can't locate object method host via package URI::_foreign

2003-09-03 Thread Reinier Post
On Tue, Sep 02, 2003 at 12:29:16PM +0400, Siddhartha Jain(IT) wrote:
 Sorry, the input being given to the $uri-host method was erroneous.
 
 Again, sorry for the false alert!!

Comment: URIs exist that are valid, but do not have a host part,
and you will have this problem then, so it is a good idea
to use eval { $uri-host } if you haven't checked in advance that
$uri contains one.

WWW::Robot has this problem for instance.

-- 
Reinier



Re: TreeBuilder cgi memory problems

2003-08-14 Thread Reinier Post
On Fri, Aug 08, 2003 at 12:43:16AM +0100, John J Lee wrote:
 On Thu, 7 Aug 2003, [EMAIL PROTECTED] wrote:
 
  Having a potential TreeBuilder memory problem when using it to parse
  through a large HTML table ( 2K rows) where the memory allocation grows to
  about 20M on my server and never goes down even after finishing with the
  HTML and TreeBuilder structures. The Perl script runs as a CGI and Apache
  gives up after awhile with the following line in the error logs - Out of
  Memory !!
 [...]
 
 20 Mb does seem a lot, but why would one expect the process memory usage
 to fall after parsing is comlpete?  On most systems, memory used by a
 process and free'd isn't returned to the system until the process exits.
 
 Sorry, no actual help...

Look at the DESCRIPTION section of the HTML::TreeBuilder perldoc, item 4:

   4. and finally, when you're done with the tree, call
   $tree-delete() to erase the contents of the tree from
   memory.  This kind of thing usually isn't necessary with
   most Perl objects, but it's necessary for TreeBuilder
   objects.  See HTML::Element for a more verbose explanation
   of why this is the case.

It may explain the problem.

- 
Reinier



Re: Help needed

2003-07-15 Thread Reinier Post
On Mon, Jul 14, 2003 at 11:10:26PM +0200, Carsten Kruse wrote:
 Hi Teddy,
 
 if you know about the structure of the html page you should
 try the functions of HTML:TokeParser package. 

[very nice example omitted here for brevity]

I have used HTML::TreeBuilder in some of my scripts.

If the HTML is well-structured, an alternative is to use
XML packages; XML::LbnXML can read and write HTML syntax
and allows you to manipulate the structure with DOM operations.

-- 
Reinier



Re: HTML parsing

2003-07-09 Thread Reinier Post
On Tue, Jul 08, 2003 at 03:34:12PM +0100, Richard Lamb wrote:
 Hi folks,
 I'm Richard Lamb, and I'm a Perl virgin. Just getting to know the
 language. I'm in the midst of an MSc in Computing in Manchester (UK),
 working out a means of stripping HTML tags (via the DOM interface, which
 I'm trying to get to grips with) and reformatting text, so as to improve
 a Web site's accessibility (particularly the visually impaired). Are
 there any PPMs you'd recommend I check-out?

I have only tried HTML::TreeBuilder (not DOM, but the same principle;
uses heuristic HTML parsing and patching that does some unwanted things)
and XML::LibXML (which has the advantage of using the libxml2 library
that other languages also bind to; supports a lot of DOM plus some
extensions).

There are many more XML libraries, some with DOM in their names,
but they always fail to install on my Solaris system.

Here is my first XML::LibXML script, a HTML reformatter:

#!/usr/bin/env perl

use XML::LibXML;

my $parser = new XML::LibXML;
$parser-validation(1);
$parser-expand_entities(0);
$parser-keep_blanks(0);
$parser-pedantic_parser(1);
$parser-expand_xinclude(0);


foreach my $srcdoc (@ARGV ? @ARGV : ('-'))
{
  my $doc = $parser-parse_html_file($srcdoc);
  print $doc-toStringHTML();
}

# end of script

 Cheers,
  
 Richard.

  Enjoy,

-- 
Reinier



Re: Help! how is this called?

2002-12-02 Thread Reinier Post
On Thu, Nov 28, 2002 at 12:16:19PM -0700, Keary Suska wrote:
 on 11/27/02 7:54, [EMAIL PROTECTED] purportedly said:
 
  RE: Help! how is this called?Thank you but this won't help me I guess.
  
  I could find that info only from within the script, right?
  
  Well, I want to create a program like that Teleport Pro from Windows that
  spiders a web site and download all the pages from the site.
  To download the pages is very easy, but the biggest problem is to create the
  local file names, and to replace all the links from the downloaded pages to
  make them work locally.
  
  Until now, the only problem I found, is that I can't reliably find the file
  name from the path in all the cases.

I have written a couple of programs that do this.
You don't really need to know a file name, but you do need to weed out
duplicates, e.g. [...]/foo/ is often identical to [...]/foo/index.html.
 
 Well, yes and no. The example URL provided:
 
  http://www.site.com/script.cfm/dir1/dir2/http://www.site.com/file.html
 
 is technically a malformed URI.

According to RFC 2396, it isn't.  A : is allowed anywhere in the path,
and a // is allowed to appear multiple times as well, as far as I can see.
(The / characters separate segments, and segments can be empty.)

Som browsers (at least IE 6 and links) misparse such URLs,
but they have no excuse, as far as I can see.

 It should be:
 
 http://www.site.com/script.cfm/dir1/dir2/http:%2F%2Fwww.site.com%2Ffile.html
 
 or minimally:
 
 http://www.site.com/script.cfm/dir1/dir2/http:%2F%2Fwww.site.com/file.html

That would mean you'd have to rewrite the URLs that point to them
from other documents so they won't break.

 You will always find that sites do stupid things, and will have to find ways
 around them. However, the case of extra PATH_INFO or query strings, it
 doesn't hurt to treat them as they are, and you will be successful most of
 the time.
 
 Other than issues with the URI above, you should have minimal problems.

Right now I have the problem that Apache 2 won't feed URLs to
script.php (in my case it's a PHP script) if they have an extra path.
But this is just one of my regular quarrels with the Apache
configuration file mess, I expect it can be done somehow.

-- 
Reinier Post
TU Eindhoven



Re: can't load host method in URI package

2002-09-01 Thread Reinier Post

On Wed, Aug 28, 2002 at 01:08:41PM -0400, Thurn, Martin (Intranet) wrote:
   LWP::RobotUA=HASH(0x83d7994) GET 1 ...Can't locate object method host
   via package
   URI::_generic (perhaps you forgot to load URI::_generic?) at
 
 That's the message you get when your URL does not have something like
 'http:/' in front of it.
 
 WHY DON'T YOU PRINT OUT THE VALUE OF THE VARIABLES IN YOUR PROGRAM?
 THAT'S THE FIRST STEP IN DEBUGGING.  GO BACK TO 8TH GRADE PROGRAMMING CLASS.
 YOU WOULD SEE IMMEDIATELY THAT WHAT YOU'RE TREATING AS A URL IS NOT A URL.

LWP already saw it, but it kept its knowledge to itself.
A better error message would help here ...
 
-- 
Reinier



Re: can't load host method in URI package

2002-08-28 Thread Reinier Post

On Fri, Aug 09, 2002 at 03:01:54PM +0500, Ken Munro wrote:
 Hi.
 
 I am trying to write a simple robot that reads urls from a text file.
 The source is listed below. I am getting an error that says:
 
 LWP::RobotUA=HASH(0x83d7994) GET 1 ...Can't locate object method host
 via package
 URI::_generic (perhaps you forgot to load URI::_generic?) at
 /usr/lib/perl5/site_perl/5.6.1/WWW/RobotRules.pm line 187, IN line 2.
 
 I have search Google far and wide, and have found other people with this
 problem, but no solution.

This is from memory and refers to old code, but ...

I think I saw that problem when I was using WWW::Robot over a year ago.
I remember fixing a problem with LWP::RobotUA crashing on unusual URLs.
I also removed some hardcoded limits on the kinds of URLs WWW::Robot traverses.
If you're interested in trying the modified modules, they are available at

  http://www.win.tue.nl/~rp/perl/lib/LWP/RobotUA.pm
  http://www.win.tue.nl/~rp/perl/lib/WWW/Robot.pm

-- 
Reinier



Re: Fw: Can't navigate to URL after login

2002-08-22 Thread Reinier Post

On Tue, Aug 06, 2002 at 10:26:01AM -0500, Kenny G. Dubuisson, Jr. wrote:
 Tried that (referer = ...) with no luck.  I did find that I can navigate
 several pages in using sequential $browser-request calls but the page that
 finally fails has the hyperlinks to the next page calling javascript
 functions.  Maybe that has something to do with it.  Just a guess at this
 point.  Thanks,
 Kenny

Most definitely.  LWP does not include Javascript support.

-- 
Reinier



Re: libwww only as root

2002-08-02 Thread Reinier Post

On Wed, Jul 17, 2002 at 12:44:39PM -0600, Keary Suska wrote:
 on 7/17/02 12:13 PM, [EMAIL PROTECTED] purportedly said:
 
  If you're having a problem with running your scripts from cron, the answer
  is usually in your PATH environment variable or working directory - cron
  tends to run with different paths, and your script probably can't find
  libraries or other things it needs.
 
 Except that Perl does not rely on PATH and related variables to determine
 module or loadable (.so) locations.

$PERLLIB

 The cron problem could be permissions or
 a CWD problem which would effect finding custom modules, if that is even an
 issue, or several other reasons. If there was some information about what is
 going wrong, perhaps a more sensible solution could be presented.

Run a cron job with the command env  /tmp/env .
Then set your environment to exactly that.  (If you're using bash
or some other sh-derivative, just . /tmp/env should do the trick.)
Then debug the errors you get.

-- 
Reinier



Re: Fetching big files

2002-06-03 Thread Reinier Post

On Wed, May 29, 2002 at 06:20:58PM +0300, evgeny tsurkin wrote:
 
  Hi!
  The problem I have:
  I am fetchng big files from the web that are created on the fly.
  Befor actually fetching them I would like to know what is the size
  it is going to be.
  I am not sure that is possible ,but if it is - please be very clear
  i am new in using lwp.
  Thanks.

Use your HEAD.

(It comes with LWP!)

The 'head' method only fetches the document headers;
one of them gives the document size.

-- 
Reinier



Re: Authentication

2002-03-29 Thread Reinier Post

On Fri, Mar 22, 2002 at 05:05:49PM -0600, Damian Kohlfeld wrote:
 I have a situation where I have a webpage that has a list of links to
 other web sites.  They do the following:
 
 Login to my website.
 My website sends assigns them a cookie using libwww perl.
 They see the list of links and click one.
 The web page pointed to can check the cookie and see if it is valid,
 thus, granting them access.
 
 The whole point is that I want to authenticate poeple so that they can
 visit the links on my page, but, I want to keep them from visiting the
 links directly.
 Is this possible?

Yes, why not?  If they already have a cookie, let them in;
if not, redirect them to the login page.

-- 
Reinier



Re: hi

2002-03-06 Thread Reinier Post

On Tue, Mar 05, 2002 at 06:54:41AM -0800, Randal L. Schwartz wrote:
  Reinier == Reinier Post [EMAIL PROTECTED] writes:
 
 Reinier On Mon, Mar 04, 2002 at 04:33:37PM +0530, kavitha malar wrote:
  I want to search a text in a website how to do that through perl.
 
 Reinierperl -MLWP::Simple -e \
 Reinier 'getprint http://www.google.com/search?q=$word+site:$site'
 
 Reinier I'm serious.  (This is what I use to find my own pages.)
 
 Except now, Google has gotten fairly upset about automated page
 fetches.  There's a thread on use.perl.org about it.

Thanks for the pointer.

  http://www.google.com/terms_of_service.html

is pretty vague about it.  As someone in the thread remarked, we are
talking about a single query here, for personal use, without even any
reformatting of the results.
 
 And last time I checked, Google *specifically* blocks the default
 agent type that LWP uses, so you'll get no response.  You have
 to change the agent type to something with Mozilla in it. :)

Mmm, I should have checked that.  I actually feed the Google query URL
to lynx or links.

 Gisle - would it be unfair to have a special useragent string
 when LWP detects that it is visiting Google? :)

Nice idea :)  But hidden magic in code is always bad.

-- 
Reinier



Re: hi

2002-03-05 Thread Reinier Post

On Mon, Mar 04, 2002 at 02:57:30PM +0530, kavitha malar wrote:
 perl -MLWP::Simple -e 'getprint http://wwwyahoocom;'
 400 Bad Request URL:http://wwwyahoocom
 
 anybody knows why this error is happening  

It isn't here  Try setting $http_proxy or something

 --jude

-- 
Reinier



Re: installing libwww on solaris

2002-02-15 Thread Reinier Post

On Thu, Feb 14, 2002 at 02:47:19PM +0200, Afgin Shlomit wrote:
 
 I try to install libwww on solaris and first the 'make test' dont past
 okay - I get :
 robot/uaPerl lib version (v5.6.1) doesn't match executable
 version (
 5.00503) at /usr/local/lib/perl5/5.6.1/sun4-solaris/Config.pm line 18.

This is not a LWP specific problem.

It means you're mixing references to the old and new Perls.
It is possible to use your own set of Perl modules with an existing
Perl installation - I have done this on Solaris.  It is also possible
to do the reverse: install your own new Perl and use libraries of the
previous Perl installation with it; I have done this on Solaris, too.
But in the long run, the cleanest approach is to install a completely
separate version of Perl and Perl modules that do not refer to any
preexisting Perl installation.  There are many places where these
references can be set (Perl Configure, CPAN config, $PERLLIB variable,
etc.) so you have to be careful.  Documentation is in

   perldoc ExtUtils::MakeMaker

and other places.

I like to use the CPAN shell, configure it to use its own location for
everything Perl, then reinstall the CPAN module, restart it, and reinstall
Perl itself with it.  But there are many different methods.

Then when you want to move it to /usr/local, redo the installation
from scratch.  My installation notes are here:

  http://wwwis.win.tue.nl/~rp/perl-install/

-- 
Reinier



Re: Double slash in a URI: legal or not?

2002-01-06 Thread Reinier Post

On Sun, Jan 06, 2002 at 08:41:54AM -0800, Randal L. Schwartz wrote:
  Hans == Hans De Graaff [EMAIL PROTECTED] writes:
 
 Hans RFC 2396 seems to indicate that in path segments only a single slash
 Hans is legal,
 
 I'm not sure where you get that.  My reading of the BNF:
 
   abs_path  = /  path_segments
   path_segments = segment *( / segment )
   segment   = *pchar *( ; param )
 
 implies that a segment can be null, so abc//def is abc  def in
 terms of path steps, and is thus *not* equivalent to abc def.
 
 Sure, you can't have two slashes next to each other and have it mean
 *nothing*, but it does in fact form a legal URI and a server can
 possibly ignore it or do something different with it or report that
 this particular resource is not found.

Not only that, adjacent /es aren't even removed in relative URL processing,
according to RFC 1808; the path transformation rules, see e.g.

  http://deesse.univ-lemans.fr:8003/Connected/RFC/1808/18.html

do not match // within a path ('a complete path' must not be empty).
This is contrary to Unix file path semantics, which so treat any sequence
of /s as equivalent to one.

-- 
Reinier



Re: Minor bug in request()

2001-10-22 Thread Reinier Post

 Why?  I know this has been argued extensively elsewhere; see e.g.
 
http://pppwww.ph.gla.ac.uk/~flavell/www/post-redirect.html
 
 Possibility 1 mentioned there is common enough to add support for it.
 
 --
 Reinier
 
 
 This link goes nowhere -- is the site down?

The correct link is

  http://ppewww.ph.gla.ac.uk/~flavell/www/post-redirect.html

I tried to verify it, but couldn't at the moment of posting.
Lesson: don't post in such cases.

-- 
Reinier



Re: ODBC to MS SQL 7/2000

2001-09-24 Thread Reinier Post

On Thu, Sep 20, 2001 at 05:50:39PM -0400, Hawk wrote:
 Hi,
 
 I have been assigned the task of writing perl scripts from a Linux box to 
 connected to a MS SQL 7/2000 server.  Are there routines and modules 
 already built for this?

Yes.

% perl -eshell -MCPAN

cpan shell -- CPAN exploration and modules installation (v1.59_54)
ReadLine support enabled

cpan i /SQL/
CPAN: Storable loaded ok
Going to read /home/rp/.cpan/Metadata
  Database was generated on Sat, 22 Sep 2001 00:01:30 GMT
[...]
Module  MSSQL::DBlib(S/SO/SOMMAR/mssql-1.008.zip)
[...]
cpan i /ODBC/
DistributionJ/JM/JMAHAN/iodbc_ext_0_1.tar.gz
DistributionJ/JU/JURL/DBD-ODBC-0.28.tar.gz
Module  DBD::ODBC   (J/JU/JURL/DBD-ODBC-0.28.tar.gz)
Module  RDBAL::Layer::ODBC (B/BR/BRIAN/RDBAL-1.2.tar.gz)
Module  Win32::ODBC (Contact Author Dave Roth [EMAIL PROTECTED])
Module  iodbc   (J/JM/JMAHAN/iodbc_ext_0_1.tar.gz)
6 items found

cpan 

I haven't used any of it, but it's there.

-- 
Reinier



Re: problems installing the modules

2001-08-24 Thread Reinier Post

On Thu, Aug 23, 2001 at 01:46:47PM +0300, Yair Lapin wrote:
 Hi,
  
 I'm trying to install the libwww modules in a sparc server with solaris 2.8
 and the most of them I can't compile I get the following
 Error message:
  
 cc -c   -xO3 -xdepend-DVERSION=\3.25\ -DXS_VERSION=\3.25\ -KPIC
 -I/usr/perl5/5.00503/sun4-solaris/CORE -DMARKED_SECTION Parser.c
 cc: unrecognized option `-KPIC'
 cc: language depend not recognized

-xdepend -KPIC is an option to Sun cc.  Are you sure your cc is Sun's?

If not, you have to regenerate the Makefile for the C com piler
you're using (probably gcc).

-- 
Reinier



Re: Question

2001-07-09 Thread Reinier Post

On Sun, Jul 08, 2001 at 04:03:46PM -0700, Jason Whitlow wrote:
 I am trying to get one of my apps to display only 5 records at a time. With
 Perl attaching to a mysql database. Does anyone have any good Ideas of how
 to do this. 

Yes, Perl can do this (check the DBI documentation), and the SQL SELECT
statement can do this, too (check the mySQL documentation, www.mysql.com).

Your question is off-topic for this mailing list,
which is about LWP, the WWW library for Perl.

-- 
Reinier



Re: LWP::RobotUA-recurse?

2001-07-09 Thread Reinier Post

On Fri, Jun 29, 2001 at 10:44:20PM +0200, Simon Dang wrote:
 Hi, I am a newbie with LWP. 
 
 Does LWP::RobotUA run recursively by default? 
 
 If not, is there method that I can call within UA to
 set this to run recursively?
 
 I have searched the docs within LWP::RobotUA, but
 there is nothing mentioned about recursive searches. 

Try WWW::Robot, which is an inbterface on LWP::RobotUA to do just that.

-- 
Reinier



Re: redirects and javascript

2001-07-09 Thread Reinier Post

On Mon, Jul 02, 2001 at 08:51:42AM -0400, fred whitridge wrote:
 I have inelegantly solved my problem by loading the page with the
 javascript reference into Excel and then snagging the executed result.
 There has to be a better way to do this, altho' this one works.

LWP doesn't support Javascript, but you can

  + do an ad-hoc 'parse' of the Javascript code in question,
if you know what they look like, using regexps
  + check the mailing list archives (topic has been discussed before)

-- 
Reinier



Re: HTML::Parser - Extracting out the text from body

2001-07-09 Thread Reinier Post

On Mon, Jul 02, 2001 at 11:17:00AM -0700, Bill Moseley wrote:
 Hello,
 
 I need to extract text out of html docs to do search word highlighting in
 context.  (You know, like google's output.)
 
 So, is there a fastest method to do this -- better than just using
 HTML::Parser, setting a flag when I catch body and then storing the text?

If 'fastest' means 'most convenient', try

  perl -MLWP::Simple -MHTML::TreeBuilder -e \
'print HTML::TreeBuilder-new-parse(LWP::Simple::get(http://www/;))-as_text'

-- 
Reinier



how to disable automatic redirect (was: Newbie Question)

2001-05-16 Thread Reinier Post

On Tue, May 15, 2001 at 11:48:54AM -0400, Jean Zoch wrote:
 Hello all,
 
 I am developing a utility that needs to grab the HTML code from web pages.
 To do this I am using:
 
 my $url = 'http://www.theURLiWant.com';
 
 use LWP::Simple;
 $content = get($url);
 
 This works great, but I also need the *actual* URL of the content that is
 returned. Often, the content does not come from $url, but from a redirect. I
 need this so that I can add a BASE HREF=$url to the code so that the web
 page will work even though it is being displayed on my server.

You can disable automatic following of redirects.  The perverse
(author's wording) and quite popular way of doing this is
described at

   http:[EMAIL PROTECTED]/msg0.html

See also

  perldoc LWP::UserAgent

 I have tried using LWP::UserAgent, and getting the headers returned from the
 web page, but all that gives is something like:
 HTTP::Headers=HASH(0x10205064).
 
 Any suggestions?

That's the printable representation of a Perl object holding the headers.

You want to extract the object's fields, presumably by calling
HTTP::Headers methods.  See

  perldoc HTTP::Headers


BTW please use the subject of your message to indicate the subject of
yuour message. Thanks.

-- 
Reinier



Re: Automated FORM posting

2001-05-16 Thread Reinier Post

On Wed, May 16, 2001 at 12:52:04PM +0100, D.D.Casperson wrote:
 Hi
 
 I am new to perl, so I would appreciate a verbose response to this, any
 references would be great.
 
 I am playing around with HTML FORM's and it had been suggested to me that the
 libwww might be the answer to my problem.
 
 I want to generate a client that automaticly fills out a form and posts the
 details to the server.

For a given form, or for a large range of forms unknown in advance?
It's possible to do the latter, to some extent.
 
 How would I go about writing a perl script that accomplished the same as the
 HTML below?

You mean, the same as a user filling out a form with a browser
and clicking submit?

 For example if I filled out my message as hello, this is a test,
 and my name as Dominic, and then clicked the Go button. Is it possible to
 write a perl script so that the server couldn't tell the difference between a
 POST from that script and a person filling out the web page.

Yes.  The POST utility distributed with LWP supports
this, you can read its code to see how it's done.
The LWP mailing list archives have many postings on this issue.

You can write a parser for forms (using HTML::Parser or modules
that depend on it) that parses a form to find the form fields
and then submit the form using values you pick somehow.

LWP doesn't support Javascript, so the forms have to be without
Javascript.
 
 FORM NAME=testForm METHOD=POST
 
 target=_topACTION=http://someserver.com/SendForm.htm;
 
 TEXTAREA NAME=Message wrap=no rows=5 cols=40 /TEXTAREA
 
 INPUT type=text NAME=Name size=20
 
 INPUT TYPE=button VALUE=Go onClick=submit() 
 
 /FORM

-- 
Reinier Post



Re: LWP::RobotUA problem

2001-04-24 Thread Reinier Post

On Tue, Apr 24, 2001 at 09:47:21AM -0700, Gisle Aas wrote:

  234c234,235
   my $netloc = $request-url-host_port;
  ---
   my $ru = $request-url;
   my $netloc = $ru-can('host_port') ? $ru-host_port : $ru-host;
 
 Not all URIs have a 'host' method either.  I think simply making it:
 
$netloc = eval { $ru-host_port };
 
 should do.

If eval{}ing arbitrary URIs is safe ... what happens on the 'URI'

  http://$usersuppliedvalue/

?  I'd have to check this particular case ...  LWP promise in general
to avoid exploits of this nature?

 But then we have the $SIG{__DIE__} stupidity which makes it:
 
$netloc = eval { local $SIG{__DIE__}; $ru-host_port };

That's nice enough, if eval{} really doesn't lead to exploitable URIs.

-- 
Reinier



Re: considering HTML::Element's $tree-extract_links

2001-02-27 Thread Reinier Post

On Sat, Feb 24, 2001 at 05:11:02PM -0700, Sean M. Burke wrote:
 Some clever person wrote me earlier this month and suggested adding a
 feature to HTML::Element's extract_links method; and I want to
 run it past people who actually use the current method's behavior.

Count me in.

 What the person who wrote to me suggested was this:  make each item
 in the returned array contain not two subitems (attribute_value,
 $element), but THREE: (attribute_value, $element, attribute_name).
 
 I think this is a wonderful idea.

Yes, definitely!  In fact, I'd be happy to get just the element.

 For anyone who uses extract_links, I'm asking:  would any of your code
 break if I added a third value to each sublist returned?

Mine won't.
 
-- 
Reinier



Re: / and DirectoryIndex

2001-02-21 Thread Reinier Post

On Wed, Feb 21, 2001 at 04:42:20PM +0700, John Indra wrote:
 Hi all...
 
 How do I tell my user-agent (an LWP::UserAgent object) to NOT download both
 / and index.html or whatever remote sites DirectoryIndex set to?
 Example, my user-agent sees 2 link:
 - http:://www.domain.com/

This :: notation is contagious :-)

 - http:://www.domain.com/index.html

 IF in this situation both link to the same document, my user-agent will be a
 fool if it tries to download both file. How do I make a "smarter" user-agent
 that will know that those 2 links are the same and only perform one GET
 method, either to http:://www.domain.com/ OR
 http:://www.domain.com/index.html?

The server won't tell you whether or not they're the same document.
You have the same problem with server aliases or symlinks: the whole
tree

   http://www.domain.com/a/butreally/b/*

may be identical to 

  http://www.domain.com/b/*

Depending on what you find on the server it may be possible to hypothesize
some heuristics, for instance, '*/index.html always has the same content
as */', but exceptions are always possible.  The only way to be really sure
is to check the document content, or at least the header.

-- 
Reinier



Re: Off topic question

2001-01-23 Thread Reinier Post

On Mon, Jan 22, 2001 at 08:57:20AM -0800, [EMAIL PROTECTED] wrote:
 I know this is off topic, but can some perhaps point me to a 
 resource online that shows how you can load a perl module into 
 your local cgi-bin and use it locally. I'm running into a case of a host 
 admin that refuses to install some modules for some of our software. 
 It would be a lot easier if I could provide instructions for people that 
 want to install our software if the module is missing and the admin is 
 uncooperative.

Well, the basic idea is, set $PERLLIB to the installation location,
both at installation time and at use time.  At use time you can also use

  use lib libdir


-- 
Reinier



Re: Install, Again

2001-01-11 Thread Reinier Post

On Tue, Jan 09, 2001 at 09:00:58AM -0500, Alliance Support wrote:
 
 # perl -e "use LWP::Proxy"
 Can't locate LWP/Proxy.pm in @INC (@INC contains:
 /usr/local/lib/perl5/5.00502/sun4-solaris
 /usr/local/lib/perl5/5.00502
 /usr/local/lib/perl5/site_perl/5.005/sun4-solaris
 /usr/local/lib/perl5/site_perl/5.005
  .) at -e line 1.
 BEGIN failed--compilation aborted at -e line 1
 
 I really don't understand what the test is doing other than looking for a
 file LIB/Proxy.pm. There is a directory name LWP but no Proxy.pm file, lots
 of others.

So install the LWP::Proxy module.
The standard way of doing this is by typing

# perl -eshell -MCPAN
cpan install LWP::Proxy
[...]
cpan quit
#

If you can't touch the set of libraries installed as root, it is
possible to install your own set, or even your own version of Perl,
from the same interface.  This is not very well docum,ented though, I
just lost a few hours because I didn't remember all the details from
last time.  Basically, you need to set $PERLLIB to where you want your
own libraries installed, then read the ExtUtils::MakeMaker and CPAN
manpages for the proper values of the CPAN configuration variables,
then copy CPAN/Config.pm to $HOME/.cpan/CPAN/MyConfig.pm, edit it to
contain the correct values, and 'perl -eshell -MCPAN' will work.

-- 
Reinier



Re: problems with LWP::UserAgent

2000-12-07 Thread Reinier Post

On Wed, Dec 06, 2000 at 04:38:48PM -0800, Gisle Aas wrote:

  meta http-equiv="Refresh" content="0; URL=/2000/11/02/"
  
  Contrary to what you seem to believe, this is not a HTTP redirect.
  It isn't handled by the redirect_ok setting.
  
  I don't think LWP offers supportfor automatic refreshes.
 
 LWP will let HTML::HeadParser look at the HTML it receives, so these
 meta elements actually end up as HTTP headers.  We might try to deal
 with:
 
Refresh: 0; ...
 
 as if it was a normal 3xx-redirect.

This would be nice, if documented e.g. for redirect_ok.

 If the number is something else
 than 0 then the page should simply be returned as now.

It would be nice to also have the option to have it refreshed anyway.
It would even be possible to refresh after the specified # of seconds,
with sleep().

  refresh_ok ?
  refresh_immediately_if_faster_than(10) ?

 --Gisle
 
-- 
Reinier Post
TU Eindhoven



Re: problems with LWP::UserAgent

2000-12-06 Thread Reinier Post

On Thu, Dec 07, 2000 at 12:54:00AM +0200, [EMAIL PROTECTED] wrote:
 Hi,
 
 can you help me to desolve this problem? is where I have mistake in follow script?
 
 exist A.html and B.html pages. how can I get the source of B.html page if page 
A.html redirect to B.html like in follow HTML example?
 
 !doctype html public "-//w3c//dtd html 4.01 transitional//en"
 html
 head
meta http-equiv="Content-Type" content="text/html; charset=windows-1251"
meta http-equiv="Refresh" content="0; URL=/2000/11/02/"

Contrary to what you seem to believe, this is not a HTTP redirect.
It isn't handled by the redirect_ok setting.

I don't think LWP offers supportfor automatic refreshes.

-- 
Reinier Post



Re: URI::Heuristic

2000-11-27 Thread Reinier Post

On Fri, Nov 24, 2000 at 07:32:03PM +0200, Doru Petrescu wrote:
 tryed to email "[EMAIL PROTECTED]" but I got an user not found" SMTP error :(
 hope this is the right email address ...
 
 -- Original message --
 Subject: URI::Heuristic
 
 
 Hi,
 
 I was playing with the URI::Heuristic module, and
 I have a suggestion ... when guessing the host part of a URI, befor trying
 things like 
 
 www.ACME.MY_COUNTRY
 www.ACME.com
 www.ACME.org
 ...
 
 isn't normal to first try to resolve that ACME string ? maybe it is a host
 in my OWN LOCAL DOMAIN ...

I haven't looked at the code, but I can guess the reason: it's desirable
to limit URI::* to doing pure string manipulation, without any
dependence on DNS lookups or actual document lookups over HTTP, etc.
This limits the amount of guessing it can do, but it won't rely on
the availability of a network connection.  Depending on the circumstances
of use,DNS lookups may be slow or completely inoperational.

Perhaps you can implement a URI::Heuristic::Gethostname to do what you want?

 lynx on the other hand does another stupid thing ...
 it tries:
 1. rdsnet.ro ... fails
 2. www.rdsnet.ro.com ... fails
 3. www.rdsnet.ro.org/net/mil ... etc ... all of them fail ...
 4. host not found ...
 (but OFC, www.rdsnet.ro exists and is up and alive, too bad no one ask for
 his name ...)
 
 what do you say ... ?

Send a bug report to lynx-dev :)

-- 
Reinier Post
TU Eindhoven



Re: libwww-perl install

2000-11-15 Thread Reinier Post

On Wed, Nov 15, 2000 at 09:30:12AM +0100, Bence Fejervari wrote:
 
 Hi!
 
 Yesterday I tried to install the libwww-perl package from .tar.gz file,
 but when I made make test, it gave me 16 errors out of 22.
 I attached all the output information.

LWP depends on some other libraries that you need to install first.
My advice is to use

  % perl -eshell -MCPAN

and let the CPAN shell do this for you automatically.

  cpan install Bundle::LWP

-- 
Reinier



Re: One Doubt !!!

2000-11-06 Thread Reinier Post

On Mon, Nov 06, 2000 at 11:43:47PM +0530, Vasu Balla wrote:
 Can Any body clarify my doubt ... Is it possible to save all images to
 disk which are requested by a client i.e., any browser . so that they can
 be manipulate them then send them to browser ... Are there any modules
 ... useing which , we can do this job ...

Definitely; I use LWP and Image::Magick to dynamically manipulate
images at the time they are requested by the browser.  You'll need a
detailed plan on how your software is going to work though, and that
is something you'd better design by yourself.

-- 
Reinier Post



Re: Perl script

2000-11-06 Thread Reinier Post

On Mon, Nov 06, 2000 at 01:58:53PM -0500, Dahlman, Don P. wrote:
 Not sure if this pertains to this mail list or not, but I will try.

It doesn'.  LWP is a software library for client-side use of the Web
in Perl.
 
 However in the last six months the script was spammed
 by mass amounts of off domain calls. The script evidently
 was able to be called with the proper parameters attached on to the
 script call.

You can probably shut out certain clients (by IP number or other
characterertics) in such a way that that specific caller is denied
access and not too many others are denied as well.  See your webserver
software's documentation for details.

-- 
Reinier post



Re: Logging on to a Website using libwww

2000-10-26 Thread Reinier Post

On Thu, Oct 26, 2000 at 10:39:57AM +0200, Dirk Treusch wrote:
 Dear list members,
 
 I would like to log in to a web site from Perl and return the URL and 
 content which the server returns after the login.
 
 With the code below I have been successful in filling simple forms on 
 some websites. However I have not been able to log in to
 http://www.finanztreff.de 

Your Perl code looks fine, but when I look att he source of this page, I see:

  Form NAME="LOGIN" action="/ftr/steuer/user_steuer.htm" method=post onSubmit="return 
LoginwithCookie();"

I haven't studied the details, but you probably need to add
cookie handling on the LWP end.  This is supported.

-- 
Reinier Post



Re: filtering uploaded files

2000-10-19 Thread Reinier Post

On Mon, Aug 28, 2000 at 10:58:52PM +0200, Gisle Aas wrote:
 Ian Duplisse [EMAIL PROTECTED] writes:
 
  I am uploading files via HTTP::Request::Common::POST, but would like 
  to modify the data that is actually uploaded to the webserver "on the fly", 
  such as with a search and replace.  How can that be done, short of making a 
  copy of my original file that has the desired changes?
 
 Something like this should work:
 
 #!/usr/bin/perl 
 
 use HTTP::Request::Common qw(POST);
 
 my $file = `cat stuff.txt`;  # slurp a file
 $file =~ s/foo/bar/g;# modify it
 
 my $req = POST('http://foo.com/',
  Content_Type = 'form-data',
  Content = [ foo = $bar,
   file = [ undef, "stuff.txt",
   Content_type = "text/plain",
   Content = $file,
 ],
 ],
   );
 
 print $req-as_string;
 
 use LWP::UserAgent;
 $ua = LWP::UserAgent-new;
 my $res = $ua-request($req);
 __END__

This is incompatible with the file upload used by my Netscape 4.7
on Win98 browser, as understood by HTTP::File::upload.  I.e. the above
script will result in empty upload results with HTTP::File::upload at the
receiving end, bcause the file content is transmitted in a different way.

Does LWP also support that other method (multi-part/form-data, every form
attribute being a separate part)?

 Regards,
 Gisle
 
-- 
Reinier Post [EMAIL PROTECTED]



Re: Extending HTML-Parser

2000-10-10 Thread Reinier Post

On Tue, Oct 10, 2000 at 12:35:48PM -0500, [EMAIL PROTECTED] wrote:
 Gisle Aas suggested I send the following patch to this libwww mailing list to
 see if any of you have any comments.  The feature we've been talking about adds
 functionality to HTML::Parser to allow it to parse ASP- and/or JSP- style tags.  
 Specifically, we can now configure patterns to specify regions of the input
 tags which should be handed off to special handlers to handle, e.g. the 
 following:
 
  The conditional is 
  % if ($conditional) { %
blinktrue/blink
  % } else { %
blinkfalse/blink
  % } %

I use HTML::Mason to do this, would it be possible to unite forces?
Having one parser for this mechanism could benefit both development (you)
and users (me).

-- 
Reinier Post



Re: Redirects with javascript

2000-10-03 Thread Reinier Post

On Thu, Sep 28, 2000 at 06:07:45PM -0300, Anderson Marcelo wrote:
 Please, 
 
 How make for the "result.html" content the 
 redirect of the "index.html" ??
 
 The page (index.html) content this:
 script window.location="/test.shtml"/script

LWP doesn't include a Javascript engine; you'll need one to get this to work.

-- 
Reinier



Re: HTTP redirects

2000-09-07 Thread Reinier Post

On Thu, Sep 07, 2000 at 04:07:31PM +, Jarrett Carver wrote:
 Is there a way to tell if your request has been redirected? i.e is_redirect? 

  perldoc HTTP::Response

Look at the previous() method.  HTTP return codes are defined in

  http://www.w3.org/Protocols/rfc2068/rfc2068

-- 
Reinier



Re: how to convert

2000-08-11 Thread Reinier Post

 i am using,
  
 perl -e s/\\r/\\$/g filename, it's not working. please suggest the me.

You're not asking for this, but please be aware what you're doing:

 - your question is totally unrelated to LWP

 - programming is not a trick, but a profession. complain to your boss
   if you're doing this for work

 - learn Perl! there are some good books and sites for beginners

 - you could have used the Web to solve this, e.g. it took me 10 seconds
   to find

 http://userpage.chemie.fu-berlin.de/~winkelma/Misc/Perl/Scripts/Dos-Unix/

 - try

  perl -0pe 's/\r\n/\n/g' filename  converted

-- 
Reinier Post



WWW::Robot crashing upon encountering file:// URLs

2000-08-06 Thread Reinier Post

  Hello list,

WWW::Robot spiders crash on on file:// URLs, due to the following problem
in LWP::RobotUA:

% perl -MLWP::RobotUA -e 'my $ag=new 
LWP::UserAgent("bugexposer/0.1","reinpost\@win.tue.nl");$ag-delay(0);printf 
"%s\n",$ag-request(new 
HTTP::Request(GET,"file://localhost/home/rp/.cshrc/"))-content'
Can't locate object method "host_port" via package "URI::file" at 
/usr/local/lib/perl5/site_perl/5.005/URI/WithBase.pm line 48.

I've just taken my first steps with this software (still trying to
figure out why WWW::Robot will only produce HTML URLs despite what the
documentation suggests), but this is a clear problem, that seems to call
for a patch.  A stopgap patch is attached.

-- 
Reinier Post [EMAIL PROTECTED]


--- RobotUA.pm.orig Sun Aug  6 18:08:56 2000
+++ RobotUA.pm  Sun Aug  6 15:56:47 2000
@@ -231,7 +231,8 @@
  HTTP::Status::RC_FORBIDDEN, 'Forbidden by robots.txt';
 }
 
-my $netloc = $request-url-host_port;
+my $ru = $request-url;
+my $netloc = $ru-can('host_port') ? $ru-host_port : $ru-host;
 my $wait = $self-host_wait($netloc);
 
 if ($wait) {



Re: last_modified problem, help needed

2000-08-03 Thread Reinier Post

 but, if it's a ".php" file, $res-last_modified get 
 nothing, what's wrong?

A .php URL typically points to a PHP script that generates its output on
the fly; therefore, a Last-Modified header would be quite useless.

-- 
Reinier Post



Re: Newbie to the list

2000-08-01 Thread Reinier Post

On Mon, Jul 31, 2000 at 11:57:00PM -0700, [EMAIL PROTECTED] wrote:
 [...]   I have written one spider using 
 the sockets library, but its not as robust as I would like. So now I'm 
 exploring the LWP and HTTP modules as a way of bringing our 
 spider upto date.

I happen to be looking at the WWW::Robot CPAN module right now, and my
questions are similar.  Can you say how your spider compares to
WWW::Robot?  Is it more advanced?

-- 
Reinier Post



Re: HTML::Entities module

2000-06-22 Thread Reinier Post

On Thu, Jun 22, 2000 at 09:52:18AM +, marc-andre sauve wrote:
 Hi,
 
 Looking for HTML::Entities perl module

% perl -e shell -M CPAN
cpan install HTML::Entities
cpan quit
% perldoc HTML::Entities

-- 
Reinier



Re: Problem deleting nodes with HTML::Element

2000-05-18 Thread Reinier Post

You are deleting nodes from the tree while traverse-ing it.

 From running this code on this sample I still have the Fifth and 
 Seventh ps in there.

The documentation specifically warns against this.  Last month I posted
to this list a patch to make it possible (all it took was a one-line
change in traverse) but the maintainer didn't accept it, and also
completely reimplemented traverse; I haven't studied if it would be
as easy to change in the new implementation.

So mark or collect the nodes for deletion, delete them in a second pass.

-- 
Reinier Post



Re: Help!

2000-05-04 Thread Reinier Post

 Question:
 Is there any way I can set the absolute url of the response to
 http://165.21.42.93/7000OneNumber so that the redirection is successful.

Yes, use the abs() method, described in

  perldoc URI

-- 
Reinier



Re: FRAME support in HTML::TreeBuilder

2000-04-30 Thread Reinier Post

 How easy would it be to change HTML::TreeBuilder to preserve
 the structure of framed pages?
 
 Not terribly, but I'll give it a try.  In the meantime, yes, try moving
 things around to repair the tree.  Presumably it's just a matter of finding
 the body, finding the frameset under that, moving it up to be body's
 sister, and then demoting body to... be inside the noframe element inside
 the frameset (or make one if none there?).

OK thanks, I'm trying that and it seems to work, but I haven't tested
very well.

-- 
Reinier



Re: patch for HTML::Parser 3.06 to fix declaration commenthandling

2000-03-09 Thread Reinier Post

On Wed, Mar 08, 2000 at 10:21:17PM -0500, la mouton wrote:
 this is what I experienced also.  Comments like "! row1 --" get treated
 like comments by browsers and HTML::Parser should behave the same way.

In other words, HTML::Parser should parse not HTML, but what some browsers
think HTML is.

-- 
Reinier



Re: MULTI FORM submission

2000-01-24 Thread Reinier Post

On Mon, Jan 24, 2000 at 01:11:46PM +1100, Shao Zhang wrote:
 Hi,
   I have sent to [EMAIL PROTECTED] I thought this list is for
   discussions of using perl modules to interact with the web.
   Am I wrong?

Yes and no.  This list is for discussions of a specific Perl module: LWP.

-- 
Reinier Post