Re: reading HTML input-files (WITH ATTACHMNT!)

2002-03-08 Thread Ian Abbott

On 8 Mar 2002 at 10:50, Mathias Kratzer wrote:

 I admit that the lines  in my original file   contain a really  stupid
 syntax error. As an absolute beginner with the Markup Languages I have
 just tried   to learn   from  some  hyperlink examples  but  obviously
 misunderstood their  formal  structure.  Nevertheless, Wget  1.5.2 did
 recognize my URLs!

Well, as you noted, the HTML parser was rewritten for Wget 1.7, so
it is not too surprising that it would behave differently for
erroneous input!

 So does Wget 1.7 after I've changed the lines to SGML format. However,
 I feel obliged to inform you that XML format didn't solve the problem.

Ah yes, the XML (XHTML) form was not supported until Wget 1.8 or
1.8.1 (I can't remember which, and can't be arsed to find out at
the moment!).



reading HTML input-files

2002-03-07 Thread Mathias Kratzer


Dear Wget-team,

the NEWS file coming with Wget 1.7 says:

 ** The HTML parser has been rewritten.  The new one works more
 reliably, allows finer-grained control over which tags and attributes
 are detected, and has better support for some features like correctly
 skipping comments and declarations, decoding entities, etc.  It is 
 also more general.

While calling Wget 1.5.2 by

  wget -F -O 69_4_522_Ref.res -i 69_4_522_Ref.mrq

on the attached file 69_4_522_Ref.mrq has worked very well I am left
with the error message 

  No URLs found in 69_4_522_Ref.mrq 

whenever I try the same command using Wget 1.7. Even embedding the
content of 69_4_522_Ref.mrq into a HTML4 frame (i.e. DOCTYPE-header,
html-, head- and body-tags) did not help.

Can you tell me what I am doing wrong?

Thanks in advance,
Mathias

--
Dr. Mathias Kratzer   | I_nstitute for 
E-Mail: [EMAIL PROTECTED] | E_xperimental  
Phone : +49-201-183-7680  | M_athematics Ellernstr. 29 
Visit : IEM, Room 205 |  D-45326 ESSEN




reading HTML input-files (WITH ATTACHMNT!)

2002-03-07 Thread Mathias Kratzer


Dear Wget-team,

sorry for forgetting about the attachment to my first mail:


-- original message --- original message --
Date: Thu, 7 Mar 2002 17:41:53 +0100 (MEZ)
From: Mathias Kratzer [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: reading HTML input-files


Dear Wget-team,

the NEWS file coming with Wget 1.7 says:

 ** The HTML parser has been rewritten.  The new one works more
 reliably, allows finer-grained control over which tags and attributes
 are detected, and has better support for some features like correctly
 skipping comments and declarations, decoding entities, etc.  It is 
 also more general.

While calling Wget 1.5.2 by

  wget -F -O 69_4_522_Ref.res -i 69_4_522_Ref.mrq

on the attached file 69_4_522_Ref.mrq has worked very well I am left
with the error message 

  No URLs found in 69_4_522_Ref.mrq 

whenever I try the same command using Wget 1.7. Even embedding the
content of 69_4_522_Ref.mrq into a HTML4 frame (i.e. DOCTYPE-header,
html-, head- and body-tags) did not help.

Can you tell me what I am doing wrong?

Thanks in advance,
Mathias

--
Dr. Mathias Kratzer   | I_nstitute for 
E-Mail: [EMAIL PROTECTED] | E_xperimental  
Phone : +49-201-183-7680  | M_athematics Ellernstr. 29 
Visit : IEM, Room 205 |  D-45326 ESSEN



a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|London Math. Soc. Lecture 
Note Ser.|ELIASHBERG, Y.|151||45|1991Filling by holomorphic discs and its 
applications/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Ann. Inst. Fourier 
(Grenoble)|ELIASHBERG, Y.|42||165|1992Contact 3-manifolds twenty years since J. 
Martinet's work/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||GIROUX, E.|B|||/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Invent. Math.|GROMOV, 
M.|82||307|1985Pseudo -holomorphic curves in symplectic manifolds/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||GROMOV, M.|B|||/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Ann. of Math. (2)|HIRSCH, 
M.|73||566|1961On imbedding differential manifolds into Euclidean space/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||LUTTINGER, K.|B|||/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Astérisque|LAUDENBACH, 
F.|12|||1974Topologie de la dimension trois: homotopie et isotopie/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|J. Amer. Math. Soc.|McDuFF, 
D.|3||679|1990The structure of rational and ruled symplectic 4-manifolds/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||POLTEROVICH, 
L.|B|||/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Math. Notes|POLTEROVICH, 
L.|45||152|1989Strongly optical Lagrange manifolds/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||SIKORAV, J.|B|||/a



Re: reading HTML input-files (WITH ATTACHMNT!)

2002-03-07 Thread Ian Abbott

On 7 Mar 2002 at 17:50, Mathias Kratzer wrote:

 While calling Wget 1.5.2 by
 
   wget -F -O 69_4_522_Ref.res -i 69_4_522_Ref.mrq
 
 on the attached file 69_4_522_Ref.mrq has worked very well I am left
 with the error message 
 
   No URLs found in 69_4_522_Ref.mrq 
 
 whenever I try the same command using Wget 1.7. Even embedding the
 content of 69_4_522_Ref.mrq into a HTML4 frame (i.e. DOCTYPE-header,
 html-, head- and body-tags) did not help.
 
 Can you tell me what I am doing wrong?

The file 69_4_522_Ref.mrq contains several lines of the form:

  a href=url/a

which looks pretty invalid to me. Perhaps you need to change them
to:

  a href=url/ (XML format)

or:

  a href=url/a  (SGML format)