reading HTML input-files

2002-03-07 Thread Mathias Kratzer


Dear Wget-team,

the NEWS file coming with Wget 1.7 says:

 ** The HTML parser has been rewritten.  The new one works more
 reliably, allows finer-grained control over which tags and attributes
 are detected, and has better support for some features like correctly
 skipping comments and declarations, decoding entities, etc.  It is 
 also more general.

While calling Wget 1.5.2 by

  wget -F -O 69_4_522_Ref.res -i 69_4_522_Ref.mrq

on the attached file 69_4_522_Ref.mrq has worked very well I am left
with the error message 

  No URLs found in 69_4_522_Ref.mrq 

whenever I try the same command using Wget 1.7. Even embedding the
content of 69_4_522_Ref.mrq into a HTML4 frame (i.e. DOCTYPE-header,
html-, head- and body-tags) did not help.

Can you tell me what I am doing wrong?

Thanks in advance,
Mathias

--
Dr. Mathias Kratzer   | I_nstitute for 
E-Mail: [EMAIL PROTECTED] | E_xperimental  
Phone : +49-201-183-7680  | M_athematics Ellernstr. 29 
Visit : IEM, Room 205 |  D-45326 ESSEN




reading HTML input-files (WITH ATTACHMNT!)

2002-03-07 Thread Mathias Kratzer


Dear Wget-team,

sorry for forgetting about the attachment to my first mail:


-- original message --- original message --
Date: Thu, 7 Mar 2002 17:41:53 +0100 (MEZ)
From: Mathias Kratzer [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: reading HTML input-files


Dear Wget-team,

the NEWS file coming with Wget 1.7 says:

 ** The HTML parser has been rewritten.  The new one works more
 reliably, allows finer-grained control over which tags and attributes
 are detected, and has better support for some features like correctly
 skipping comments and declarations, decoding entities, etc.  It is 
 also more general.

While calling Wget 1.5.2 by

  wget -F -O 69_4_522_Ref.res -i 69_4_522_Ref.mrq

on the attached file 69_4_522_Ref.mrq has worked very well I am left
with the error message 

  No URLs found in 69_4_522_Ref.mrq 

whenever I try the same command using Wget 1.7. Even embedding the
content of 69_4_522_Ref.mrq into a HTML4 frame (i.e. DOCTYPE-header,
html-, head- and body-tags) did not help.

Can you tell me what I am doing wrong?

Thanks in advance,
Mathias

--
Dr. Mathias Kratzer   | I_nstitute for 
E-Mail: [EMAIL PROTECTED] | E_xperimental  
Phone : +49-201-183-7680  | M_athematics Ellernstr. 29 
Visit : IEM, Room 205 |  D-45326 ESSEN



a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|London Math. Soc. Lecture 
Note Ser.|ELIASHBERG, Y.|151||45|1991Filling by holomorphic discs and its 
applications/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Ann. Inst. Fourier 
(Grenoble)|ELIASHBERG, Y.|42||165|1992Contact 3-manifolds twenty years since J. 
Martinet's work/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||GIROUX, E.|B|||/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Invent. Math.|GROMOV, 
M.|82||307|1985Pseudo -holomorphic curves in symplectic manifolds/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||GROMOV, M.|B|||/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Ann. of Math. (2)|HIRSCH, 
M.|73||566|1961On imbedding differential manifolds into Euclidean space/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||LUTTINGER, K.|B|||/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Astérisque|LAUDENBACH, 
F.|12|||1974Topologie de la dimension trois: homotopie et isotopie/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|J. Amer. Math. Soc.|McDuFF, 
D.|3||679|1990The structure of rational and ruled symplectic 4-manifolds/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||POLTEROVICH, 
L.|B|||/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=|Math. Notes|POLTEROVICH, 
L.|45||152|1989Strongly optical Lagrange manifolds/a
a href=http://www.ams.org/batchmrlookup?api=xrefqdata=||SIKORAV, J.|B|||/a



¿Ü·Î¿òÀ» ´Þ·¡µå¸±²²¿ä (¼ºÀÎ-±¤°í)

2002-03-07 Thread 57opo

  
  






¿Ü·Î¿òÀ» ´Þ·¡ÁÙ ¾ÖÀÎÀÌ ÇÊ¿äÇÒ¶§ 
24½Ã°£ ±â´Ù¸®°í ÀÖ¾î¿ä
   

1.  ÀüÈ­±â¸¦ µé°í
   060-707-7749   ¸¦ ´©¸£¼¼¿ä.
   
   (Àü±¹½Ã³»¿ä±Ý) 


°¡ÀÔºñ ¾ø½¿

2. ¿Ü·Î¿òÀ» ´Þ·¡ÁÙ ´ëÈ­»ó´ë°¡ ÇÊ¿äÇϽźÐ.

3. Áö±Ý ÀüÈ­¸¦ µé°í 060-707-7749¸¦ ´­·¯ÁÖ¼¼¿ä

5. °¡ÀÔÈÄ °ø°³°Ô½ÃÆÇ¿¡ ÀÚ½ÅÀÇ °³ÀÎÇÁ·ÎÇÊÀ» °­·ÂÇÏ°Ô ¾îÇÊÇغ¸¼¼¿ä
 

 ±×´ÙÀ½¿¡ ¹«½¼ÀÏÀÌ ÀϾÁö Ã¥ÀÓ¸øÁ®


** ÀÌÄÉÇÏ¸é ¿Àºü¾ß Ç°À¸·Î ¿­³ª¼½¾¯ ÂßÂß»§»§ ¾ð³ÄµéÀÌ È£¹Ú³ÕÄðó·³ ¿ì·ç·è!! **



º» ¸ÞÀÏÀº Á¤º¸Åë½Å¸Á ÀÌ¿ëÃËÁø¹ý ±ÔÁ¤¿¡ µû¶ó ±¤°í¸ÞÀÏÀÓÀ» Ç¥½ÃÇÏ¿´À¸¸ç ¼ö½Å°ÅºÎÀåÄ¡¸¦ ¸¶·ÃÇÏ°í ÀÖ½À´Ï´Ù. 
º» ¸ÞÀÏÁÖ¼Ò´Â ÀÎÅͳݻ󿡼­ ¼öÁýÇÑ°ÍÀÌ¸ç ¸ÞÀÏÁÖ¼Ò¿Ü ¾î¶°ÇÑ °³ÀÎÁ¤º¸µµ °®°í ÀÖÁö ¾Ê½À´Ï´Ù
¿øÄ¡ ¾ÊÀº Á¤º¸¿´´Ù¸é Á¤ÁßÈ÷ »ç°ú µå¸®¸ç, ¼ö½Å°ÅºÎ¸¦ ÇØ ÁÖ½Ã¸é ´ÙÀ½ºÎÅÍ´Â ¸ÞÀÏÀÌ ¹ß¼ÛµÇÁö ¾ÊÀ» °ÍÀÔ´Ï´Ù
¼ö½Å°ÅºÎ


  



Re: reading HTML input-files (WITH ATTACHMNT!)

2002-03-07 Thread Ian Abbott

On 7 Mar 2002 at 17:50, Mathias Kratzer wrote:

 While calling Wget 1.5.2 by
 
   wget -F -O 69_4_522_Ref.res -i 69_4_522_Ref.mrq
 
 on the attached file 69_4_522_Ref.mrq has worked very well I am left
 with the error message 
 
   No URLs found in 69_4_522_Ref.mrq 
 
 whenever I try the same command using Wget 1.7. Even embedding the
 content of 69_4_522_Ref.mrq into a HTML4 frame (i.e. DOCTYPE-header,
 html-, head- and body-tags) did not help.
 
 Can you tell me what I am doing wrong?

The file 69_4_522_Ref.mrq contains several lines of the form:

  a href=url/a

which looks pretty invalid to me. Perhaps you need to change them
to:

  a href=url/ (XML format)

or:

  a href=url/a  (SGML format)




bug with .html?

2002-03-07 Thread Picot Chappell

I'm using -html-extension to append files with the html extension.
Debug log is below.  I'm not getting the expected result, and I'm hoping
someone can determine the problem.  For testing purposes, I've got a cgi
script that generates the html for a page.  The server, that the cgi is
running on, has mime type set to text/html.

As an aside, the page requisites aren't loaded either.  Could this be
because the file doesn't initially have the html extension, so wget
doesn't go back and grab all images, etc?

Thanks,
Picot

*
The wget call looks like this:

./wget  -da extlog --html-extension --convert-links --page-requisites
--tries=3 --timeout=60 --ignore-length  --no-http-keep-alive
--cookies=off -i ID_11

**
The output from my debug log looks like this:

DEBUG output created by Wget 1.8.1 on solaris2.8.

Loaded ID_11 (size 28).
Enqueuing http://host:port/PET.cgi at depth 0
Queue count 1, maxcount 1.
Dequeuing http://host:port/PET.cgi at depth 0
Queue count 0, maxcount 1.
Caching host = ip.address
Created socket 18.
Releasing 121ee8 (new refcount 1).
---request begin---
GET /PET.cgi HTTP/1.0
User-Agent: Wget/1.8.1
Host: host:port
Accept: */*

---request end---
HTTP/1.1 200 OK
Server: Netscape-Enterprise/3.5.1
Date: Thu, 07 Mar 2002 11:38:10 GMT
 content-type: text/html; charset=ISO-8859-1
Connection: close

Closing fd 18

***
The page that's generated looks like this:

html
head
 titlePET/title
/head

!-- This is the linked style sheet --
link rel=stylesheet href=http://host:port/style.css;
TYPE=text/css

!-- Begin HTML Body --
body class=White
:
:
Stuff
:
:
/body
/html



begin:vcard 
n:Chappell;Picot
tel;cell:571.214.2874
tel;fax:703.902.3697
tel;work:703.902.5297
x-mozilla-html:FALSE
org:Booz Allen Hamilton;Visit us on the Internet: a href=http://boozallen.com;BoozOnline/a
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
fn:Picot Chappell
end:vcard