Bob Dohse wrote:
. > Glenn,
. > Why not parse for the elements contained within the brackets?
. > Search for the leading bracket ...
. > - if followed by a legal element (e.g., "<img src"), then ...
. > - replace everything within the quotes (i.e., the folder and file name)
. > with the Arachne assigned name
. > - the result would be an acceptable Arachne reference valid for offline
. > use
. > Bob ~
Bob,
There is no need to parse the page source as the pairings are all included in
CACHE.IDX. I finally got a text version of CACHE.IDX by using the -c option
of WWWMAN to create CACHEIDX.HTM and then using HTMSTRIP to get a pure text
version so that I would not have to copy full source page reference paths
into an e-mail message.
The CACHE.IDX is that of my home page, MY.YAHOO. The first few lines of
CACHE.IDX are: (My comments will be all upper case for distinction.)
Cache index
key...">Index of Arachne WWW cache
------------------------------------------------------------------------------
-- Cache index filename: cache\cache.idx
------------------------------------------------------------------------------
-- http://my.yahoo.com/ <--THIS PAGE SOURCE IS CACHED AS-------
|
V
=====================
Sun Jun 08 08:29:03 2003 | 40630 bytes | text/html | P:\CACHE\55075314.HTM
DO A SEARCH IN CACHED PAGE SOURCE FOR-
|
V
==============================================
http://us.i1.yimg.com/us.yimg.com/i/my/top7.gif
AND REPLACE IT WITH-----------------------------------
|
V
====================
Sun Jun 08 08:29:07 2003 | 1965 bytes | image/gif | P:\CACHE\55075351.GIF
DO ANOTHER SEARCH IN CACHED PAGE SOURCE FOR--
|
http://us.i1.yimg.com/us.yimg.com/i/spacer.gif <-
AND REPLACE IT WITH----------------------------------
|
V
====================
Sun Jun 08 08:29:07 2003 | 43 bytes | image/gif | P:\CACHE\55075362.GIF
ETC., ETC., ETC.
Hope this is much clearer. It doesn't make any difference if the searched
for item begins with "http://...." or not. This way the page can be read
offline with the images shown.
It may even be prudent to replace the cached page source name (55075314.HTM)
with an 8.3 filename so that it would be relevant for offline reading, e.g.,
for this example, replace cached page source name, 55075314.HTM, with
myyahoo1.htm
Roger Turk
Tucson, Arizona