# Here is the rest of the code (this all happens before the snippit
# above). "cornell-staff.list" is a list of filenames like the one
# above.   If I fill the array manually, eg: $pages[0]= "..";
# $pages[1]="..." ... it works fine.

Thanks for posting the code.  Precisely what are you trying to do with
this script?  That is, I wonder about the s|\^|/|g and s|http\;//||;,
and so on.  What kind of input are we dealing with, and what is your
plan for the processed result which you are placing in @pages and
@links?

# open( INPUTLIST, "cornell-staff.list") || die "Can't open cornell-staff.list: $!";
# 
# foreach(<INPUTLIST>){ 
#     chomp;
#     my $filename = $_;
#     s|\^|/|g;         # convert ^'s to /'s
#     s|http\;//||;     # remove the http;^^
#     push(@pages, $filename);
#     push(@links, $_);
# }
# close( INPUTLIST );

I would recommend rewriting the s/// expressions as

  s|http\;\^\^||; # remove the http;^^
  s|\^|/|g; # convert ^'s to /'s

The fact that the second expression removes "http;^^" by removing
"http;//" is a little unclear at first glance, since you the reader
must remember that the string in question is no longer filled with
circumflexes.  For your own benefit, you might change it so in 6
months you can figure out what you were trying to express.  Just an
idea.

The impression I've gotten so far is that you've got a file with lines
like

  http;^^dri.cornell.edu^pub^People^davis.html

and you want to end up with

  @pages = ("dri.cornell.edu/pub/People/davis.html", ...);

The lack of context here is making this hard to figure out.  Are you
trying to map these URLs to files in a filesystem?  Or are you
planning on automating the downloading and parsing of the HTML located
at said URLs?

-- 

  Jonathan Daugherty
  http://www.cprogrammer.org

  "It's a book about a Spanish guy called Manual, you should read it."
                                                            -- Dilbert
_______________________________________________
PDXLUG mailing list
[EMAIL PROTECTED]
http://pdxlug.org/mailman/listinfo/pdxlug

Reply via email to