Re: [PDXLUG] Perl question (open oddities)

2004-03-03 Thread Jonathan Daugherty
# @pages is an array holding the filenames, read in from another file.
# Here's the suspect code: (with line numbers).
# 
# 42: print |$pages[0]|\n;
# 43: open(PAGE, , http;^^dri.cornell.edu^pub^People^davis.html) || die(can't 
open $pages[0]: $!);
# 44: close(PAGE);
# 45: open( PAGE, , $pages[0]) || die(can't open $pages[0]: $!);
# 46: close(PAGE);

My initial gut reaction to this (despite the fact that it seems to be
correct code) is that any filename with semicolons and circumflexes is
probably a risky filename to begin with. :)

# This seems to be somehow linked to the way I am generating the list,
# but I'm not sure how. (I added the |'s above to check for whitespace).

How are you generating the list, and what leads you to believe that
such generation would have this kind of effect?  I'd recommend
something like this:

  open FILE, /path/to/file or die Can't open file.;
  my @pages = FILE;
  chomp @pages;
  close FILE;

-- 

  Jonathan Daugherty
  http://www.cprogrammer.org

  It's a book about a Spanish guy called Manual, you should read it.
-- Dilbert
___
PDXLUG mailing list
[EMAIL PROTECTED]
http://pdxlug.org/mailman/listinfo/pdxlug


Re: [PDXLUG] Perl question (open oddities)

2004-03-03 Thread E. Rogan Creswick
Jonathan Daugherty writes:
  # 42: print |$pages[0]|\n;
  # 43: open(PAGE, , http;^^dri.cornell.edu^pub^People^davis.html) || die(can't 
  open $pages[0]: $!);
  # 44: close(PAGE);
  # 45: open( PAGE, , $pages[0]) || die(can't open $pages[0]: $!);
  # 46: close(PAGE);
  
  I put this in a perl script, touched the file
  http;^^dri.cornell.edu^pub^People^davis.html, and ran the script.  I
  did not get the error you got, so I suspect that the issue lies in
  another part of your code that we are not seeing.

Here is the rest of the code (this all happens before the snippit
above). cornell-staff.list is a list of filenames like the one
above.   If I fill the array manually, eg: $pages[0]= ..;
$pages[1]=... ... it works fine.

Thanks,
Rogan

my @pages;
my @links;

open( INPUTLIST, cornell-staff.list) || die Can't open cornell-staff.list: $!;

foreach(INPUTLIST){ 
chomp;
my $filename = $_;
s|\^|/|g;   # convert ^'s to /'s
s|http\;//||;   # remove the http;^^
push(@pages, $filename);
push(@links, $_);
}
close( INPUTLIST );

  (Just a follow-up to say that I can't reproduce the error with the
  supplied code.)
  
  -- 
  
Jonathan Daugherty
http://www.cprogrammer.org
  
It's a book about a Spanish guy called Manual, you should read it.
  -- Dilbert
  ___
  PDXLUG mailing list
  [EMAIL PROTECTED]
  http://pdxlug.org/mailman/listinfo/pdxlug

___
PDXLUG mailing list
[EMAIL PROTECTED]
http://pdxlug.org/mailman/listinfo/pdxlug


Re: [PDXLUG] Perl question (open oddities)

2004-03-03 Thread Jonathan Daugherty
# Here is the rest of the code (this all happens before the snippit
# above). cornell-staff.list is a list of filenames like the one
# above.   If I fill the array manually, eg: $pages[0]= ..;
# $pages[1]=... ... it works fine.

Thanks for posting the code.  Precisely what are you trying to do with
this script?  That is, I wonder about the s|\^|/|g and s|http\;//||;,
and so on.  What kind of input are we dealing with, and what is your
plan for the processed result which you are placing in @pages and
@links?

# open( INPUTLIST, cornell-staff.list) || die Can't open cornell-staff.list: $!;
# 
# foreach(INPUTLIST){ 
# chomp;
# my $filename = $_;
# s|\^|/|g; # convert ^'s to /'s
# s|http\;//||; # remove the http;^^
# push(@pages, $filename);
# push(@links, $_);
# }
# close( INPUTLIST );

I would recommend rewriting the s/// expressions as

  s|http\;\^\^||; # remove the http;^^
  s|\^|/|g; # convert ^'s to /'s

The fact that the second expression removes http;^^ by removing
http;// is a little unclear at first glance, since you the reader
must remember that the string in question is no longer filled with
circumflexes.  For your own benefit, you might change it so in 6
months you can figure out what you were trying to express.  Just an
idea.

The impression I've gotten so far is that you've got a file with lines
like

  http;^^dri.cornell.edu^pub^People^davis.html

and you want to end up with

  @pages = (dri.cornell.edu/pub/People/davis.html, ...);

The lack of context here is making this hard to figure out.  Are you
trying to map these URLs to files in a filesystem?  Or are you
planning on automating the downloading and parsing of the HTML located
at said URLs?

-- 

  Jonathan Daugherty
  http://www.cprogrammer.org

  It's a book about a Spanish guy called Manual, you should read it.
-- Dilbert
___
PDXLUG mailing list
[EMAIL PROTECTED]
http://pdxlug.org/mailman/listinfo/pdxlug


Re: [PDXLUG] Perl question (open oddities)

2004-03-03 Thread Jonathan Daugherty
# http;// is a little unclear at first glance, since you the reader

s/you//;

:)

-- 

  Jonathan Daugherty
  http://www.cprogrammer.org

  It's a book about a Spanish guy called Manual, you should read it.
-- Dilbert
___
PDXLUG mailing list
[EMAIL PROTECTED]
http://pdxlug.org/mailman/listinfo/pdxlug