Just for fun, a functional but BAD example....

  use FileHandle;
  @_ = map { new FileHandle $_ } @ARGV;
  print while $_ = join "\t", map(scalar<$_>||'',@_), "\n"
          and s/\n(.)/\t$1/g;

Okay, that's ugly and almost completely unmaintainable for LOTS of
reasons, but it works. It will even take an arbitrary number of files,
though I'm not addressing the filesystem limit here.

First, FileHandle makes it easy to store anonymous filehandles, so you
can easily make an array of them, as I did in @_.
FileHandle is good -- it's your friend. Use it. That line isn't evil.
:)

Next, @_ is always there, as is $_ and even %_. That doesn't mean it's
a good idea to use it carelessly. It makes this code harder to read --
try rewriting it with an explicitly declared array, like this:

  my @fh = map { new FileHandle $_ } @ARGV;

Now, the *last* line is nutz. If you can read this at first pass, good
for you. I wrote it, and I have trouble. :)

  print while $_ = join "\t", map(scalar<$_>||'',@fh), "\n"
          and s/\n(.)/\t$1/g;

What's it doing? Let's break it down.

  print while .....

okay, that's an implicit print of $_ in a loop. So long as the rest
evaluates to true, it will print the default variable. Obviously (I
hope), the loop must be setting $_ -- and sure enough, that's what the
next bit says!

  print while $_ = join "\t",

okay! $_ is being set with a join, which takes a list and stacks the
values into a scalar. Here, it's tab-delimited. So what's in the list?

  map(scalar<$_>||'',@_), "\n"

Okay, the "\n" on the end makes sense, if it's okay to have the final
newline always preceded with a tab. I actually like that in most of my
output. But what's that map() doing?

It's iterating over the filehandles we put into @_, and for each,
reading one line with <$_> and if it fails, returning an empty string
to pacify join. (This will run without complaint under strict pragma
with warnings on, so don't think those two things guarantee good code!
....still, they darned well, help. Use 'em, both, as a habit!) 

The || operator puts <$_> into a scalar context, so it only reads one
line. But what about the poor shmuck who comes behind you in six months
trying to read your code? Is he going to know that? Are you? :)

So now we have 

  print while $_ = join "\t", map(scalar<$_>||'',@fh), "\n"

which stacks the contents of each file's read into one line....doesn't
it? Well, no, because they still have newlines on them. We didn't chomp
them. We couldn't use chomp() in the map() because it returns the
number of characters removed. The next guy is likely to add it if he
doesn't remember that, and the whole thing breaks.

What we did instead was a less-than-terrifically-efficient
substitution:

  print while $_ = join "\t", map(scalar<$_>||'',@_), "\n"
          and s/\n(.)/\t$1/g;

So, now that the join has loaded $_ with the concatenated lines from
the input files, the s/// searches for newlines (remembering the
following character) and replaces them with tabs (and the following
character). That's done before the while finishes evaluating, because
of the "and" operator. Pretty slick, eh?

Maybe, but you're making assumptions about your data. More importantly,
"slick" is almost always a euphamism for "hard to read". You'll regret
it when you come back to modify this.

Mine works, but it took me a while to work the kinks out. I could have
written the whole thing faster with more easily maintainable code.

Let's see:

  use FileHandle;
  my @fh  = map { new FileHandle $_ } @ARGV;
  # assume files have equal numbers of records
  my $driver = pop @fh;              # last entry drives
  while ( my $d_rec = <$driver> ) {  
     for my $file (@fh) {
         my $rec = <$file>;
         chomp $rec;
         print "$rec\t";
     }
     print $d_rec;                   # includes newline!
  }

Now, we're still using map(), but in a reasonably clean and readable
way. We pop off one of the filehandles (the last) to use as a driver,
making the (documented) assumtion that the files have equal numbers of
records. Then we go into a basic loop reading from that driver file,
and while it has lines, we loop over all the other filehandles. In that
loop, we read records into a working space, remove the newlines, and
then print them with trailing tabs. After we've done all of those, we
print the still un-chomp()'d line from the driver file, which provides
a newline.

Is there still room for improvement? Oh, you betcha. LOTS, in many
areas. But you can read it now, practically at first glance. There are
still some gotchas -- would you have noticed that the driver still had
the newline? Maybe not, without the comment.... maybe it would be
better to chomp it and manually add it back on, but then that's
unnecessary work. Comments are your friend.

Ok, enough babbling -- I want to get this out there so better coders
can chop it up and comment and make real, useful suggestions. I only
post it because I got an itch to write some tight code, and thought
someone might be able to learn something from it. ;op

Paul

__________________________________________________
Do you Yahoo!?
Yahoo! Shopping - Send Flowers for Valentine's Day
http://shopping.yahoo.com

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to