> I'm completely baffled by this and not entirely sure where to start.
>
> I have a plain text  file, testfile.txt, which contains a single line:
>
> Very truly yours,
>
> It is written exactly how you see it above, with a newline at the end.
>
> I'm trying to write a script that will determine the number of words
> in the file.  A snippet of what I have thus far is the following:
>
> my $fh = new IO::File("$lvl2path/$filestng", "r") ||
>       die ("Can't open .txt file named at $lvl2path.  Exiting
> program.\n\n");
> while (my $line = $fh->getline())
> {
>       my @words = split /\s+/, $line;
>       my %count = ();
>       $count{$line} += @words;
>       print "$line";
>       print "The line above has " . scalar @words . " occurrences of
> something.\n";
> }
> $fh->close();
>
> That outputs the following:
>
> V e r y  t r u l y  y o u r s ,
> The line above has 3 occurrences of something.
>
> I understand that spilt /\s+/ is matching whitespace characters, and
> I'm pleased that it comes back with 3 (two spaces and the newline).
> What I don't understand is why the output has spaces between all the
> letters.  I've looked at this and other .txt files in different
> editors on different OS's; I can't find any hidden characters,
> whitespace or other, anywhere they don't belong.  What's really
> concerning is when I change the above such that:
>
> my @words = split /\w+/, $line;
>
> I get this:
>
> V e r y  t r u l y  y o u r s ,
> The line above has 15 occurrences of something.
>
> Where is this whitespace coming from between the letters??  Is it
> really whitespace (/\s+/ doesn't catch it, but /\w+/ is catching each
> character as if there's whitespace between)??  A good part of my
> dissertation hinges on being able to read thousands of .txt files
> without the extraneous spaces that are being introduced somewhere.
>
> By the way, only some files appear affected, but there's no obvious
> pattern.
>
> Any hints would be wildly appreciated.




Hi,

I think you are making this all too complicated. All that is needed is
the script below. If you have a file




#!/usr/bin/perl

use strict;

while (<DATA>) {
    chomp;
    my @words    = split / /;
    my $nr_words = @words;
    print "Number of words is $nr_words and the words are\n @words\n";
}

__DATA__
Very truly yours,


# perl beg1.pl
Number of words is 3 and the words are
 Very truly yours,

=========================================================
Using a file open statement you would do something like this (untested)


#!/usr/bin/perl

use strict;

open (my $FILETEST, "<", " $lvl2path/$filestng") or die "can't open
$lvl2path/$filestng for reading $!\n";

while (<$FILETEST>) {
    chomp;
    my @words    = split / /;
    my $nr_words = @words;
    print "Number of words is $nr_words and the words are\n @words\n";
}

But you might want to split on white space to cope with the occasions
when there is more than one space between words.

You have done something that puts a space between letters when
printing out.
-- 



Owen


-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to