Thanks, Jim. This awesome!
T. Sent from my iPhone On 2013-02-11, at 11:49 AM, Jim Gibson <jimsgib...@gmail.com> wrote: > > On Feb 10, 2013, at 5:57 PM, Tiago Hori wrote: > >> Hi All, >> >> I am trying to force myself to not use one of perl's modules to parse tab >> delimited files (like TXT::CSV), so please be patient and don't tell me >> just to go and use them. I am trying to re-ivent the wheel, so to speak, >> because as we do with science, we repeat experiments to lean about the >> process even tough we know the outcome. > > Coding your own solutions rather than using a module already built for the > same purpose is perfectly all right, especially if you are learning Perl. If > you are confident your data has a simple format and will not change, then you > can parse it yourself. Keep in mind, however, that the Text::CSV module can > handle more complicated cases. For example, what if your data fields can > contain the separator character? In that case, your data fields may be > enclosed in quotes or the embedded separator characters will have to be > escaped (e.g., preceded by a '\' character or some other means.) The > Text::CSV module can handle these cases, plus it can read from a file or a > scalar and deal with broken lines and other complexities. There is also the > Text::CSV::XS module which includes C code for speed. > >> >> So I started by putting reading in the files and go one line at time, >> putting those line in arrays and matching a specific line of interest. With >> join I could then turn the array of interest in a scalar and print that >> out. That is almost what I wanted (see code below): >> >> #! /usr/bin/perl >> use strict; >> use warnings; >> >> my $filename_data = $ARGV[0]; >> my $filename_target = $ARGV[1]; >> my $line_number = 1; >> my @targets; >> >> open FILE, "<", $filename_data or die $!; >> open TARGET, "<", $filename_target or die $!; > > Lexical file handles are generally better, and it helps to include the file > name in the error message: > > open(my $file, '<', $filename_data) or > die( "Can't open $filename_data for reading: $!"); > >> >> while (<TARGET>){ >> push (@targets, $_); >> } > > You can replace the above with: > > my @targets = <TARGET>; > > You can also do this to remove the line ending characters from @targets: > > chomp(@targets); > >> >> close (TARGET); >> >> while (<FILE>){ >> chomp; >> my $line = $_; > > You can read directly into a scalar, so no need for the $_ variable here: > > while( my $line = <FILE> ) { > chomp($line); > >> my @elements = split ("\t", $line); >> my $row_name = $elements[0]; >> if ($line_number == 1){ >> my $header = join("\t", @elements); > > You are splitting $line, then joining it back up in $header. Why not just > $header = $line; > >> print $header, "\n"; >> $line_number = 2;} >> elsif($line_number = 2){ > > That should be > > elsif( $line_number == 2 ) { > >> foreach (@targets){ >> chomp; >> my $target = $_; >> if ($row_name eq $target){ >> my $data = join("\t", @elements); >> print $data,"\n"; > > Once again, just use $line. > >> } >> } >> } >> } >> >> close (FILE); >> >> Realistic, I don't want the whole row. So I started thinking about how to >> get specific columns. I started reading on the internet and the ideas seems >> to be placing the arrays containing the lines in a hash indexed by the row >> names. So I did this: > > There are several ways to extract individual columns from a CSV line. > > 1. You can split the line into an array and make copies of specific elements: > > my @fields = split("\t",$line); > my $name = $fields[0]; > my $address = $fields[3]; > my $zip = $fields[7]; > > 2. You can use an array slice on the array: > > my( $name, $address, $zip ) = @fields[0,3,7]; > > 3. You can use an array slice on the return list from split: > > my( $name, $address, $zip ) = (split("\t",$line))[0,3,7]; > > 4. You can split the line into individual variables: > > my( $name, $position, $salary, $address, $street, $city, $country, $zip ) = > split("\t",$line); > > 5. You can use undefs to ignore columns you don't want: > > my( $name, undef, undef, $address, undef, undef, undef, $zip ) = > split("\t",$line); > > > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/