On Fri, Jul 31, 2009 at 10:58, <98447...@student.ucc.ie> wrote:
snip
> [HMU09450], [1084175], [1085500], [ c], [putative thiophene
snip

This shows that your data is not quite what you think it is.  This
means one (or more) of the following things:

1. the program that is generating the file has a bug
2. that space means something
3. you don't understand the file format

If the problem is 1, then you need to contact the author(s) of that
program and get a bug fix, but you also need to decide how to handle
this sort of bug in the future.  One solution is to check the contents
of the field for expected values and die or warn when the value is not
what you expect:

while(<$fh>) {
       my ($Gene, $Start, $Stop, $Strand, $Product) = split /\t/;
       if ($Strand eq "c") {
               print "$Gene\t-\t$Start\t$Stop\t$Product\n";
       } elsif ($Strand eq "") {
               print "$Gene\t+\t$Start\t$Stop\t$Product\n";
       } else {
               die "Strand value [$Strand] is not valid";
       }
}

If the problem is 2, then you need to compare against " c" or possibly
use a regex.

If the problem is 3, then you may need to change how you are parsing
the record.  For instance, if the field separator is spaces and tabs
containing at least one tab (i.e. spaces on either side the tab are
part of the record separator, but spaces inside a field are not to be
touched), then you can say

my ($Gene, $Start, $Stop, $Strand, $Product) = split /[\t ]*\t[\t ]*/;

-- 
Chas. Owens
wonkden.net
The most important skill a programmer can have is the ability to read.

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to