Wang, Li wrote:
Dear list members

Hello,


I am a very beginner of perl programming.

Welcome to the Perl beginners mailing list.


I am trying to write a script to search all scalers of one array
(geneIDFile) in another file (annotationFile). If it is found and
matched, output the whole line of the annotation file.
My script is as follows. It turns out not woking, and I cannot spot
out the error. Could anyone help me?

#!/usr/bin/perl -w

# This script assigns gene function from specific poplar Gene IDs using  
populus tricocarpa annotation
# USAGE:
# unix command line:
# ./assignGOpoplar.pl candidateGenes.name annotationFile.name
# e.g. ./assignGOpoplar.pl top4018tags.xls Ptrichocarpa_156_annotation_info.txt

You say that you want two file names on the command line but your code only uses one of those file names.


# the script takes the genes number from the first file and finds the 
annotation in the second file
# then outputs a third file with the geneID and annotation

You also don't specify an output file in your code?


use strict;
use warnings;

my $geneIDfile = shift @ARGV;
my @geneID=();
my @logFC=();
my @logCPM=();
my @LR=();
my @Pvalue=();
my @FDR=();

my $i=-1;
open (GENEIDFILE, "$geneIDfile") || die "GENEID File not found\n";

You shouldn't quote scalar variables, Perl is not the shell.

perldoc -q quoting

You should probably also include the $! variable in your error message so you know why open failed.

open GENEIDFILE, '<', $geneIDfile or die "Cannot open '$geneIDfile' because: $!";


      while (<GENEIDFILE>) {
        chomp;
        $i++;
        next if ($i==0);
        ($geneID[$i], $logFC[$i], $logCPM[$i], $LR[$i], $Pvalue[$i], $FDR[$i]) 
= split(/\t/, $_);

You never use the arrays @logFC, @logCPM, @LR, @Pvalue and @FDR so you don't really need them. Your loop would probably be better as:

while ( <GENEIDFILE> ) {
    next if $. == 1;
    push @geneID, ( split /\t/ )[ 0 ];


      }
close(GENEIDFILE);


my $j= 1;
my $annotationFile = 
"/Users/olsonmatthew/Desktop/Perl/Ptrichocarpa_156_annotation_info.txt";

Aren't you supposed to get this file name from the command line (@ARGV)?


open (ANNOTFILE, "<$annotationFile") || die "ANNOTFILE File not found\n";

open ANNOTFILE, '<', $annotationFile or die "Cannot open '$annotationFile' because: $!";


      while (<ANNOTFILE>) {
        chomp;

        if ($_=~/\n/){

The readline (<ANNOTFILE>) reads one line from the file, where a line is defined as zero or more characters ending in newline, and then chomp removes that newline, so there is no newline for your regular expression to match.


                        if ($_=~/$geneID[$j]/){

You are only comparing one element from @geneID to the line instead of all elements which you stated at the beginning is what you wanted to do.


                                print "$_\n";
                        }
                        ++$j;
                }
                }
close(ANNOTFILE);
exit;

If you could provide some sample data from your two files it would be easier to come up with a solution.



John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction.                   -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to