Hi all,
I'm having programs reading from files. 

I have a text file "files.txt" that contains the names
of the files to be opened, i.e. the contents of
files.txt are

Homo_sapiens.fa
Rattus_norvegicus.fa

(They are FA files that can be opened in any text
editor.)

Each of the FA files contains a number in the first
line and a string of characters (A,T,G or C). For
example, the Homo_sapiens.fa file would contain

16571
GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTT
CGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTC
GCAGTATCTGTCTTTGATTCCTGCCTCATTCTATTATTTATCGCACCTACGTTCAATATT
ACAGGCGAACATACCTACTAAAGTGTGTTAATTAATTAATGCTTGTAGGACATAATAATA

and so on, with 16571 A,T,G or Cs.

Below is my code:

#include <stdio.h>
#include <stdlib.h>

#define MAX_FILE 100    // maximum length of file name
#define MAX_SEQ 20000    // maximum length of sequence
#define N 2             // total number of sequences

int main(void)
{
   FILE *fin, *fin1, *fout;
   char input[MAX_FILE+1], seq[N][MAX_SEQ+1], c;
   int size[N], i = 0, j = 0;
 
   fin = fopen("files.txt", "r");
   fout = fopen("output.txt", "w");
   while (fscanf(fin, "%s", input) != EOF)
   {
      fin1 = fopen(input, "r");
      printf("%s\n", input);
      fscanf(fin1, "%d ", &size[i]);
      printf("%d\n", size[i]);
      while ((c = fgetc(fin1)) != EOF)
      {
        fprintf(fout, "%c", c);
        if (c != '\n')
                seq[i][j] = c;
         j++;
         if (j % 100 == 0)
                printf("%c", seq[i][j]);
      }
      fprintf(fout, "\n\n");
      j = 0;
      i++;
   }

   fclose(fin);
   fclose(fin1);
   fclose(fout);
   return 0;
}

The printf statements for me to check my code. 

When I try to open 2 files, the first file is read in
fine, but the second file is incomplete. Over 600
characters are not read, and the program hangs.

I get the output(due to the checking printf
statements)

Homo_sapiens.fa
16571
Rattus_norvegicus.fa
16300
<program hangs>

Notice that the statements
if (j % 100 == 0)
    printf("%c", seq[i][j]);
are not executed, but if I just print the character
seq[0][100], it comes out correctly.

If I try to open 3 files, the same program happens,
i.e. the first file is read correctly, but the second
file is incomplete and the third file is not read at
all. I get the output

Homo_sapiens.fa
16571
Rattus_norvegicus.fa
16300
Homo_sapiens.fa
16571
Segmentation fault

I tried my program with 2 much smaller files (one has
13 characters and the other 14), and the program
works. Are the 2 files too big and the program ran out
of memory? How do I get around this problem, as I have
to read files even bigger than these 2 later?

Thank you.

Regards,
Rayne


       
____________________________________________________________________________________
Sick sense of humor? Visit Yahoo! TV's 
Comedy with an Edge to see what's on, when. 
http://tv.yahoo.com/collections/222

Reply via email to