Re: A Small Program

Glynn Clements Sat, 27 Jun 1998 19:10:21 -0400

Moshe Zadka wrote:

> Here is a small program I wrote because I needed to use
> it.
> 
> It takes a file, and splits it into pieces, whose size
> is specified on the command line.
> 
> I named it hack because:
> 1. It is. (A hack!).
> 2. It hacks the file into pieces brutally.
> 3. All the good names (cut, split) were taken.
> 
> Share, read and enjoy. I'm putting it in the public domain.

Any particular reason why you couldn't use `split'?

You didn't ask for comments, but I'm going to give them to you anyway.

+ #include <stdio.h>            /* for fopen, fprintf, sprintf etc */
+ #include <stdlib.h>           /* for atoi, exit */

> int main(int argc, char **argv)
> {

-       char name[1024];
+       char *name;     /* I'll deal with this later. */

-       int size;
+       size_t size;

>       int count;
>       int c;
>       FILE *fin;
>       FILE *fout;
> 

-       if(argc<3) {
-               fprintf(stderr, "Error usage\n");
+       if(argc != 3) {
+               fprintf(stderr, "Usage %s <size> <filename>\n", argv[0]);

>               exit(1);
>       }
> 

-       size=atoi(argv[1]);
+       size=atol(argv[1]);

-       fprintf(stderr, "size is %d\n", size);
+ #if _DEBUG_
+       fprintf(stderr, "size is %lu\n", (unsigned long) size);
+ #endif

Likewise, all of the other debugging fprintf() calls should be
conditional.

>       
>       if((fin=fopen(argv[2], "r"))==NULL) {

-               fprintf(stderr, "Error file\n");
+               fprintf(stderr, "Error opening file '%s'\n", argv[2]);

>               exit(1);
>       }
>       fprintf(stderr, "opened %s\n", argv[2]);
> 

-       if(feof(fin))
-               fprintf(stderr, "Empty file\n");

This won't work. feof() doesn't return non-zero until *after* you've
tried to read past EOF.

>       for(count=0;!feof(fin);count++) {

-               int i;
+               size_t i;

-               sprintf(name, "%s.%03d", argv[2], count);
+               sprintf(name, "%1019s.%03d", argv[2], count);

This avoids a potential buffer overrun with filenames longer than 1024 
characters. A better solution would be to allocate `name' using

        char *name = alloca(strlen(argv[2]) + 5);

This will work regardless of the length of the filename.

>               fprintf(stderr, "opening %s\n", name);
>               if((fout=fopen(name, "w"))==NULL) {

-                       fprintf(stderr, "Error out file\n");
+                       fprintf(stderr, "Error opening file '%s'\n", name);

>                       exit(1);
>               }
>               fprintf(stderr, "opened %s\n", name);

-               for(i=0;i<size;i++) {
-                       if((c=fgetc(fin))!=EOF)
-                               fputc(c, fout);
-                       else
-                               break;
-               }

No way.

+               char buff[BUFSIZ];
+               for(i=size; i>0; i-=BUFSIZ) {
+                       size_t n = fread(buff, 1, min(i,BUFSIZ), fin);
+                       if (n <= 0)
+                               break;
+                       if (fwrite(buff, 1, n, fout) != n)
+                       {
+                               fprintf(stderr, "Error writing file\n");
+                               exit(1);
+                       }
+                       if (n < BUFSIZ)
+                               break;
+               }

This will copy the data a block at a time, which is substantially more 
efficient that copying it a byte at a time.

However, there is still inefficiency caused by the buffering of the
stdio functions. Either use `setvbuf(fp, NULL, _IONBF, 0)' to disable
buffering, or better still, use the POSIX unbuffered I/O functions
(open, read, write, close) rather than the ANSI buffered I/O
functions. Another problem with the ANSI functions is that they aren't
guaranteed to set `errno' appropriately when an error occurs.

>               fclose(fout);
>               fprintf(stderr, "closed %s\n", name);
>       }

+       return 0;

main() returns an `int'.

> }

-- 
Glynn Clements <[EMAIL PROTECTED]>
split.c
Re: A Small Program

Reply via email to