Re: [plug] Maximum number of lines wc can count

fooler mail Wed, 18 Feb 2009 07:54:23 -0800

On Wed, Feb 18, 2009 at 10:14 PM, Pablo Manalastas
<[email protected]> wrote:
> Ramil's original problem is not how to read all those tens
> of gigabytes of text data, but the more simple problem of keeping
> a count of the number of lines read, since if wc uses int (fortunately
> it does not), then wc can count only up to 2 billion lines. But he
> expects to read up to 100 billion lines. Note that he does not need
> to keep them in memory -- he only needs to count the number of lines.
> I believe that wc is even an overkill, since the following simple
> code will do the job:
>
> unsigned long long n = 0;
> while( (c = getchar()) != EOF) {
>  if(c == '\n') ++n;
> }
> return n;
>
> With this code, you do not even buffer the file, except for the
> buffering (usually 4k) that the C library implementation of the
> getchar() macro requires.


yup that would do doc but the problem using getchar() is too slow
compare to fgets() or read() used by wc..

example of a simple code using fgets..

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

#define BUFSIZE         1024

int main(void) {
char buffer[BUFSIZE];
uintmax_t lines;

        for (lines = 0; fgets(buffer, BUFSIZE, stdin) != NULL; lines++);

        printf("%ju\n", lines);

        return(0);
}

save to let say wc2.c

gcc -ansi -Wall -O3 -o wc2 wc2.c

./wc2 < /path/to/file

the code above only needs a buffer size of maximum charactes in a
given line... this is faster than using getch() but a little bit
slower than read()...

fooler.
_________________________________________________
Philippine Linux Users' Group (PLUG) Mailing List
http://lists.linux.org.ph/mailman/listinfo/plug
Searchable Archives: http://archives.free.net.ph

Re: [plug] Maximum number of lines wc can count

Reply via email to