Re: [Zutils-bug] zgrep performance long line

2018-08-17 Thread Antonio Diaz Diaz

Walter Anema wrote:

I think that your messages_nnl.gz can be processed faster with
$ time zcat messages_nnl.gz | fold -s -w1 | grep -o connect  | wc


Folding makes it slightly slower.



I managed to create another file without private data using the script

i=0
j=0
while [ $i -lt 10 ]; do
while [ $j -lt 800 ]; do
   printf "%s "  $RANDOM
   ((j++))
done
printf "%s" "connect"
((i++))
done>  largefile


I have tried the code above (which, BTW, puts the 10 "connect" together 
at the end of the file. Not sure that is what you intended) and still I 
can't see any differences between gzip and zutils. Take into account 
that the times in my last mail were from my slow machine (AMD K6-2 450 
MHz). On my desktop machine both gzip and zutils take about 1.2s to 
zgrep largefile.gz.



Best regards,
Antonio.

___
Zutils-bug mailing list
Zutils-bug@nongnu.org
https://lists.nongnu.org/mailman/listinfo/zutils-bug


Re: [Zutils-bug] zgrep performance long line

2018-08-17 Thread Antonio Diaz Diaz

Hi Walter,

Walter Anema wrote:

You made a nice package with z utilities.


Thanks!


I have a problem with the performance of a special file. It is a file with 
logging in json format, without a \n.
I need to append an `echo` before `wc` shows up with a count.

(zcat 
/logs/s3/2018/04/11/08/prod-kinesis-firehose-stream-1-2018-04-11-08-05-23-bcdf3841-52b5-47eb-bf85-c36dfa2d0d55;echo
 ) | wc
   1 2145643 37786248


I have crafted a similar file, but I have found no differences between 
zutils' zgrep and gzip's zgrep. Also I did not need to append a newline 
for wc to show results:


$ zcat messages_nnl.gz | wc
  0 7371268 44149261

$ time zgrep -o connect messages_nnl.gz | wc
104 104 832

real0m24.714s
user0m22.000s
sys 0m2.250s

$ time gzip-1.9/zgrep -o connect messages_nnl.gz | wc
104 104 832

real0m24.896s
user0m22.340s
sys 0m2.250s

What version of grep and wc are you using? Mine are (I have tested in 
two machines):


GNU grep 2.5
GNU grep 3.1
wc (GNU coreutils) 6.9
wc (GNU coreutils) 8.11


Best regards,
Antonio.

___
Zutils-bug mailing list
Zutils-bug@nongnu.org
https://lists.nongnu.org/mailman/listinfo/zutils-bug