Re: [PATCH] performance-tests: tests for renamed/copied files in notmuch new

2019-04-05 Thread David Bremner
Tomi Ollila  writes:

>> +done < <(find mail -type f ! -path 'mail/.notmuch/*' )
>
> // this comment was written last in this email, just for fun >;) //
>
> find mail -type f ! -path 'mail/.notmuch/*' | sed -n '1~4 p' > $manifest
> count=`wc $manifest`
>
> (I'd be interested which one of the above were faster -- my suggestion
> does quite a many more forks and execve's but abowe read loop 200 000 
> read(2)'s and [lf]seek(2)s (and then 50 000 opens). 
> well, probably no-one would notice difference...)

Your version is _much_ faster so I used it, along with the other
suggestions in the version I just pushed to master.

d
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] performance-tests: tests for renamed/copied files in notmuch new

2019-04-03 Thread Tomi Ollila
On Tue, Apr 02 2019, David Bremner wrote:

> Several people have observed that this is surprisingly slow, and we
> have a proposal to add tagging into this code path, so we want to make
> sure it doesn't imply too much of a performance hit.
> ---
>  performance-test/T00-new.sh | 30 ++
>  1 file changed, 30 insertions(+)
>
> I added these tests to help evaluate Michael's propesed patch. I'll send the 
> results in a seperate email.
>
> diff --git a/performance-test/T00-new.sh b/performance-test/T00-new.sh
> index 68750129..cec28d58 100755
> --- a/performance-test/T00-new.sh
> +++ b/performance-test/T00-new.sh
> @@ -12,4 +12,34 @@ for i in $(seq 2 6); do
>  time_run "notmuch new #$i" 'notmuch new'
>  done
>  
> +manifest=$(mktemp manifestXX)
> +
> +count=0
> +total=0
> +while read -r name ; do
> +if [ $((total % 4 )) -eq 0 ]; then
> +echo $name >> $manifest
> +count=$((count + 1))
> +fi
> +total=$((total + 1))
> +done < <(find mail -type f ! -path 'mail/.notmuch/*' )

// this comment was written last in this email, just for fun >;) //

find mail -type f ! -path 'mail/.notmuch/*' | sed -n '1~4 p' > $manifest
count=`wc $manifest`

(I'd be interested which one of the above were faster -- my suggestion
does quite a many more forks and execve's but abowe read loop 200 000 
read(2)'s and [lf]seek(2)s (and then 50 000 opens). 
well, probably no-one would notice difference...)

> +
> +while read -r name ; do
> +mv $name ${name}.renamed
> +done <  $manifest

'12' -- 2 spaces above (and below...)

luckily bash read builtin does not read input byte at a time
(IIRC it read 128 bytes, then scanned for newline and then seeked
 -- in this case it can, since file was redirected -- fd is seekable)

50 000 mv(1) executions definitely take time.

perl -nle 'rename $_, "$_.renamed"' $manifest  would be significantly faster 

> +
> +time_run "new ($count mv)" 'notmuch new'
> +
> +while read -r name ; do
> +mv ${name}.renamed $name
> +done <  $manifest
> +
> +time_run "new ($count mv back)" 'notmuch new'
> +
> +while read -r name ; do
> +cp ${name} $name.copy
> +done <  $manifest

perl -nle 'link $_, "$_.copy"' $manifest  ?

> +
> +time_run "new ($count cp)" 'notmuch new'
> +
>  time_done
> -- 
> 2.20.1
>
> ___
> notmuch mailing list
> notmuch@notmuchmail.org
> https://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] performance-tests: tests for renamed/copied files in notmuch new

2019-04-02 Thread David Bremner
David Bremner  writes:

> Several people have observed that this is surprisingly slow, and we
> have a proposal to add tagging into this code path, so we want to make
> sure it doesn't imply too much of a performance hit.

On my SSD / 8th gen i7 / 32G RAM based debian workstation it seems OK,
with a max slowdown of 1.4% (mean of 5 runs). I'd like to see similar
figures for spinning rust and older CPUs. This is with Xapian 1.4.11;
ideally so would other comparisons.

| | Before |  After | slowdown % |
| initial | 517.69 | 522.13 |0.9 |
| mv  | 313.16 | 317.57 |1.4 |
| mv back | 315.21 | 316.73 |0.5 |
| cp  | 170.36 | 171.08 |0.4 |
#+TBLFM: $4=100*($3/$2-1);f1

Raw data is attached for the curious
with patch

T00-new.sh: Testing notmuch new [0.4 large]
Wall(s) Usr(s)  Sys(s)  Res(K)  In/Out(512B)
  Initial notmuch new   521.81  405.90  111.16  273992  0/15876448
  notmuch new #20.020.010.0012632   0/144
  notmuch new #30.000.000.0088720/8
  notmuch new #40.000.000.0089000/8
  notmuch new #50.000.000.0090040/8
  notmuch new #60.000.000.0087280/8
  new (52374 mv)316.30  219.89  95.42   146136  0/4611440
  new (52374 mv back)   312.67  217.23  94.55   147476  0/4986256
  new (52374 cp)170.76  125.47  43.85   114360  0/4099808

T00-new.sh: Testing notmuch new [0.4 large]
Wall(s) Usr(s)  Sys(s)  Res(K)  In/Out(512B)
  Initial notmuch new   523.08  405.39  112.02  274152  0/15871216
  notmuch new #20.020.000.0112372   0/144
  notmuch new #30.020.010.0089840/8
  notmuch new #40.010.000.0088200/8
  notmuch new #50.000.000.0087760/8
  notmuch new #60.000.000.0087760/8
  new (52374 mv)317.27  221.69  94.55   146424  0/4864352
  new (52374 mv back)   318.88  222.51  95.31   146764  0/5063728
  new (52374 cp)172.09  126.18  44.75   114548  0/4073552

T00-new.sh: Testing notmuch new [0.4 large]
Wall(s) Usr(s)  Sys(s)  Res(K)  In/Out(512B)
  Initial notmuch new   523.58  407.64  112.15  274100  0/16211056
  notmuch new #20.020.010.0012644   0/144
  notmuch new #30.000.000.0088280/8
  notmuch new #40.000.000.0088840/8
  notmuch new #50.000.000.0089160/8
  notmuch new #60.000.000.0088960/8
  new (52374 mv)318.53  222.16  95.60   146368  0/5056016
  new (52374 mv back)   315.98  219.54  95.64   146796  0/4766496
  new (52374 cp)170.79  124.49  44.63   114560  0/4075616

T00-new.sh: Testing notmuch new [0.4 large]
Wall(s) Usr(s)  Sys(s)  Res(K)  In/Out(512B)
  Initial notmuch new   519.77  403.60  111.33  274212  0/15964240
  notmuch new #20.020.000.0112456   0/144
  notmuch new #30.000.000.0090800/8
  notmuch new #40.000.000.0088840/8
  notmuch new #50.000.000.0090040/8
  notmuch new #60.000.000.0088400/8
  new (52374 mv)318.98  222.26  95.94   146332  0/4854864
  new (52374 mv back)   319.39  221.87  96.65   146944  0/4969296
  new (52374 cp)170.72  124.50  44.63   114224  0/3817216

T00-new.sh: Testing notmuch new [0.4 large]
Wall(s) Usr(s)  Sys(s)  Res(K)  In/Out(512B)
  Initial notmuch new   522.43  406.40  111.85  273936  0/16188560
  notmuch new #20.880.010.0112512   0/144
  notmuch new #30.030.010.0187560/8
  notmuch new #40.000.000.0088520/8
  notmuch new #50.000.000.0088320/8
  notmuch new #60.000.000.0088640/8
  new (52374 mv)316.75  219.96  95.79   146496  0/4866064
  new (52374 mv back)   316.73  220.40  95.33   146760  0/5034928
  new (52374 cp)171.03  125.37  44.34   114384  0/4096496

--

without patch:

T00-new.sh: Testing notmuch new [0.4 large]
Wall(s) Usr(s)  Sys(s)  Res(K)  In/Out(512B)
  Initial notmuch new   517.73  403.06  110.61  284648  0/14723776
  notmuch new #20.010.000.0087600/8
  notmuch new #30.010.000.0189280/8
  notmuch new #40.010.010.0088120/8
  notmuch new #50.010.000.0189280/8
  notmuch new #60.000.000.0088320/8
  new (52374 mv)314.48  218.67  94.55   146560  0/4774288
  new (52374 mv back)   319.43  222.46  96.09   

[PATCH] performance-tests: tests for renamed/copied files in notmuch new

2019-04-02 Thread David Bremner
Several people have observed that this is surprisingly slow, and we
have a proposal to add tagging into this code path, so we want to make
sure it doesn't imply too much of a performance hit.
---
 performance-test/T00-new.sh | 30 ++
 1 file changed, 30 insertions(+)

I added these tests to help evaluate Michael's propesed patch. I'll send the 
results in a seperate email.

diff --git a/performance-test/T00-new.sh b/performance-test/T00-new.sh
index 68750129..cec28d58 100755
--- a/performance-test/T00-new.sh
+++ b/performance-test/T00-new.sh
@@ -12,4 +12,34 @@ for i in $(seq 2 6); do
 time_run "notmuch new #$i" 'notmuch new'
 done
 
+manifest=$(mktemp manifestXX)
+
+count=0
+total=0
+while read -r name ; do
+if [ $((total % 4 )) -eq 0 ]; then
+echo $name >> $manifest
+count=$((count + 1))
+fi
+total=$((total + 1))
+done < <(find mail -type f ! -path 'mail/.notmuch/*' )
+
+while read -r name ; do
+mv $name ${name}.renamed
+done <  $manifest
+
+time_run "new ($count mv)" 'notmuch new'
+
+while read -r name ; do
+mv ${name}.renamed $name
+done <  $manifest
+
+time_run "new ($count mv back)" 'notmuch new'
+
+while read -r name ; do
+cp ${name} $name.copy
+done <  $manifest
+
+time_run "new ($count cp)" 'notmuch new'
+
 time_done
-- 
2.20.1

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch