[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CMake: remove USE_TSAN, use -DSANITIZE_THREAD instead

2018-03-13 Thread GerritHub
>From Dominique Martinet :

Dominique Martinet has uploaded this change for review. ( 
https://review.gerrithub.io/403728


Change subject: CMake: remove USE_TSAN, use -DSANITIZE_THREAD instead
..

CMake: remove USE_TSAN, use -DSANITIZE_THREAD instead

We currently have two ways of enabling TSAN and this one does not work
Instead of trying to debug why, just remove it

Change-Id: I7e925f319821162c1d0446ad1146e7fff693c973
Signed-off-by: Dominique Martinet 
---
M src/CMakeLists.txt
D src/cmake/tsan.cmake
2 files changed, 0 insertions(+), 48 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/28/403728/1
--
To view, visit https://review.gerrithub.io/403728
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7e925f319821162c1d0446ad1146e7fff693c973
Gerrit-Change-Number: 403728
Gerrit-PatchSet: 1
Gerrit-Owner: Dominique Martinet 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: CMake sanitizers: s/saitizer/sanitizer/

2018-03-13 Thread GerritHub
>From Dominique Martinet :

Dominique Martinet has uploaded this change for review. ( 
https://review.gerrithub.io/403729


Change subject: CMake sanitizers: s/saitizer/sanitizer/
..

CMake sanitizers: s/saitizer/sanitizer/

Annoying typo in function names are evil

Change-Id: Icc2ca77720960fec3b13b1473f14f6e8ac72a666
Signed-off-by: Dominique Martinet 
---
M src/cmake/modules/FindASan.cmake
M src/cmake/modules/FindMSan.cmake
M src/cmake/modules/FindTSan.cmake
M src/cmake/modules/FindUBSan.cmake
M src/cmake/modules/sanitize-helpers.cmake
5 files changed, 5 insertions(+), 5 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/29/403729/1
--
To view, visit https://review.gerrithub.io/403729
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icc2ca77720960fec3b13b1473f14f6e8ac72a666
Gerrit-Change-Number: 403729
Gerrit-PatchSet: 1
Gerrit-Owner: Dominique Martinet 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread Daniel Gryniewicz

rpcping was not thread safe.  I have fixes for it incoming.

Daniel

On 03/13/2018 12:13 PM, William Allen Simpson wrote:

On 3/13/18 2:38 AM, William Allen Simpson wrote:

In my measurements, using the new CLNT_CALL_BACK(), the client thread
starts sending a stream of pings.  In every case, it peaks at a
relatively stable rate.


DanG suggested that timing was dominated by the system time calls.

The previous numbers were switched to a finer grained timer than
the original code.  JeffL says that clock_gettime() should have had
negligible overhead.

But just to make sure, I've eliminated the per thread timers and
substituted one before and one after.  Unlike previously, this
will include the overhead of setting up the client, in addition to
completing all the callback returns.

Same result.  More calls ::= slower times.

rpcping tcp localhost threads=1 count=1000 (port=2049 program=13 
version=3 procedure=0): average 36012.0254, total 36012.0254
rpcping tcp localhost threads=1 count=1500 (port=2049 program=13 
version=3 procedure=0): average 33720.9125, total 33720.9125
rpcping tcp localhost threads=1 count=2000 (port=2049 program=13 
version=3 procedure=0): average 25604.7542, total 25604.7542
rpcping tcp localhost threads=1 count=3000 (port=2049 program=13 
version=3 procedure=0): average 21170.0836, total 21170.0836
rpcping tcp localhost threads=1 count=5000 (port=2049 program=13 
version=3 procedure=0): average 18163.2451, total 18163.2451


Including the 3-way handshake time for setting up the clients does affect
the overall throughput numbers.

rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 
version=3 procedure=0): average 10379.3976, total 20758.7951
rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 
version=3 procedure=0): average 10746.9395, total 21493.8790


rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 
version=3 procedure=0): average 5473.3780, total 16420.1339
rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 
version=3 procedure=0): average 5886.5549, total 17659.6646


rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 
version=3 procedure=0): average 3396.9438, total 16984.7190
rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 
version=3 procedure=0): average 3455.3026, total 17276.5131



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread William Allen Simpson

On 3/13/18 2:38 AM, William Allen Simpson wrote:

In my measurements, using the new CLNT_CALL_BACK(), the client thread
starts sending a stream of pings.  In every case, it peaks at a
relatively stable rate.


DanG suggested that timing was dominated by the system time calls.

The previous numbers were switched to a finer grained timer than
the original code.  JeffL says that clock_gettime() should have had
negligible overhead.

But just to make sure, I've eliminated the per thread timers and
substituted one before and one after.  Unlike previously, this
will include the overhead of setting up the client, in addition to
completing all the callback returns.

Same result.  More calls ::= slower times.

rpcping tcp localhost threads=1 count=1000 (port=2049 program=13 version=3 
procedure=0): average 36012.0254, total 36012.0254
rpcping tcp localhost threads=1 count=1500 (port=2049 program=13 version=3 
procedure=0): average 33720.9125, total 33720.9125
rpcping tcp localhost threads=1 count=2000 (port=2049 program=13 version=3 
procedure=0): average 25604.7542, total 25604.7542
rpcping tcp localhost threads=1 count=3000 (port=2049 program=13 version=3 
procedure=0): average 21170.0836, total 21170.0836
rpcping tcp localhost threads=1 count=5000 (port=2049 program=13 version=3 
procedure=0): average 18163.2451, total 18163.2451

Including the 3-way handshake time for setting up the clients does affect
the overall throughput numbers.

rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 version=3 
procedure=0): average 10379.3976, total 20758.7951
rpcping tcp localhost threads=2 count=1500 (port=2049 program=13 version=3 
procedure=0): average 10746.9395, total 21493.8790

rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 version=3 
procedure=0): average 5473.3780, total 16420.1339
rpcping tcp localhost threads=3 count=1500 (port=2049 program=13 version=3 
procedure=0): average 5886.5549, total 17659.6646

rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 version=3 
procedure=0): average 3396.9438, total 16984.7190
rpcping tcp localhost threads=5 count=1500 (port=2049 program=13 version=3 
procedure=0): average 3455.3026, total 17276.5131

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: Adding empty file const_strcuts.checkpatch

2018-03-13 Thread GerritHub
>From :

supriti.si...@suse.com has uploaded this change for review. ( 
https://review.gerrithub.io/403704


Change subject: Adding empty file const_strcuts.checkpatch
..

Adding empty file const_strcuts.checkpatch

In absence of this file, checkpatch.pl shows an error:
No structs that should be const will be found, file missing

Change-Id: Iab141bf7bf5aa40a4c19f4994cbcbb7b896e469b
Signed-off-by: Supriti Singh 
---
A src/scripts/const_structs.checkpatch
1 file changed, 0 insertions(+), 0 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/04/403704/1
--
To view, visit https://review.gerrithub.io/403704
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iab141bf7bf5aa40a4c19f4994cbcbb7b896e469b
Gerrit-Change-Number: 403704
Gerrit-PatchSet: 1
Gerrit-Owner: supriti.si...@suse.com
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Better late than never - US Daylight Savings Time has started and that means weekly conference call is an hour earlier

2018-03-13 Thread Frank Filz
> Time has started and that means weekly conference call is an hour earlier
> 
> 
> An hour later...

No, an hour earlier. The time for the meeting is based on current Pacific
time, not UT/GMT. So when the US enters Daylight Saving Time, the meeting
switches to an hour earlier, and when we leave Daylight Saving Time, the
meeting switches to an hour later. In most of the US, the clock time of the
meeting stays the same. In much of Europe, the clock time is out of sync for
a few weeks due to different dates for entering/leaving Daylight Saving
Time. In parts of the world that don't observe Daylight Saving Time (most
notably for this project is India), the clock time changes as well as the
absolute time. For those south of the Equator that observe Daylight Saving
Time, the clock time ultimately shifts two hours, though it is probably
staged in two steps due to their leaving Daylight Saving Time in their fall
on a different date than the US enters Daylight Saving Time in the spring.

Frank


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Better late than never - US Daylight Savings Time has started and that means weekly conference call is an hour earlier

2018-03-13 Thread Dominique Martinet
Daniel Gryniewicz wrote on Tue, Mar 13, 2018 at 10:07:33AM -0400:
> An hour later...

Nope, it is an hour earlier for us :)

-- 
Dominique

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Better late than never - US Daylight Savings Time has started and that means weekly conference call is an hour earlier

2018-03-13 Thread Daniel Gryniewicz

An hour later...

Daniel

On 03/13/2018 10:02 AM, Frank Filz wrote:



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


[Nfs-ganesha-devel] Better late than never - US Daylight Savings Time has started and that means weekly conference call is an hour earlier

2018-03-13 Thread Frank Filz
 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] intermittent malloc list corruption on shutdown in -dev.3

2018-03-13 Thread Dominique Martinet
Hi Jeff,

the CEA bot has hit this twice in the past two or so weeks, so you're
definitely not the only one seeing that -- unfortunately it's only ever
hit it on the runs without ASAN so the traces are pretty much the same
as what you get.

This kind of messages mean we're messing about with internal glibc
malloc headers and I'm very surprised ASAN/valgrind don't catch it.
If it's a race maybe hellgrind? But I think that reports quite a bit,
would need more time than I have to check...


Anyway, you're not alone, but I don't have much clue either.. Good
luck ! :P

-- 
Dominique

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread Matt Benjamin
On Tue, Mar 13, 2018 at 2:38 AM, William Allen Simpson
 wrote:
> On 3/12/18 6:25 PM, Matt Benjamin wrote:
>>
>> If I understand correctly, we always insert records in xid order, and
>> xid is monotonically increasing by 1.  I guess pings might come back
>> in any order,
>
>
> No, they always come back in order.  This is TCP.  I've gone to some
> lengths to fix the problem that operations were being executed in
> arbitrary order.  (As was reported in the past.)

We're aware of the issues with former req queuing.  It was one of my
top priorities to fix in napalm, and we did it.

>
> For UDP, there is always the possibility of loss or re-ordering of
> datagrams, one of the reasons for switching to TCP in NFSv3 (and
> eliminating UDP in NFSv4).
>
> Threads can still block in apparently random order, because of
> timing variances inside FSAL calls.  Should not be an issue here.
>
>
>> but if we assume xids retire in xid order also,
>
>
> They do.  Should be no variance.  Eliminating the dupreq caching --
> also using the rbtree -- significantly improved the timing.

It's certainly correct not to cache, but it's also a special case that
arises from...benchmarking with rpcping, not NFS.
Same goes for retire order.  Who said, let's assume the rpcping
requests retire in order?  Oh yes, me above.  Do you think NFS
requests in general are required to retire in arrival order?  No, of
course not.  What workload is the general case for the DRC?  NFS.

>
> Apparently picked the worst tree choice for this data, according to
> computer science. If all you have is a hammer

What motivates you to write this stuff?

Here are two facts you may have overlooked:

1. The DRC has a constant insert-delete workload, and for this
application, IIRC, I put the last inserted entries directly into the
cache.  This both applies standard art on trees (rbtree vs avl
perfomance on insert/delete heavy workloads, and ostensibly avoids
searching the tree in the common case;  I measured hitrate informally,
looked to be working).

2. the key in the DRC caches is hk,not xid.

>
>
>> and keep
>> a window of 1 records in-tree, that seems maybe like a reasonable
>> starting point for measuring this?
>> I've not tried 10,000 or 100,000 recently.  (The original code
>
> default sent 100,000.)
>
> I've not recorded how many remain in-tree during the run.
>
> In my measurements, using the new CLNT_CALL_BACK(), the client thread
> starts sending a stream of pings.  In every case, it peaks at a
> relatively stable rate.
>
> For 1,000, <4,000/s.  For 100, 40,000/s.  Fairly linear relationship.
>
> By running multiple threads, I showed that each individual thread ran
> roughly the same (on average).  But there is some variance per run.
>
> I only posted the 5 thread results, lowest and highest achieved.
>
> My original message had up to 200 threads and 4 results, but I decided
> such a long series was overkill, so removed them before sending.
>
> That 4,000 and 40,000 per client thread was stable across all runs.
>
>
>> I wrote a gtest program (gerrit) that I think does the above in a
>> single thread, no locks, for 1M cycles (search, remove, insert).  On
>> lemon, compiled at O2, the gtest profiling says the test finishes in
>> less than 150ms (I saw as low as 124).  That's over 6M cycles/s, I
>> think.
>>
> What have you compared it to?  Need a gtest of avl and tailq with the
> same data.  That's what the papers I looked at do

The point is, that is very low latency, a lot less than I expected.
It's probably minimized from CPU caching and so forth, but it tries to
address the more basic question, is expected or unexpected latency
from searching the rb tree a likely contributor to overall latency?
If we get 2M retires per sec (let alone 6-7), is that a likely
supposition?

The rb tree either is, or isn't a major contributor to latency.  We'll
ditch it if it is.  Substituting a tailq (linear search) seems an
unlikely choice, but if you can prove your case with the numbers, no
one's going to object.

Matt

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] rpcping

2018-03-13 Thread William Allen Simpson

On 3/12/18 6:25 PM, Matt Benjamin wrote:

If I understand correctly, we always insert records in xid order, and
xid is monotonically increasing by 1.  I guess pings might come back
in any order, 


No, they always come back in order.  This is TCP.  I've gone to some
lengths to fix the problem that operations were being executed in
arbitrary order.  (As was reported in the past.)

For UDP, there is always the possibility of loss or re-ordering of
datagrams, one of the reasons for switching to TCP in NFSv3 (and
eliminating UDP in NFSv4).

Threads can still block in apparently random order, because of
timing variances inside FSAL calls.  Should not be an issue here.


but if we assume xids retire in xid order also, 


They do.  Should be no variance.  Eliminating the dupreq caching --
also using the rbtree -- significantly improved the timing.

Apparently picked the worst tree choice for this data, according to
computer science.  If all you have is a hammer



and keep
a window of 1 records in-tree, that seems maybe like a reasonable
starting point for measuring this?
I've not tried 10,000 or 100,000 recently.  (The original code

default sent 100,000.)

I've not recorded how many remain in-tree during the run.

In my measurements, using the new CLNT_CALL_BACK(), the client thread
starts sending a stream of pings.  In every case, it peaks at a
relatively stable rate.

For 1,000, <4,000/s.  For 100, 40,000/s.  Fairly linear relationship.

By running multiple threads, I showed that each individual thread ran
roughly the same (on average).  But there is some variance per run.

I only posted the 5 thread results, lowest and highest achieved.

My original message had up to 200 threads and 4 results, but I decided
such a long series was overkill, so removed them before sending.

That 4,000 and 40,000 per client thread was stable across all runs.



I wrote a gtest program (gerrit) that I think does the above in a
single thread, no locks, for 1M cycles (search, remove, insert).  On
lemon, compiled at O2, the gtest profiling says the test finishes in
less than 150ms (I saw as low as 124).  That's over 6M cycles/s, I
think.


What have you compared it to?  Need a gtest of avl and tailq with the
same data.  That's what the papers I looked at do

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel