On Mon, Jun 17, 2019 at 06:26:20PM +0100, Will Deacon wrote:
> On Mon, Jun 17, 2019 at 01:33:19PM +0200, Ard Biesheuvel wrote:
> > On my single core TX2, the comparative performance is as follows
> >
> > Baseline: REFCOUNT_TIMING test using REFCOUNT_FULL (LSE cmpxchg)
> > 191057942484
On Mon, Jun 17, 2019 at 06:26:20PM +0100, Will Deacon wrote:
> On Mon, Jun 17, 2019 at 01:33:19PM +0200, Ard Biesheuvel wrote:
> > On Sun, 16 Jun 2019 at 23:31, Kees Cook wrote:
> > > On Sat, Jun 15, 2019 at 04:18:21PM +0200, Ard Biesheuvel wrote:
> > > > Yes, I am using the same saturation point
On Mon, Jun 17, 2019 at 01:33:19PM +0200, Ard Biesheuvel wrote:
> On Sun, 16 Jun 2019 at 23:31, Kees Cook wrote:
> > On Sat, Jun 15, 2019 at 04:18:21PM +0200, Ard Biesheuvel wrote:
> > > Yes, I am using the same saturation point as x86. In this example, I
> > > am not entirely sure I understand
On Sun, 16 Jun 2019 at 23:31, Kees Cook wrote:
>
> On Sat, Jun 15, 2019 at 04:18:21PM +0200, Ard Biesheuvel wrote:
> > Yes, I am using the same saturation point as x86. In this example, I
> > am not entirely sure I understand why it matters, though: the atomics
> > guarantee that the write by
On Sat, Jun 15, 2019 at 04:18:21PM +0200, Ard Biesheuvel wrote:
> Yes, I am using the same saturation point as x86. In this example, I
> am not entirely sure I understand why it matters, though: the atomics
> guarantee that the write by CPU2 fails if CPU1 changed the value in
> the mean time,
On Sat, 15 Jun 2019 at 15:59, Kees Cook wrote:
>
> On Sat, Jun 15, 2019 at 10:47:19AM +0200, Ard Biesheuvel wrote:
> > remaining question Will had was whether it makes sense to do the
> > condition checks before doing the actual store, to avoid having a time
> > window where the refcount assumes
On Sat, Jun 15, 2019 at 10:47:19AM +0200, Ard Biesheuvel wrote:
> remaining question Will had was whether it makes sense to do the
> condition checks before doing the actual store, to avoid having a time
> window where the refcount assumes its illegal value. Since arm64 does
> not have memory
On Sat, 15 Jun 2019 at 06:21, Kees Cook wrote:
>
> tl;dr: if arm/arm64 can catch overflow, untested dec-to-zero, and
> inc-from-zero, while performing better than existing REFCOUNT_FULL,
> it's a no-brainer to switch. Minimum parity to x86 would be to catch
> overflow and untested dec-to-zero.
tl;dr: if arm/arm64 can catch overflow, untested dec-to-zero, and
inc-from-zero, while performing better than existing REFCOUNT_FULL,
it's a no-brainer to switch. Minimum parity to x86 would be to catch
overflow and untested dec-to-zero. Minimum viable protection would be to
catch overflow. LKDTM
Hi Ard,
On Fri, Jun 14, 2019 at 12:24:54PM +0200, Ard Biesheuvel wrote:
> On Fri, 14 Jun 2019 at 11:58, Will Deacon wrote:
> > On Fri, Jun 14, 2019 at 07:09:26AM +, Jayachandran Chandrasekharan Nair
> > wrote:
> > > x86 added a arch-specific fast refcount implementation - and the commit
> >
On Fri, 14 Jun 2019 at 11:58, Will Deacon wrote:
>
> [+Kees]
>
> On Fri, Jun 14, 2019 at 07:09:26AM +, Jayachandran Chandrasekharan Nair
> wrote:
> > On Wed, Jun 12, 2019 at 10:31:53AM +0100, Will Deacon wrote:
> > > On Wed, Jun 12, 2019 at 04:10:20AM +, Jayachandran Chandrasekharan
> >
[+Kees]
On Fri, Jun 14, 2019 at 07:09:26AM +, Jayachandran Chandrasekharan Nair
wrote:
> On Wed, Jun 12, 2019 at 10:31:53AM +0100, Will Deacon wrote:
> > On Wed, Jun 12, 2019 at 04:10:20AM +, Jayachandran Chandrasekharan Nair
> > wrote:
> > > Now that the lockref change is mainline, I
On Wed, Jun 12, 2019 at 10:31:53AM +0100, Will Deacon wrote:
> Hi JC,
>
> On Wed, Jun 12, 2019 at 04:10:20AM +, Jayachandran Chandrasekharan Nair
> wrote:
> > On Wed, May 22, 2019 at 05:04:17PM +0100, Will Deacon wrote:
> > > On Sat, May 18, 2019 at 12:00:34PM +0200, Ard Biesheuvel wrote:
>
On 2019/6/12 12:10, Jayachandran Chandrasekharan Nair wrote:
> On Wed, May 22, 2019 at 05:04:17PM +0100, Will Deacon wrote:
>> On Sat, May 18, 2019 at 12:00:34PM +0200, Ard Biesheuvel wrote:
>>> On Sat, 18 May 2019 at 06:25, Jayachandran Chandrasekharan Nair
>>> wrote:
On Mon, May 06,
Hi JC,
On Wed, Jun 12, 2019 at 04:10:20AM +, Jayachandran Chandrasekharan Nair
wrote:
> On Wed, May 22, 2019 at 05:04:17PM +0100, Will Deacon wrote:
> > On Sat, May 18, 2019 at 12:00:34PM +0200, Ard Biesheuvel wrote:
> > > On Sat, 18 May 2019 at 06:25, Jayachandran Chandrasekharan Nair
> > >
On Wed, May 22, 2019 at 05:04:17PM +0100, Will Deacon wrote:
> On Sat, May 18, 2019 at 12:00:34PM +0200, Ard Biesheuvel wrote:
> > On Sat, 18 May 2019 at 06:25, Jayachandran Chandrasekharan Nair
> > wrote:
> > >
> > > On Mon, May 06, 2019 at 07:10:40PM +0100, Will Deacon wrote:
> > > > On Mon,
On Sat, May 18, 2019 at 12:00:34PM +0200, Ard Biesheuvel wrote:
> On Sat, 18 May 2019 at 06:25, Jayachandran Chandrasekharan Nair
> wrote:
> >
> > On Mon, May 06, 2019 at 07:10:40PM +0100, Will Deacon wrote:
> > > On Mon, May 06, 2019 at 06:13:12AM +, Jayachandran Chandrasekharan
> > > Nair
On Sat, 18 May 2019 at 06:25, Jayachandran Chandrasekharan Nair
wrote:
>
> On Mon, May 06, 2019 at 07:10:40PM +0100, Will Deacon wrote:
> > On Mon, May 06, 2019 at 06:13:12AM +, Jayachandran Chandrasekharan Nair
> > wrote:
> > > Perhaps someone from ARM can chime in here how the cas/yield
On Mon, May 06, 2019 at 07:10:40PM +0100, Will Deacon wrote:
> On Mon, May 06, 2019 at 06:13:12AM +, Jayachandran Chandrasekharan Nair
> wrote:
> > Perhaps someone from ARM can chime in here how the cas/yield combo
> > is expected to work when there is contention. ThunderX2 does not
> > do
On Mon, May 06, 2019 at 06:13:12AM +, Jayachandran Chandrasekharan Nair
wrote:
> Perhaps someone from ARM can chime in here how the cas/yield combo
> is expected to work when there is contention. ThunderX2 does not
> do much with the yield, but I don't expect any ARM implementation
> to treat
On Sun, May 5, 2019 at 11:13 PM Jayachandran Chandrasekharan Nair
wrote:
>
> > It's not normal, and it's not inevitable.
>
> If you look at the code, the CAS failure is followed by a yield
> before retrying the CAS. Yield on arm64 is expected to be a hint
> to release resources so that other
On Fri, May 03, 2019 at 12:40:34PM -0700, Linus Torvalds wrote:
> On Thu, May 2, 2019 at 4:19 PM Jayachandran Chandrasekharan Nair
> wrote:
> >>
> > I don't really see the point your are making about hardware. If you
> > look at the test case, you have about 64 cores doing CAS to the same
> >
On Thu, May 2, 2019 at 4:19 PM Jayachandran Chandrasekharan Nair
wrote:
>>
> I don't really see the point your are making about hardware. If you
> look at the test case, you have about 64 cores doing CAS to the same
> location. At any point one of them will succeed and the other 63 will
> fail -
On Thu, May 02, 2019 at 09:12:18AM -0700, Linus Torvalds wrote:
> On Thu, May 2, 2019 at 1:27 AM Jan Glauber wrote:
> >
> > I'll see how x86 runs the same testcase, I thought that playing
> > cacheline ping-pong is not the optimal use case for any CPU.
>
> Oh, ping-pong is always bad.
>
> But
On Thu, May 2, 2019 at 1:27 AM Jan Glauber wrote:
>
> I'll see how x86 runs the same testcase, I thought that playing
> cacheline ping-pong is not the optimal use case for any CPU.
Oh, ping-pong is always bad.
But from past experience, x86 tends to be able to always do tight a
cmpxchg loop
On Wed, May 01, 2019 at 05:01:40PM +0100, Will Deacon wrote:
> Hi Jan,
>
> [+Peter and Linus, since they enjoy this stuff]
>
> On Mon, Apr 29, 2019 at 02:52:11PM +, Jan Glauber wrote:
> > I've been looking into performance issues that were reported for several
> > test-cases, for instance an
On Wed, May 01, 2019 at 09:41:08AM -0700, Linus Torvalds wrote:
> On Mon, Apr 29, 2019 at 7:52 AM Jan Glauber wrote:
> >
> > It turned out the issue we have on ThunderX2 is the file open-close sequence
> > with small read sizes. If the used files are opened read-only the
> > lockref code (enabled
On Mon, Apr 29, 2019 at 7:52 AM Jan Glauber wrote:
>
> It turned out the issue we have on ThunderX2 is the file open-close sequence
> with small read sizes. If the used files are opened read-only the
> lockref code (enabled by ARCH_USE_CMPXCHG_LOCKREF) is used.
>
> The lockref CMPXCHG_LOOP uses
Hi Jan,
[+Peter and Linus, since they enjoy this stuff]
On Mon, Apr 29, 2019 at 02:52:11PM +, Jan Glauber wrote:
> I've been looking into performance issues that were reported for several
> test-cases, for instance an nginx benchmark.
Could you share enough specifics here so that we can
Hi Catalin & Will,
I've been looking into performance issues that were reported for several
test-cases, for instance an nginx benchmark.
It turned out the issue we have on ThunderX2 is the file open-close sequence
with small read sizes. If the used files are opened read-only the
lockref code
30 matches
Mail list logo