Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-10-02 Thread Jörg Saßmannshausen
Hi Ade,

thanks for this. I will give it a spin.

So far I only done a simple ping-pong test but never done a RDMA test.

All the best and thanks!

Jörg

Am Dienstag, 2. Oktober 2018, 21:33:09 BST schrieb Ade Fewings:
> Hello from Wales
> 
> Red Hat quoted just a simple ib_write_bw test as indicating the broken state
> of IB RDMA (https://access.redhat.com/solutions/3568891):
> 
> Run a RDMA write bandwidth test. ib_write_bw is provided by the package
> perftest.
> 
> On target node run :
> # ib_write_bw
> 
> On client side run :
> # ib_write_bw 
> 
> The test should fail.
> 
> Hope that helps
> Ade
> 
> 
> 
> -Original Message-
> From: Beowulf  On Behalf Of Jörg Saßmannshausen
> Sent: 02 October 2018 22:20
> To: beowulf@beowulf.org
> Subject: Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks
> RDMA
> 
> Dear all,
> 
> is there some kind of quick test to demonstrate the patch does have or does
> not cause a problem with RDMA? I have been asked to look into that but I
> don't really want to use a large cp2k calculation which, I believe, makes
> use of RDMA.
> 
> All the best from London
> 
> Jörg
> 
> Am Mittwoch, 12. September 2018, 18:02:06 BST schrieb John Hearns via 
Beowulf:
> > Regarding CentOS, Karanbir Singh is the leader of the project and has
> > a job at Redhat
> > https://www.linuxfoundation.org/blog/2014/01/centos-project-leader-kar
> > anbir-> singh-opens-up-on-red-hat-deal/ On Tue, 11 Sep 2018 at 18:03,
> > Peter St. John 
> 
> wrote:
> > > I mean the RH QA that tests RH products isn't the same team as tests
> > > (or
> > > not) CentOS, but I only know from the wiki that RH has an expanding
> > > agreement with CentOS so may be this is all merging. As I said, my
> > > buddy doesn't work in this area, and I sure don't. Probably all you
> > > guys are more up to date on the merging than either of us.> On Tue,
> > > 
> > > Sep 11, 2018 at 12:50 PM, Peter Kjellström  wrote:
> > >> On Tue, 11 Sep 2018 12:37:18 -0400
> > >> 
> > >> "Peter St. John"  wrote:
> > >> > A friend at RH (who works in a different area) tells me RH does
> > >> > not themselves test the downstream CentOS.
> > >> > Peter
> > >> 
> > >> That isn't surprising is it? But in this case we're talking about
> > >> them not testing their own product.. :-D
> > >> 
> > >> /Peter K
> > >> 
> > >> 
> > >> --
> > >> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> > > 
> > > ___
> > > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
> > > Computing To change your subscription (digest mode or unsubscribe)
> > > visit http://www.beowulf.org/mailman/listinfo/beowulf
> > 
> > ___
> > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
> > Computing To change your subscription (digest mode or unsubscribe)
> > visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To
> change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> 
>[HPC Wales - www.hpcwales.co.uk] <http://www.hpcwales.co.uk>
> 
> 
> 
> The contents of this email and any files transmitted with it are
> confidential and intended solely for the named addressee only.  Unless you
> are the named addressee (or authorised to receive this on their behalf) you
> may not copy it or use it, or disclose it to anyone else.  If you have
> received this email in error, please notify the sender by email or
> telephone.  All emails sent by High Performance Computing Wales have been
> checked using an Anti-Virus system.  We would advise you to run your own
> virus check before opening any attachments received as we will not in any
> event accept any liability whatsoever, once an email and/or attachment is
> received.
> 
> High Performance Computing Wales is a private limited company incorporated
> in Wales on 8 March 2010 as company number 07181701.
> 
> Our registered office is at Finance Office, Bangor University, Cae Derwen,
> College Road, Bangor, Gwynedd. LL57 2DG. UK.
> 
> High Performance Computing Wales is part funded by the European Regional
> Development Fund through the Welsh Government.

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-10-02 Thread Ade Fewings
Hello from Wales

Red Hat quoted just a simple ib_write_bw test as indicating the broken state of 
IB RDMA (https://access.redhat.com/solutions/3568891):

Run a RDMA write bandwidth test. ib_write_bw is provided by the package 
perftest.

On target node run :
# ib_write_bw

On client side run :
# ib_write_bw 

The test should fail.

Hope that helps
Ade



-Original Message-
From: Beowulf  On Behalf Of Jörg Saßmannshausen
Sent: 02 October 2018 22:20
To: beowulf@beowulf.org
Subject: Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

Dear all,

is there some kind of quick test to demonstrate the patch does have or does not 
cause a problem with RDMA? I have been asked to look into that but I don't 
really want to use a large cp2k calculation which, I believe, makes use of RDMA.

All the best from London

Jörg

Am Mittwoch, 12. September 2018, 18:02:06 BST schrieb John Hearns via Beowulf:
> Regarding CentOS, Karanbir Singh is the leader of the project and has
> a job at Redhat
> https://www.linuxfoundation.org/blog/2014/01/centos-project-leader-kar
> anbir-> singh-opens-up-on-red-hat-deal/ On Tue, 11 Sep 2018 at 18:03,
> Peter St. John 
wrote:
> > I mean the RH QA that tests RH products isn't the same team as tests
> > (or
> > not) CentOS, but I only know from the wiki that RH has an expanding
> > agreement with CentOS so may be this is all merging. As I said, my
> > buddy doesn't work in this area, and I sure don't. Probably all you
> > guys are more up to date on the merging than either of us.> On Tue,
> > Sep 11, 2018 at 12:50 PM, Peter Kjellström  wrote:
> >> On Tue, 11 Sep 2018 12:37:18 -0400
> >>
> >> "Peter St. John"  wrote:
> >> > A friend at RH (who works in a different area) tells me RH does
> >> > not themselves test the downstream CentOS.
> >> > Peter
> >>
> >> That isn't surprising is it? But in this case we're talking about
> >> them not testing their own product.. :-D
> >>
> >> /Peter K
> >>
> >>
> >> --
> >> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> >
> > ___
> > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
> > Computing To change your subscription (digest mode or unsubscribe)
> > visit http://www.beowulf.org/mailman/listinfo/beowulf
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
> Computing To change your subscription (digest mode or unsubscribe)
> visit http://www.beowulf.org/mailman/listinfo/beowulf

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To 
change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


   [HPC Wales - www.hpcwales.co.uk] <http://www.hpcwales.co.uk>



The contents of this email and any files transmitted with it are confidential 
and intended solely for the named addressee only.  Unless you are the named 
addressee (or authorised to receive this on their behalf) you may not copy it 
or use it, or disclose it to anyone else.  If you have received this email in 
error, please notify the sender by email or telephone.  All emails sent by High 
Performance Computing Wales have been checked using an Anti-Virus system.  We 
would advise you to run your own virus check before opening any attachments 
received as we will not in any event accept any liability whatsoever, once an 
email and/or attachment is received.

High Performance Computing Wales is a private limited company incorporated in 
Wales on 8 March 2010 as company number 07181701.

Our registered office is at Finance Office, Bangor University, Cae Derwen, 
College Road, Bangor, Gwynedd. LL57 2DG. UK.

High Performance Computing Wales is part funded by the European Regional 
Development Fund through the Welsh Government.
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-10-02 Thread Jörg Saßmannshausen
Dear all,

is there some kind of quick test to demonstrate the patch does have or does 
not cause a problem with RDMA? I have been asked to look into that but I don't 
really want to use a large cp2k calculation which, I believe, makes use of 
RDMA.

All the best from London

Jörg

Am Mittwoch, 12. September 2018, 18:02:06 BST schrieb John Hearns via Beowulf:
> Regarding CentOS, Karanbir Singh is the leader of the project and has
> a job at Redhat
> https://www.linuxfoundation.org/blog/2014/01/centos-project-leader-karanbir-> 
> singh-opens-up-on-red-hat-deal/
> On Tue, 11 Sep 2018 at 18:03, Peter St. John  
wrote:
> > I mean the RH QA that tests RH products isn't the same team as tests (or
> > not) CentOS, but I only know from the wiki that RH has an expanding
> > agreement with CentOS so may be this is all merging. As I said, my buddy
> > doesn't work in this area, and I sure don't. Probably all you guys are
> > more up to date on the merging than either of us.> 
> > On Tue, Sep 11, 2018 at 12:50 PM, Peter Kjellström  wrote:
> >> On Tue, 11 Sep 2018 12:37:18 -0400
> >> 
> >> "Peter St. John"  wrote:
> >> > A friend at RH (who works in a different area) tells me RH does not
> >> > themselves test the downstream CentOS.
> >> > Peter
> >> 
> >> That isn't surprising is it? But in this case we're talking about them
> >> not testing their own product.. :-D
> >> 
> >> /Peter K
> >> 
> >> 
> >> --
> >> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> > 
> > ___
> > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-10-01 Thread Ryan Novosielski
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/10/2018 08:41 PM, Kilian Cavalotti wrote:
> On Mon, Sep 10, 2018 at 4:18 PM Ryan Novosielski 
>  wrote:
>> So we’ve learned what, here, that RedHat doesn’t test the RDMA 
>> stack at all?
> 
> Looks like Spectre-like vulns take all precedence, these days, 
> indeed.
> 
> Last I heard, the fix will be in 862.14.1 to be released on the 
> 25th

Confirmed fixed in 862.14.4:

https://access.redhat.com/solutions/3568891

- -- 
 
 || \\UTGERS, |--*O*
 ||_// the State  |Ryan Novosielski - novos...@rutgers.edu
 || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus
 ||  \\of NJ  | Office of Advanced Res. Comp. - MSB C630, Newark
  `'
-BEGIN PGP SIGNATURE-

iEYEARECAAYFAluyjRkACgkQmb+gadEcsb5GDQCgjS3o5QZdv2xBm3Nr08lk4ifK
ziAAoIjIbNy8yoISNxIxMA5+V+SYoDck
=ln3g
-END PGP SIGNATURE-
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-12 Thread John Hearns via Beowulf
Regarding CentOS, Karanbir Singh is the leader of the project and has
a job at Redhat
https://www.linuxfoundation.org/blog/2014/01/centos-project-leader-karanbir-singh-opens-up-on-red-hat-deal/
On Tue, 11 Sep 2018 at 18:03, Peter St. John  wrote:
>
> I mean the RH QA that tests RH products isn't the same team as tests (or not) 
> CentOS, but I only know from the wiki that RH has an expanding agreement with 
> CentOS so may be this is all merging. As I said, my buddy doesn't work in 
> this area, and I sure don't. Probably all you guys are more up to date on the 
> merging than either of us.
>
> On Tue, Sep 11, 2018 at 12:50 PM, Peter Kjellström  wrote:
>>
>> On Tue, 11 Sep 2018 12:37:18 -0400
>> "Peter St. John"  wrote:
>>
>> > A friend at RH (who works in a different area) tells me RH does not
>> > themselves test the downstream CentOS.
>> > Peter
>>
>> That isn't surprising is it? But in this case we're talking about them
>> not testing their own product.. :-D
>>
>> /Peter K
>>
>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-11 Thread Peter St. John
 I mean the RH QA that tests RH products isn't the same team as tests (or
not) CentOS, but I only know from the wiki that RH has an expanding
agreement with CentOS so may be this is all merging. As I said, my buddy
doesn't work in this area, and I sure don't. Probably all you guys are more
up to date on the merging than either of us.

On Tue, Sep 11, 2018 at 12:50 PM, Peter Kjellström  wrote:

> On Tue, 11 Sep 2018 12:37:18 -0400
> "Peter St. John"  wrote:
>
> > A friend at RH (who works in a different area) tells me RH does not
> > themselves test the downstream CentOS.
> > Peter
>
> That isn't surprising is it? But in this case we're talking about them
> not testing their own product.. :-D
>
> /Peter K
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-11 Thread Peter St. John
I mean the RH QA that tests RH products isn't the same team as tests (or
not) CentOS, but I only know from the wiki that RH has an expanding
agreement with CentOS so may be this is all merging. As I said, my buddy
doesn't work in this area, and I sure don't. Probably all you guys are more
up to date on the merging than either of us.

Peter

On Tue, Sep 11, 2018 at 12:50 PM, Peter Kjellström  wrote:

> On Tue, 11 Sep 2018 12:37:18 -0400
> "Peter St. John"  wrote:
>
> > A friend at RH (who works in a different area) tells me RH does not
> > themselves test the downstream CentOS.
> > Peter
>
> That isn't surprising is it? But in this case we're talking about them
> not testing their own product.. :-D
>
> /Peter K
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-11 Thread Peter Kjellström
On Tue, 11 Sep 2018 12:37:18 -0400
"Peter St. John"  wrote:

> A friend at RH (who works in a different area) tells me RH does not
> themselves test the downstream CentOS.
> Peter

That isn't surprising is it? But in this case we're talking about them
not testing their own product.. :-D

/Peter K

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-11 Thread Peter St. John
A friend at RH (who works in a different area) tells me RH does not
themselves test the downstream CentOS.
Peter

On Tue, Sep 11, 2018 at 8:32 AM, Peter Kjellström  wrote:

> On Mon, 10 Sep 2018 23:17:21 +
> Ryan Novosielski  wrote:
> ...
> > So we’ve learned what, here, that RedHat doesn’t test the RDMA stack
> > at all?
>
> This we knew already since last time they completely destroyed it with
> an update (and took >month to fix).
>
> /Peter
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-11 Thread Peter Kjellström
On Mon, 10 Sep 2018 23:17:21 +
Ryan Novosielski  wrote:
...
> So we’ve learned what, here, that RedHat doesn’t test the RDMA stack
> at all?

This we knew already since last time they completely destroyed it with
an update (and took >month to fix).

/Peter

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread Chris Samuel
On Tuesday, 11 September 2018 10:41:24 AM AEST Kilian Cavalotti wrote:

> Last I heard, the fix will be in 862.14.1 to be released on the 25th

Ah interesting, I wonder if that fix is already in the 3.10.0-933 kernel 
that's meant to be in the RHEL 7.6 beta?

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread Chris Samuel
On Tuesday, 11 September 2018 9:17:21 AM AEST Ryan Novosielski wrote:

> So we’ve learned what, here, that RedHat doesn’t test the RDMA stack at all?

It certainly does seem to be the case.  Unlike other issues I've hit in the 
past with bugs introduced in the IB stack in 6.x -> 6.y transitions where 
they've needed more hardware than you could reasonably expect them to have to 
be able to spot the bug this is a pretty fundamental failure.

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread Kilian Cavalotti
On Mon, Sep 10, 2018 at 4:18 PM Ryan Novosielski  wrote:
> So we’ve learned what, here, that RedHat doesn’t test the RDMA stack at all?

Looks like Spectre-like vulns take all precedence, these days, indeed.

Last I heard, the fix will be in 862.14.1 to be released on the 25th

Cheers,
-- 
Kilian
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread Ryan Novosielski
> On Sep 10, 2018, at 18:15, Chris Samuel  wrote:
> 
>> On Tuesday, 11 September 2018 1:25:55 AM AEST Peter St. John wrote:
>> 
>> I had wanted to say that such a bug would be caught by compiling with some
>> reasonalbe warning level; but I think I was wrong.
> 
> Interesting - looks like it depends on your GCC version, 7.3.0 catches it 
> with -Wall here:
> 
> chris@quad:/tmp$ gcc -Wall test.c -o test
> test.c: In function ‘main’:
> test.c:6:2: warning: this ‘if’ clause does not guard... 
> [-Wmisleading-indentation]
> if ( test );
> ^~
> test.c:7:3: note: ...this statement, but the latter is misleadingly indented 
> as if it were guarded by the ‘if’
>  printf ( "hello\n" );
>  ^~
> 
>> So I guess I have to forgive the software engineer who fat-fingered that
>> semicolon. Of course I've done worse.
> 
> Oh yes, same here too!   There but for... and all that. :-)

So we’ve learned what, here, that RedHat doesn’t test the RDMA stack at all?

--

|| \\UTGERS, |---*O*---
||_// the State  | Ryan Novosielski - novos...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\of NJ  | Office of Advanced Research Computing - MSB C630, Newark
 `'

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread Peter St. John
yes the gcc I used is 5.1, I guess that's how long I've had this laptop :-)
And I like that "not guarding" that sounds useful.

On Mon, Sep 10, 2018 at 6:15 PM, Chris Samuel  wrote:

> On Tuesday, 11 September 2018 1:25:55 AM AEST Peter St. John wrote:
>
> > I had wanted to say that such a bug would be caught by compiling with
> some
> > reasonalbe warning level; but I think I was wrong.
>
> Interesting - looks like it depends on your GCC version, 7.3.0 catches it
> with -Wall here:
>
> chris@quad:/tmp$ gcc -Wall test.c -o test
> test.c: In function ‘main’:
> test.c:6:2: warning: this ‘if’ clause does not guard...
> [-Wmisleading-indentation]
>   if ( test );
>   ^~
> test.c:7:3: note: ...this statement, but the latter is misleadingly
> indented as if it were guarded by the ‘if’
>printf ( "hello\n" );
>^~
>
> > So I guess I have to forgive the software engineer who fat-fingered that
> > semicolon. Of course I've done worse.
>
> Oh yes, same here too!   There but for... and all that. :-)
>
> All the best,
> Chris
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread Chris Samuel
On Tuesday, 11 September 2018 1:25:55 AM AEST Peter St. John wrote:

> I had wanted to say that such a bug would be caught by compiling with some
> reasonalbe warning level; but I think I was wrong.

Interesting - looks like it depends on your GCC version, 7.3.0 catches it with 
-Wall here:

chris@quad:/tmp$ gcc -Wall test.c -o test
test.c: In function ‘main’:
test.c:6:2: warning: this ‘if’ clause does not guard... 
[-Wmisleading-indentation]
  if ( test );
  ^~
test.c:7:3: note: ...this statement, but the latter is misleadingly indented as 
if it were guarded by the ‘if’
   printf ( "hello\n" );
   ^~

> So I guess I have to forgive the software engineer who fat-fingered that
> semicolon. Of course I've done worse.

Oh yes, same here too!   There but for... and all that. :-)

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread Peter St. John
I had wanted to say that such a bug would be caught by compiling with some
reasonalbe warning level; but I think I was wrong.

I compiled
if(1==1);

with some wrapper and got nothing with whatever gcc I have on this laptop,
until
gcc -Wextra

which is more persnickety than -Wall, and just got
mynoop.c: In function 'main':
mynoop.c:4:10: warning: suggest braces around empty body in an 'if'
statement [-Wempty-body]
  if(1==1);
  ^

So I guess I have to forgive the software engineer who fat-fingered that
semicolon. Of course I've done worse.

Peter




On Mon, Sep 10, 2018 at 4:22 AM, Chris Samuel  wrote:

> On Friday, 17 August 2018 2:47:37 PM AEST Chris Samuel wrote:
>
> > Just a heads up that the 3.10.0-862.11.6.el7.x86_64 kernel from
> RHEL/CentOS
> > that was released to address the most recent Intel CPU problem "L1TF"
> seems
> > to break RDMA (found by a colleague here at Swinburne).
>
> So this CentOS bug has a one line bug fix for this problem!
>
> https://bugs.centos.org/view.php?id=15193
>
> It's a corker - basically it looks like someone typo'd a ; into an if
> statement, the fix is:
>
> -   if (!rdma_is_port_valid_nospec(device, _attr->port_num));
> +   if (!rdma_is_port_valid_nospec(device, _attr->port_num))
> return -EINVAL;
>
> So it always returns -EINVAL when checking the port as the if becomes a
> noop..
> :-(
>
> Patch attached...
>
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread John Hearns via Beowulf
Linux should have coded the kernel in Python then. Easily caught there.

(Yes. I am making a joke)
On Mon, 10 Sep 2018 at 09:23, Chris Samuel  wrote:
>
> On Friday, 17 August 2018 2:47:37 PM AEST Chris Samuel wrote:
>
> > Just a heads up that the 3.10.0-862.11.6.el7.x86_64 kernel from RHEL/CentOS
> > that was released to address the most recent Intel CPU problem "L1TF" seems
> > to break RDMA (found by a colleague here at Swinburne).
>
> So this CentOS bug has a one line bug fix for this problem!
>
> https://bugs.centos.org/view.php?id=15193
>
> It's a corker - basically it looks like someone typo'd a ; into an if
> statement, the fix is:
>
> -   if (!rdma_is_port_valid_nospec(device, _attr->port_num));
> +   if (!rdma_is_port_valid_nospec(device, _attr->port_num))
> return -EINVAL;
>
> So it always returns -EINVAL when checking the port as the if becomes a noop..
> :-(
>
> Patch attached...
>
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-09-10 Thread Chris Samuel
On Friday, 17 August 2018 2:47:37 PM AEST Chris Samuel wrote:

> Just a heads up that the 3.10.0-862.11.6.el7.x86_64 kernel from RHEL/CentOS
> that was released to address the most recent Intel CPU problem "L1TF" seems
> to break RDMA (found by a colleague here at Swinburne).

So this CentOS bug has a one line bug fix for this problem!

https://bugs.centos.org/view.php?id=15193

It's a corker - basically it looks like someone typo'd a ; into an if 
statement, the fix is:

-   if (!rdma_is_port_valid_nospec(device, _attr->port_num));
+   if (!rdma_is_port_valid_nospec(device, _attr->port_num))
return -EINVAL;

So it always returns -EINVAL when checking the port as the if becomes a noop.. 
:-(

Patch attached...

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>From 6353587a7efa488a4064f3661cf64bd4d74eaa73 Mon Sep 17 00:00:00 2001
From: Pablo Greco 
Date: Mon, 20 Aug 2018 06:39:55 -0300
Subject: [PATCH] OMG

---
 drivers/infiniband/core/verbs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index debe718..c080eb2 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1232,7 +1232,7 @@ int ib_resolve_eth_dmac(struct ib_device *device,
 	int   ret = 0;
 	struct ib_global_route *grh;
 
-	if (!rdma_is_port_valid_nospec(device, _attr->port_num));
+	if (!rdma_is_port_valid_nospec(device, _attr->port_num))
 		return -EINVAL;
 
 	if (ah_attr->type != RDMA_AH_ATTR_TYPE_ROCE)
-- 
1.8.3.1

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-23 Thread John Hearns via Beowulf
My bad. The license has been updated now
https://www.theregister.co.uk/2018/08/23/intel_microcode_license/


On Thu, 23 Aug 2018 at 20:11, John Hearns  wrote:

> https://www.theregister.co.uk/2018/08/21/intel_cpu_patch_licence/
>
>
> https://perens.com/2018/08/22/new-intel-microcode-license-restriction-is-not-acceptable/
>
>
>
> On Tue, 21 Aug 2018 at 16:18, Lux, Jim (337K) 
> wrote:
>
>>
>>
>> On 8/21/18, 1:37 AM, "Beowulf on behalf of Chris Samuel" <
>> beowulf-boun...@beowulf.org on behalf of ch...@csamuel.org> wrote:
>>
>> On Tuesday, 21 August 2018 3:27:59 AM AEST Lux, Jim (337K) wrote:
>>
>> > I'd find it hard to believe that Intel's CPU designers sat around
>> > implementing deliberate flaws ( the Bosch engine controller for VW
>> model).
>>
>> Not to mention that Spectre variants affected AMD, ARM & IBM (at
>> least).
>>
>> This publicly NSA funded research ("The Intel 80x86 processor
>> architecture:
>> pitfalls for secure systems") from 1995 has an interesting section:
>>
>> https://ieeexplore.ieee.org/document/398934/
>>
>> https://pdfs.semanticscholar.org/2209/42809262c17b6631c0f6536c91aaf7756857.pdf
>>
>> Section 3.10 - Cache and TLB timing channels
>>
>> which warns (in generalities) about the use of MSRs and the use of
>> instruction
>> timing as side channels.
>>
>>
>>
>> Such vulnerabilities have existed since the early days of computers.  As
>> processors and use cases have gotten more complex they're harder to find.
>>
>> This is why back in "orange book" days there's the whole "system high"
>> mode of operation - basically "air gap, you, or things you trust, are the
>> only one on the machine"
>>
>>
>> ___
>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-23 Thread John Hearns via Beowulf
https://www.theregister.co.uk/2018/08/21/intel_cpu_patch_licence/

https://perens.com/2018/08/22/new-intel-microcode-license-restriction-is-not-acceptable/



On Tue, 21 Aug 2018 at 16:18, Lux, Jim (337K) 
wrote:

>
>
> On 8/21/18, 1:37 AM, "Beowulf on behalf of Chris Samuel" <
> beowulf-boun...@beowulf.org on behalf of ch...@csamuel.org> wrote:
>
> On Tuesday, 21 August 2018 3:27:59 AM AEST Lux, Jim (337K) wrote:
>
> > I'd find it hard to believe that Intel's CPU designers sat around
> > implementing deliberate flaws ( the Bosch engine controller for VW
> model).
>
> Not to mention that Spectre variants affected AMD, ARM & IBM (at
> least).
>
> This publicly NSA funded research ("The Intel 80x86 processor
> architecture:
> pitfalls for secure systems") from 1995 has an interesting section:
>
> https://ieeexplore.ieee.org/document/398934/
>
> https://pdfs.semanticscholar.org/2209/42809262c17b6631c0f6536c91aaf7756857.pdf
>
> Section 3.10 - Cache and TLB timing channels
>
> which warns (in generalities) about the use of MSRs and the use of
> instruction
> timing as side channels.
>
>
>
> Such vulnerabilities have existed since the early days of computers.  As
> processors and use cases have gotten more complex they're harder to find.
>
> This is why back in "orange book" days there's the whole "system high"
> mode of operation - basically "air gap, you, or things you trust, are the
> only one on the machine"
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-21 Thread Lux, Jim (337K)


On 8/21/18, 1:37 AM, "Beowulf on behalf of Chris Samuel" 
 wrote:

On Tuesday, 21 August 2018 3:27:59 AM AEST Lux, Jim (337K) wrote:

> I'd find it hard to believe that Intel's CPU designers sat around
> implementing deliberate flaws ( the Bosch engine controller for VW model).

Not to mention that Spectre variants affected AMD, ARM & IBM (at least).

This publicly NSA funded research ("The Intel 80x86 processor architecture: 
pitfalls for secure systems") from 1995 has an interesting section:

https://ieeexplore.ieee.org/document/398934/

https://pdfs.semanticscholar.org/2209/42809262c17b6631c0f6536c91aaf7756857.pdf

Section 3.10 - Cache and TLB timing channels

which warns (in generalities) about the use of MSRs and the use of 
instruction 
timing as side channels.



Such vulnerabilities have existed since the early days of computers.  As 
processors and use cases have gotten more complex they're harder to find.

This is why back in "orange book" days there's the whole "system high" mode of 
operation - basically "air gap, you, or things you trust, are the only one on 
the machine"


___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-21 Thread Chris Samuel
On Tuesday, 21 August 2018 3:27:59 AM AEST Lux, Jim (337K) wrote:

> I'd find it hard to believe that Intel's CPU designers sat around
> implementing deliberate flaws ( the Bosch engine controller for VW model).

Not to mention that Spectre variants affected AMD, ARM & IBM (at least).

This publicly NSA funded research ("The Intel 80x86 processor architecture: 
pitfalls for secure systems") from 1995 has an interesting section:

https://ieeexplore.ieee.org/document/398934/
https://pdfs.semanticscholar.org/2209/42809262c17b6631c0f6536c91aaf7756857.pdf

Section 3.10 - Cache and TLB timing channels

which warns (in generalities) about the use of MSRs and the use of instruction 
timing as side channels.

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-20 Thread Lux, Jim (337K)
All complex systems have flaws. It's more a matter of deciding which flaws are 
acceptable and which aren't, which is driven by economic factors for the most 
part - the cost of fixing the flaw (and potentially introducing a new one) vs 
the cost of damage from the flaw.

I'd find it hard to believe that Intel's CPU designers sat around implementing 
deliberate flaws ( the Bosch engine controller for VW model).

I'd not find it hard to believe that someone, somewhere raised a speculation 
about a potential flaw, among many others.  That one just didn't happen to get 
resources applied to it, others did.  Picking which ones to attack and spend 
resources on is a difficult question, and often gets answered based on totally 
irrelevant factors. 

That's not negligence - that's just "it is impossible to discover and fix all 
possible bugs"

This is not unusual even in MUCH simpler chips-I have some 8 bit wide level 
shifters (from 2.5 to 3.3V logic) that have an obscure behavior with the rate 
at which the two power supplies come up that causes them not to pass data 
(preventing the system in which they are installed from booting). About 1 out 
of 500 times. The mfr's response is "yeah, we think we can duplicate that, but 
we've moved on to a newer version of that chip, why don't you replace the chips 
with the new ones".  This isn't an necessarily an issue of the chip not 
performing to the datasheet specs (essentially, the data sheet is silent on 
this).

The Errata and Notes lists for complex parts (like CPUs and large FPGAs) runs 
to hundreds of pages, and continuously grows as people find more odd behaviors.


Therefore - one should assume your system has unknown flaws and design your 
software and operational procedures accordingly.


James Lux
Project Manager, SunRISE - Sun Radio Interferometer Space Experiment
Task Manager, DARPA High Frequency Research (DHFR) Space Testbed
Jet Propulsion Laboratory  (Mail Stop 161-213)
4800 Oak Grove Drive
Pasadena CA 91109
(818)354-2075 (office)
(818)395-2714 (cell)
-Original Message-
From: Beowulf [mailto:beowulf-boun...@beowulf.org] On Behalf Of Jörg 
Saßmannshausen
Sent: Sunday, August 19, 2018 2:00 PM
To: beowulf@beowulf.org
Subject: Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

Dear all,

whereas I am accepting that no system is 100% secure ans bug-free, I am 
beginning to wonder whether the current problems we are having are actually 
design flaws and whether, and that is the more important bit, Intel and other 
vendors did know about it. I am thinking of the famous 'diesel-engine' scandal 
and, continuing this line of thought, dragging the vendors into the limelight 
and get them to pay for this. 
I mean, we have to sort out the mess the company was making in the first place, 
have to judge whether to apply a patch which might decrease the performance of 
our systems (I am doing HPC, hence my InfiniBand question) versus security. 
Where will it stop?

Given the current and previous 'bugs' are clearly design flaws IMHO, what are 
the chances of a law suite? The any compensation here should go to Open Source 
projects, in my opinion, which are making software more secure. 

Any comments here?

All the best

Jörg

Am Sonntag, 19. August 2018, 06:11:16 BST schrieb John Hearns via Beowulf:
> Rather more seriously, this is a topic which is well worth discussing, 
> What are best practices on patching HPC systems?
> Perhaps we need a separate thread here.
> 
> I will throw in one thought, which I honestly do not want to see happening.
> I recently took a trip to Bletchley Park in the UK. On display there 
> was an IBM punch card machine and sample punch cards Back in the day 
> one prepared a 'job deck' which was collected by an operator in a 
> metal hopper then wheeled off to the mainframe. You did not ever touch 
> the mainframe. So effectively an air gapped system. A system like that 
> would in these days kill productivity.
> However should there be 'virus checking' of executables  before they 
> are run on compute nodes.
> One of the advantages lauded for Linux systems is of course that 
> anti-virus programs are not needed.
> 
> Also I should ask - in the jargon of anti-virus is there a 'signature' 
> for any of these exploit codes? One would guess that bad actors copy 
> the example codes already published and use these almost in a cut and 
> paste fashion. So the signature would be tight loops repeatedly 
> reading or writing to the same memory locations. Can that be 
> distinguished from innocent code?
> 
> On Sun, 19 Aug 2018 at 05:59, John Hearns  wrote:
> > *To patch, or not to patch, that is the question:* Whether 'tis 
> > nobler in the mind to suffer The loops and branches of speculative 
> > execution, Or to take arms against a sea of exploits And by opposing 
> > end them. To die—to sleep,

Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-19 Thread Jonathan Engwall
Thank you

On August 19, 2018, at 2:10 PM, Chris Samuel  wrote:

On Monday, 20 August 2018 6:32:26 AM AEST Jonathan Engwall wrote:

> I am not shocked that my previous message may have been removed.

To clarify: nothing has been removed to my knowledge.  Your email is in the 
list archives.

http://beowulf.org/pipermail/beowulf/2018-August/035219.html

All the best,
Chris (just woken up)
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-19 Thread Chris Samuel
On Monday, 20 August 2018 6:32:26 AM AEST Jonathan Engwall wrote:

> I am not shocked that my previous message may have been removed.

To clarify: nothing has been removed to my knowledge.  Your email is in the 
list archives.

http://beowulf.org/pipermail/beowulf/2018-August/035219.html

All the best,
Chris (just woken up)
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-19 Thread Jörg Saßmannshausen
Dear all,

whereas I am accepting that no system is 100% secure ans bug-free, I am 
beginning to wonder whether the current problems we are having are actually 
design flaws and whether, and that is the more important bit, Intel and other 
vendors did know about it. I am thinking of the famous 'diesel-engine' scandal 
and, continuing this line of thought, dragging the vendors into the limelight 
and get them to pay for this. 
I mean, we have to sort out the mess the company was making in the first place, 
have to judge whether to apply a patch which might decrease the performance of 
our systems (I am doing HPC, hence my InfiniBand question) versus security. 
Where will it stop?

Given the current and previous 'bugs' are clearly design flaws IMHO, what are 
the chances of a law suite? The any compensation here should go to Open Source 
projects, in my opinion, which are making software more secure. 

Any comments here?

All the best

Jörg

Am Sonntag, 19. August 2018, 06:11:16 BST schrieb John Hearns via Beowulf:
> Rather more seriously, this is a topic which is well worth discussing,
> What are best practices on patching HPC systems?
> Perhaps we need a separate thread here.
> 
> I will throw in one thought, which I honestly do not want to see happening.
> I recently took a trip to Bletchley Park in the UK. On display there was an
> IBM punch card machine and sample punch cards Back in the day one prepared
> a 'job deck' which was collected by an operator in a metal hopper then
> wheeled off to the mainframe. You did not ever touch the mainframe. So
> effectively an air gapped system. A system like that would in these days
> kill productivity.
> However should there be 'virus checking' of executables  before they are
> run on compute nodes.
> One of the advantages lauded for Linux systems is of course that anti-virus
> programs are not needed.
> 
> Also I should ask - in the jargon of anti-virus is there a 'signature' for
> any of these exploit codes? One would guess that bad actors copy the
> example codes already published and use these almost in a cut and paste
> fashion. So the signature would be tight loops repeatedly reading or
> writing to the same memory locations. Can that be distinguished from
> innocent code?
> 
> On Sun, 19 Aug 2018 at 05:59, John Hearns  wrote:
> > *To patch, or not to patch, that is the question:* Whether 'tis nobler in
> > the mind to suffer
> > The loops and branches of speculative execution,
> > Or to take arms against a sea of exploits
> > And by opposing end them. To die—to sleep,
> > No more; and by a sleep to say we end
> > The heart-ache and the thousand natural shocks
> > That HPC is heir to: 'tis a consummation
> > Devoutly to be wish'd. To die, to sleep
> > 
> > On Sun, 19 Aug 2018 at 02:31, Chris Samuel  wrote:
> >> On Sunday, 19 August 2018 5:19:07 AM AEST Jeff Johnson wrote:
> >> > With the spate of security flaws over the past year and the impacts
> >> 
> >> their
> >> 
> >> > fixes have on performance and functionality it might be worthwhile to
> >> 
> >> just
> >> 
> >> > run airgapped.
> >> 
> >> For me none of the HPC systems I've been involved with here in Australia
> >> would
> >> have had that option.  Virtually all have external users and/or reliance
> >> on
> >> external data for some of the work they are used for (and the sysadmins
> >> don't
> >> usually have control over the projects & people who get to use them).
> >> 
> >> All the best,
> >> Chris
> >> --
> >> 
> >>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
> >> 
> >> ___
> >> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> >> To change your subscription (digest mode or unsubscribe) visit
> >> http://www.beowulf.org/mailman/listinfo/beowulf

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-19 Thread Jonathan Engwall
As far as vulnerabilities go, here is a terrible idea:
Write a little login patch that grabs your own email address and uses it
to attempt to login to Facebook without a password 1000 times per 
second. Kill the script after two seconds. You want to read the Facebook 
head first so you can kick all the noise to /dev/null. It is brute force 
based on a query.

On August 18, 2018, at 10:12 PM, John Hearns via Beowulf  
wrote:

Rather more seriously, this is a topic which is well worth discussing,

What are best practices on patching HPC systems?

Perhaps we need a separate thread here.


I will throw in one thought, which I honestly do not want to see happening.

I recently took a trip to Bletchley Park in the UK. On display there was an IBM 
punch card machine and sample punch cards Back in the day one prepared a 'job 
deck' which was collected by an operator in a metal hopper then wheeled off to 
the mainframe. You did not ever touch the mainframe. So effectively an air 
gapped system. A system like that would in these days kill productivity.

However should there be 'virus checking' of executables  before they are run on 
compute nodes.

One of the advantages lauded for Linux systems is of course that anti-virus 
programs are not needed.


Also I should ask - in the jargon of anti-virus is there a 'signature' for any 
of these exploit codes? One would guess that bad actors copy the example codes 
already published and use these almost in a cut and paste fashion. So the 
signature would be tight loops repeatedly reading or writing to the same memory 
locations. Can that be distinguished from innocent code?











On Sun, 19 Aug 2018 at 05:59, John Hearns  wrote:

To patch, or not to patch, that is the question:
Whether 'tis nobler in the mind to suffer
The loops and branches of speculative execution,
Or to take arms against a sea of exploits
And by opposing end them. To die—to sleep,
No more; and by a sleep to say we end
The heart-ache and the thousand natural shocks
That HPC is heir to: 'tis a consummation
Devoutly to be wish'd. To die, to sleep


On Sun, 19 Aug 2018 at 02:31, Chris Samuel  wrote:

On Sunday, 19 August 2018 5:19:07 AM AEST Jeff Johnson wrote:

> With the spate of security flaws over the past year and the impacts their
> fixes have on performance and functionality it might be worthwhile to just
> run airgapped.

For me none of the HPC systems I've been involved with here in Australia would 
have had that option.  Virtually all have external users and/or reliance on 
external data for some of the work they are used for (and the sysadmins don't 
usually have control over the projects & people who get to use them).

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread John Hearns via Beowulf
Rather more seriously, this is a topic which is well worth discussing,
What are best practices on patching HPC systems?
Perhaps we need a separate thread here.

I will throw in one thought, which I honestly do not want to see happening.
I recently took a trip to Bletchley Park in the UK. On display there was an
IBM punch card machine and sample punch cards Back in the day one prepared
a 'job deck' which was collected by an operator in a metal hopper then
wheeled off to the mainframe. You did not ever touch the mainframe. So
effectively an air gapped system. A system like that would in these days
kill productivity.
However should there be 'virus checking' of executables  before they are
run on compute nodes.
One of the advantages lauded for Linux systems is of course that anti-virus
programs are not needed.

Also I should ask - in the jargon of anti-virus is there a 'signature' for
any of these exploit codes? One would guess that bad actors copy the
example codes already published and use these almost in a cut and paste
fashion. So the signature would be tight loops repeatedly reading or
writing to the same memory locations. Can that be distinguished from
innocent code?










On Sun, 19 Aug 2018 at 05:59, John Hearns  wrote:

>
> *To patch, or not to patch, that is the question:* Whether 'tis nobler in
> the mind to suffer
> The loops and branches of speculative execution,
> Or to take arms against a sea of exploits
> And by opposing end them. To die—to sleep,
> No more; and by a sleep to say we end
> The heart-ache and the thousand natural shocks
> That HPC is heir to: 'tis a consummation
> Devoutly to be wish'd. To die, to sleep
>
> On Sun, 19 Aug 2018 at 02:31, Chris Samuel  wrote:
>
>> On Sunday, 19 August 2018 5:19:07 AM AEST Jeff Johnson wrote:
>>
>> > With the spate of security flaws over the past year and the impacts
>> their
>> > fixes have on performance and functionality it might be worthwhile to
>> just
>> > run airgapped.
>>
>> For me none of the HPC systems I've been involved with here in Australia
>> would
>> have had that option.  Virtually all have external users and/or reliance
>> on
>> external data for some of the work they are used for (and the sysadmins
>> don't
>> usually have control over the projects & people who get to use them).
>>
>> All the best,
>> Chris
>> --
>>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>>
>>
>>
>> ___
>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread John Hearns via Beowulf
*To patch, or not to patch, that is the question:* Whether 'tis nobler in
the mind to suffer
The loops and branches of speculative execution,
Or to take arms against a sea of exploits
And by opposing end them. To die—to sleep,
No more; and by a sleep to say we end
The heart-ache and the thousand natural shocks
That HPC is heir to: 'tis a consummation
Devoutly to be wish'd. To die, to sleep

On Sun, 19 Aug 2018 at 02:31, Chris Samuel  wrote:

> On Sunday, 19 August 2018 5:19:07 AM AEST Jeff Johnson wrote:
>
> > With the spate of security flaws over the past year and the impacts their
> > fixes have on performance and functionality it might be worthwhile to
> just
> > run airgapped.
>
> For me none of the HPC systems I've been involved with here in Australia
> would
> have had that option.  Virtually all have external users and/or reliance
> on
> external data for some of the work they are used for (and the sysadmins
> don't
> usually have control over the projects & people who get to use them).
>
> All the best,
> Chris
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Chris Samuel
On Sunday, 19 August 2018 5:19:07 AM AEST Jeff Johnson wrote:

> With the spate of security flaws over the past year and the impacts their
> fixes have on performance and functionality it might be worthwhile to just
> run airgapped.

For me none of the HPC systems I've been involved with here in Australia would 
have had that option.  Virtually all have external users and/or reliance on 
external data for some of the work they are used for (and the sysadmins don't 
usually have control over the projects & people who get to use them).

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Chris Samuel
On Saturday, 18 August 2018 11:55:22 PM AEST Jörg Saßmannshausen wrote:

> So I don't really understand about "Cannot make this public, as the patch
> that caused it was due to embargo'd security fix." issue.

I don't think any of us do, unless there's another fix there that is for an 
undisclosed CVE (which seems unlikely).

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Joe Landman
FWIW: it looks like this is the CVE that keeps on giving. Yesterday some 
of the mitigation hit, and this morning a new rev of kernel with a 
single CVE patch came out.   Don't know when it might show up in distro 
kernels, but its already in mine.


We are not done with Spectre/Meltdown vulns by any stretch (no insider 
info, just a hypothesis).



On 08/18/2018 03:19 PM, Jeff Johnson wrote:
With the spate of security flaws over the past year and the impacts 
their fixes have on performance and functionality it might be 
worthwhile to just run airgapped.



On Thu, Aug 16, 2018 at 22:48 Chris Samuel > wrote:


Hi all,

Just a heads up that the 3.10.0-862.11.6.el7.x86_64 kernel from
RHEL/CentOS
that was released to address the most recent Intel CPU problem
"L1TF" seems to
break RDMA (found by a colleague here at Swinburne).   The
discovery came
about when testing the new kernel on a system running Lustre.

https://jira.whamcloud.com/browse/LU-11257

Stanford have reported it to Red Hat, but the BZ entry is locked
due to its
relationship with L1TF.

https://bugzilla.redhat.com/show_bug.cgi?id=1618452

Hope this helps folks out there..

All the best,
Chris
-- 
 Chris Samuel  : http://www.csamuel.org/ :  Melbourne, VIC




___
Beowulf mailing list, Beowulf@beowulf.org
 sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

--
--
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com 
www.aeoncomputing.com 
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite C - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage


___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


--
Joe Landman
e: joe.land...@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Jeff Johnson
With the spate of security flaws over the past year and the impacts their
fixes have on performance and functionality it might be worthwhile to just
run airgapped.


On Thu, Aug 16, 2018 at 22:48 Chris Samuel  wrote:

> Hi all,
>
> Just a heads up that the 3.10.0-862.11.6.el7.x86_64 kernel from
> RHEL/CentOS
> that was released to address the most recent Intel CPU problem "L1TF"
> seems to
> break RDMA (found by a colleague here at Swinburne).   The discovery came
> about when testing the new kernel on a system running Lustre.
>
> https://jira.whamcloud.com/browse/LU-11257
>
> Stanford have reported it to Red Hat, but the BZ entry is locked due to
> its
> relationship with L1TF.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1618452
>
> Hope this helps folks out there..
>
> All the best,
> Chris
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
-- 
--
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite C - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Jörg Saßmannshausen
Hi Chris,

unless there is something I miss but I read about that in 'Der Spiegel Online' 
on Wednesday 

http://www.spiegel.de/netzwelt/gadgets/foreshadow-neue-angriffsmethode-trifft-intel-chips-und-cloud-dienste-a-1223289.html

and the link was to this page here:

https://foreshadowattack.eu/

So I don't really understand about "Cannot make this public, as the patch that 
caused it was due to embargo'd security fix." issue. The problem is known and I 
also noticed that Debian issued some Intel microcode patches which raised my 
awareness about a potential problem again.

Sorry, maybe I miss out something here.

All the best

Jörg

Am Samstag, 18. August 2018, 21:56:52 BST schrieb Chris Samuel:
> On 18/8/18 8:47 pm, Jörg Saßmannshausen wrote:
> > Hi Chris,
> 
> Hiya,
> 
> > these are bad news if InfiniBand will be affected here as well as
> > that is what we need to use for parallel calculations. They make use
> > of RMDA and if that has a problem. well, you get the idea I
> > guess.
> 
> Oh yes, this is why I wanted to bring it to everyones attention, this
> isn't just about Lustre, it's much more widespread.
> 
> > Has anybody contacted the vendors like Mellanox or Intel regarding
> > this?
> 
> As Kilian wrote in the Lustre bug quoting his RHEL bug:
> 
>  https://bugzilla.redhat.com/show_bug.cgi?id=1618452
> 
>  — Comment #3 from Don Dutile  —
>  Already reported and being actively fixed.
> 
>  Cannot make this public, as the patch that caused it was due to
> embargo'd
>  security fix.
> 
>  This issue has highest priority for resolution.
>  Revert to 3.10.0-862.11.5.el7 in the mean time.
> 
>  This bug has been marked as a duplicate of bug 1616346

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Chris Samuel

On 18/8/18 8:47 pm, Jörg Saßmannshausen wrote:


Hi Chris,


Hiya,


these are bad news if InfiniBand will be affected here as well as
that is what we need to use for parallel calculations. They make use
of RMDA and if that has a problem. well, you get the idea I
guess.


Oh yes, this is why I wanted to bring it to everyones attention, this
isn't just about Lustre, it's much more widespread.


Has anybody contacted the vendors like Mellanox or Intel regarding
this?


As Kilian wrote in the Lustre bug quoting his RHEL bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1618452

— Comment #3 from Don Dutile  —
Already reported and being actively fixed.

Cannot make this public, as the patch that caused it was due to 
embargo'd

security fix.

This issue has highest priority for resolution.
Revert to 3.10.0-862.11.5.el7 in the mean time.

This bug has been marked as a duplicate of bug 1616346

--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Jörg Saßmannshausen
Hi Chris,

these are bad news if InfiniBand will be affected here as well as that is what 
we need to use for parallel calculations. They make use of RMDA and if that 
has a problem. well, you get the idea I guess.

Has anybody contacted the vendors like Mellanox or Intel regarding this?

All the beset

Jörg

Am Samstag, 18. August 2018, 18:31:35 BST schrieb Christopher Samuel:
> On 18/08/18 17:22, Jörg Saßmannshausen wrote:
> > if the problem is RMDA, how about InfiniBand? Will that be broken as
> > well?
> 
> For RDMA it appears yes, though IPoIB still works for us (though ours is
> OPA rather than IB Kilian reported the same).
> 
> All the best,
> Chris

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Christopher Samuel

On 18/08/18 17:22, Jörg Saßmannshausen wrote:

if the problem is RMDA, how about InfiniBand? Will that be broken as 
well?


For RDMA it appears yes, though IPoIB still works for us (though ours is
OPA rather than IB Kilian reported the same).

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Jörg Saßmannshausen
Dear all,

if the problem is RMDA, how about InfiniBand? Will that be broken as well?

All the best

Jörg

Am Samstag, 18. August 2018, 13:33:55 BST schrieb Chris Samuel:
> On Saturday, 18 August 2018 12:54:03 AM AEST Kilian Cavalotti wrote:
> > That's true: RH mentioned an "embargo'd security fix" but didn't refer
> > to L1TF explicitly (which I think is not under embargo anymore).
> 
> Agreed, though I'm not sure any of the listed fixes are embargoed now.
> 
> > As the reporter of the issue on the Whamcloud JIRA, I also have to
> > apologize for initially pointing fingers at Lustre, it didn't cross my
> > mind that this kind of whole RDMA stack breakage would have slipped
> > past Red Hat's QA.
> 
> Oh I didn't read that as pointing any fingers at Lustre at all, just that
> the kernel update broke Lustre for you (and for us!).
> 
> All the best,
> Chris

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-17 Thread Chris Samuel
On Saturday, 18 August 2018 12:54:03 AM AEST Kilian Cavalotti wrote:

> That's true: RH mentioned an "embargo'd security fix" but didn't refer
> to L1TF explicitly (which I think is not under embargo anymore).

Agreed, though I'm not sure any of the listed fixes are embargoed now.

> As the reporter of the issue on the Whamcloud JIRA, I also have to
> apologize for initially pointing fingers at Lustre, it didn't cross my
> mind that this kind of whole RDMA stack breakage would have slipped
> past Red Hat's QA.

Oh I didn't read that as pointing any fingers at Lustre at all, just that the 
kernel update broke Lustre for you (and for us!).

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-17 Thread Jörg Saßmannshausen
Hi all,

I came across the 'foreshadow' problem 2 days ago. 

This is what I got back from my colleagues:

https://access.redhat.com/security/vulnerabilities/L1TF-perf

This is more a performance investigation though but I thought I might add a 
bit more information to the whole problem.

All the best

Jörg

Am Freitag, 17. August 2018, 07:54:03 BST schrieb Kilian Cavalotti:
> Hi Chris,
> 
> On Thu, Aug 16, 2018 at 10:05 PM, Chris Samuel  wrote:
> > There's 6 CVE's addressed in that update from the look of it, so it might
> > not be the L1TF fix itself that has triggered it.
> > 
> > https://access.redhat.com/errata/RHSA-2018:2384
> 
> That's true: RH mentioned an "embargo'd security fix" but didn't refer
> to L1TF explicitly (which I think is not under embargo anymore).
> 
> As the reporter of the issue on the Whamcloud JIRA, I also have to
> apologize for initially pointing fingers at Lustre, it didn't cross my
> mind that this kind of whole RDMA stack breakage would have slipped
> past Red Hat's QA.
> 
> Cheers,

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-17 Thread Kilian Cavalotti
Hi Chris,

On Thu, Aug 16, 2018 at 10:05 PM, Chris Samuel  wrote:
> There's 6 CVE's addressed in that update from the look of it, so it might not
> be the L1TF fix itself that has triggered it.
>
> https://access.redhat.com/errata/RHSA-2018:2384

That's true: RH mentioned an "embargo'd security fix" but didn't refer
to L1TF explicitly (which I think is not under embargo anymore).

As the reporter of the issue on the Whamcloud JIRA, I also have to
apologize for initially pointing fingers at Lustre, it didn't cross my
mind that this kind of whole RDMA stack breakage would have slipped
past Red Hat's QA.

Cheers,
-- 
Kilian
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-16 Thread Chris Samuel
On Friday, 17 August 2018 2:47:37 PM AEST Chris Samuel wrote:

> Just a heads up that the 3.10.0-862.11.6.el7.x86_64 kernel from RHEL/CentOS
> that was released to address the most recent Intel CPU problem "L1TF" seems
> to break RDMA (found by a colleague here at Swinburne).

There's 6 CVE's addressed in that update from the look of it, so it might not 
be the L1TF fix itself that has triggered it.

https://access.redhat.com/errata/RHSA-2018:2384

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf