Re: [kernel-hardening] Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-12 Thread Kaiwan N Billimoria
On Mon, Nov 13, 2017 at 10:05 AM, Tobin C. Harding  wrote:
> On Mon, Nov 13, 2017 at 06:37:28AM +0300, Kirill A. Shutemov wrote:
>> On Mon, Nov 13, 2017 at 10:06:46AM +1100, Tobin C. Harding wrote:
>> > On Sun, Nov 12, 2017 at 02:10:07AM +0300, Kirill A. Shutemov wrote:
...
>> >
>> > Thanks for the link. So it looks like we need to refactor the kernel
>> > address regular expression into a function that takes into account the
>> > machine architecture and the number of page table levels. We will need
>> > to add this to the false positive checks also.
>> >
>> > > Not sure if we care. It won't work too for other 64-bit architectrues 
>> > > that
>> > > have more than 256TB of virtual address space.
>> >
>> > Is this because of the virtual memory map?
>>
>> On x86 direct mapping is the nearest thing we have to userspace.
>>
>> > Did you mean 512TB?
>>
>> No, I mean 256TB.
>>
>> You have all kernel memory in the range from 0x to
>> 0x if you have 256 TB of virtual address space. If you
>> hvae more, some thing might be ouside the range.
>
> Doesn't 4-level paging already limit a system to 64TB of memory? So any
> system better equipped than this will use 5-level paging right? If I am
> totally talking rubbish please ignore, I'm appreciative that you pointed
> out the limitation already. Perhaps we can add a comment to the script
>
> # Script may miss some addresses on machines with more than 256TB of
> # memory.

I think the 256TB is wrt *virtual* address space not physical RAM.

Also, IMHO, the script should 'transparently' take into account the # of paging
levels (instead of the user needing to pass a parameter).
IOW it should be able to detect the same (say, from the .config file) and act
accordingly - in the sense, the regex's and associated logic would accordingly
differ.


Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-12 Thread Tobin C. Harding
On Mon, Nov 13, 2017 at 06:37:28AM +0300, Kirill A. Shutemov wrote:
> On Mon, Nov 13, 2017 at 10:06:46AM +1100, Tobin C. Harding wrote:
> > On Sun, Nov 12, 2017 at 02:10:07AM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> > > > Currently we are leaking addresses from the kernel to user space. This
> > > > script is an attempt to find some of those leakages. Script parses
> > > > `dmesg` output and /proc and /sys files for hex strings that look like
> > > > kernel addresses.
> > > > 
> > > > Only works for 64 bit kernels, the reason being that kernel addresses
> > > > on 64 bit kernels have '' as the leading bit pattern making greping
> > > > possible. On 32 kernels we don't have this luxury.
> > > 
> > > Well, it's not going to work as well as intented on x86 machine with
> > > 5-level paging. Kernel address space there starts at 0xff10.
> > > It will still catch pointers to kernel/modules text, but the rest is
> > > outside of 0x... space. See Documentation/x86/x86_64/mm.txt.
> > 
> > Thanks for the link. So it looks like we need to refactor the kernel
> > address regular expression into a function that takes into account the
> > machine architecture and the number of page table levels. We will need
> > to add this to the false positive checks also.
> > 
> > > Not sure if we care. It won't work too for other 64-bit architectrues that
> > > have more than 256TB of virtual address space.
> > 
> > Is this because of the virtual memory map?
> 
> On x86 direct mapping is the nearest thing we have to userspace.
> 
> > Did you mean 512TB?
> 
> No, I mean 256TB.
> 
> You have all kernel memory in the range from 0x to
> 0x if you have 256 TB of virtual address space. If you
> hvae more, some thing might be ouside the range.

Doesn't 4-level paging already limit a system to 64TB of memory? So any
system better equipped than this will use 5-level paging right? If I am
totally talking rubbish please ignore, I'm appreciative that you pointed
out the limitation already. Perhaps we can add a comment to the script

# Script may miss some addresses on machines with more than 256TB of
# memory.

thanks,
Tobin.


Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-12 Thread Kirill A. Shutemov
On Mon, Nov 13, 2017 at 10:06:46AM +1100, Tobin C. Harding wrote:
> On Sun, Nov 12, 2017 at 02:10:07AM +0300, Kirill A. Shutemov wrote:
> > On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> > > Currently we are leaking addresses from the kernel to user space. This
> > > script is an attempt to find some of those leakages. Script parses
> > > `dmesg` output and /proc and /sys files for hex strings that look like
> > > kernel addresses.
> > > 
> > > Only works for 64 bit kernels, the reason being that kernel addresses
> > > on 64 bit kernels have '' as the leading bit pattern making greping
> > > possible. On 32 kernels we don't have this luxury.
> > 
> > Well, it's not going to work as well as intented on x86 machine with
> > 5-level paging. Kernel address space there starts at 0xff10.
> > It will still catch pointers to kernel/modules text, but the rest is
> > outside of 0x... space. See Documentation/x86/x86_64/mm.txt.
> 
> Thanks for the link. So it looks like we need to refactor the kernel
> address regular expression into a function that takes into account the
> machine architecture and the number of page table levels. We will need
> to add this to the false positive checks also.
> 
> > Not sure if we care. It won't work too for other 64-bit architectrues that
> > have more than 256TB of virtual address space.
> 
> Is this because of the virtual memory map?

On x86 direct mapping is the nearest thing we have to userspace.

> Did you mean 512TB?

No, I mean 256TB.

You have all kernel memory in the range from 0x to
0x if you have 256 TB of virtual address space. If you
hvae more, some thing might be ouside the range.

-- 
 Kirill A. Shutemov


Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-12 Thread Tobin C. Harding
On Sun, Nov 12, 2017 at 02:10:07AM +0300, Kirill A. Shutemov wrote:
> On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> > 
> > Only works for 64 bit kernels, the reason being that kernel addresses
> > on 64 bit kernels have '' as the leading bit pattern making greping
> > possible. On 32 kernels we don't have this luxury.
> 
> Well, it's not going to work as well as intented on x86 machine with
> 5-level paging. Kernel address space there starts at 0xff10.
> It will still catch pointers to kernel/modules text, but the rest is
> outside of 0x... space. See Documentation/x86/x86_64/mm.txt.

Thanks for the link. So it looks like we need to refactor the kernel
address regular expression into a function that takes into account the
machine architecture and the number of page table levels. We will need
to add this to the false positive checks also.

> Not sure if we care. It won't work too for other 64-bit architectrues that
> have more than 256TB of virtual address space.

Is this because of the virtual memory map? Did you mean 512TB?

from mm.txt:
ffd4 - ffd5 (=49 bits) virtual memory map (512TB)

Perhaps an option (--terse) that only checks the virtual memory map
range (above for 5-level paging) and

ea00 - eaff (=40 bits) virtual memory map (1TB)

for 4-level paging?

> Just wanted to point to the limitation.

Appreciate it, thanks.

Tobin.



Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-11 Thread Kirill A. Shutemov
On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
> 
> Only works for 64 bit kernels, the reason being that kernel addresses
> on 64 bit kernels have '' as the leading bit pattern making greping
> possible. On 32 kernels we don't have this luxury.

Well, it's not going to work as well as intented on x86 machine with
5-level paging. Kernel address space there starts at 0xff10.
It will still catch pointers to kernel/modules text, but the rest is
outside of 0x... space. See Documentation/x86/x86_64/mm.txt.

Not sure if we care. It won't work too for other 64-bit architectrues that
have more than 256TB of virtual address space.

Just wanted to point to the limitation.

-- 
 Kirill A. Shutemov


Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-07 Thread Linus Torvalds
On Tue, Nov 7, 2017 at 12:58 PM, Tobin C. Harding  wrote:
>
> Interesting idea. Isn't the same outcome already achieved with
> dmesg_restrict. I appreciate that this does beg the question 'why are we
> scanning dmesg then?'

dmesg_restrict is even more asinine than kptr_restrict.

It's a completely idiotic flag, only useful for distributions that
then also refuse to show system journals to regular users.

And such distributions are garbage, since that also effectively means
that users can't sanely make bug reports etc.

In other words, the whole 'dmesg_restrict' is the _classic_ case of
so-called "security" people who make bad decisions, and play security
theater.

This is exactly the kind of crap that the grsecurity people came up
with, and I'm sorry it was ever back-ported into the mainline kernel,
because it's f*cking retarded.

I often wish that security people used their brains more than they
actually seem to do.

Because a lot of them don't actually seem to ever look at the big
picture, and they do these kinds of security theater garbage patches
that don't actually help anything what-so-ever, but make people say
"that's good security".

And yes, the same would _very_ much be true of anything that just
hides the pointers from users when they read dmesg. It wouldn't be
sufficient to change the kernel, you also would have to change every
single program that implements system logging, and once you did that,
you'd have screwed up system debuggability.

So really, people - start thinking critically about security. That
VERY MUCH also means starting to thinking critically about things that
people _claim_ are a security feature.

   Linus


Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-07 Thread Tobin C. Harding
On Tue, Nov 07, 2017 at 01:56:05PM +, David Laight wrote:
> From: Tobin C. Harding
> > Sent: 07 November 2017 10:32
> >
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> ...
> 
> Maybe the %p that end up in dmesg (via the kernel message buffer) should
> be converted to text in a form that allows the code that reads them to
> substitute alternate text for non-root users?
>
> Then the actual addresses will be available to root (who can probably
> get most by other means) but not to the casual observer.

Interesting idea. Isn't the same outcome already achieved with
dmesg_restrict. I appreciate that this does beg the question 'why are we
scanning dmesg then?'

There has not been much discussion on dmesg_restrict. Is dmesg_restrict
good enough that we needn't bother scanning it?

thanks for your input,
Tobin.


Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-07 Thread Tobin C. Harding
On Tue, Nov 07, 2017 at 11:50:27AM +0100, Greg KH wrote:
> On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> > 
> > Only works for 64 bit kernels, the reason being that kernel addresses
> > on 64 bit kernels have '' as the leading bit pattern making greping
> > possible. On 32 kernels we don't have this luxury.
> > 
> > Scripts is _slightly_ smarter than a straight grep, we check for false
> > positives (all 0's or all 1's, and vsyscall start/finish addresses).
> > 
> > Output is saved to file to expedite repeated formatting/viewing of
> > output.
> > 
> > Signed-off-by: Tobin C. Harding 
> > ---
> > 
> > This version outputs a report instead of the raw results by default. 
> > Designing
> > this proved to be non-trivial, the reason being that it is not immediately 
> > clear
> > what constitutes a duplicate entry (similar message, address range, same
> > file?). Also, the aim of the report is to assist users _not_ missing correct
> > results; limiting the output is inherently a trade off between noise and
> > correct, clear results.
> > 
> > Without testing on various real kernels its not clear that this reporting 
> > is any
> > good, my test cases were a bit contrived. Your usage may vary.
> > 
> > It would be super helpful to get some comments from people running this with
> > different set ups.
> > 
> > Please feel free to say 'try harder Tobin, this reporting is shit'.
> > 
> > Thanks, appreciate your time,
> > Tobin.
> > 
> > v4:
> >  - Add `scan` and `format` sub-commands.
> >  - Output report by default.
> >  - Add command line option to send scan results (to me).
> 
> As the script is already in Linus's tree, you might need to send a patch
> on top of that, instead of this one, as this one will not apply anymore.

Your awareness of what is going on never ceases to amaze me Greg, you're
the man.

thanks,
Tobin.



Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-07 Thread Tobin C. Harding
On Tue, Nov 07, 2017 at 04:51:29PM +0100, Petr Mladek wrote:
> On Tue 2017-11-07 21:32:11, Tobin C. Harding wrote:
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> > 
> > Only works for 64 bit kernels, the reason being that kernel addresses
> > on 64 bit kernels have '' as the leading bit pattern making greping
> > possible. On 32 kernels we don't have this luxury.
> > 
> > Scripts is _slightly_ smarter than a straight grep, we check for false
> > positives (all 0's or all 1's, and vsyscall start/finish addresses).
> > 
> > Output is saved to file to expedite repeated formatting/viewing of
> > output.
> > 
> > diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
> > new file mode 100755
> > index ..282c0cc2bdea
> > --- /dev/null
> > +++ b/scripts/leaking_addresses.pl
> > +sub help
> > +{
> > +   my ($exitcode) = @_;
> > +
> > +   print << "EOM";
> > +Usage: $P COMMAND [OPTIONS]
> > +Version: $V
> > +
> > +Commands:
> > +
> > +   scanScan the kernel (savesg raw results to file and runs `format`).
> > +   format  Parse results file and format output.
> > +
> > +Options:
> > +   -o, --output=  Accepts absolute or relative filename or 
> > directory name.
> 
> IMHO, this is pretty non-standard. I would support only -o file. Then you do
> not need to solve problems with replacing an existing file. The user
> would know exactly what file will be generated.
> 
> 
> > +   --suppress-dmesg Don't show dmesg results.
> 
> The apostrophe breaks highlighting of the rest of the code ;-)
> 
> 
> > +   --squash-by-path Show one result per unique path.
> > +   --rawShow raw results.
> > +   --send-reportSubmit raw results for someone else to worry 
> > about.
> > +   -d, --debug  Display debugging output.
> > +   -h, --help, --versionDisplay this help and exit.
> > +
> > +Scans the running (64 bit) kernel for potential leaking addresses.
> > +}
> 
> This bracket should not be here. The help text is limited
> by "EOM" below.
> 
> 
> > +
> > +EOM
> > +   exit($exitcode);
> > +}
> 
> [...]
> 
> > +sub cache_path
> > +{
> > +my ($paths, $line) = @_;
> > +
> > +my $index = index($line, ':');
> 
> There are paths with the double dot, for example:
> /sys/devices/pci:00/:00:1d.0/usb2/2-1/2-1.6/2-1.6:1.0/input/input4/uevent
> Then the file name is wrongly detected, in my example as "pci"
> 
> It seems that searching for ": " sub-string works rather well.
> I mean using:
> 
>   my $index = index($line, ': ');
> 
> > +my $path = substr($line, 0, $index);
> > +
> > +if (!$paths->{$path}) {
> > +$paths->{$path} = ();
> > +}
> > +push @{$paths->{$path}}, $line;
> 
> It would make sense to use the same trick from cache_filename
> and remove path from the cached text. I mean:
> 
>   $index += 2;# skip ': '
>   push @{$paths->{$path}}, substr($line, $index);
> 
> > +}
> > +
> > +sub cache_filename
> > +{
> > +my ($files, $line) = @_;
> > +
> > +my $index = index($line, ':');
> 
> Same problem with the double dot in the path name.
> The following helped me:
> 
>   my $index = index($line, ': ');
> 
> > +my $path = substr($line, 0, $index);
> > +my $filename = basename($path);
> > +if (!$files->{$filename}) {
> > +$files->{$filename} = ();
> > +}
> > +$index += 2;# skip ': '
> > +push @{$files->{$filename}}, substr($line, $index);
> > +}
> 
> This is what caught my eye when trying the script.

Awesome. Thank you very much. All comments will be addressed for the
next spin.

thanks,
Tobin.


Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-07 Thread Petr Mladek
On Tue 2017-11-07 21:32:11, Tobin C. Harding wrote:
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
> 
> Only works for 64 bit kernels, the reason being that kernel addresses
> on 64 bit kernels have '' as the leading bit pattern making greping
> possible. On 32 kernels we don't have this luxury.
> 
> Scripts is _slightly_ smarter than a straight grep, we check for false
> positives (all 0's or all 1's, and vsyscall start/finish addresses).
> 
> Output is saved to file to expedite repeated formatting/viewing of
> output.
> 
> diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
> new file mode 100755
> index ..282c0cc2bdea
> --- /dev/null
> +++ b/scripts/leaking_addresses.pl
> +sub help
> +{
> + my ($exitcode) = @_;
> +
> + print << "EOM";
> +Usage: $P COMMAND [OPTIONS]
> +Version: $V
> +
> +Commands:
> +
> + scanScan the kernel (savesg raw results to file and runs `format`).
> + format  Parse results file and format output.
> +
> +Options:
> + -o, --output=  Accepts absolute or relative filename or 
> directory name.

IMHO, this is pretty non-standard. I would support only -o file. Then you do
not need to solve problems with replacing an existing file. The user
would know exactly what file will be generated.


> + --suppress-dmesg Don't show dmesg results.

The apostrophe breaks highlighting of the rest of the code ;-)


> + --squash-by-path Show one result per unique path.
> + --rawShow raw results.
> + --send-reportSubmit raw results for someone else to worry 
> about.
> + -d, --debug  Display debugging output.
> + -h, --help, --versionDisplay this help and exit.
> +
> +Scans the running (64 bit) kernel for potential leaking addresses.
> +}

This bracket should not be here. The help text is limited
by "EOM" below.


> +
> +EOM
> + exit($exitcode);
> +}

[...]

> +sub cache_path
> +{
> +my ($paths, $line) = @_;
> +
> +my $index = index($line, ':');

There are paths with the double dot, for example:
/sys/devices/pci:00/:00:1d.0/usb2/2-1/2-1.6/2-1.6:1.0/input/input4/uevent
Then the file name is wrongly detected, in my example as "pci"

It seems that searching for ": " sub-string works rather well.
I mean using:

my $index = index($line, ': ');

> +my $path = substr($line, 0, $index);
> +
> +if (!$paths->{$path}) {
> +$paths->{$path} = ();
> +}
> +push @{$paths->{$path}}, $line;

It would make sense to use the same trick from cache_filename
and remove path from the cached text. I mean:

$index += 2;# skip ': '
push @{$paths->{$path}}, substr($line, $index);

> +}
> +
> +sub cache_filename
> +{
> +my ($files, $line) = @_;
> +
> +my $index = index($line, ':');

Same problem with the double dot in the path name.
The following helped me:

my $index = index($line, ': ');

> +my $path = substr($line, 0, $index);
> +my $filename = basename($path);
> +if (!$files->{$filename}) {
> +$files->{$filename} = ();
> +}
> +$index += 2;# skip ': '
> +push @{$files->{$filename}}, substr($line, $index);
> +}

This is what caught my eye when trying the script.

Best Regards,
Petr


RE: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-07 Thread David Laight
From: Tobin C. Harding
> Sent: 07 November 2017 10:32
>
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
...

Maybe the %p that end up in dmesg (via the kernel message buffer) should
be converted to text in a form that allows the code that reads them to
substitute alternate text for non-root users?

Then the actual addresses will be available to root (who can probably
get most by other means) but not to the casual observer.

David



Re: [PATCH v4] scripts: add leaking_addresses.pl

2017-11-07 Thread Greg KH
On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
> 
> Only works for 64 bit kernels, the reason being that kernel addresses
> on 64 bit kernels have '' as the leading bit pattern making greping
> possible. On 32 kernels we don't have this luxury.
> 
> Scripts is _slightly_ smarter than a straight grep, we check for false
> positives (all 0's or all 1's, and vsyscall start/finish addresses).
> 
> Output is saved to file to expedite repeated formatting/viewing of
> output.
> 
> Signed-off-by: Tobin C. Harding 
> ---
> 
> This version outputs a report instead of the raw results by default. Designing
> this proved to be non-trivial, the reason being that it is not immediately 
> clear
> what constitutes a duplicate entry (similar message, address range, same
> file?). Also, the aim of the report is to assist users _not_ missing correct
> results; limiting the output is inherently a trade off between noise and
> correct, clear results.
> 
> Without testing on various real kernels its not clear that this reporting is 
> any
> good, my test cases were a bit contrived. Your usage may vary.
> 
> It would be super helpful to get some comments from people running this with
> different set ups.
> 
> Please feel free to say 'try harder Tobin, this reporting is shit'.
> 
> Thanks, appreciate your time,
> Tobin.
> 
> v4:
>  - Add `scan` and `format` sub-commands.
>  - Output report by default.
>  - Add command line option to send scan results (to me).

As the script is already in Linus's tree, you might need to send a patch
on top of that, instead of this one, as this one will not apply anymore.

thanks,

greg k-h