Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-09-02 Thread Aleksander Alekseev
> I'll mark this as "returned with feedback". I'd be happy to take a patch 
> that helps to reduce sanitizer complaints, but this seems to need some work.
> 
> Aleksander, how did you run the sanitizer? I tried to build with clang 
> 4.0, with the -fsanitize=memory option, and ran "make 
> installcheck-parallel", but I didn't get any sanitizer errors out of it. 
> I did get some errors, from failing to load "regress.so", though:
> 
> ERROR:  could not load library 
> "/home/heikki/git-sandbox-pgsql/master/src/test/regress/regress.so": 
> /home/heikki/git-sandbox-pgsql/master/src/test/regress/regress.so: 
> undefined symbol: __msan_va_arg_overflow_size_tls
> 
> How did you do it?

It's quite simple actually [1][2]. I've just re-checked on Ubuntu 16.04
and Clang 3.8:

```
sudo apt-get install clang git make flex bison libreadline-dev \
  zlib1g-dev jade
git clone http://git.postgresql.org/git/postgresql.git
cd postgresql
CC=/usr/bin/clang CFLAGS="-fsanitize=memory -fPIE -pie" ./configure
make -j4 -s
MSAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-3.8 make check
```

Stacktraces are written to src/test/regress/log/initdb.log.

You can add `printf("%d\n", getpid())` and `sleep(1000)` calls somewhere
in main() procedure. It will give you some time to connect using debugger.
IIRC it's what I did.

[1] http://clang.llvm.org/docs/MemorySanitizer.html
[2] https://github.com/google/sanitizers/wiki/MemorySanitizer

-- 
Best regards,
Aleksander Alekseev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-09-02 Thread Heikki Linnakangas

On 08/20/2016 10:46 PM, Tom Lane wrote:

Noah Misch  writes:

On Fri, Aug 19, 2016 at 07:22:02PM -0400, Tom Lane wrote:

So maybe we ought to work towards taking those out?



Maybe.  It's a policy question boiling down to our willingness to spend cycles
zeroing memory in order to limit trust in the confidentiality of storage
backing the data directory.  Think of "INSERT INTO t VALUES(my_encrypt('key',
'cleartext'))", subsequent to which bits of the key or cleartext leak to disk
by way of WAL padding bytes.  A reasonable person might expect that not to
happen; GNU Privacy Guard and a few like-minded programs prevent it.  I'm on
the fence regarding whether PostgreSQL should target this level of vigilance.
An RDBMS is mainly a tool for managing persistent data, and PostgreSQL will
never be a superior tool for data that _must not_ persist.  Having said that,
the runtime cost of zeroing memory and the development cost of reviewing the
patches to do so is fairly low.


[ after thinking some more about this... ]

FWIW, I put pretty much zero credence in the proposition that junk left in
padding bytes in WAL or data files represents a meaningful security issue.
An attacker who has access to those files will probably find much more
that is of interest in the non-pad data.  My only interest here is in
making the code sanitizer-clean, which seems like it is useful for
debugging purposes independently of any security arguments.


Yeah, that's how I view these, too.


So to me, it seems like the core of this complaint boils down to "my
sanitizer doesn't understand the valgrind exclusion patterns that have
been created for Postgres".  We can address that to some extent by trying
to reduce the number of valgrind exclusions we need, but it's unlikely to
be practical to get that to zero, and it's not very clear that adding
runtime cycles is a good tradeoff for it either.  So maybe we need to push
back on the assumption that people should expect their sanitizers to
produce zero warnings without having made some effort to adapt the
valgrind rules.


I'll mark this as "returned with feedback". I'd be happy to take a patch 
that helps to reduce sanitizer complaints, but this seems to need some work.


Aleksander, how did you run the sanitizer? I tried to build with clang 
4.0, with the -fsanitize=memory option, and ran "make 
installcheck-parallel", but I didn't get any sanitizer errors out of it. 
I did get some errors, from failing to load "regress.so", though:


ERROR:  could not load library 
"/home/heikki/git-sandbox-pgsql/master/src/test/regress/regress.so": 
/home/heikki/git-sandbox-pgsql/master/src/test/regress/regress.so: 
undefined symbol: __msan_va_arg_overflow_size_tls


How did you do it?

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-22 Thread Andres Freund
On 2016-08-22 13:49:47 -0400, Robert Haas wrote:
> On Mon, Aug 22, 2016 at 1:46 PM, Andres Freund  wrote:
> > I don't think the runtime overhead is likely to be all that high - if
> > you look at valgrind.supp the peformancecritical parts basically are:
> > - pgstat_send - the context switching is going to drown out some zeroing
> > - xlog insertions - making the crc computation more predictable would
> >   actually be nice
> > - reorderbuffer serialization - zeroing won't be a material part of the
> >   cost
> >
> > The rest is mostly bootstrap or python related.
> >
> > There might be cases where we *don't* unconditionally do the zeroing -
> > e.g. I'm doubtful about the sinval stuff where we currently only
> > conditionally clear - but the stuff in valgrind.supp seems fine.
> 
> Naturally you'll be wanting to conclusively demonstrate this with
> benchmarks on multiple workloads, platforms, and concurrency levels.
> Right?  :-)

Pah ;)

I do think some micro-benchmarks aiming at the individual costs make
sense, we're only taking about ~three places in the code - don't think
concurrency plays a large role though ;)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-22 Thread Robert Haas
On Mon, Aug 22, 2016 at 1:46 PM, Andres Freund  wrote:
> I don't think the runtime overhead is likely to be all that high - if
> you look at valgrind.supp the peformancecritical parts basically are:
> - pgstat_send - the context switching is going to drown out some zeroing
> - xlog insertions - making the crc computation more predictable would
>   actually be nice
> - reorderbuffer serialization - zeroing won't be a material part of the
>   cost
>
> The rest is mostly bootstrap or python related.
>
> There might be cases where we *don't* unconditionally do the zeroing -
> e.g. I'm doubtful about the sinval stuff where we currently only
> conditionally clear - but the stuff in valgrind.supp seems fine.

Naturally you'll be wanting to conclusively demonstrate this with
benchmarks on multiple workloads, platforms, and concurrency levels.
Right?  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-22 Thread Andres Freund
On 2016-08-22 13:16:34 -0400, Robert Haas wrote:
> On Sat, Aug 20, 2016 at 3:46 PM, Tom Lane  wrote:
> > So to me, it seems like the core of this complaint boils down to "my
> > sanitizer doesn't understand the valgrind exclusion patterns that have
> > been created for Postgres".  We can address that to some extent by trying
> > to reduce the number of valgrind exclusions we need, but it's unlikely to
> > be practical to get that to zero, and it's not very clear that adding
> > runtime cycles is a good tradeoff for it either.  So maybe we need to push
> > back on the assumption that people should expect their sanitizers to
> > produce zero warnings without having made some effort to adapt the
> > valgrind rules.

I don't think the runtime overhead is likely to be all that high - if
you look at valgrind.supp the peformancecritical parts basically are:
- pgstat_send - the context switching is going to drown out some zeroing
- xlog insertions - making the crc computation more predictable would
  actually be nice
- reorderbuffer serialization - zeroing won't be a material part of the
  cost

The rest is mostly bootstrap or python related.

There might be cases where we *don't* unconditionally do the zeroing -
e.g. I'm doubtful about the sinval stuff where we currently only
conditionally clear - but the stuff in valgrind.supp seems fine.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-22 Thread Robert Haas
On Sat, Aug 20, 2016 at 3:46 PM, Tom Lane  wrote:
> Noah Misch  writes:
>> On Fri, Aug 19, 2016 at 07:22:02PM -0400, Tom Lane wrote:
>>> So maybe we ought to work towards taking those out?
>
>> Maybe.  It's a policy question boiling down to our willingness to spend 
>> cycles
>> zeroing memory in order to limit trust in the confidentiality of storage
>> backing the data directory.  Think of "INSERT INTO t VALUES(my_encrypt('key',
>> 'cleartext'))", subsequent to which bits of the key or cleartext leak to disk
>> by way of WAL padding bytes.  A reasonable person might expect that not to
>> happen; GNU Privacy Guard and a few like-minded programs prevent it.  I'm on
>> the fence regarding whether PostgreSQL should target this level of vigilance.
>> An RDBMS is mainly a tool for managing persistent data, and PostgreSQL will
>> never be a superior tool for data that _must not_ persist.  Having said that,
>> the runtime cost of zeroing memory and the development cost of reviewing the
>> patches to do so is fairly low.
>
> [ after thinking some more about this... ]
>
> FWIW, I put pretty much zero credence in the proposition that junk left in
> padding bytes in WAL or data files represents a meaningful security issue.
> An attacker who has access to those files will probably find much more
> that is of interest in the non-pad data.  My only interest here is in
> making the code sanitizer-clean, which seems like it is useful for
> debugging purposes independently of any security arguments.
>
> So to me, it seems like the core of this complaint boils down to "my
> sanitizer doesn't understand the valgrind exclusion patterns that have
> been created for Postgres".  We can address that to some extent by trying
> to reduce the number of valgrind exclusions we need, but it's unlikely to
> be practical to get that to zero, and it's not very clear that adding
> runtime cycles is a good tradeoff for it either.  So maybe we need to push
> back on the assumption that people should expect their sanitizers to
> produce zero warnings without having made some effort to adapt the
> valgrind rules.

One idea is to add protect additional memory-clearing operations with
#ifdef SANITIZER_CLEAN or something of that sort.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-20 Thread Tom Lane
Noah Misch  writes:
> On Fri, Aug 19, 2016 at 07:22:02PM -0400, Tom Lane wrote:
>> So maybe we ought to work towards taking those out?

> Maybe.  It's a policy question boiling down to our willingness to spend cycles
> zeroing memory in order to limit trust in the confidentiality of storage
> backing the data directory.  Think of "INSERT INTO t VALUES(my_encrypt('key',
> 'cleartext'))", subsequent to which bits of the key or cleartext leak to disk
> by way of WAL padding bytes.  A reasonable person might expect that not to
> happen; GNU Privacy Guard and a few like-minded programs prevent it.  I'm on
> the fence regarding whether PostgreSQL should target this level of vigilance.
> An RDBMS is mainly a tool for managing persistent data, and PostgreSQL will
> never be a superior tool for data that _must not_ persist.  Having said that,
> the runtime cost of zeroing memory and the development cost of reviewing the
> patches to do so is fairly low.

[ after thinking some more about this... ]

FWIW, I put pretty much zero credence in the proposition that junk left in
padding bytes in WAL or data files represents a meaningful security issue.
An attacker who has access to those files will probably find much more
that is of interest in the non-pad data.  My only interest here is in
making the code sanitizer-clean, which seems like it is useful for
debugging purposes independently of any security arguments.

So to me, it seems like the core of this complaint boils down to "my
sanitizer doesn't understand the valgrind exclusion patterns that have
been created for Postgres".  We can address that to some extent by trying
to reduce the number of valgrind exclusions we need, but it's unlikely to
be practical to get that to zero, and it's not very clear that adding
runtime cycles is a good tradeoff for it either.  So maybe we need to push
back on the assumption that people should expect their sanitizers to
produce zero warnings without having made some effort to adapt the
valgrind rules.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-19 Thread Tom Lane
Andres Freund  writes:
> On 2016-08-19 17:55:25 -0400, Tom Lane wrote:
>> It'd be useful also to figure out why our existing valgrind testing has
>> not caught this already.  The example you give looks like it surely
>> ought to be replicated well enough in the standard regression tests.

> The valgrind suppression file explicitly hides a bunch of padding
> issues.

So maybe we ought to work towards taking those out?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-19 Thread Andres Freund
On 2016-08-19 17:55:25 -0400, Tom Lane wrote:
> It'd be useful also to figure out why our existing valgrind testing has
> not caught this already.  The example you give looks like it surely
> ought to be replicated well enough in the standard regression tests.

The valgrind suppression file explicitly hides a bunch of padding
issues.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-19 Thread Piotr Stefaniak
On 2016-08-19 23:55, Tom Lane wrote:
> I think you are failing to understand Heikki's point.  There is no way
> we are committing the change depicted above, because (1) it will mask more
> bugs than it fixes; (2) it's an enormously expensive way to fix anything;
> and (3) it will effectively disable valgrind testing for missed
> initializations.

I wasn't disagreeing with Heikki, I was trying to show that while 
Aleksander's patch may be useless and perhaps harmful if committed, it 
is not useless in a larger perspective as it has made people look into 
an issue.

And I did that simply because I have more changes of that kind that may 
end up being deemed as useless for committing, but I want to share them 
with -hackers anyway.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-19 Thread Tom Lane
Piotr Stefaniak  writes:
> On 2016-08-18 20:02, Heikki Linnakangas wrote:
>>> -block = (AllocBlock) malloc(blksize);
>>> +block = (AllocBlock) calloc(1, blksize);

>> I think this goes too far. You're zeroing all palloc'd memory, even if
>> it's going to be passed to palloc0(), and zeroed there. It might even
>> silence legitimate warnings, if there's code somewhere that does
>> palloc(), and accesses some of it before initializing. Plus it's a
>> performance hit.

> I just did a test where I [ this n that ]

I think you are failing to understand Heikki's point.  There is no way
we are committing the change depicted above, because (1) it will mask more
bugs than it fixes; (2) it's an enormously expensive way to fix anything;
and (3) it will effectively disable valgrind testing for missed
initializations.

The right thing to do is find out what downstream bit of code is missing
initializing a block of memory it's working on, and fix that localized
oversight.

It'd be useful also to figure out why our existing valgrind testing has
not caught this already.  The example you give looks like it surely
ought to be replicated well enough in the standard regression tests.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-19 Thread Piotr Stefaniak
On 2016-08-18 20:02, Heikki Linnakangas wrote:
> On 03/22/2016 03:27 PM, Aleksander Alekseev wrote:
>> diff --git a/src/backend/utils/mmgr/aset.c
>> b/src/backend/utils/mmgr/aset.c
>> index d26991e..46ab8a2 100644
>> --- a/src/backend/utils/mmgr/aset.c
>> +++ b/src/backend/utils/mmgr/aset.c
>> @@ -850,7 +850,7 @@ AllocSetAlloc(MemoryContext context, Size size)
>>  blksize <<= 1;
>>
>>  /* Try to allocate it */
>> -block = (AllocBlock) malloc(blksize);
>> +block = (AllocBlock) calloc(1, blksize);
>>
>>  /*
>>   * We could be asking for pretty big blocks here, so cope if
>> malloc

>
> I think this goes too far. You're zeroing all palloc'd memory, even if
> it's going to be passed to palloc0(), and zeroed there. It might even
> silence legitimate warnings, if there's code somewhere that does
> palloc(), and accesses some of it before initializing. Plus it's a
> performance hit.

I just did a test where I
1. memset() that block to 0xAC (aset.c:853)
2. compile and run bin/initdb, then bin/postgres
2. run an SQL file, shut down bin/postgres
3. make a copy of the transaction log file
4. change the memset() to 0x0C, repeat steps 2-3
5. compare the two transaction log files with a combination of 
hexdump(1), cut(1), and diff(1).

At the end of the output I can see:

-0f34 0010  f5ff ac02 000a  
+0f34 0010  f5ff 0c02 000a  

So it looks like the MSan complaint might be a true positive.

The SQL file is just this snippet from bit.sql:
CREATE TABLE varbit_table (a BIT VARYING(16), b BIT VARYING(16));
COPY varbit_table FROM stdin;
X0F X10
X1F X11
X2F X12
X3F X13
X8F X04
X000F   X0010
X0123   X
X2468   X2468
XFA50   X05AF
X1234   XFFF5
\.

I realize given information might be a little bit scarce, but I didn't 
know what else might be interesting to you that you wouldn't be able to 
reproduce.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-19 Thread Andres Freund


On August 19, 2016 2:50:30 AM PDT, Aleksander Alekseev 
 wrote:
>Heikki, Peter, thanks a lot for code review!
>
>> What's going on here? Surely pg_atomic_init_u64() should initialize
>> the value?
>
>It's because of how pg_atomic_exchange_u64_impl is implemented:
>
>```
>while (true)
>{   
>old = ptr->value; /* <-- reading of uninitialized value! */
>if (pg_atomic_compare_exchange_u64_impl(ptr, , xchg_))
>break;
>}
>```
>
>Currently pg_atomic_init_u64 works like this:
>
>pg_atomic_init_u64
>`- pg_atomic_init_u64_impl
>   `- pg_atomic_write_u64_impl
>  `- pg_atomic_exchange_u64_impl
>
>I suspect there is actually no need to make an atomic exchange during
>initialization of an atomic variable. Regular `mov` should be enough
>(IIRC there is no need to do `lock mov` since `mov` is already atomic).
>Anyway I don't feel brave enough right now to mess with atomic
>operations since it involves all sort of portability issues. So I
>removed this change for now.

There's platforms with atomic 8 byte compare exchange, without atomic 8 byte 
regular stores.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-19 Thread Aleksander Alekseev
Heikki, Peter, thanks a lot for code review!

> What's going on here? Surely pg_atomic_init_u64() should initialize
> the value?

It's because of how pg_atomic_exchange_u64_impl is implemented:

```
while (true)
{   
old = ptr->value; /* <-- reading of uninitialized value! */
if (pg_atomic_compare_exchange_u64_impl(ptr, , xchg_))
break;
}
```

Currently pg_atomic_init_u64 works like this:

pg_atomic_init_u64
`- pg_atomic_init_u64_impl
   `- pg_atomic_write_u64_impl
  `- pg_atomic_exchange_u64_impl

I suspect there is actually no need to make an atomic exchange during
initialization of an atomic variable. Regular `mov` should be enough
(IIRC there is no need to do `lock mov` since `mov` is already atomic).
Anyway I don't feel brave enough right now to mess with atomic
operations since it involves all sort of portability issues. So I
removed this change for now.

> Why does MemorySanitizer complain about that? Calling stat(2) is 
> supposed to fill in all the fields we look at, right?

> In addition to what Heikki wrote, I think the above is not necessary.

It's been my observation that MemorySanitizer often complains on passing
structures with uninitialized padding bytes to system calls. I'm not
sure whether libc really reads/copies structures in these cases or
MemorySanitizer doesn't like the idea in general.

Although it's true that maybe MemorySanitizer it too pedantic in these
cases, in some respect it's right. Since operating system is a black box
from our perspective (not mentioning that there are _many_ OS'es that
change all the time) in general case passing a structure with
uninitialized padding bytes to a system call can in theory cause a
non-repeatable behavior. For instance if there are caching and hash
calculation involved.

Also, as Chapman noted previously [1], according to PostgreSQL
documentation using structures with uninitialized padding bytes should
be avoided anyway.

I strongly believe that benefits of passing all MemorySanitizer checks
(possibility of discovering many complicated bugs automatically in this
case) many times outweighs drawbacks of tiny memset's overhead in these
concrete cases.

> I think this goes too far. You're zeroing all palloc'd memory, even
> if  it's going to be passed to palloc0(), and zeroed there. It might
> even  silence legitimate warnings, if there's code somewhere that
> does  palloc(), and accesses some of it before initializing. Plus
> it's a performance hit.

Completely agree, my bad. I removed these changes.

> Right after this, we have:
> VALGRIND_MAKE_MEM_DEFINED(, sizeof(msg));  
> Do we need both?

Apparently we don't. I removed VALGRIND_MAKE_MEM_DEFINED here since now
there are no uninitialized padding bytes in 


Corrected patch is attached. If you have any other ideas how it could
be improved please let me know!

[1]
https://www.postgresql.org/message-id/56EFF347.20500%40anastigmatix.net

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index 01c7ef7..fb5c026 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -337,6 +337,7 @@ gistXLogSplit(bool page_is_leaf,
 	for (ptr = dist; ptr; ptr = ptr->next)
 		npage++;
 
+	memset(, 0, sizeof(xlrec));
 	xlrec.origrlink = origrlink;
 	xlrec.orignsn = orignsn;
 	xlrec.origleaf = page_is_leaf;
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index f090ca5..1275b6c 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -204,6 +204,7 @@ addLeafTuple(Relation index, SpGistState *state, SpGistLeafTuple leafTuple,
 {
 	spgxlogAddLeaf xlrec;
 
+	memset(, 0, sizeof(xlrec));
 	xlrec.newPage = isNew;
 	xlrec.storesNulls = isNulls;
 
@@ -448,6 +449,8 @@ moveLeafs(Relation index, SpGistState *state,
 		i = it->nextOffset;
 	}
 
+	memset(, 0, sizeof(xlrec));
+
 	/* Find a leaf page that will hold them */
 	nbuf = SpGistGetBuffer(index, GBUF_LEAF | (isNulls ? GBUF_NULLS : 0),
 		   size, );
@@ -723,6 +726,7 @@ doPickSplit(Relation index, SpGistState *state,
 	newLeafs = (SpGistLeafTuple *) palloc(sizeof(SpGistLeafTuple) * n);
 	leafPageSelect = (uint8 *) palloc(sizeof(uint8) * n);
 
+	memset(, 0, sizeof(xlrec));
 	STORE_STATE(state, xlrec.stateSrc);
 
 	/*
@@ -1501,6 +1505,7 @@ spgAddNodeAction(Relation index, SpGistState *state,
 	newInnerTuple = addNode(state, innerTuple, nodeLabel, nodeN);
 
 	/* Prepare WAL record */
+	memset(, 0, sizeof(xlrec));
 	STORE_STATE(state, xlrec.stateSrc);
 	xlrec.offnum = current->offnum;
 
@@ -1741,6 +1746,7 @@ spgSplitNodeAction(Relation index, SpGistState *state,
 	postfixTuple->allTheSame = innerTuple->allTheSame;
 
 	/* prep data for WAL record */
+	memset(, 0, sizeof(xlrec));
 	xlrec.newPage = false;
 
 	/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f13f9c1..360e8d2 100644
--- 

Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-08-18 Thread Peter Eisentraut
On 3/22/16 9:27 AM, Aleksander Alekseev wrote:
> diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
> index 28f9fb5..45aa802 100644
> --- a/src/backend/libpq/hba.c
> +++ b/src/backend/libpq/hba.c
> @@ -1008,14 +1008,9 @@ parse_hba_line(List *line, int line_num, char 
> *raw_line)
>   *cidr_slash = '\0';
>  
>   /* Get the IP address either way */
> + memset(, 0, sizeof(hints));
>   hints.ai_flags = AI_NUMERICHOST;
>   hints.ai_family = AF_UNSPEC;
> - hints.ai_socktype = 0;
> - hints.ai_protocol = 0;
> - hints.ai_addrlen = 0;
> - hints.ai_canonname = NULL;
> - hints.ai_addr = NULL;
> - hints.ai_next = NULL;
>  
>   ret = pg_getaddrinfo_all(str, NULL, , 
> _result);
>   if (ret == 0 && gai_result)

In addition to what Heikki wrote, I think the above is not necessary.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-03-22 Thread Aleksander Alekseev
> > It's possible that memset() would be more convincing.
> 
> +1

OK, here is corrected patch.


-- 
Best regards,
Aleksander Alekseev
http://eax.me/
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index b48e97c..273e0b0 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -337,6 +337,7 @@ gistXLogSplit(RelFileNode node, BlockNumber blkno, bool page_is_leaf,
 	for (ptr = dist; ptr; ptr = ptr->next)
 		npage++;
 
+	memset(, 0, sizeof(xlrec));
 	xlrec.origrlink = origrlink;
 	xlrec.orignsn = orignsn;
 	xlrec.origleaf = page_is_leaf;
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index f090ca5..1275b6c 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -204,6 +204,7 @@ addLeafTuple(Relation index, SpGistState *state, SpGistLeafTuple leafTuple,
 {
 	spgxlogAddLeaf xlrec;
 
+	memset(, 0, sizeof(xlrec));
 	xlrec.newPage = isNew;
 	xlrec.storesNulls = isNulls;
 
@@ -448,6 +449,8 @@ moveLeafs(Relation index, SpGistState *state,
 		i = it->nextOffset;
 	}
 
+	memset(, 0, sizeof(xlrec));
+
 	/* Find a leaf page that will hold them */
 	nbuf = SpGistGetBuffer(index, GBUF_LEAF | (isNulls ? GBUF_NULLS : 0),
 		   size, );
@@ -723,6 +726,7 @@ doPickSplit(Relation index, SpGistState *state,
 	newLeafs = (SpGistLeafTuple *) palloc(sizeof(SpGistLeafTuple) * n);
 	leafPageSelect = (uint8 *) palloc(sizeof(uint8) * n);
 
+	memset(, 0, sizeof(xlrec));
 	STORE_STATE(state, xlrec.stateSrc);
 
 	/*
@@ -1501,6 +1505,7 @@ spgAddNodeAction(Relation index, SpGistState *state,
 	newInnerTuple = addNode(state, innerTuple, nodeLabel, nodeN);
 
 	/* Prepare WAL record */
+	memset(, 0, sizeof(xlrec));
 	STORE_STATE(state, xlrec.stateSrc);
 	xlrec.offnum = current->offnum;
 
@@ -1741,6 +1746,7 @@ spgSplitNodeAction(Relation index, SpGistState *state,
 	postfixTuple->allTheSame = innerTuple->allTheSame;
 
 	/* prep data for WAL record */
+	memset(, 0, sizeof(xlrec));
 	xlrec.newPage = false;
 
 	/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b119a47..a50a244 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4798,15 +4798,14 @@ BootStrapXLOG(void)
 	 * segment with logid=0 logseg=1. The very first WAL segment, 0/0, is not
 	 * used, so that we can use 0/0 to mean "before any valid WAL segment".
 	 */
+	memset(, 0, sizeof(checkPoint));
 	checkPoint.redo = XLogSegSize + SizeOfXLogLongPHD;
 	checkPoint.ThisTimeLineID = ThisTimeLineID;
 	checkPoint.PrevTimeLineID = ThisTimeLineID;
 	checkPoint.fullPageWrites = fullPageWrites;
-	checkPoint.nextXidEpoch = 0;
 	checkPoint.nextXid = FirstNormalTransactionId;
 	checkPoint.nextOid = FirstBootstrapObjectId;
 	checkPoint.nextMulti = FirstMultiXactId;
-	checkPoint.nextMultiOffset = 0;
 	checkPoint.oldestXid = FirstNormalTransactionId;
 	checkPoint.oldestXidDB = TemplateDbOid;
 	checkPoint.oldestMulti = FirstMultiXactId;
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 1ff5728..a10c078 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -669,6 +669,11 @@ destroy_tablespace_directories(Oid tablespaceoid, bool redo)
 	char	   *subfile;
 	struct stat st;
 
+	/*
+	 * Prevents MemorySanitizer's "use-of-uninitialized-value" warning
+	 */
+	memset(, 0, sizeof(st));
+
 	linkloc_with_version_dir = psprintf("pg_tblspc/%u/%s", tablespaceoid,
 		TABLESPACE_VERSION_DIRECTORY);
 
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index 28f9fb5..45aa802 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -1008,14 +1008,9 @@ parse_hba_line(List *line, int line_num, char *raw_line)
 *cidr_slash = '\0';
 
 			/* Get the IP address either way */
+			memset(, 0, sizeof(hints));
 			hints.ai_flags = AI_NUMERICHOST;
 			hints.ai_family = AF_UNSPEC;
-			hints.ai_socktype = 0;
-			hints.ai_protocol = 0;
-			hints.ai_addrlen = 0;
-			hints.ai_canonname = NULL;
-			hints.ai_addr = NULL;
-			hints.ai_next = NULL;
 
 			ret = pg_getaddrinfo_all(str, NULL, , _result);
 			if (ret == 0 && gai_result)
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 1467355..ae7394c 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -341,14 +341,10 @@ pgstat_init(void)
 	/*
 	 * Create the UDP socket for sending and receiving statistic messages
 	 */
+	memset(, 0, sizeof(hints));
 	hints.ai_flags = AI_PASSIVE;
 	hints.ai_family = AF_UNSPEC;
 	hints.ai_socktype = SOCK_DGRAM;
-	hints.ai_protocol = 0;
-	hints.ai_addrlen = 0;
-	hints.ai_addr = NULL;
-	hints.ai_canonname = NULL;
-	hints.ai_next = NULL;
 	ret = pg_getaddrinfo_all("localhost", NULL, , );
 	if (ret || !addrs)
 	{
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index 924bebb..498e7bd 100644
--- 

Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-03-21 Thread Tom Lane
Chapman Flack  writes:
> On 03/21/2016 10:21 AM, Aleksander Alekseev wrote:
>> Well in this case here is a patch that fixes "use of uninitialized
>> value" reports by MemorySanitizer I managed to catch so far.

> I'm new here so someone more experienced would have to weigh in,
> but I would wonder a couple of things:

> a. whether a braced struct assignment is supported in every
>C compiler that PostgreSQL still intends to support

We rely on struct assignment to work already; although I'm not sure
we should expect it to be efficient, so we might not want to use it
in performance-critical places.

> b. whether such a struct assignment is guaranteed to initialize
>padding spaces as well as declared fields (in all supported
>C versions/compilers).

I think this is a valid concern; my recollection is that the C standard
defines struct assignment as "assign each member".

> It's possible that memset() would be more convincing.

+1

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-03-21 Thread Aleksander Alekseev
> I'm new here so someone more experienced would have to weigh in,
> but I would wonder a couple of things:
> 
> a. whether a braced struct assignment is supported in every
>C compiler that PostgreSQL still intends to support
> 
> b. whether such a struct assignment is guaranteed to initialize
>padding spaces as well as declared fields (in all supported
>C versions/compilers).
> 
> It's possible that memset() would be more convincing.

Frankly I'm not sure regarding all supported C versions/compilers. But
it seems to be a valid ANSI C. Here is a test program:

```
#include 

typedef struct {
  int i;
  char c;
  long l;
  short s;
} MyStruct;

int main()
{
  int i, sum = 0;
  char *c;
  MyStruct s = {0};

  s.i = 11;
  s.c = 22;
  s.l = 33;
  s.s = 44;

  c = (char*)
  for(i = 0; i < sizeof(s); i++) {
sum += *c;
c++;
  }

  printf("Sum: %d\n", sum);

  return 0;
}
```

I compiled it with various versions of GCC and CLang with different
optimization flags:

clang38 -O3 -ansi -g t.c -o t
gcc -O0 -ansi -g t.c -o t

In all cases running a program under debugger shows that structure is
properly initialized:

(gdb) b main
Breakpoint 1 at 0x4007ae: file t.c, line 12.
(gdb) r
Starting program: /usr/home/eax/temp/t 

Breakpoint 1, main () at t.c:12
12   int i, sum = 0;
(gdb) p memset(, 0xEA, sizeof(MyStruct))
$1 = -5376
(gdb) x/24xb 
0x7fffeb00: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7fffeb08: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
0x7fffeb10: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea
(gdb) n
14   MyStruct s = {0};
(gdb) 
16   s.i = 11;
(gdb) x/24xb 
0x7fffeb00: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7fffeb08: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7fffeb10: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) quit

Naturally we could use memset() as well. But I personally find it a bit
less readable. And in theory it doesn't prevent some _very_ "smart" C
compiler from not cleaning the whole structure anyway.

-- 
Best regards,
Aleksander Alekseev
http://eax.me/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-03-21 Thread Chapman Flack
On 03/21/2016 10:21 AM, Aleksander Alekseev wrote:

> Well in this case here is a patch that fixes "use of uninitialized
> value" reports by MemorySanitizer I managed to catch so far.

I'm new here so someone more experienced would have to weigh in,
but I would wonder a couple of things:

a. whether a braced struct assignment is supported in every
   C compiler that PostgreSQL still intends to support

b. whether such a struct assignment is guaranteed to initialize
   padding spaces as well as declared fields (in all supported
   C versions/compilers).

It's possible that memset() would be more convincing.

-Chap



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)

2016-03-21 Thread Aleksander Alekseev
> Well, the documentation already says to avoid it:
> 
> http://www.postgresql.org/docs/current/static/xfunc-c.html
> 
>Another important point is to avoid leaving any uninitialized
>bits within data type values; for example, take care to zero out
>any alignment padding bytes that might be present in structs.
> 
> so I don't think what you're suggesting would be controversial
> at all; it looks like what you've done is found a(t least one)
> bug where the documented practice wasn't followed, and it's good
> to find any such places.

Well in this case here is a patch that fixes "use of uninitialized
value" reports by MemorySanitizer I managed to catch so far.

-- 
Best regards,
Aleksander Alekseev
http://eax.me/
diff --git a/src/backend/access/gist/gistxlog.c b/src/backend/access/gist/gistxlog.c
index b48e97c..3e81865 100644
--- a/src/backend/access/gist/gistxlog.c
+++ b/src/backend/access/gist/gistxlog.c
@@ -328,7 +328,7 @@ gistXLogSplit(RelFileNode node, BlockNumber blkno, bool page_is_leaf,
 			  BlockNumber origrlink, GistNSN orignsn,
 			  Buffer leftchildbuf, bool markfollowright)
 {
-	gistxlogPageSplit xlrec;
+	gistxlogPageSplit xlrec = {0};
 	SplitedPageLayout *ptr;
 	int			npage = 0;
 	XLogRecPtr	recptr;
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index f090ca5..7bab290 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -202,7 +202,7 @@ static void
 addLeafTuple(Relation index, SpGistState *state, SpGistLeafTuple leafTuple,
 		   SPPageDesc *current, SPPageDesc *parent, bool isNulls, bool isNew)
 {
-	spgxlogAddLeaf xlrec;
+	spgxlogAddLeaf xlrec = {0};
 
 	xlrec.newPage = isNew;
 	xlrec.storesNulls = isNulls;
@@ -400,7 +400,7 @@ moveLeafs(Relation index, SpGistState *state,
 	OffsetNumber *toDelete;
 	OffsetNumber *toInsert;
 	BlockNumber nblkno;
-	spgxlogMoveLeafs xlrec;
+	spgxlogMoveLeafs xlrec = {0};
 	char	   *leafdata,
 			   *leafptr;
 
@@ -701,7 +701,7 @@ doPickSplit(Relation index, SpGistState *state,
 	int			currentFreeSpace;
 	int			totalLeafSizes;
 	bool		allTheSame;
-	spgxlogPickSplit xlrec;
+	spgxlogPickSplit xlrec = {0};
 	char	   *leafdata,
 			   *leafptr;
 	SPPageDesc	saveCurrent;
@@ -1492,7 +1492,7 @@ spgAddNodeAction(Relation index, SpGistState *state,
  int nodeN, Datum nodeLabel)
 {
 	SpGistInnerTuple newInnerTuple;
-	spgxlogAddNode xlrec;
+	spgxlogAddNode xlrec = {0};
 
 	/* Should not be applied to nulls */
 	Assert(!SpGistPageStoresNulls(current->page));
@@ -1699,7 +1699,7 @@ spgSplitNodeAction(Relation index, SpGistState *state,
 	BlockNumber postfixBlkno;
 	OffsetNumber postfixOffset;
 	int			i;
-	spgxlogSplitTuple xlrec;
+	spgxlogSplitTuple xlrec = {0};
 	Buffer		newBuffer = InvalidBuffer;
 
 	/* Should not be applied to nulls */
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b119a47..50d3123 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4755,7 +4755,7 @@ XLOGShmemInit(void)
 void
 BootStrapXLOG(void)
 {
-	CheckPoint	checkPoint;
+	CheckPoint	checkPoint = {0};
 	char	   *buffer;
 	XLogPageHeader page;
 	XLogLongPageHeader longpage;
@@ -4802,11 +4802,9 @@ BootStrapXLOG(void)
 	checkPoint.ThisTimeLineID = ThisTimeLineID;
 	checkPoint.PrevTimeLineID = ThisTimeLineID;
 	checkPoint.fullPageWrites = fullPageWrites;
-	checkPoint.nextXidEpoch = 0;
 	checkPoint.nextXid = FirstNormalTransactionId;
 	checkPoint.nextOid = FirstBootstrapObjectId;
 	checkPoint.nextMulti = FirstMultiXactId;
-	checkPoint.nextMultiOffset = 0;
 	checkPoint.oldestXid = FirstNormalTransactionId;
 	checkPoint.oldestXidDB = TemplateDbOid;
 	checkPoint.oldestMulti = FirstMultiXactId;
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 1ff5728..e27a18f 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -667,7 +667,7 @@ destroy_tablespace_directories(Oid tablespaceoid, bool redo)
 	DIR		   *dirdesc;
 	struct dirent *de;
 	char	   *subfile;
-	struct stat st;
+	struct stat st = {0};
 
 	linkloc_with_version_dir = psprintf("pg_tblspc/%u/%s", tablespaceoid,
 		TABLESPACE_VERSION_DIRECTORY);
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index 28f9fb5..123f068 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -823,7 +823,7 @@ parse_hba_line(List *line, int line_num, char *raw_line)
 {
 	char	   *str;
 	struct addrinfo *gai_result;
-	struct addrinfo hints;
+	struct addrinfo hints = {0};
 	int			ret;
 	char	   *cidr_slash;
 	char	   *unsupauth;
@@ -1010,12 +1010,6 @@ parse_hba_line(List *line, int line_num, char *raw_line)
 			/* Get the IP address either way */
 			hints.ai_flags = AI_NUMERICHOST;
 			hints.ai_family = AF_UNSPEC;
-			hints.ai_socktype = 0;
-			hints.ai_protocol = 0;
-			hints.ai_addrlen = 0;
-			hints.ai_canonname = NULL;
-			hints.ai_addr = NULL;
-			hints.ai_next = NULL;
 
 			ret =