Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-07-12 Thread Chapman Flack
On 07/12/17 08:38, Robert Haas wrote:

> another protocol message.  I feel like the usefulness of this for
> connection pooling software is pretty obvious: it's a lot easier for
> the pooler to disallow a certain protocol message than a certain SQL
> command.

I assume you mean easier than disallowing an SQL command that has to be
disallowed (with all the complexity of of parsing and recognizing all the
forms it could take) or else the client could abuse it—in other words,
the current state of affairs, without a cookie'd SQL command.

Once the comparison is not to the current state, but to a proposed
cookie mechanism for the SQL command, I don't think I see either idea
as strikingly easier or more effective.

But the protocol extension becomes another thing (like SASL channel
binding) that you can add to the server, but you don't really have it
until all of the protocol driver projects catch up. A mechanism in SQL
is ready for everybody as soon as it's there.

> variable can't later be changed from SQL.  So now you don't even need
> the cookie, and the client can't try to guess the cookie.

Again, the trouble of needing a cookie or of supporting a special protocol
message don't seem that different to me, and with, say, a 64-bit cookie
built from a good source of randomness, the risk of a client guessing it
seems negligible.

One other reason I think I'm slow to warm to a protocol extension is
things done that way tend to make second-class citizens of code that
runs in the backend.

For an example, think of how warnings are handled. If client code uses
JDBC, it should be able to call getWarnings() on a ResultSet and find out
what warnings might have been raised. If you move the same code to the
backend, in PL/Java, it still uses JDBC and there's still a getWarnings()
method but it's (currently) useless. elog makes catching errors easy,
but warnings get shipped straight to libpq and out to the client. (For
this example, that's also an encapsulation breach and possible information
leak, the client getting warnings from internal SQL queries by backend
routines).

LISTEN/NOTIFY is another example of a mechanism that's not there for
backend code, because the notification mechanism is purely a message
over pq. A less interesting issue than the warnings, perhaps (once the
code is in the backend, why be notified by a trigger when it could simply
BE a trigger?) ... but it could be a bit surprising to someone accustomed
to having it available for client code.

So, even if I don't this instant have an example of why some backend
code or extension in an arbitrary PL might want to be able to lock down
a particular GUC, I can imagine it might happen, and if the mechanism is
just SQL with a cookie, all the PLs have it for free, but if it is
tied up with the fe/be protocol, it's hard.

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-07-12 Thread David Fetter
On Wed, Jul 12, 2017 at 07:38:56AM -0500, Robert Haas wrote:
> On Tue, May 9, 2017 at 9:43 PM, Chapman Flack  wrote:
> > That's where the appident.cookie() function comes in. You just
> > query it once at session establishment and remember the cookie.
> > That allows your code to say:
> >
> > SET SESSION ON BEHALF OF 'joe user' BECAUSE I HAVE :cookie AND I SAY SO;
> >
> > and Mallory can't inject that because he doesn't have :cookie and the
> > appident.cookie() function only succeeds the first time.
> 
> I have for a long time been interested in having a protocol-level
> method for setting the session identity.  That is, instead of sending
> "SET SESSION AUTHORIZATION 'rhaas'" wrapped in a Query message, you'd
> send some new protocol message that we invent that nails down the
> session authorization and doesn't allow it to be changed except by
> another protocol message.  I feel like the usefulness of this for
> connection pooling software is pretty obvious: it's a lot easier for
> the pooler to disallow a certain protocol message than a certain SQL
> command.

Neat idea.

> But maybe we could generalize it a bit, so that it can work for any
> GUC.  For example, let the client send some new SetVariableViaProtocol
> message with two arguments, a GUC name and a value.  The server treats
> this just like a SET command, except that once you've done this, the
> variable can't later be changed from SQL.  So now you don't even need
> the cookie, and the client can't try to guess the cookie.  You just
> tell the server - via this protocol message - to nail down
> session_authorization or application_name or appuser.thunk to some
> value of your choice, and it's invulnerable to SQL injection
> thereafter.

Likely to SQL injection.  I guess that would direct attackers to the
protocol, which I suppose is already a harder target, assuming SSL or
similar.

> Whaddaya think?

I thing I'm not alone in wanting some way to set parameters and hold
them steady.  I confess I'd been thinking more in terms of a DCL for
these, but that makes it a lot less flexible than what you're
proposing in the sense that it's set on connect rather than via GRANT
and REVOKE.

One thing I'm not seeing how to do via your proposal is to hold these
things for local (not localhost) users.  Is there some way to handle
them, too, or would that be over-engineering this, given what a local
user can already accomplish?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-07-12 Thread Robert Haas
On Tue, May 9, 2017 at 9:43 PM, Chapman Flack  wrote:
> That's where the appident.cookie() function comes in. You just
> query it once at session establishment and remember the cookie.
> That allows your code to say:
>
> SET SESSION ON BEHALF OF 'joe user' BECAUSE I HAVE :cookie AND I SAY SO;
>
> and Mallory can't inject that because he doesn't have :cookie and the
> appident.cookie() function only succeeds the first time.

I have for a long time been interested in having a protocol-level
method for setting the session identity.  That is, instead of sending
"SET SESSION AUTHORIZATION 'rhaas'" wrapped in a Query message, you'd
send some new protocol message that we invent that nails down the
session authorization and doesn't allow it to be changed except by
another protocol message.  I feel like the usefulness of this for
connection pooling software is pretty obvious: it's a lot easier for
the pooler to disallow a certain protocol message than a certain SQL
command.

But maybe we could generalize it a bit, so that it can work for any
GUC.  For example, let the client send some new SetVariableViaProtocol
message with two arguments, a GUC name and a value.  The server treats
this just like a SET command, except that once you've done this, the
variable can't later be changed from SQL.  So now you don't even need
the cookie, and the client can't try to guess the cookie.  You just
tell the server - via this protocol message - to nail down
session_authorization or application_name or appuser.thunk to some
value of your choice, and it's invulnerable to SQL injection
thereafter.

Whaddaya think?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-10 Thread Chapman Flack
On 05/10/2017 03:56 AM, Craig Ringer wrote:
> On 10 May 2017 10:44 am, "Chapman Flack"  wrote:
>> On 05/09/17 18:48, Mark Dilger wrote: 
>>> SET SESSION ON BEHALF OF 'joe user'
> 
> No need to do anything they custom and specific. No need for new syntax
> either.
> SET myapp.appuser = 'joe'

We seem to be agreed here ... I was claiming that Mark's new syntax
and simply setting a custom variable are different ways of "spelling"
the same idea, but the custom variable approach is less invasive and
can probably be done all within an extension.

>> The other bit of my proposal was to prevent Mallory from spoofing
>> his appident info by managing to inject some SQL through your app
> 
> If your attacker gets that far you're kind of screwed anyway.

Sure, but part of defense-in-depth still applies then, either to
collect more info on who's screwing you or to throw tables and chairs
to make them climb over.

> But that's where something like 'secure variables' or package variables
> come in. See the mailing list discussion on that topic a couple of months
> ago.

My hasty search for a mailing list discussion didn't find one (got
a link?), but I did find a wiki page; is that what you have in mind?

https://wiki.postgresql.org/wiki/Variable_Design

>> That's where it might be handy to have a
>> way to associate the info with the current thread or current request
>> in a way that doesn't need any support in third party layers in the middle,
>> but can be retrieved by the driver (or a thin wrapper around it, down
>> at the bottom of the stack) and turned into the proper SET commands.
> 
> I don't see how postgres can do anything about this. PgJDBC maybe. ...

We're agreed here too; that's why I described it as a separable,
future-work idea. I was just sketching how it might be possible in
the client-side stack to build on whatever basic function (secure
variables or a GUCs-with-cookie extension) postgres could provide.

Any uptake of that idea would have to happen in PgJDBC *and* in
psycopg2 *and* in DBD::Pg *and* in you-name-it for programming
languages/environments x, y, and z. So it's clearly science fiction.
(Actually, a more practical way to do it might be to write thin
wrappers that implement the DB driver API in each language, that
could be used *in place of* PgJDBC or psycopg2 or DBD::Pg and just
delegate everything to the real driver, except the 'connect'
operation would delegate to the real driver but then issue one
query for the cookie, and there'd be some additional non-standard
API for higher layers to call and set values, to be passed along
via SET commands and the cookie ahead of later queries.)

The design for any such thing would want to be idiomatic and
sensitive to the typical usages in each language or software
stack, so I don't remotely propose to do or even design any
for now. Just wanted to put rough ideas into the collective
consciousness to soak for a while.

> The main part I would like is a generic mech[an]ism to inject the value of a
> GUC into the logs.
> 
> For csvlog, it'd be a list of GUC names, each  a to be emitted as a
> separate field if set, or empty field if unset.
> 
> For normal log, it'd be available in log_line_prefix as something like
> 
> %(myapp.user)g 
> 
> I can see this being plenty useful for all sorts of work, and nicely
> flexible.

Thank you for bringing that back in, since logging was my motive when
I started this thread, and then my last message was all about other
things.

Yes, I think I'd like to see logging extended in exactly those ways
(modulo whatever exact spelling ends up preferred for %(gucname)g).

Looked at that way, that's a logging extension of general utility
so it's also separable from the 'variables to store on-behalf-of
identities' extension proposed here.

... with the caveat that if the variables-to-store-identities idea
were to end up going a different direction than an-extension-with-GUCs
(secure variables or whatever), then I'd want the logging extension
to also provide an escape syntax for logging whatever those are.

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-10 Thread Craig Ringer
On 10 May 2017 10:44 am, "Chapman Flack"  wrote:

On 05/09/17 18:48, Mark Dilger wrote:

> I don't have any positive expectation that the postgres community will go
> along with any of this, but just from my point of view, the cleaner way to
> do what you are proposing is something like setting a session variable.
>
> In your middle tier java application or whatever, you'd run something like
>
> SET SESSION ON BEHALF OF 'joe user'


No need to do anything they custom and specific. No need for new syntax
either.


SET myapp.appuser = 'joe'

Or use SET LOCAL for xact scoped.


The other bit of my proposal was to prevent Mallory from spoofing
his appident info by managing to inject some SQL through your app


If your attacker gets that far you're kind of screwed anyway.

But that's where something like 'secure variables' or package variables
come in. See the mailing list discussion on that topic a couple of months
ago.



SET SESSION ON BEHALF OF 'joe user' BECAUSE I HAVE :cookie AND I SAY SO;


I do want something similar to this for SET SESSION AUTHORIZATION.

But for most things a secure variable model with a setter function should
work better.



Without any new syntax


Much, much more chance of this.

 with it.

It's those more complex architectures I was thinking of with the client-
side ideas, where your code may be at the top of such a tall stack of
persistence/ORM/whatever layers that you're not sure you can just emit
an arbitrary SET command and have it come out in front of the right query
generated by the lower layers.


Surely in that case you have the same problem with something based on new
syntax?


That's where it might be handy to have a
way to associate the info with the current thread or current request
in a way that doesn't need any support in third party layers in the middle,
but can be retrieved by the driver (or a thin wrapper around it, down
at the bottom of the stack) and turned into the proper SET commands. That's
really a separable, less immediate, future-work idea.


I don't see how postgres can do anything about this. PgJDBC maybe. But
probably not.

The main part I would like is a generic mechs ism to inject the value of a
GUC into the logs.

For csvlog, it'd be a list of GUC names, each  a to be emitted as a
separate field if set, or empty field if unset.

For normal log, it'd be available in log_line_prefix as something like

%(myapp.user)g

... or whatever.

I can see this being plenty useful for all sorts of work, and nicely
flexible.


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-09 Thread Chapman Flack
On 05/09/17 18:48, Mark Dilger wrote:

> I don't have any positive expectation that the postgres community will go
> along with any of this, but just from my point of view, the cleaner way to
> do what you are proposing is something like setting a session variable.
> 
> In your middle tier java application or whatever, you'd run something like
> 
> SET SESSION ON BEHALF OF 'joe user'
> 
> And then when you reuse that connection for the next web page you
> would do something like that again:
> 
> SET SESSION ON BEHALF OF "anonymous from 1.2.3.4"

I may have muddled my presentation by starting it with the more-speculative
ideas about building support into client-side drivers. At present, none of
that exists, and in the simple case where the application code has access
to the driver for sending arbitrary SQL, then yes, you just have it send

SET SESSION ON BEHALF OF 'joe user'
or
SET SESSION ON BEHALF OF "anonymous from 1.2.3.4"

or something equivalent but spelled differently, like

SET appident.user TO 'joe user'
or
SET appident.user TO 'anonymous'; SET appident.inet_addr TO '1.2.3.4'

Your spelling is nicer, but requires teaching the parser new syntax
and touching lots of places in PostgreSQL core, probably a non-starter.
In contrast, writing an 'appident' extension that defines some new GUCs
would be trivial and well-encapsulated. (What I'd think ideal would be
an extension that defines _whatever new GUCs make sense for your
application_ so if it would serve your purpose to have appident.locale
you just make it so. I wonder if an extension can define one GUC that
you set in the config file to a list of other GUCs it should also define
within its own namespace, or if that would run afoul of the order
initialization happens in.)

The other bit of my proposal was to prevent Mallory from spoofing
his appident info by managing to inject some SQL through your app
like "21' && sET_/**/cONfiG('appident.user', 'alice', fA/**/lsE)".

That's where the appident.cookie() function comes in. You just
query it once at session establishment and remember the cookie.
That allows your code to say:

SET SESSION ON BEHALF OF 'joe user' BECAUSE I HAVE :cookie AND I SAY SO;

and Mallory can't inject that because he doesn't have :cookie and the
appident.cookie() function only succeeds the first time.

Without any new syntax, that could be spelled either:

SELECT appident.set('user', 'joe user', is_local => false,
  cookie => :cookie)
(where the cookie is simply verified by the SQL-callable function), or
SET appident.user TO :cookie || 'joe user'
(where it's verified by the check hook on the GUC, then stripped off).

That's pretty much the total extent of what I would propose the
extension to PostgreSQL proper would do. Just let you define some
new GUCs with meaning to your application, and not interpret them
or use them for anything, but provide a bit of protection so your
code controls them and arbitrary SQL queries can't.

The more science-fiction, client-side ideas I proposed were just
ruminations on what might be useful to application code sitting on top
of a taller stack of third-party code that might get in the way of just
sending your arbitrary SET command ahead of your query.

> and so forth.  You wouldn't have to worry about threads since this is
> being handled much like SET ROLE only instead of changing the role
> under which your database handle is operating, it only changes an
> opaque value that you can then use for whatever purpose.  I would
> not expect the database permissions logic to use the value in any
> way, but just to preserve it for you to query in logging or from a stored
> procedure or similar.  As long as your jdbc driver or similar does not
> prevent you from running non-standard sql statements, you should
> be able to pass this down the line without any of the third party
> software in the middle messing with it.

It's those more complex architectures I was thinking of with the client-
side ideas, where your code may be at the top of such a tall stack of
persistence/ORM/whatever layers that you're not sure you can just emit
an arbitrary SET command and have it come out in front of the right query
generated by the lower layers. That's where it might be handy to have a
way to associate the info with the current thread or current request
in a way that doesn't need any support in third party layers in the middle,
but can be retrieved by the driver (or a thin wrapper around it, down
at the bottom of the stack) and turned into the proper SET commands. That's
really a separable, less immediate, future-work idea.

> If this feature were implemented, I'd probably use it.  I might also be
> willing to write this with you in the unlikely event that it gets community
> approval.

I might just go ahead and try writing the extension part. Isn't that
the beauty of extensions? Approval? We don't need no stinkin' approval. :)

But critiques and better ideas are never unwelcome. :)

-Chap


-- 
Sent via pgsql-hackers 

Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-09 Thread Mark Dilger

> On May 9, 2017, at 3:14 PM, Chapman Flack  wrote:
> 
> On 05/09/2017 01:25 PM, Mark Dilger wrote:
> 
>> Consensus, no, but utility, yes.
>> 
>> In three tier architectures there is a general problem that the database
>> role used by the middle tier to connect to the database does not entail
>> information about the user who, such as a visitor to your website, made
>> the request of the middle tier.  Chapman wants this information so he
>> can include it in the logs, but stored procedures that work in support
>> of the middle tier might want it for locale information, etc.  As things
>> currently stand, there is no good way to get this passed all the way down
>> into the database stored procedure that needs it, given that you are
>> typically calling down through third party code that doesn't play along.
> 
> I like this framing.
> 
> Clearly a good part of the story is outside of PostgreSQL proper, and
> has to be written elsewhere. There could be a picture like this:
> 
> middle tier receiving request (webapp?) - knows user/origin info
>   |
>   V
> third-party code (rails? web2py? spring?) - doesn't play along
>   |
>   V
> PQ protocol driver (pg? psycopg2? pgjdbc?) - could offer support
>   .
>   .
>   V
> PostgreSQL (what to do here?)
> 
> 
> What to do on the client side of the . . > can only be suggested and
> would have to be independently implemented by several drivers, but
> I could imagine a driver offering some API to tuck a bit of
> application-specific data into some form of thread-local storage.
> In the picture above, the top layer, where the user/origin info
> is known, would need a small modification to call that driver API
> and provide that info. The request could then be processed on down
> through the third-party layer(s) that don't play along. When
> it reaches the driver, something magic will happen to forward
> the thread-local preserved information on to PostgreSQL along with
> the query.
> 
> That of course isn't enough if the intervening layers that don't
> play along use thread pools, and the request could ultimately
> reach the driver on a different thread. But for the simple case
> it gives an idea.
> 
> As to how the driver then propagates the info to PostgreSQL, seems
> to me it could generate a SET in front of the actual query. Most or
> all of what would be needed in PostgreSQL might be possible in an
> extension, which I could try my hand at writing. Here's the idea:
> 
> The extension would define one or more custom GUCs, with flags /
> check hooks to enforce strict limits on when and how they can be set.
> 
> If the client stack is using a simple connection-per-request
> approach, they could just be PGC_BACKEND, and the client part of
> the picture could just be that the top layer supplies them as
> options= in the conninfo string, which various drivers already
> support.
> 
> But if connections may be pooled and re-used for different identities
> and origins, that isn't enough. So the extension would provide
> a function that can be called once in the session, returning
> a random magic cookie. The driver itself would call this function
> upon connecting, and save the cookie in a per-connection
> private variable. Code above the driver in the stack would have
> no access to it, as the function can't be called a second time,
> and so could not spoof identities just by sending arbitrary SET
> commands. The extension would reject any attempts to set or reset
> those GUCs unless accompanied by the cookie.
> 
> Stored procedures could then look at those GUCs for locale / identity
> / origin information and trust that they haven't been spoofed by
> injected commands.
> 
> If there were such a thing as a log_line_prefix_hook, then such an
> extension could also support my original idea and add some new
> escapes to log the added information. But there doesn't seem to be
> such a hook at present. Or, if there were simply a %{name-of-GUC}
> escape supported in log_line_prefix, nothing more would even be
> needed.
> 
> Does this sound workable?

I don't have any positive expectation that the postgres community will go
along with any of this, but just from my point of view, the cleaner way to
do what you are proposing is something like setting a session variable.

In your middle tier java application or whatever, you'd run something like

SET SESSION ON BEHALF OF 'joe user'

And then when you reuse that connection for the next web page you
would do something like that again:

SET SESSION ON BEHALF OF "anonymous from 1.2.3.4"

and so forth.  You wouldn't have to worry about threads since this is
being handled much like SET ROLE only instead of changing the role
under which your database handle is operating, it only changes an
opaque value that you can then use for whatever purpose.  I would
not expect the database permissions logic to use the value in any
way, but just to preserve it for you to query in logging or from a 

Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-09 Thread Chapman Flack
On 05/09/2017 01:25 PM, Mark Dilger wrote:

> Consensus, no, but utility, yes.
> 
> In three tier architectures there is a general problem that the database
> role used by the middle tier to connect to the database does not entail
> information about the user who, such as a visitor to your website, made
> the request of the middle tier.  Chapman wants this information so he
> can include it in the logs, but stored procedures that work in support
> of the middle tier might want it for locale information, etc.  As things
> currently stand, there is no good way to get this passed all the way down
> into the database stored procedure that needs it, given that you are
> typically calling down through third party code that doesn't play along.

I like this framing.

Clearly a good part of the story is outside of PostgreSQL proper, and
has to be written elsewhere. There could be a picture like this:

 middle tier receiving request (webapp?) - knows user/origin info
   |
   V
 third-party code (rails? web2py? spring?) - doesn't play along
   |
   V
 PQ protocol driver (pg? psycopg2? pgjdbc?) - could offer support
   .
   .
   V
 PostgreSQL (what to do here?)


What to do on the client side of the . . > can only be suggested and
would have to be independently implemented by several drivers, but
I could imagine a driver offering some API to tuck a bit of
application-specific data into some form of thread-local storage.
In the picture above, the top layer, where the user/origin info
is known, would need a small modification to call that driver API
and provide that info. The request could then be processed on down
through the third-party layer(s) that don't play along. When
it reaches the driver, something magic will happen to forward
the thread-local preserved information on to PostgreSQL along with
the query.

That of course isn't enough if the intervening layers that don't
play along use thread pools, and the request could ultimately
reach the driver on a different thread. But for the simple case
it gives an idea.

As to how the driver then propagates the info to PostgreSQL, seems
to me it could generate a SET in front of the actual query. Most or
all of what would be needed in PostgreSQL might be possible in an
extension, which I could try my hand at writing. Here's the idea:

The extension would define one or more custom GUCs, with flags /
check hooks to enforce strict limits on when and how they can be set.

If the client stack is using a simple connection-per-request
approach, they could just be PGC_BACKEND, and the client part of
the picture could just be that the top layer supplies them as
options= in the conninfo string, which various drivers already
support.

But if connections may be pooled and re-used for different identities
and origins, that isn't enough. So the extension would provide
a function that can be called once in the session, returning
a random magic cookie. The driver itself would call this function
upon connecting, and save the cookie in a per-connection
private variable. Code above the driver in the stack would have
no access to it, as the function can't be called a second time,
and so could not spoof identities just by sending arbitrary SET
commands. The extension would reject any attempts to set or reset
those GUCs unless accompanied by the cookie.

Stored procedures could then look at those GUCs for locale / identity
/ origin information and trust that they haven't been spoofed by
injected commands.

If there were such a thing as a log_line_prefix_hook, then such an
extension could also support my original idea and add some new
escapes to log the added information. But there doesn't seem to be
such a hook at present. Or, if there were simply a %{name-of-GUC}
escape supported in log_line_prefix, nothing more would even be
needed.

Does this sound workable?

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-09 Thread Tom Lane
David Fetter  writes:
> On Tue, May 09, 2017 at 12:48:01PM -0400, Tom Lane wrote:
>> I don't think that's a problem: while psql will remove "--" and everything
>> following it until newline, it won't remove the newline.  So there's still
>> a token boundary there.

> We may still need to be careful.

> davidfetter@davidfetter=# SELECT 'foo'-- stuff goes here
> 'bar';
>  ?column?
> --
>  foobar
> (1 row)

If you read the SQL spec, you'll find that particular behavior is
required by spec, and would still be required by spec with or
without the comment.  Perhaps it's a trap for unwary SQL programmers,
but psql isn't making it worse.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-09 Thread David Fetter
On Tue, May 09, 2017 at 12:48:01PM -0400, Tom Lane wrote:
> David Fetter  writes:
> > On Fri, May 05, 2017 at 02:20:26PM -0400, Robert Haas wrote:
> >> On Thu, May 4, 2017 at 10:59 AM, Chapman Flack  
> >> wrote:
> >>> invalid input syntax for integer: "21' && 1=2)) Uni/**/ON
> >>> SEl/**/eCT 0x646665743166657274,0x646665743266657274,
> >>> 0x646665743366657274 -- "
> 
> >> Now that is choice.  I wonder what specific database system that's
> >> targeting...
> 
> > It could well be targeting some class of pipeline to the database,
> > too, for example one that removes comments and/or un-escapes.
> 
> Yeah.  It's a bit hard to see a database's parser treating "Uni/**/ON"
> as UNION, but if some stack someplace had a keyword check ahead of
> a comment-stripping step, maybe that could do something useful.

Right.

> > It occurs to me that psql's habit of stripping out everything on a
> > line that follows a double dash  might be vulnerable in this way, but
> > I wouldn't see such vulnerabilities as super easy to exploit, as psql
> > isn't usually exposed directly to input from the internet.
> 
> I don't think that's a problem: while psql will remove "--" and everything
> following it until newline, it won't remove the newline.  So there's still
> a token boundary there.  If we tried to strip /*...*/ comments we'd have
> to be more careful.

We may still need to be careful.

davidfetter@davidfetter=# SELECT 'foo'-- stuff goes here
'bar';
 ?column?
--
 foobar
(1 row)

> As far as the actual thread topic goes, I tend to agree with
> Robert's doubt that there's enough utility or consensus for this.

I'm pretty sure we're going to need a logger with more structure than
our default, especially as those logs get machine-parsed, and more
importantly, machine-acted-upon.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-09 Thread Mark Dilger

> On May 9, 2017, at 9:48 AM, Tom Lane  wrote:
> 
> David Fetter  writes:
>> On Fri, May 05, 2017 at 02:20:26PM -0400, Robert Haas wrote:
>>> On Thu, May 4, 2017 at 10:59 AM, Chapman Flack  
>>> wrote:
 invalid input syntax for integer: "21' && 1=2)) Uni/**/ON
 SEl/**/eCT 0x646665743166657274,0x646665743266657274,
 0x646665743366657274 -- "
> 
>>> Now that is choice.  I wonder what specific database system that's
>>> targeting...
> 
>> It could well be targeting some class of pipeline to the database,
>> too, for example one that removes comments and/or un-escapes.
> 
> Yeah.  It's a bit hard to see a database's parser treating "Uni/**/ON"
> as UNION, but if some stack someplace had a keyword check ahead of
> a comment-stripping step, maybe that could do something useful.
> 
>> It occurs to me that psql's habit of stripping out everything on a
>> line that follows a double dash  might be vulnerable in this way, but
>> I wouldn't see such vulnerabilities as super easy to exploit, as psql
>> isn't usually exposed directly to input from the internet.
> 
> I don't think that's a problem: while psql will remove "--" and everything
> following it until newline, it won't remove the newline.  So there's still
> a token boundary there.  If we tried to strip /*...*/ comments we'd have
> to be more careful.
> 
> As far as the actual thread topic goes, I tend to agree with Robert's
> doubt that there's enough utility or consensus for this.

Consensus, no, but utility, yes.

In three tier architectures there is a general problem that the database
role used by the middle tier to connect to the database does not entail
information about the user who, such as a visitor to your website, made
the request of the middle tier.  Chapman wants this information so he
can include it in the logs, but stored procedures that work in support
of the middle tier might want it for locale information, etc.  As things
currently stand, there is no good way to get this passed all the way down
into the database stored procedure that needs it, given that you are
typically calling down through third party code that doesn't play along.

I expect proposals to address this have been made and rejected before.
Is there a word or phrase for this that I can use to search the archives
to read up on the issue?

Thanks,

Mark Dilger

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-09 Thread Tom Lane
David Fetter  writes:
> On Fri, May 05, 2017 at 02:20:26PM -0400, Robert Haas wrote:
>> On Thu, May 4, 2017 at 10:59 AM, Chapman Flack  wrote:
>>> invalid input syntax for integer: "21' && 1=2)) Uni/**/ON
>>> SEl/**/eCT 0x646665743166657274,0x646665743266657274,
>>> 0x646665743366657274 -- "

>> Now that is choice.  I wonder what specific database system that's
>> targeting...

> It could well be targeting some class of pipeline to the database,
> too, for example one that removes comments and/or un-escapes.

Yeah.  It's a bit hard to see a database's parser treating "Uni/**/ON"
as UNION, but if some stack someplace had a keyword check ahead of
a comment-stripping step, maybe that could do something useful.

> It occurs to me that psql's habit of stripping out everything on a
> line that follows a double dash  might be vulnerable in this way, but
> I wouldn't see such vulnerabilities as super easy to exploit, as psql
> isn't usually exposed directly to input from the internet.

I don't think that's a problem: while psql will remove "--" and everything
following it until newline, it won't remove the newline.  So there's still
a token boundary there.  If we tried to strip /*...*/ comments we'd have
to be more careful.

As far as the actual thread topic goes, I tend to agree with Robert's
doubt that there's enough utility or consensus for this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-09 Thread David Fetter
On Fri, May 05, 2017 at 02:20:26PM -0400, Robert Haas wrote:
> On Thu, May 4, 2017 at 10:59 AM, Chapman Flack  wrote:
> > invalid input syntax for integer: "21' && 1=2)) Uni/**/ON
> > SEl/**/eCT 0x646665743166657274,0x646665743266657274,
> > 0x646665743366657274 -- "
> 
> Now that is choice.  I wonder what specific database system that's
> targeting...

It could well be targeting some class of pipeline to the database,
too, for example one that removes comments and/or un-escapes.

It occurs to me that psql's habit of stripping out everything on a
line that follows a double dash  might be vulnerable in this way, but
I wouldn't see such vulnerabilities as super easy to exploit, as psql
isn't usually exposed directly to input from the internet.

> > I just wonder if anybody thinks web apps, and therefore this
> > scenario, are common enough these days to maybe justify one or two
> > more GUCs with their own log_line_prefix escapes, such as
> > app_client_addr or app_user. Naturally they would only be as
> > reliable as the app setting them, and uninterpreted by PostgreSQL
> > itself, and so functionally no different from the uninterpreted
> > string already available as application_name.  The benefit is
> > perhaps to be clearer than just overloading application_name to
> > carry two or three pieces of information (and perhaps privacy, if
> > you care about app user identities and source IPs showing up in ps
> > titles).
> >
> > Worth considering, or is application_name Good Enough?
> 
> I mean, if there were a list of things that needed to propagated
> that was (1) lengthy and (2) universally agreed, then we'd probably
> want more than one field.  But your list is pretty short, so I guess
> I don't see why you can't just join them together with a punctuation
> mark of your choice and call it good.
> 
> I might be missing something, though.

That there isn't universal agreement probably points to wanting an
ability to place arbitrary fields in the logs, not just a
log_line_prefix.  This would be made a good bit simpler by structuring
logs, by default, in some serialization a little easier to reason
about (and among other things, parse correctly) than CSV.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-05 Thread Robert Haas
On Thu, May 4, 2017 at 10:59 AM, Chapman Flack  wrote:
> invalid input syntax for integer: "21' && 1=2)) Uni/**/ON
> SEl/**/eCT 0x646665743166657274,0x646665743266657274,
> 0x646665743366657274 -- "

Now that is choice.  I wonder what specific database system that's targeting...

> I just wonder if anybody thinks web apps, and therefore this
> scenario, are common enough these days to maybe justify one
> or two more GUCs with their own log_line_prefix escapes, such
> as app_client_addr or app_user. Naturally they would only be
> as reliable as the app setting them, and uninterpreted by
> PostgreSQL itself, and so functionally no different from the
> uninterpreted string already available as application_name.
> The benefit is perhaps to be clearer than just overloading
> application_name to carry two or three pieces of information
> (and perhaps privacy, if you care about app user identities and
> source IPs showing up in ps titles).
>
> Worth considering, or is application_name Good Enough?

I mean, if there were a list of things that needed to propagated that
was (1) lengthy and (2) universally agreed, then we'd probably want
more than one field.  But your list is pretty short, so I guess I
don't see why you can't just join them together with a punctuation
mark of your choice and call it good.

I might be missing something, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] idea: custom log_line_prefix components besides application_name

2017-05-04 Thread Chapman Flack
Hi,

At $work I am often entertained by log entries like:

invalid input syntax for integer: "21' && 1=2)) Uni/**/ON
SEl/**/eCT 0x646665743166657274,0x646665743266657274,
0x646665743366657274 -- "

They're entertaining mostly because I know our web guy has heard
of SQL injection and doesn't write stuff where it would work. So
I'm mostly just entertained to see what wacky tactics the lowlifes
come up with.

But suppose I wanted to log more information about the origins
of the attempts. As is often the case with web apps, %r/%h/%u
won't give me anything useful, because %h will always be our
own server hosting the app, and %u is the hardcoded name the app
uses to connect. Obviously, each app itself would need to be tweaked
to pass along what I really care about, the actual remote host and
the user identity within the app.

That could be done today by having the app construct some 64-
character string out of the app name, remote IP, and identity,
and setting that as application_name, and I could log %a.

I just wonder if anybody thinks web apps, and therefore this
scenario, are common enough these days to maybe justify one
or two more GUCs with their own log_line_prefix escapes, such
as app_client_addr or app_user. Naturally they would only be
as reliable as the app setting them, and uninterpreted by
PostgreSQL itself, and so functionally no different from the
uninterpreted string already available as application_name.
The benefit is perhaps to be clearer than just overloading
application_name to carry two or three pieces of information
(and perhaps privacy, if you care about app user identities and
source IPs showing up in ps titles).

Worth considering, or is application_name Good Enough?

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers