Re: [HACKERS] Column Redaction
On 16 October 2014 01:29, Claudio Freire wrote: > But in any case, if the deterrence isn't enough, and you get attacked, > anything involving redaction as fleshed out in the OP is good for > nothing. The damage has been done already. The feature doesn't > meaningfully slow down extraction of data, so anything you do can only > punish the attacker, not prevent further data theft or damaged > reputation/business. Deterrence is exactly the goal. "Only punishing the attacker" is exactly what this is for. This is not the same thing as preventative security. Redaction is designed to prevent authorized users from accidental misuse. Your business already trusts these people. You know their names, their addresses, their bank account details and you'll have already run security scans on them. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Wed, Oct 15, 2014 at 8:59 PM, Simon Riggs wrote: > On 15 October 2014 21:03, Claudio Freire wrote: > >>> So you're familiar then with this process? So you know that an auditor >>> would trigger an investigation, resulting in deeper surveillance and >>> gathering of evidence that ends with various remedial actions, such as >>> court. How would that process start then, if not this way? >> >> I've seen lots of such investigations fail because the evidence wasn't >> strong enough to link to a particular person, but rather a computer >> terminal or something like that. > > So your solution to the evidence problem is to do nothing? Or you have > a better suggestion? > > Nothing is certain, apart from doing nothing. Is solving the evidence problem in scope of the postgresql project? The solution is to not require evidence in order to be protected from data theft. Having evidence is nice, you can punish effective attacks, which is a deterrent to any attacker as you pointed out, and may even include financial compensation. It requires physical security as well as software security, and I'm not qualified to solve that problem without help from a lawyer (but I do know you need help from a lawyer to make sure the evidence you gather is usable). Not having usable evidence, however, could fail to deter knowledgeable attackers (remember, in this setting, it would be an inside job, so it would be a very knowledgeable attacker). But in any case, if the deterrence isn't enough, and you get attacked, anything involving redaction as fleshed out in the OP is good for nothing. The damage has been done already. The feature doesn't meaningfully slow down extraction of data, so anything you do can only punish the attacker, not prevent further data theft or damaged reputation/business. Something that requires superuser privilege (or specially granted privilege) in order to gain access to the unredacted value, on the other hand, would considerably slow down the attacker. From my proposal, only the second form (unnormalized redacted tuples) would provide any meaningful data security in this sense, but even in the other, less limiting form, it would still prevent unauthorized users from extracting the value: you can no longer do binary search with unredacted data, only a full brute-force search would work. That's because the full value id (that I called prefix id, sorry, leftover from an earlier draft) doesn't relate to the unredacted value, so sorting comparisons (< <= > >=) don't provide usable information about value space. So, if there is a chance to implement redaction in a way that truly protects redacted data... even if it costs a bit of performance sometimes. Is avoiding the performance hit worth the risk? I guess the potential users of such a feature are the only ones qualified to answer, and the answer has great weight on how the feature could be implemented. Well, and of course, the quality of the implementation. If my proposal has weaknesses I did not realize yet, it may be worthless. But that's true of all proposals that aim for any meaningful level of security: it's worth a lengthy look. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 15 October 2014 21:03, Claudio Freire wrote: >> So you're familiar then with this process? So you know that an auditor >> would trigger an investigation, resulting in deeper surveillance and >> gathering of evidence that ends with various remedial actions, such as >> court. How would that process start then, if not this way? > > I've seen lots of such investigations fail because the evidence wasn't > strong enough to link to a particular person, but rather a computer > terminal or something like that. So your solution to the evidence problem is to do nothing? Or you have a better suggestion? Nothing is certain, apart from doing nothing. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Wed, Oct 15, 2014 at 4:59 PM, Simon Riggs wrote: > On 15 October 2014 20:41, Claudio Freire wrote: >> On Sat, Oct 11, 2014 at 4:40 AM, Simon Riggs wrote: >>> On 10 October 2014 16:45, Rod Taylor wrote: >>> Redaction prevents accidental information loss only, forcing any loss >>> that occurs to be explicit. It ensures that loss of information can be >>> tied clearly back to an individual, like an ink packet that stains the >>> fingers of a thief. >> >> That is not true. >> >> It can only be tied to a session. That's very far from an individual >> in court terms, if you ask a lawyer. >> >> You need a helluva lot more to tie that to an individual. > > So you're familiar then with this process? So you know that an auditor > would trigger an investigation, resulting in deeper surveillance and > gathering of evidence that ends with various remedial actions, such as > court. How would that process start then, if not this way? I've seen lots of such investigations fail because the evidence wasn't strong enough to link to a particular person, but rather a computer terminal or something like that. Unless you also physically restrict access to such terminal to a single person through other means (which is quite uncommon practice except perhaps in banks), that evidence is barely circumstantial. But you'd have to ask a lawyer in your country to be sure. I can only speak for my own experiences in my own country which is probably not yours nor has the same laws. Law is a complex beast. So, you really want actual information security in addition to that deterrent you speak of. I don't say the deterrent is bad, I only say it's not good enough on its own. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 15 October 2014 20:41, Claudio Freire wrote: > On Sat, Oct 11, 2014 at 4:40 AM, Simon Riggs wrote: >> On 10 October 2014 16:45, Rod Taylor wrote: >> Redaction prevents accidental information loss only, forcing any loss >> that occurs to be explicit. It ensures that loss of information can be >> tied clearly back to an individual, like an ink packet that stains the >> fingers of a thief. > > That is not true. > > It can only be tied to a session. That's very far from an individual > in court terms, if you ask a lawyer. > > You need a helluva lot more to tie that to an individual. So you're familiar then with this process? So you know that an auditor would trigger an investigation, resulting in deeper surveillance and gathering of evidence that ends with various remedial actions, such as court. How would that process start then, if not this way? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Sat, Oct 11, 2014 at 4:40 AM, Simon Riggs wrote: > On 10 October 2014 16:45, Rod Taylor wrote: > Redaction prevents accidental information loss only, forcing any loss > that occurs to be explicit. It ensures that loss of information can be > tied clearly back to an individual, like an ink packet that stains the > fingers of a thief. That is not true. It can only be tied to a session. That's very far from an individual in court terms, if you ask a lawyer. You need a helluva lot more to tie that to an individual. > Redaction clearly relies completely on auditing before it can have any > additional effect. And the effectiveness of redaction needs to be > understood next to Rod's example. It forces you to audit all of the queries issued by the otherwise trusted user. That is, I believe, a far from optimal design. When you have to audit everything, you end up auditing nothing, a haystack of false positives can easily hide the needle that is the true positive. What you want, is something that allows selective auditing of leak-prone queries. But we've seen that joining is already a leak-prone query, so clearly you cannot allow simple joining if you want the above. What I propose, needs a schema change and some preparedness from the DBA. But, how can you assume that to be asking too much and not say the same from thorough auditing? So, what I propose, is to require explicit separation of concepts at the schema level. On Sat, Oct 11, 2014 at 10:43 AM, Bruce Momjian wrote: > For example, for a credit card type, you would output the last four > digits, but is there any value to storing the non-visible digits? You > can check the checksum of the digits, but that can be done on input and > doesn't require the storage of the digits. Is there some function we > could provide that would make that data type useful? Could we provide > comparison functions with delays or increasing delays? Basically, as said above, the point is to provide a data type that is nigh-useless. Imagine a redacted card number as a tuple (full_value_id, suffix). Suffix is in cleartext, and prefix_id is just an id pointing to a lookup table for the type. Regular users can read any redacted_number column, but will only get the id (useless unless they already know what that prefix is), and suffix. Format for that type would be " suffix" and would serve the purpose on the OP: it can be joined (equal value = equal id). Moreover, the type can be design in one of two ways: equal values contain equal id, or salted-values, where even equal values generated from different computations (ie: not copied) have different ids. This second mode would be the most secure, albeit a tad hard to use perhaps. But it would allow joining and everything. Only users that have access to the lookup table would be allowed to resolve the full value, with a non-security-defining function like: extract_full_value(redacted_number) Then you can audit all queries against the lookup table, and you have rather strong security IMHO. This can all be done without any new features to postgres. Maybe you can add syntactic sugar, but you don't really need anything on the core to accomplish the above. The syntactic sugar can take the form of a new data type family (like enum?) where you specify the redaction function, redacted data type, output format, and from there everything else works atomagically, with a extract_full(any) -> any function that somehow knows what to do. On Wed, Oct 15, 2014 at 3:57 PM, Simon Riggs wrote: > On 15 October 2014 19:46, Robert Haas wrote: > >>> In IT terms, we're looking at controlling and reducing improper access >>> to data by an otherwise Trusted person. The only problem is that some >>> actions on data items are allowed, others are not. >> >> Sure, I don't disagree with any of that as a general principle. I >> just think we should look for some ways of shoring up your proposal >> against some of the more obvious attacks, so as to have more good and >> less bad. > > Suggestions welcome. I'm not in a rush to implement this, so we have > time to mull it over. Does the above work for your intended purposes? Hard to know from what you've posted until now, but I believe it does. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 15 October 2014 19:46, Robert Haas wrote: >> In IT terms, we're looking at controlling and reducing improper access >> to data by an otherwise Trusted person. The only problem is that some >> actions on data items are allowed, others are not. > > Sure, I don't disagree with any of that as a general principle. I > just think we should look for some ways of shoring up your proposal > against some of the more obvious attacks, so as to have more good and > less bad. Suggestions welcome. I'm not in a rush to implement this, so we have time to mull it over. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Wed, Oct 15, 2014 at 4:04 AM, Simon Riggs wrote: > On 14 October 2014 17:43, Robert Haas wrote: >> On Sat, Oct 11, 2014 at 3:40 AM, Simon Riggs wrote: >>> As soon as you issue the above query, you have clearly indicated your >>> intention to steal. Receiving information is no longer accidental, it >>> is an explicit act that is logged in the auditing system against your >>> name. This is sufficient to bury you in court and it is now a real >>> deterrent. Redaction has worked. >> >> To me, this feels thin. It's true that this might be good enough for >> some users, but I wouldn't bet on it being good enough for very many >> users, and I really hope there's a better option. We have an existing >> method of doing data redaction via security barrier views. > > I agree with "thin". There is a leak in the design, so let me coin the > phrase "imprecise security". Of course, the leaks reduce the value of > such a feature; they just don't reduce it all the way to zero. > > Security barrier views or views of any kind don't do the required job. > > We are not able to easily classify people as Trusted or Untrusted. > > We're seeking to differentiate between the right to use a column for > queries and the right to see the value itself. Or put another way, you > can read the book, you just can't photocopy it and take the copy home. > Or, you can try on the new clothes to see if they fit, but you can't > take them home for free. Both of those examples have imprecise > security measures in place to control and reduce negative behaviours > and in every other industry this is known as "security". > > In IT terms, we're looking at controlling and reducing improper access > to data by an otherwise Trusted person. The only problem is that some > actions on data items are allowed, others are not. Sure, I don't disagree with any of that as a general principle. I just think we should look for some ways of shoring up your proposal against some of the more obvious attacks, so as to have more good and less bad. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 14 October 2014 17:43, Robert Haas wrote: > On Sat, Oct 11, 2014 at 3:40 AM, Simon Riggs wrote: >> As soon as you issue the above query, you have clearly indicated your >> intention to steal. Receiving information is no longer accidental, it >> is an explicit act that is logged in the auditing system against your >> name. This is sufficient to bury you in court and it is now a real >> deterrent. Redaction has worked. > > To me, this feels thin. It's true that this might be good enough for > some users, but I wouldn't bet on it being good enough for very many > users, and I really hope there's a better option. We have an existing > method of doing data redaction via security barrier views. I agree with "thin". There is a leak in the design, so let me coin the phrase "imprecise security". Of course, the leaks reduce the value of such a feature; they just don't reduce it all the way to zero. Security barrier views or views of any kind don't do the required job. We are not able to easily classify people as Trusted or Untrusted. We're seeking to differentiate between the right to use a column for queries and the right to see the value itself. Or put another way, you can read the book, you just can't photocopy it and take the copy home. Or, you can try on the new clothes to see if they fit, but you can't take them home for free. Both of those examples have imprecise security measures in place to control and reduce negative behaviours and in every other industry this is known as "security". In IT terms, we're looking at controlling and reducing improper access to data by an otherwise Trusted person. The only problem is that some actions on data items are allowed, others are not. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Sat, Oct 11, 2014 at 3:40 AM, Simon Riggs wrote: > As soon as you issue the above query, you have clearly indicated your > intention to steal. Receiving information is no longer accidental, it > is an explicit act that is logged in the auditing system against your > name. This is sufficient to bury you in court and it is now a real > deterrent. Redaction has worked. To me, this feels thin. It's true that this might be good enough for some users, but I wouldn't bet on it being good enough for very many users, and I really hope there's a better option. We have an existing method of doing data redaction via security barrier views. There are information leaks, to be sure, but they are far more subtle and low-bandwidth than Rod's query. The reason for that is that only trusted code (leakproof functions) are allowed to run against the trusted data; the redaction is applied before any potentially-untrustworthy stuff happens. Here, you're applying the redaction as the very last step before sending the data to the user, and that allows too much freedom to do bad stuff between the time the database first lays hands on the data and the time it gets redacted. But maybe that can be fixed. I don't know exactly how. I think you need a design that allows you to restrict very tightly the operations that an untrusted user can perform on trusted data. Maybe you only want to allow "=" and nothing else, for example. Perhaps the set of allowable predicates could be defined via DDL. Then when the query is run, the system imposes a security fence. Only approved predicates can be pushed through the fence. And when the data crosses the fence from the trusted side to the untrusted side, redaction happens at that point, rather than just before sending the data to the user. This is, of course, more complicated. But I think it's likely to be worth it. The problem with relying on auditing is that you need a human to look at the audit logs and judge intent. With a query as overt as Rod's, that's maybe not too hard. But with a lot of analysts running a lot of queries, it might not be that hard to bury an information-stealing query inside an innocent-looking query in such a way that the administrator doesn't notice. Granted, that's playing with fire, but I've encountered many security vulnerabilities in my career that can be exploited without doing anything obviously evil. If you retroactively put a packet-sniffer on every network I've ever been connected to, and carefully examined all my network traffic, you'd find me finding holes in all kinds of things, but in fact, nobody's ever noticed a problem in advance of me reporting it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10/10/14 21:57, Simon Riggs wrote: Postgres currently supports column level SELECT privileges. 1. If we want to confirm a credit card number, we can issue SELECT 1 FROM customer WHERE stored_card_number = '1234 5678 5344 7733' 2. If we want to look for card fraud, we need to be able to use the full card number to join to transaction data and look up blocked card lists etc.. 3. We want to block the direct retrieval of card numbers for additional security. In some cases, we might want to return an answer like ' * 7733' We can't do all of the above with current facilities inside the database. The ability to mask output for data in certain cases, for the purpose of security, is known lately as data redaction, or column-level data redaction. The best way to support this requirement would be to allow columns to have an additional "output formatting function". This would be executed only when data is about to be returned by a query. All other uses of that would not restrict the data. This would have other uses as well, such as default report formats, so we can store financial amounts as NUMERIC, but format them on retrieval as $12,345.78 etc.. Suggested user interface would be... FORMAT functionname(parameters, if any) e.g. CREATE TABLE customer ( id ... ... , stored_card_number NUMERIC FORMAT pci_card_number_redaction() ... ); We'd need to implement something to allow pg_dump to ignore format functions. I suggest the best way to do that is by providing a BACKUP role that can be delegated to other users. We would then allow a parameter for SET output_formatting = on | off, which can only be set by superuser and BACKUP role, then have pg_dump issue SET output_formatting = off explicitly when it runs. Do we want redaction in PostgreSQL? Do we want it generalised into output format functions? I think having a FORMAT option would be good, but I strongly feel that end users should NEVER EVER have direct access to any database with sensitive information! And if the full details are stored, then obviously, at some time people will have a legitimate need to access all the digits - so it does not make sense to prevent this . Also I think it would be useful to store formats, especially complicated ones, so they can be defined once and reused as many times as required - helps for standardisation. How about something like: CREATE FORMAT /format-name/ [WITH] /format-spec/ [DENY | ALLOW role-1, ...]; Where the /format-spec/ is either a function, or something similar to a COBOL picture spec., I suspect that the implied security control with the ALLOW & DENY options might prove too weak for anyone determined, though it might be good enough in some common contexts. CREATE FORMAT card_format_redacted WITH ' ' ALLOW ALL; CREATE FORMAT card_format_full ' ' ALLOW admin_1; CREATE FORMAT card_format_special special_card_formatter(); ALLOW admin_42, mariadba; -- specify default FORMAT CREATE TABLE customer ( ... stored_card_number NUMERIC FORMAT card_format_redacted, ... ) -- unformatted, fails if role is neither admin-1 or a role that inherits from it SELECT stored_card_number WHERE ...; -- using card_format_redacted SELECT stored_card_number FORMAT DEFAULT WHERE ...; -- using card_format_full, fails if role is neither admin-1 or a role that inherits from it SELECT stored_card_number FORMAT card_format_full WHERE ...; Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Sat, Oct 11, 2014 at 09:51:28AM +0100, Simon Riggs wrote: > > So it's not actually suitable for the example you gave. I don't think we > > want this feature... > > The full quote I read is the following... > > "Even though Oracle Data Redaction is not intended to protect against > attacks by database users who run ad hoc queries directly against the > database, it can hide sensitive data for these ad hoc query scenarios > when you couple it with other preventive and detective controls." > > That full context would have been useful. OK, that certainly helps. I think the interesting question, though, is whether we can create a data type that doesn't have any casting or comparison functions, and has limited or no output function, and is useful. Are there are cases where you would want to store data in a database that could not be fully viewed but still would be useful to be stored. For example, for a credit card type, you would output the last four digits, but is there any value to storing the non-visible digits? You can check the checksum of the digits, but that can be done on input and doesn't require the storage of the digits. Is there some function we could provide that would make that data type useful? Could we provide comparison functions with delays or increasing delays? I can think of a useful fully-redacted data type example, and that would be the credit card expire date. You could store that in a field that has no output or comparison functions, but you could provide a useful function that would tell whether the expire date had passed based on the system date. It would be useful to store such a date, and a user could know the data value only after it had expired. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/11/2014 02:40 AM, Simon Riggs wrote: > As soon as you issue the above query, you have clearly indicated > your intention to steal. Receiving information is no longer > accidental, it is an explicit act that is logged in the auditing > system against your name. This is sufficient to bury you in court > and it is now a real deterrent. Redaction has worked. > > Redaction is similar to a 3m high razor wire fence. The fence > reminds you of what is correct and dissuades you from going > further. The fence does not prevent access by a determined and > skillful agent (Rod), but the CCTV cameras that are set out will > record the action. It will be almost impossible to claim you were > just walking your dog, and the wire cutters were a gift for your > brother in law. > > Redaction prevents accidental information loss only, forcing any > loss that occurs to be explicit. It ensures that loss of > information can be tied clearly back to an individual, like an ink > packet that stains the fingers of a thief. > > I don't have a word or pithy phrase for this concept. Maybe > something related to "forcing their hand", flushing game into the > open, or simply preventing "tipping your hand" and inadvertently > allowing data loss. > > Redaction clearly relies completely on auditing before it can have > any additional effect. And the effectiveness of redaction needs to > be understood next to Rod's example. > > Since it relies on auditing, we need to do that first. This is a really good summary. I definitely know of folks who would be interested in this feature, but I also agree, as you have said, it relies on a good audit trail. Joe - -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQIcBAEBAgAGBQJUOTOGAAoJEDfy90M199hlswcP/1qUtwvsb+a4hKqL3FsIIkmK +2f5x+TRm1C5B04QhVa4A7iOr+lfzcoGChV2x2EwCqKJWNzwcpZfB/vBNv593KU4 /WZ+r0o0Hih69dE8gAS602xkrw8x3iAqcTzfyrfiE2O9yhYjoCmqqPls6PtgACc7 JI9pNiPRO+Sd2B308FaD70KkbnGDjMeFPgrxU7NRZwf0NG/bkDq28vSJl5QLg6DO lFEtB1mMVWWmlnfTgw+zTXamxPJZTLK2Z38OBX3mjjD+64kEMjI5YQ39X8T9Ndfu 0dCA6KCqfCiy/ANETv0ScdoO/uiEQ6VfkbXy1lHK9sWDgu7HOwTPo4c0ft4tILDK NIXvCYAFK0aPzuEVLFfwf6wm6BP7kuJ+42fY+VwMwCkt4DoQpLRJChIQzJ9ilmK2 suMSmC/sxHeRkLwRAo4uHyAzLZbectq3VC6Zdjlx35jdWG7We1katBoIU8MOC0sc YFcUJRQk+PTxjp1fOPS7szDZulCMMXP4s0v07hiW5z6EaY82I9mJk6dnuk8eha16 3h4zBgbkM9hZhKLlbwLFSUKZrQdUklRJDXQhUuUqSIOQAU02zEKs2Pl0w1l+h5CY cb0xPfvkIVPgrDMRfEhdbr+rh2jcEE4gQeuWNe0cexuyZiKI+Xc2MLscaeqIeBNJ bEur+OvRj+wlnrYPGA80 =gTcG -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 11:27, Heikki Linnakangas wrote: > I googled for Oracle Data redaction, and found "General Usage guidelines": > >> General Usage Guidelines >> >> * Oracle Data Redaction is not intended to protect against attacks by >> privileged database users who run ad hoc queries directly against the >> database. >> >> * Oracle Data Redaction is not intended to protect against users who >> run exhaustive SQL queries that attempt to determine the actual >> values by inference. > > > So it's not actually suitable for the example you gave. I don't think we > want this feature... The full quote I read is the following... "Even though Oracle Data Redaction is not intended to protect against attacks by database users who run ad hoc queries directly against the database, it can hide sensitive data for these ad hoc query scenarios when you couple it with other preventive and detective controls." That full context would have been useful. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 16:45, Rod Taylor wrote: > On my laptop I can pull all 10,000 card numbers in less than 1 second. Right. Like I said: covert channels exist. Great example of how to exploit them, thanks. Cool SQL. What could be the use of "a security feature that does not prevent security"? As soon as you issue the above query, you have clearly indicated your intention to steal. Receiving information is no longer accidental, it is an explicit act that is logged in the auditing system against your name. This is sufficient to bury you in court and it is now a real deterrent. Redaction has worked. Redaction is similar to a 3m high razor wire fence. The fence reminds you of what is correct and dissuades you from going further. The fence does not prevent access by a determined and skillful agent (Rod), but the CCTV cameras that are set out will record the action. It will be almost impossible to claim you were just walking your dog, and the wire cutters were a gift for your brother in law. Redaction prevents accidental information loss only, forcing any loss that occurs to be explicit. It ensures that loss of information can be tied clearly back to an individual, like an ink packet that stains the fingers of a thief. I don't have a word or pithy phrase for this concept. Maybe something related to "forcing their hand", flushing game into the open, or simply preventing "tipping your hand" and inadvertently allowing data loss. Redaction clearly relies completely on auditing before it can have any additional effect. And the effectiveness of redaction needs to be understood next to Rod's example. Since it relies on auditing, we need to do that first. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 19:26, Bruce Momjian wrote: > On Fri, Oct 10, 2014 at 02:05:05PM +0100, Thom Brown wrote: >> On 10 October 2014 13:43, Simon Riggs wrote: >> > On 10 October 2014 11:45, Thom Brown wrote: >> > >> >> To be honest, this all sounds rather flaky. > My other concern is you must have realized these issues in five seconds > too, so why didn't you mention them? Because the problems that you come up with in 5 seconds aren't necessarily problems. You just think they are, given 5 seconds thought. I think my first impression of the concept was poor also though it would be wonderful if I had remembered all of my initial objections. I didn't have any problem with Thom's first post, which was helpful in allowing me to explain the context and details. As I said in reply at that point, this is not in itself a barrier; other measures are necessary. The rest of the thread has descended into a massive misunderstanding of the purpose and role of redaction. When any of us move too quickly to a value judgement about a new concept then we're probably missing the point. All of us will be asked at sometime in the next few years why Postgres doesn't have redaction. When you get it, post back here please. Or if you win the argument on it not being useful in any circumstance, post that here also. I'm not in a rush. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
Rod, * Rod Taylor (rod.tay...@gmail.com) wrote: > For fun I gave the search a try. Neat! > On my laptop I can pull all 10,000 card numbers in less than 1 second. For > a text based item I don't imagine it would be much different. Numbers are > pretty easy to work with though. I had been planning to give something like this a shot once I got back from various meetings today- so thanks! Being able to use the CC # *as* the target for the binary search is definitely an issue, though looking back on the overall problem space, CC's are less than 54 bits, and it's actually a smaller space than than that if you know how they're put together. My thought on an attack was more along these lines: select * from cards join (SELECT CAST(random() * AS bigint) a from generate_series(1,100)) as foo on (cards.cc = foo.a); Which could pretty quickly find ~500 CC #s in a second or so (with a 'cards' table of about 1M entries) based on my testing. That's clearly sufficient enough to make it a viable attack also. The next question I have is- do(es) the other vendor(s) provide a way to address this or is it simply known that this doesn't offer any protection at all from adhoc queries and it's strictly for formatting? I can certainly imagine it actually being a way to simply avoid *inadvertant* exposure rather than providing any security from the individual running the commands. I'm not sure that would make it genuinely different enough from simply maintaining a view which does that filtering to make it useful on its own as a feature though, but I'm not particularly strongly against it either. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
On Fri, Oct 10, 2014 at 02:05:05PM +0100, Thom Brown wrote: > On 10 October 2014 13:43, Simon Riggs wrote: > > On 10 October 2014 11:45, Thom Brown wrote: > > > >> To be honest, this all sounds rather flaky. > > > > To be honest, suggesting anything at all is rather difficult and I > > recommend people try it. > > I have, and most ideas I've had have been justifiably shot down or > picked apart (scheduled background tasks, offloading stats collection > to standby, index maintenance in DML query plans, expression > statistics... to name but a few). > > > Everything sounds crap when you didn't think of it and you've given it > > an hour's thought. > > I'm not sure that means my concerns aren't valid. I don't think it > sounds crap, but I also can't see any use-case for it where we don't > already have things covered, or where it's going to offer any useful > level of security. Like with RLS, it may be that I'm just looking at > things from the wrong perspective. Agreed. The problem isn't giving it only an hours thought --- it is that we can come up with serious problems in five _seconds_ of thought. Unless you can some up with a solution to those issues, I am not sure why we are even talking about it. My other concern is you must have realized these issues in five seconds too, so why didn't you mention them? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
Thom Brown wrote: > On 10 October 2014 15:56, Stephen Frost wrote: >> Thom Brown (t...@linux.com) wrote: >>> Data such as plain credit card numbers stored in a >>> column, even with all its data masked, would be easy to determine. >> >> I'm not as convinced of that as you are.. Though I'll point out that in >> the use-cases which I've been talking to users about, it isn't credit >> cards under discussion. > > I think credit card numbers are a good example. I'm not so sure. Aren't credit card numbers generally required by law to be stored in an encrypted form? > If we're talking > about format functions here, there has to be something in addition to > that which determines permitted comparison operations. If not, and we > were going to remove all but = operations, we'd effectively cripple > the functionality of anything that's been formatted that wasn't > intended as a security measure. It almost sounds like an extension to > domains rather than column-level functionality. I have to say that my first thought was that format functions associated with types with domain override would be a very nice capability. But I don't see where that has much to do with security. I have seen many places where redaction is necessary (and in fact done), but I don't see how that could be addressed by what Simon is proposing. Perhaps I'm missing something; if so, a more concrete exposition of a use case might allow things to "click". -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Fri, Oct 10, 2014 at 10:56 AM, Stephen Frost wrote: > * Thom Brown (t...@linux.com) wrote: > > On 10 October 2014 12:45, Stephen Frost wrote: > > >> There's a difference between intending that there shouldn't be a way > > >> past security and just making access a matter of walking a longer > > >> route. > > > > > > Throwing random 16-digit numbers and associated information at a credit > > > card processor could be viewed as "walking a longer route" too. The > > > same goes for random key searches or password guesses. > > > > But those would need to be exhaustive, and in nearly all cases, > > impractical. > > That would be exactly the idea with this- we make it impractical to get > at the unredacted information. > For fun I gave the search a try. create table cards (id serial, cc bigint); insert into cards (cc) SELECT CAST(random() * AS bigint) FROM generate_series(1,1); \timing on WITH RECURSIVE t(id, range_min, range_max) AS ( SELECT id, 1::bigint, FROM cards UNION ALL SELECT id , CASE WHEN cc >= range_avg THEN range_avg ELSE range_min END , CASE WHEN cc <= range_avg THEN range_avg ELSE range_max END FROM (SELECT id, (range_min + range_max) / 2 AS range_avg, range_min, range_max FROM t ) AS t_avg JOIN cards USING (id) WHERE range_min != range_max ) SELECT id, range_min AS cc FROM t WHERE range_min = range_max; On my laptop I can pull all 10,000 card numbers in less than 1 second. For a text based item I don't imagine it would be much different. Numbers are pretty easy to work with though.
Re: [HACKERS] Column Redaction
On Fri, Oct 10, 2014 at 11:58 AM, Stephen Frost wrote: >> You are obviously wearing your rose-colored glasses this morning. I >> predict a competent SQL programmer could write an SQL function, or >> client-side code, to pump the data out of the database using binary >> search in milliseconds per row. > > Clearly, if we're unable to prevent that, then this feature wouldn't be > useful. What would be helpful is to consider what we could provide > along these lines without allowing the data to be trivially recovered. Joins are way too powerful to allow arbitrary joins to untrusted users. The only somewhat secure way is to allow administrators define which joins are possible, and untrusted users use those. You get that with views. I'm not sure you can allow more than that, and not have lots of leaks. Is there a use case where redaction is the only solution really? Nothing mentioned till now really is: * Transaction logs and blocked card lists can be joined against users and a view can be provided that includes the user, and not the credit card. So you can join freely between the views just fine, by user, and do all the analysis you need without exposing credit card numbers in any way, not even redacted. * If not users, you can join against a random but unique per card value generated at some point when the card is first inserted in the records, and you get a random token for the card. Still works, and can be done with triggers, and is far less leaky than the proposed redaction. * Credit card number verification is a leak on its own, but if you really want it, you can provide a function that does it. And I think it's perfectly reasonable that defining leaking functions has to be an admin thing. * Views can expose the redacted value just fine for direct use. A generically usable user-id or random-token to redacted number mapping view would provide all the freedom you could want. * Functions defined as leakproof (even when they're not, which is an admin decision to throw data safety out the window, but it's a possible decision), which allows fetching redacted columns that way from security barrier views. Is there anything not covered by the above that can be done by built-in redacting? If the answer is yes, then maybe the feature has value. If the feature's value is ease of use, I'd weight that with the security loss. False sense of security is a net security loss in most (if not all) cases. Having to flesh out the logic through security barrier views, leakproof redacting functions and triggers can have the good side-effect of making all the possible leaks obvious to the admin. On Fri, Oct 10, 2014 at 12:27 PM, Thom Brown wrote: > Also, joining to foreign tables could be an issue, copying data to > temporary tables could possibly remove any redaction, materialised > views would need to support it somehow. Although just because I can't > picture how that would work, it's no indication that it couldn't. Well, that's why encryption is usually regulatorily required on credit card data. Way too many ways to leak, and way too valuable to expect lack of knowledgeable and motivated people trying to get them. On Fri, Oct 10, 2014 at 12:27 PM, Thom Brown wrote: >>> Salted and hashed passwords, even with complete visibility of the >>> value, isn't vulnerable to scrutiny of particular character values. >>> If it were, no-one would use it. >> >> I wasn't suggesting otherwise, but I don't see it as particularly >> relevant to the discussion regardless. > > I guess I was trying to illustrate that the security in a hashed > password is acceptable because it requires exhaustive searching to > break. If comparison operators worked on it, it would be broken out > of the box. Lately, the security of password-based authentication is being put into question very often. So I wouldn't hold credit card numbers or any other sensible information to the password standard. But lets use the password example: it's widely accepted that holding onto cleartext passwords or even transmitting over any channel them or their plain hashes to be extremely bad practice. So redaction isn't good enough for passwords, nor is salted hashing either. The only generally accepted way on the security community, is a password proof in the context of a zero-knowledge password proof protocol[0]. You'd want something like that for any bit of info you need to "join" or "compare" but you can't accept leaking it. [0] http://en.wikipedia.org/wiki/Zero-knowledge_password_proof -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 15:56, Stephen Frost wrote: > * Thom Brown (t...@linux.com) wrote: >> Data such as plain credit card numbers stored in a >> column, even with all its data masked, would be easy to determine. > > I'm not as convinced of that as you are.. Though I'll point out that in > the use-cases which I've been talking to users about, it isn't credit > cards under discussion. I think credit card numbers are a good example. If we're talking about format functions here, there has to be something in addition to that which determines permitted comparison operations. If not, and we were going to remove all but = operations, we'd effectively cripple the functionality of anything that's been formatted that wasn't intended as a security measure. It almost sounds like an extension to domains rather than column-level functionality. But then if operators such as <, > and ~~ aren't hindered, it sounds like no protection at all. Also, joining to foreign tables could be an issue, copying data to temporary tables could possibly remove any redaction, materialised views would need to support it somehow. Although just because I can't picture how that would work, it's no indication that it couldn't. >> Salted and hashed passwords, even with complete visibility of the >> value, isn't vulnerable to scrutiny of particular character values. >> If it were, no-one would use it. > > I wasn't suggesting otherwise, but I don't see it as particularly > relevant to the discussion regardless. I guess I was trying to illustrate that the security in a hashed password is acceptable because it requires exhaustive searching to break. If comparison operators worked on it, it would be broken out of the box. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Fri, Oct 10, 2014 at 7:00 AM, Stephen Frost wrote: > > The discussion about looking up specific card numbers in the original > > email from Simon was actually an allowed use-case, as I understood it, > > not a risk concern. Indeed, if you know a valid credit card number > > already, as in this example, then why are you bothering with the search? > > Perhaps it would provide confirmation, but it's not the database's > > responsibility to make you forget the number you already have. Doing a > > random walk through a keyspace of 10^16 and extracting a significant > > enough number of results to be useful should be difficult. I agree that > > if we're completely unable to make it difficult then this is less > > useful, but I feel it's a bit early to jump to that conclusion. Thanks much for the laugh. :) > You are obviously wearing your rose-colored glasses this morning. I > predict a competent SQL programmer could write an SQL function, or > client-side code, to pump the data out of the database using binary > search in milliseconds per row. Clearly, if we're unable to prevent that, then this feature wouldn't be useful. What would be helpful is to consider what we could provide along these lines without allowing the data to be trivially recovered. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
* Thom Brown (t...@linux.com) wrote: > On 10 October 2014 12:45, Stephen Frost wrote: > >> There's a difference between intending that there shouldn't be a way > >> past security and just making access a matter of walking a longer > >> route. > > > > Throwing random 16-digit numbers and associated information at a credit > > card processor could be viewed as "walking a longer route" too. The > > same goes for random key searches or password guesses. > > But those would need to be exhaustive, and in nearly all cases, > impractical. That would be exactly the idea with this- we make it impractical to get at the unredacted information. > Data such as plain credit card numbers stored in a > column, even with all its data masked, would be easy to determine. I'm not as convinced of that as you are.. Though I'll point out that in the use-cases which I've been talking to users about, it isn't credit cards under discussion. > Salted and hashed passwords, even with complete visibility of the > value, isn't vulnerable to scrutiny of particular character values. > If it were, no-one would use it. I wasn't suggesting otherwise, but I don't see it as particularly relevant to the discussion regardless. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > You said above that it's OK to pass the card numbers to leakproof > functions. But if you allow that, you can write a function that > takes as argument a redacted card number, and unredacts it (using > the < and = operators in a binary search). And then you can just do > "SELECT unredact(card_number) from redacted_table". Not sure I'm following what you mean by 'redacted'. The original proposal provided ' 1234' as the 'redacted' number, and I'm not seeing how you can get the rest of the number trivially with just equality and binary search. If you start with a complete number then you can get the system to tell you if it exists or not with a binary search or even just doing an equality check. > You seem to have something stronger in mind: only allow the equality > operator on the redacted column, and nothing else. That wasn't my suggestion- I was merely pointing out that if you have a complete number (perhaps by pulling out a random number, with a filter against the last four digits, reducing the search space to 10^12) which you want to check for existance, you can do that directly. No need for a binary search at all. > That might be > better, although I'm not really convinced. There are just too many > ways you could still leak the datum. Just a random example, inspired > by the recent CRIME attack on SSL: build a row with the redacted > datum, and another "guess" datum, and store it along with 1k of > other data in a temporary table. The row gets toasted. Observe how > much it compressed; if the guess datum is close to the original > datum, it compresses well. Now, you can probably stop that > particular attack with more restrictions on what you can do with the > datum, but that just shows that pretty much any computation you > allow with the datum can be used to reveal its value. One concept I've been thinking about is a notion of 'trusted' data sources to allow comparison against. Perhaps individual values are allowed from the user also, but my thought is that you have: master_table trusted_table Such that you can't view the sensetive column in either the master or the trusted table, but you can join between the two on the sensetive column and view other, non-sensetive, attributes of the two tables. You might even allow other transformations on the sensetive column, provided it always results in a boolean comparison to another sensetive column. Not sure if that really solves Simon's use-case exactly, but it might tease out other thoughts. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
On Fri, Oct 10, 2014 at 5:57 AM, Simon Riggs wrote: > > 1. If we want to confirm a credit card number, we can issue SELECT 1 > FROM customer WHERE stored_card_number = '1234 5678 5344 7733' > ... > 3. We want to block the direct retrieval of card numbers for > additional security. > In some cases, we might want to return an answer like ' * 7733' I wouldn't want to allow that: select ref.ref, customer.name from (select generate_series as ref from generate_series(0, )) ref, customer where ref.ref = stored_card_number.ref May take a long while. Just disable everything except nestloop and suck up the data as it comes. Can be optimized. Not sure how you'd avoid this, not trivial at all. Not possible at all I'd venture. But if you really really want to allow this, encrypt the column, and provide a C function that can decrypt it. You can join encrypted columns, and you can even include the last 4 digits unencrypted if you want (I wouldn't want). Has to be a C function to be able to avoid leaking the key, btw. > 2. If we want to look for card fraud, we need to be able to use the > full card number to join to transaction data and look up blocked card > lists etc.. view works for this pretty well -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 13:43, Simon Riggs wrote: > On 10 October 2014 11:45, Thom Brown wrote: > >> To be honest, this all sounds rather flaky. > > To be honest, suggesting anything at all is rather difficult and I > recommend people try it. I have, and most ideas I've had have been justifiably shot down or picked apart (scheduled background tasks, offloading stats collection to standby, index maintenance in DML query plans, expression statistics... to name but a few). > Everything sounds crap when you didn't think of it and you've given it > an hour's thought. I'm not sure that means my concerns aren't valid. I don't think it sounds crap, but I also can't see any use-case for it where we don't already have things covered, or where it's going to offer any useful level of security. Like with RLS, it may be that I'm just looking at things from the wrong perspective. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Fri, Oct 10, 2014 at 7:00 AM, Stephen Frost wrote: > * Thom Brown (t...@linux.com) wrote: >> To be honest, this all sounds rather flaky. Even if you do rate-limit >> their queries, they can use methods that avoid rate-limiting, such as >> recursive queries. And if you're only after one credit card number >> (to use the original example), you'd get it in a relatively short >> amount of time, despite some rate-limiting system. > > The discussion about looking up specific card numbers in the original > email from Simon was actually an allowed use-case, as I understood it, > not a risk concern. Indeed, if you know a valid credit card number > already, as in this example, then why are you bothering with the search? > Perhaps it would provide confirmation, but it's not the database's > responsibility to make you forget the number you already have. Doing a > random walk through a keyspace of 10^16 and extracting a significant > enough number of results to be useful should be difficult. I agree that > if we're completely unable to make it difficult then this is less > useful, but I feel it's a bit early to jump to that conclusion. You are obviously wearing your rose-colored glasses this morning. I predict a competent SQL programmer could write an SQL function, or client-side code, to pump the data out of the database using binary search in milliseconds per row. And I think it's more likely than not that there are other techniques that are much faster. The idea that you're going to be able to let people query the data but not actually retrieve it should be viewed with great skepticism. This is the equivalent of telling a child that she can't open her Christmas presents until Christmas, but she can shake them, hold them up to a bright light, and/or X-ray the packages. If she doesn't know what's in there by the time she opens it, it's just for lack of effort. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 12:01, Heikki Linnakangas wrote: > Really, I don't see how this can possible be made to work. You can't allow > ad hoc processing of data, and still avoid revealing it to the user. Anyone with unmonitored access and sufficient time can break through security. I think that is true of any kind of security, and so it is true here also. Auditing and controls are required also, that's why I suggested those first. This proposal was looking beyond that to what we might need next. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 11:45, Thom Brown wrote: > To be honest, this all sounds rather flaky. To be honest, suggesting anything at all is rather difficult and I recommend people try it. Everything sounds crap when you didn't think of it and you've given it an hour's thought. I'm not blind to the difficulties raised and I thank you for your input, but I think its too early to make sweeping generalisations. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 12:45, Stephen Frost wrote: >> >> This gives the vague impression of security, but it really seems just >> >> the placing of a few obstacles in the way. >> > >> > One might consider that all security is just placing obstacles in the >> > way. >> >> There's a difference between intending that there shouldn't be a way >> past security and just making access a matter of walking a longer >> route. > > Throwing random 16-digit numbers and associated information at a credit > card processor could be viewed as "walking a longer route" too. The > same goes for random key searches or password guesses. But those would need to be exhaustive, and in nearly all cases, impractical. Data such as plain credit card numbers stored in a column, even with all its data masked, would be easy to determine. Salted and hashed passwords, even with complete visibility of the value, isn't vulnerable to scrutiny of particular character values. If it were, no-one would use it. Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10/10/2014 02:27 PM, Stephen Frost wrote: * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: On 10/10/2014 02:05 PM, Stephen Frost wrote: * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: On 10/10/2014 01:35 PM, Stephen Frost wrote: Regarding functions, 'leakproof' functions should be alright to allow, though Heikki brings up a good point regarding binary search being possible in a plpgsql function (or even directly by a client). Of course, that approach also requires that you have a specific item in mind. It doesn't require that you have a specific item in mind. Binary search is cheap, O(log n). It's easy to write a function to do a binary search on a single item, passed as argument, and then apply that to all rows: SELECT binary_search_reveal(cardnumber) FROM redacted_table; Note that your binary_search_reveal wouldn't be marked as leakproof and therefore this wouldn't be allowed. If this was allowed, you'd simply do "raise notice" inside the function and call it a day. *shrug*, just do the same with a more complicated query, then. Even if you can't create a function that does that, you can still execute the same logic without a function. Not sure I see what you're getting at here..? My point was that you'd need a target number and the system would only provide confirmation that the number exists, or does not. Your argument was that the table itself would provide the target number, which was flawed. I don't see how "just do the same with a more complicated query" removes the need to have a target number for the binary search. You said above that it's OK to pass the card numbers to leakproof functions. But if you allow that, you can write a function that takes as argument a redacted card number, and unredacts it (using the < and = operators in a binary search). And then you can just do "SELECT unredact(card_number) from redacted_table". You seem to have something stronger in mind: only allow the equality operator on the redacted column, and nothing else. That might be better, although I'm not really convinced. There are just too many ways you could still leak the datum. Just a random example, inspired by the recent CRIME attack on SSL: build a row with the redacted datum, and another "guess" datum, and store it along with 1k of other data in a temporary table. The row gets toasted. Observe how much it compressed; if the guess datum is close to the original datum, it compresses well. Now, you can probably stop that particular attack with more restrictions on what you can do with the datum, but that just shows that pretty much any computation you allow with the datum can be used to reveal its value. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
* Thom Brown (t...@linux.com) wrote: > On 10 October 2014 12:00, Stephen Frost wrote: > > The discussion about looking up specific card numbers in the original > > email from Simon was actually an allowed use-case, as I understood it, > > not a risk concern. Indeed, if you know a valid credit card number > > already, as in this example, then why are you bothering with the search? > > The topic being "column redaction" rather than "column formatting" > leads me to believe that the main use-case of the feature would be to > prevent the user from discovering the full value of the column. I believe the idea is to limit the chances that a user with limited pre-existing knowledge would be able to determine the full value of items in the column, especially in bulk. > It's > not so much point 1 I was responding do, rather point 3, where you > don't know the card number, but you get information about it in the > results. We'd certainly want to prevent that to the limit possible. Do you have a specific thought about how they'd be able to find a full number beyond a random search..? > The purpose of this feature would be to prevent the user > from seeing all that data, which is a security feature, but at the > moment it just seems to be a way of making it a little less easy to > get at that data. I certainly appreciate the thought challenges and critique and I'm hopeful we could make it more than "a little less easy" to get at the information. If we aren't able to do that, then the feature isn't useful, certainly. > >> This gives the vague impression of security, but it really seems just > >> the placing of a few obstacles in the way. > > > > One might consider that all security is just placing obstacles in the > > way. > > There's a difference between intending that there shouldn't be a way > past security and just making access a matter of walking a longer > route. Throwing random 16-digit numbers and associated information at a credit card processor could be viewed as "walking a longer route" too. The same goes for random key searches or password guesses. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > On 10/10/2014 02:05 PM, Stephen Frost wrote: > >* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > >>On 10/10/2014 01:35 PM, Stephen Frost wrote: > >>>Regarding functions, 'leakproof' functions should be alright to allow, > >>>though Heikki brings up a good point regarding binary search being > >>>possible in a plpgsql function (or even directly by a client). Of > >>>course, that approach also requires that you have a specific item in > >>>mind. > >> > >>It doesn't require that you have a specific item in mind. Binary > >>search is cheap, O(log n). It's easy to write a function to do a > >>binary search on a single item, passed as argument, and then apply > >>that to all rows: > >> > >>SELECT binary_search_reveal(cardnumber) FROM redacted_table; > > > >Note that your binary_search_reveal wouldn't be marked as leakproof and > >therefore this wouldn't be allowed. If this was allowed, you'd simply > >do "raise notice" inside the function and call it a day. > > *shrug*, just do the same with a more complicated query, then. Even > if you can't create a function that does that, you can still execute > the same logic without a function. Not sure I see what you're getting at here..? My point was that you'd need a target number and the system would only provide confirmation that the number exists, or does not. Your argument was that the table itself would provide the target number, which was flawed. I don't see how "just do the same with a more complicated query" removes the need to have a target number for the binary search. A better argument would be the equality case than the binary search if you're simply looking for confirmation of existence. If the user can define a table of targets, or uses a VALUES construct, and then join to it then we might build a hash table and provide those results faster than a binary search, though this again means that the user is providing the list of keys to check. As mentioned elsewhere on the thread, I agree that this capability wouldn't be useful if a random search (which is providing the 'targets') through a 10^16 keyspace generated a significant number of results (I'd also throw in there "in a reasonable amount of time"- clearly it'd be possible to extract all keys given sufficient time, even with a random search). The sketch that Simon outlined won't obviously provide that guarantee, but I'm not prepared to say we couldn't provide it at all while meeting the goal he outlined. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
On 10 October 2014 12:00, Stephen Frost wrote: > * Thom Brown (t...@linux.com) wrote: >> To be honest, this all sounds rather flaky. Even if you do rate-limit >> their queries, they can use methods that avoid rate-limiting, such as >> recursive queries. And if you're only after one credit card number >> (to use the original example), you'd get it in a relatively short >> amount of time, despite some rate-limiting system. > > The discussion about looking up specific card numbers in the original > email from Simon was actually an allowed use-case, as I understood it, > not a risk concern. Indeed, if you know a valid credit card number > already, as in this example, then why are you bothering with the search? The topic being "column redaction" rather than "column formatting" leads me to believe that the main use-case of the feature would be to prevent the user from discovering the full value of the column. It's not so much point 1 I was responding do, rather point 3, where you don't know the card number, but you get information about it in the results. The purpose of this feature would be to prevent the user from seeing all that data, which is a security feature, but at the moment it just seems to be a way of making it a little less easy to get at that data. >> This gives the vague impression of security, but it really seems just >> the placing of a few obstacles in the way. > > One might consider that all security is just placing obstacles in the > way. There's a difference between intending that there shouldn't be a way past security and just making access a matter of walking a longer route. I wouldn't be against formatting per se, but for the purposes of that, I would say that views can already serve that purpose. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10/10/2014 02:05 PM, Stephen Frost wrote: * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: On 10/10/2014 01:35 PM, Stephen Frost wrote: Regarding functions, 'leakproof' functions should be alright to allow, though Heikki brings up a good point regarding binary search being possible in a plpgsql function (or even directly by a client). Of course, that approach also requires that you have a specific item in mind. It doesn't require that you have a specific item in mind. Binary search is cheap, O(log n). It's easy to write a function to do a binary search on a single item, passed as argument, and then apply that to all rows: SELECT binary_search_reveal(cardnumber) FROM redacted_table; Note that your binary_search_reveal wouldn't be marked as leakproof and therefore this wouldn't be allowed. If this was allowed, you'd simply do "raise notice" inside the function and call it a day. *shrug*, just do the same with a more complicated query, then. Even if you can't create a function that does that, you can still execute the same logic without a function. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10/10/2014 11:38 AM, Simon Riggs wrote: > On 10 October 2014 10:29, Heikki Linnakangas wrote: >> On 10/10/2014 11:57 AM, Simon Riggs wrote: >>> Postgres currently supports column level SELECT privileges. >>> >>> 1. If we want to confirm a credit card number, we can issue SELECT 1 >>> FROM customer WHERE stored_card_number = '1234 5678 5344 7733' >>> >>> 2. If we want to look for card fraud, we need to be able to use the >>> full card number to join to transaction data and look up blocked card >>> lists etc.. >>> >>> 3. We want to block the direct retrieval of card numbers for >>> additional security. >>> In some cases, we might want to return an answer like ' * >>> 7733' >>> >>> We can't do all of the above with current facilities inside the database. >> >> Deny access to the underlying tables. Write SQL functions to do 1. and 2., >> and grant privileges to the functions, instead. For 3. create views that do >> the redaction. > If everything were easy to lock down the approach you suggest is of > course the best way. > > The problem there is that the SQL for (2) changes frequently, so we > want to give people SQL access. 1. Give people access to development system with "safe" data where they write their functions 2. once function is working, pass it to auditors 3. deploy and use the function. > Just not the ability to retrieve data in a usable form. For an attacker any access is "in a usable form", for honest people you can just provide a view or set-returning function. btw, one way to do the "redaction" you suggested above is to write a special type, which redacts data on output. You can even make the type output function dependent on backup role. Just make sure that users are aware that it is not really a security feature which protects against attackers. Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > On 10/10/2014 01:35 PM, Stephen Frost wrote: > >Regarding functions, 'leakproof' functions should be alright to allow, > >though Heikki brings up a good point regarding binary search being > >possible in a plpgsql function (or even directly by a client). Of > >course, that approach also requires that you have a specific item in > >mind. > > It doesn't require that you have a specific item in mind. Binary > search is cheap, O(log n). It's easy to write a function to do a > binary search on a single item, passed as argument, and then apply > that to all rows: > > SELECT binary_search_reveal(cardnumber) FROM redacted_table; Note that your binary_search_reveal wouldn't be marked as leakproof and therefore this wouldn't be allowed. If this was allowed, you'd simply do "raise notice" inside the function and call it a day. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
On 10/10/2014 01:35 PM, Stephen Frost wrote: Regarding functions, 'leakproof' functions should be alright to allow, though Heikki brings up a good point regarding binary search being possible in a plpgsql function (or even directly by a client). Of course, that approach also requires that you have a specific item in mind. It doesn't require that you have a specific item in mind. Binary search is cheap, O(log n). It's easy to write a function to do a binary search on a single item, passed as argument, and then apply that to all rows: SELECT binary_search_reveal(cardnumber) FROM redacted_table; Really, I don't see how this can possible be made to work. You can't allow ad hoc processing of data, and still avoid revealing it to the user. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
* Thom Brown (t...@linux.com) wrote: > To be honest, this all sounds rather flaky. Even if you do rate-limit > their queries, they can use methods that avoid rate-limiting, such as > recursive queries. And if you're only after one credit card number > (to use the original example), you'd get it in a relatively short > amount of time, despite some rate-limiting system. The discussion about looking up specific card numbers in the original email from Simon was actually an allowed use-case, as I understood it, not a risk concern. Indeed, if you know a valid credit card number already, as in this example, then why are you bothering with the search? Perhaps it would provide confirmation, but it's not the database's responsibility to make you forget the number you already have. Doing a random walk through a keyspace of 10^16 and extracting a significant enough number of results to be useful should be difficult. I agree that if we're completely unable to make it difficult then this is less useful, but I feel it's a bit early to jump to that conclusion. > This gives the vague impression of security, but it really seems just > the placing of a few obstacles in the way. One might consider that all security is just placing obstacles in the way. > And "auditing" sounds like a euphemism for "pass the problem of > security on elsewhere anyway". Auditing is a known requirement for good security.. There's certainly different levels of it, but if you aren't at least auditing your security configuration for the attack vectors you're concerned about, then you're unlikely to have any real security. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > On 10/10/2014 01:21 PM, Simon Riggs wrote: > >Redaction is now a feature available in other databases. I guess its > >possible its all smoke and mirrors, but thats why we discuss stuff > >before we build it. > > I googled for Oracle Data redaction, and found "General Usage guidelines": > > >General Usage Guidelines > > > >* Oracle Data Redaction is not intended to protect against attacks by > >privileged database users who run ad hoc queries directly against the > >database. > > > >* Oracle Data Redaction is not intended to protect against users who > >run exhaustive SQL queries that attempt to determine the actual > >values by inference. > > So it's not actually suitable for the example you gave. I don't > think we want this feature... Or, we need to consider how Oracle addresses these risks and consider if we can provide a similar capability. Those capabilities may include specific configuration and could be a prerequisite for this feature, but I don't think it's sensible to say we don't want this feature simply because it can't stand alone as a perfect answer to these risks. As has been discussed before, we are likely in a better position to identify the concerns and problem areas, come up with recommendations for configuration and/or develop new capabilities to mitigate those risks, than the every-day user or DBA. If we provide it and address these issues in a central location which is generally available, then fixes and problems can be addressed and fixed rather than every database implementation faced with these concerns having to address them independently with, most likely, poorer quality solutions. While we don't want every feature of every database, this deserves more consideration. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
On 10 October 2014 11:35, Stephen Frost wrote: > Simon, > > * Simon Riggs (si...@2ndquadrant.com) wrote: >> The requirement for redaction cannot be provided by a view. >> >> A view provides a single value for each column, no matter whether it >> is used in SELECT or WHERE clause. >> >> Redaction requires output formatting only, but unchanged for other purposes. >> >> Redaction is now a feature available in other databases. I guess its >> possible its all smoke and mirrors, but thats why we discuss stuff >> before we build it. > > In general, I'm on-board with the idea and similar requests have come > from users I've talked with. > > Is there any additional information available on how these other > databases deal with the questions and concerns which have been raised? > > Regarding functions, 'leakproof' functions should be alright to allow, > though Heikki brings up a good point regarding binary search being > possible in a plpgsql function (or even directly by a client). Of > course, that approach also requires that you have a specific item in > mind. Methods to mitigate would include not allowing regular users to > create functions or run DO blocks and rate-limiting their queries, along > with appropriate auditing. To be honest, this all sounds rather flaky. Even if you do rate-limit their queries, they can use methods that avoid rate-limiting, such as recursive queries. And if you're only after one credit card number (to use the original example), you'd get it in a relatively short amount of time, despite some rate-limiting system. This gives the vague impression of security, but it really seems just the placing of a few obstacles in the way. And "auditing" sounds like a euphemism for "pass the problem of security on elsewhere anyway". Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
Hi 2014-10-10 10:57 GMT+02:00 Simon Riggs : > Postgres currently supports column level SELECT privileges. > > 1. If we want to confirm a credit card number, we can issue SELECT 1 > FROM customer WHERE stored_card_number = '1234 5678 5344 7733' > > 2. If we want to look for card fraud, we need to be able to use the > full card number to join to transaction data and look up blocked card > lists etc.. > > 3. We want to block the direct retrieval of card numbers for > additional security. > In some cases, we might want to return an answer like ' * > 7733' > > We can't do all of the above with current facilities inside the database. > > The ability to mask output for data in certain cases, for the purpose > of security, is known lately as data redaction, or column-level data > redaction. > > The best way to support this requirement would be to allow columns to > have an additional "output formatting function". This would be > executed only when data is about to be returned by a query. All other > uses of that would not restrict the data. > > This would have other uses as well, such as default report formats, so > we can store financial amounts as NUMERIC, but format them on > retrieval as $12,345.78 etc.. > > Suggested user interface would be... >FORMAT functionname(parameters, if any) > > e.g. > CREATE TABLE customer > ( id ... > ... > , stored_card_number NUMERIC FORMAT pci_card_number_redaction() > ... > ); > > We'd need to implement something to allow pg_dump to ignore format > functions. I suggest the best way to do that is by providing a BACKUP > role that can be delegated to other users. We would then allow a > parameter for SET output_formatting = on | off, which can only be set > by superuser and BACKUP role, then have pg_dump issue SET > output_formatting = off explicitly when it runs. > > I see a benefit of this feature as alternative output function .. I remember a talk about output format of boolean function. But how this feature can help to security? You should to disallow any expression over this column marked or you have to enforced output alternative output function early. When you require a alternative output format function (should be implemented in C), then there is not too less work than implementation of new type. So probably much more practical a any expression can be used like stored_card_number NUMERIC FORMAT (right(stored_card_numbe::text, 4)) Regards Pavel > Do we want redaction in PostgreSQL? > Do we want it generalised into output format functions? > > -- > Simon Riggs http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] Column Redaction
Simon, * Simon Riggs (si...@2ndquadrant.com) wrote: > The requirement for redaction cannot be provided by a view. > > A view provides a single value for each column, no matter whether it > is used in SELECT or WHERE clause. > > Redaction requires output formatting only, but unchanged for other purposes. > > Redaction is now a feature available in other databases. I guess its > possible its all smoke and mirrors, but thats why we discuss stuff > before we build it. In general, I'm on-board with the idea and similar requests have come from users I've talked with. Is there any additional information available on how these other databases deal with the questions and concerns which have been raised? Regarding functions, 'leakproof' functions should be alright to allow, though Heikki brings up a good point regarding binary search being possible in a plpgsql function (or even directly by a client). Of course, that approach also requires that you have a specific item in mind. Methods to mitigate would include not allowing regular users to create functions or run DO blocks and rate-limiting their queries, along with appropriate auditing. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Column Redaction
On 10/10/2014 01:21 PM, Simon Riggs wrote: Redaction is now a feature available in other databases. I guess its possible its all smoke and mirrors, but thats why we discuss stuff before we build it. I googled for Oracle Data redaction, and found "General Usage guidelines": General Usage Guidelines * Oracle Data Redaction is not intended to protect against attacks by privileged database users who run ad hoc queries directly against the database. * Oracle Data Redaction is not intended to protect against users who run exhaustive SQL queries that attempt to determine the actual values by inference. So it's not actually suitable for the example you gave. I don't think we want this feature... - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 11:08, Damian Wolgast wrote: > >> The problem there is that the SQL for (2) changes frequently, so we >> want to give people SQL access. > > So you want to give people access to your SQL database and worry that they > could see specific information (credit card numbers) in plain and therefore > you want to format it, so that people cannot see the real data. Is that > correct? > > I'd either do that by only letting them access a view or be reconsidering if > it is really a good idea to give them SQL access to the server as they could > do other things which e.g. could slow down the server enormously. > Never trust the user. So I see what you want to achieve but I am not sure if > the reason to do that is good. Can you explain please? > Maybe you should provide them an interface (e.g. web app) that restricts > access to certain functions and cares about formatting. The requirement for redaction cannot be provided by a view. A view provides a single value for each column, no matter whether it is used in SELECT or WHERE clause. Redaction requires output formatting only, but unchanged for other purposes. Redaction is now a feature available in other databases. I guess its possible its all smoke and mirrors, but thats why we discuss stuff before we build it. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
> The problem there is that the SQL for (2) changes frequently, so we > want to give people SQL access. So you want to give people access to your SQL database and worry that they could see specific information (credit card numbers) in plain and therefore you want to format it, so that people cannot see the real data. Is that correct? I'd either do that by only letting them access a view or be reconsidering if it is really a good idea to give them SQL access to the server as they could do other things which e.g. could slow down the server enormously. Never trust the user. So I see what you want to achieve but I am not sure if the reason to do that is good. Can you explain please? Maybe you should provide them an interface (e.g. web app) that restricts access to certain functions and cares about formatting. Regards Damian Wolgast (irc:asymetrixs) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 10:15, Thom Brown wrote: > On 10 October 2014 09:57, Simon Riggs wrote: >> Postgres currently supports column level SELECT privileges. >> >> 1. If we want to confirm a credit card number, we can issue SELECT 1 >> FROM customer WHERE stored_card_number = '1234 5678 5344 7733' >> >> 2. If we want to look for card fraud, we need to be able to use the >> full card number to join to transaction data and look up blocked card >> lists etc.. >> >> 3. We want to block the direct retrieval of card numbers for >> additional security. >> In some cases, we might want to return an answer like ' * 7733' > > One question that immediately springs to mind is: would the format > apply when passing columns to other functions? If not, wouldn't > something like > > SELECT upper(redacted_column::text) ... > > just bypass the formatting? Yes, it would. As would SELECT redacted_column || ' ' I'm not sure how to block such usage, other than to apply it prior to final calculation of functions. i.e. we apply it in the SELECT clause, but not in the other clauses FROM ON/WHERE/GROUP/ORDER/HAVING etc.. > Also, how would casting be handled? Would it be forbidden for such cases? > > > And couldn't the card number be worked out using: > > SELECT 1 FROM customer WHERE stored_card_number LIKE '%1 7733'; > ?column? > -- > (0 rows) > > SELECT 1 FROM customer WHERE stored_card_number LIKE '%2 7733'; > ?column? > -- > 1 > (1 row) > > SELECT 1 FROM customer WHERE stored_card_number LIKE '%12 7733'; > ?column? > -- > (0 rows) > > .. and so on, which could be scripted in a DO statement? > > > Not so much a challenge to the idea, but just wishing to understand > how it would work. Yes, covert channels would always exist. It would really be down to auditing to control such exploits. Redaction is aimed at minimising access in normal usage. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 10:29, Heikki Linnakangas wrote: > On 10/10/2014 11:57 AM, Simon Riggs wrote: >> >> Postgres currently supports column level SELECT privileges. >> >> 1. If we want to confirm a credit card number, we can issue SELECT 1 >> FROM customer WHERE stored_card_number = '1234 5678 5344 7733' >> >> 2. If we want to look for card fraud, we need to be able to use the >> full card number to join to transaction data and look up blocked card >> lists etc.. >> >> 3. We want to block the direct retrieval of card numbers for >> additional security. >> In some cases, we might want to return an answer like ' * >> 7733' >> >> We can't do all of the above with current facilities inside the database. > > > Deny access to the underlying tables. Write SQL functions to do 1. and 2., > and grant privileges to the functions, instead. For 3. create views that do > the redaction. If everything were easy to lock down the approach you suggest is of course the best way. The problem there is that the SQL for (2) changes frequently, so we want to give people SQL access. Just not the ability to retrieve data in a usable form. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
> >This would have other uses as well, such as default report formats, so >we can store financial amounts as NUMERIC, but format them on >retrieval as $12,345.78 etc.. Nice idea, but what if you need to do further calculations? If you output the value of credit card transactions it works fine, but in case you want to SUM up the values, then you need to cast it back from text(?) to numeric, calculate it and cast it to text(?) again? And if you do - for any reason - need the credit card number in your application (for example sending it to the credit card company to deduct money) how can you retrieve it¹s original value? Moreover, if you SELECT from a sub-SELECT which already has the formatted information and not the plain data? Maybe you should restrict access to tables for a certain user and only allow the user to use a view which formats the output. Modern applications do have a presentation layer which should take care of data formatting. I am not sure if it is a good idea to mix data storage and data presentation in the database. Regards, Damian Wolgast (irc:asymetrixs) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10/10/2014 11:57 AM, Simon Riggs wrote: Postgres currently supports column level SELECT privileges. 1. If we want to confirm a credit card number, we can issue SELECT 1 FROM customer WHERE stored_card_number = '1234 5678 5344 7733' 2. If we want to look for card fraud, we need to be able to use the full card number to join to transaction data and look up blocked card lists etc.. 3. We want to block the direct retrieval of card numbers for additional security. In some cases, we might want to return an answer like ' * 7733' We can't do all of the above with current facilities inside the database. Deny access to the underlying tables. Write SQL functions to do 1. and 2., and grant privileges to the functions, instead. For 3. create views that do the redaction. The ability to mask output for data in certain cases, for the purpose of security, is known lately as data redaction, or column-level data redaction. The best way to support this requirement would be to allow columns to have an additional "output formatting function". This would be executed only when data is about to be returned by a query. All other uses of that would not restrict the data. I don't see how that could work. Once you have access to the datum, you can find its value in many indirect ways, without invoking the output function. For example, write a PL/pgSQL function that takes the card number as argument. Use < and > to binary search its value. If you block < and >, I'm sure there are countless other ways. And messing with output functions seems pretty, well, messy, in general. I think the only solution that's going to work in practice is to implement the redaction at a higher level. Don't allow direct access to the tables with card numbers. Create functions that do whatever joins, etc. you need to do with them, and grant privileges to only the functions. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On 10 October 2014 09:57, Simon Riggs wrote: > Postgres currently supports column level SELECT privileges. > > 1. If we want to confirm a credit card number, we can issue SELECT 1 > FROM customer WHERE stored_card_number = '1234 5678 5344 7733' > > 2. If we want to look for card fraud, we need to be able to use the > full card number to join to transaction data and look up blocked card > lists etc.. > > 3. We want to block the direct retrieval of card numbers for > additional security. > In some cases, we might want to return an answer like ' * 7733' One question that immediately springs to mind is: would the format apply when passing columns to other functions? If not, wouldn't something like SELECT upper(redacted_column::text) ... just bypass the formatting? Also, how would casting be handled? Would it be forbidden for such cases? And couldn't the card number be worked out using: SELECT 1 FROM customer WHERE stored_card_number LIKE '%1 7733'; ?column? -- (0 rows) SELECT 1 FROM customer WHERE stored_card_number LIKE '%2 7733'; ?column? -- 1 (1 row) SELECT 1 FROM customer WHERE stored_card_number LIKE '%12 7733'; ?column? -- (0 rows) .. and so on, which could be scripted in a DO statement? Not so much a challenge to the idea, but just wishing to understand how it would work. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column Redaction
On Fri, Oct 10, 2014 at 9:57 AM, Simon Riggs wrote: > Postgres currently supports column level SELECT privileges. > > 1. If we want to confirm a credit card number, we can issue SELECT 1 > FROM customer WHERE stored_card_number = '1234 5678 5344 7733' > > 2. If we want to look for card fraud, we need to be able to use the > full card number to join to transaction data and look up blocked card > lists etc.. > > 3. We want to block the direct retrieval of card numbers for > additional security. > In some cases, we might want to return an answer like ' * 7733' > > We can't do all of the above with current facilities inside the database. > > The ability to mask output for data in certain cases, for the purpose > of security, is known lately as data redaction, or column-level data > redaction. > > The best way to support this requirement would be to allow columns to > have an additional "output formatting function". This would be > executed only when data is about to be returned by a query. All other > uses of that would not restrict the data. > > This would have other uses as well, such as default report formats, so > we can store financial amounts as NUMERIC, but format them on > retrieval as $12,345.78 etc.. > > Suggested user interface would be... >FORMAT functionname(parameters, if any) > > e.g. > CREATE TABLE customer > ( id ... > ... > , stored_card_number NUMERIC FORMAT pci_card_number_redaction() > ... > ); I like that idea a lot - could be very useful (it reminds me of my Pick days). > We'd need to implement something to allow pg_dump to ignore format > functions. I suggest the best way to do that is by providing a BACKUP > role that can be delegated to other users. We would then allow a > parameter for SET output_formatting = on | off, which can only be set > by superuser and BACKUP role, then have pg_dump issue SET > output_formatting = off explicitly when it runs. That seems like a reasonable approach. I can imagine other uses for a BACKUP role in the future. > Do we want redaction in PostgreSQL? +1 > Do we want it generalised into output format functions? +1 -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers