Re: [protobuf] Proposal: a mechanism to deal with sensitive/redacted fields in string output

Zellyn Hunter Thu, 25 Apr 2019 07:02:01 -0700

Hey there, just checking in from Paternity Leave. Last I heard, the
Protobuf team was not opposed to the idea, but thought it would be
relatively invasive and thus unlikely to get accepted. I believe that it
will hardly be invasive at all, so I think the most-likely-to-succeed
course of action is to provide a related set of minimal patches to (a)
protoc, (b) the C protobuf implementation, (c) the Java implementation and
(d) the Go implementation, supporting sensitive fields.


Unfortunately I just haven't found the time to do all that.

Zellyn


On Fri, Apr 19, 2019 at 5:12 PM Josh Humphries <[email protected]>
wrote:

> I don't think there has been any movement on this, but I'd like to ping
> the thread again.
>
> I am still a strong proponent of standard metadata in the proto source and
> descriptor to indicate sensitive things. I also think it's a truly wise
> idea to use that information to trigger auto-redaction of sensitive data
> when serializing message in certain contexts (such as "stringification",
> like when emitting the data into a debug log message).
>
> At FullStory, we've built some interesting tools for using protobufs and
> gRPC. Most recent is a web UI for gRPC:
> https://github.com/fullstorydev/grpcui. I would *love* to be able to add
> features to some of these tools that can be aware of sensitive data in
> messages that they handle and even give the data special treatment in some
> ways. For the web UI, there's a few interesting things that a client might
> choose to do in the face of sensitive data. For one, if that page includes
> an analytics/recording tool, it could make sure the elements in the DOM
> that may contain sensitive data are excluded from analysis/recording.
> Another: it could refuse to send an RPC over an insecure connection if any
> of the message's fields are marked as sensitive.
>
> Outside of the web UI, especially in a world with GDPR, there are all
> kinds of tools that could be built that care deeply about whether data is
> sensitive or not. For example, static analysis/linters that make sure
> sensitive data is treated in a suitably sensitive manner. Without a
> standard way to denote that in the proto, open-source tools in this vein
> become more complicated to use. If open-source tools must provide their own
> custom option (as an example), and then someone wants to use multiple such
> tools on their codebase, they end up having to redundantly define multiple
> options on each sensitive field. (Admittedly, this is a manufactured
> hypothetical.)
>
> Anyhow, where do things stand with this enhancement request? Is there
> anything I can do to stoke the fire and get some attention on it?
>
> ----
> *Josh Humphries*
> [email protected]
>
>
> On Wed, Aug 22, 2018 at 10:01 AM Zellyn <[email protected]> wrote:
>
>> Apologies for the long delay, but I got radically reassigned at work, so
>> I haven't had much time to work on this. But it keeps niggling at me,
>> because I hate our internal protobuf forks so much.
>>
>> Here is the proposal: Proto Proposal: a “sensitive” field option
>> <https://docs.google.com/document/d/18WI8zN7rk6R0jXW1iC8LDYz7LJ0OrUOTKMGD7nyEnFs>
>> .
>>
>> Zellyn
>>
>> On Wednesday, February 22, 2017 at 12:10:14 PM UTC-5, Adam Cozzette wrote:
>>>
>>> Hi Zellyn, this sounds like a reasonable idea. As the next step could
>>> you perhaps write up a short proposal with more details on what exactly it
>>> would mean for a field to be redacted? To me it seems like the important
>>> thing would be to make sure it's clear how redacted fields are supposed to
>>> be behave in each situation (i.e. when they should be dropped or not), so
>>> that there's no uncertainty about when they're dropped and when they're
>>> preserved. (For example, we might say that they're never shown when a proto
>>> is implicitly stringified but maybe preserved in all other situations?) We
>>> might also need to be careful to get this right for all languages early;
>>> even if there's some language where we don't care about redaction for now,
>>> it will be hard to change later without making a breaking change.
>>>
>>> On Thu, Feb 16, 2017 at 1:45 PM, zellyn via Protocol Buffers <
>>> [email protected]> wrote:
>>>
>>>> There are many ways that protocol buffers might be stringified into
>>>> logs, accidentally or on purpose, printed in stack traces, etc. The
>>>> built-in behavior stringifies the entire protobuf recursively, including
>>>> all field data.
>>>>
>>>> At Square, we deal with payments, and often have data of varying
>>>> sensitivity in protobuf fields, which we'd like to be elided from
>>>> stringified output.
>>>>
>>>> We use an internal fork of protoc to handle a custom field option,
>>>> "redacted", and have also patched the stringification code to print
>>>> "[REDACTED]" for those fields. We do the same in Go, and in the C
>>>> implementation (for Ruby).
>>>>
>>>> Last year, we chatted with the protobuf team, and they were sympathetic
>>>> to our use case (in fact, they mentioned that the part of Google that deals
>>>> with payments has something similar internally: I think that's where the
>>>> "sensitive" name came from). I'd like to get that discussion rolling again.
>>>>
>>>> We'd like to see one of the following happen (in decreasing order of
>>>> awesomeness for us):
>>>>
>>>>    - upstreaming of the "redacted" field option, and modification of
>>>>    the runtimes to elide redacted fields when stringifying
>>>>    - introduction of generic interception points to selectively
>>>>    override default stringification behavior in Java, Go, and Ruby (at 
>>>> least).
>>>>    - addition of a "SerializeToString" or equivalent method, and
>>>>    removal of default full-stringification behavior of the toString (Java),
>>>>    String (Go), etc. - that way you only serialize on purpose
>>>>       - many tests rely on string comparison, even though nobody is
>>>>       supposed to rely on it being stable - perhaps the default behavior 
>>>> could
>>>>       compute a hash?
>>>>
>>>> Josh Humpries (who now works at Fullstory) created a proposal
>>>> <https://github.com/google/protobuf/issues/1160> a while back, but it
>>>> didn't go anywhere. I reached out to the protobuf team, and Damien Neil
>>>> suggested that this group was the appropriate place to propose such 
>>>> changes.
>>>>
>>>> Bikeshed away!
>>>>
>>>> Zellyn
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Protocol Buffers" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/protobuf.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "Protocol Buffers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/protobuf.
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Re: [protobuf] Proposal: a mechanism to deal with sensitive/redacted fields in string output

Reply via email to