Re: git bundle format [OT]

2012-11-26 Thread Stephen Bash
- Original Message -
> From: "Jason J CTR Pyeron (US)" 
> Sent: Monday, November 26, 2012 4:06:59 PM
> Subject: RE: git bundle format [OT]
> 
> > First, a shot out of left field: how about a patch based workflow?
> > (similar to the mailing list, just replace email with sneakernet)
> > Patches are plain text and simple to review (preferable to an
> > "opaque" binary format?).
> 
> This is to only address the accidental development on a high side.
> Using this or any process should come with shame or punishment for
> wasting resources/time by not developing on a low side to start
> with.

Ah, if only more of those I (previously) worked with thought as you do :)

> But accepting reality there will be times where code and its
> metadata (commit logs, etc) will be created on a high side and
> should be brought back to the low side.

Using git format-patch and git am it's possible to retain the commit messages 
(and other associated metadata).  But again, I'm not the expert on this :)  
I've made it work a few times to test patches from this list, but so far I've 
avoided serious integration into the mailing list workflow.

> >   2) Do the diffs applied to public repo contain any sensitive
> >   data?
> 
> That is a great question. Can the change of code while neither the
> original or the resultant be secret while the change imply or
> demonstrate the secret. I think the answer is yes.

In actual fact I was thinking about the simple case where the result included 
an "Eek! 3.1415926 cannot show up in this code!" (sometimes that's easier to 
see in a diff than a full text blob).  Obviously the first line of defense 
should catch such mistakes.  But yes, your point is also a good one.  I'd be 
hard pressed to argue that a particular series of commits leaks information on 
their own, but they can certainly corroborate other available information.

> > Question 2 is relatively straight forward and lead me to the patch
> > idea.  I would:
> >   - Bundle the public repository
> >   - Init a new repo in the secure space from the public bundle
> >   - Fetch from the to-be-sanitized bundle into the new repo
> >   - Examine commits (diffs) introduced by branches in the to-be-
> >   sanitized bundle
> >   - Perhaps get a list of all the objects in the to-be-sanitized
> >   bundle and do a git-cat-file on each of them (if the bundle is
> >   assembled correctly it shouldn't have any unreachable objects...).
> >   This step may be extraneous after the previous.
> 
> Here we would be missing the metadata that goes along with the
> commit. Especially the SHA sums.

Ah sorry, I guess I wasn't complete.  Once that process has been done on the 
high side one has to go back to question 1 and see if it's safe to move the 
bundle out to repeat the process on the low side. 
 
Stephen
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: git bundle format [OT]

2012-11-26 Thread Pyeron, Jason J CTR (US)
> -Original Message-
> From: Stephen Bash
> Sent: Monday, November 26, 2012 3:56 PM
> 
> - Original Message -
> > From: "Jason J CTR Pyeron (US)" 
> > Sent: Monday, November 26, 2012 2:24:54 PM
> > Subject: git bundle format
> >
> > I am facing a situation where I would like to use git bundle but at
> > the same time inspect the contents to prevent a spillage[1].
> 
> As someone who faced a similar situation in a previous life, I'll offer
> my $0.02, but I'm certainly not the technical expert here.

Kind of what I am looking for as a side effect.

> 
> > Given we have a public repository which was cloned on to a secret
> > development repository. Now the developers do some work which should
> > not be sensitive in any way and commit and push it to the secret
> > repository.
> >
> > Now they want to release it out to the public. The current process is
> > to review the text files to ensure that there is no "secret" sauce
> > in there and then approve its release. This current process ignores
> > the change tracking and all non-content is lost.
> >
> > In this situation we should assume that the bundle does not have any
> > content which is already in the public repository, that is it has
> > the minimum data to make it pass a git bundle verify from the public
> > repositories point of view. We would then take the bundle and pipe
> > it though the "git-bundle2text" program which would result in a
> > "human" inspectable format as opposed to the packed format[2]. The
> > security reviewer would then see all the information being released
> > and with the help of the public repository see how the data changes
> > the repository.
> >
> > Am I barking up the right tree?
> 
> First, a shot out of left field: how about a patch based workflow?
> (similar to the mailing list, just replace email with sneakernet)
> Patches are plain text and simple to review (preferable to an "opaque"
> binary format?).

This is to only address the accidental development on a high side. Using this 
or any process should come with shame or punishment for wasting resources/time 
by not developing on a low side to start with. But accepting reality there will 
be times where code and its metadata (commit logs, etc) will be created on a 
high side and should be brought back to the low side.


> Second, thinking about your proposed bundle-based workflow I have two
> questions I'd have to answer to be comfortable with the solution:
> 
>   1) Does the binary bundle contain any sensitive information?

Potentially, hence the review. If the reviewer cannot prove the data he is 
looking at then the presumption is yes.

>   2) Do the diffs applied to public repo contain any sensitive data?

That is a great question. Can the change of code while neither the original or 
the resultant be secret while the change imply or demonstrate the secret. I 
think the answer is yes.

> 
> Question 1 seems tricky to someone who knows *nothing* about the bundle
> format (e.g. me).  Maybe some form of bundle2text can be vetted enough
> that everyone involved believes that there is no other information
> traveling with the bundle (if so, you're golden).  Here I have to trust
> other experts.  On the flip side, even if the bundle itself is polluted
> (or considered to be lacking proof to the contrary), if (2) is
> considered safe, the patching of the public repo could potentially be
> done on a sacrificial hard drive before pushing.

The logistics are well established and here and now is not a place to go in to 
that. But the above is the crux of what I am trying to get at.
 
> 
> Question 2 is relatively straight forward and lead me to the patch
> idea.  I would:
>   - Bundle the public repository
>   - Init a new repo in the secure space from the public bundle
>   - Fetch from the to-be-sanitized bundle into the new repo
>   - Examine commits (diffs) introduced by branches in the to-be-
> sanitized bundle
>   - Perhaps get a list of all the objects in the to-be-sanitized bundle
> and do a git-cat-file on each of them (if the bundle is assembled
> correctly it shouldn't have any unreachable objects...).  This step may
> be extraneous after the previous.

Here we would be missing the metadata that goes along with the commit. 
Especially the SHA sums.

Thanks.

-Jason


smime.p7s
Description: S/MIME cryptographic signature