> -Original Message-
> From: Stephen Bash
> Sent: Monday, November 26, 2012 3:56 PM
>
> - Original Message -
> > From: "Jason J CTR Pyeron (US)"
> > Sent: Monday, November 26, 2012 2:24:54 PM
> > Subject: git bundle format
> >
> > I am facing a situation where I would like to use git bundle but at
> > the same time inspect the contents to prevent a spillage[1].
>
> As someone who faced a similar situation in a previous life, I'll offer
> my $0.02, but I'm certainly not the technical expert here.
Kind of what I am looking for as a side effect.
>
> > Given we have a public repository which was cloned on to a secret
> > development repository. Now the developers do some work which should
> > not be sensitive in any way and commit and push it to the secret
> > repository.
> >
> > Now they want to release it out to the public. The current process is
> > to review the text files to ensure that there is no "secret" sauce
> > in there and then approve its release. This current process ignores
> > the change tracking and all non-content is lost.
> >
> > In this situation we should assume that the bundle does not have any
> > content which is already in the public repository, that is it has
> > the minimum data to make it pass a git bundle verify from the public
> > repositories point of view. We would then take the bundle and pipe
> > it though the "git-bundle2text" program which would result in a
> > "human" inspectable format as opposed to the packed format[2]. The
> > security reviewer would then see all the information being released
> > and with the help of the public repository see how the data changes
> > the repository.
> >
> > Am I barking up the right tree?
>
> First, a shot out of left field: how about a patch based workflow?
> (similar to the mailing list, just replace email with sneakernet)
> Patches are plain text and simple to review (preferable to an "opaque"
> binary format?).
This is to only address the accidental development on a high side. Using this
or any process should come with shame or punishment for wasting resources/time
by not developing on a low side to start with. But accepting reality there will
be times where code and its metadata (commit logs, etc) will be created on a
high side and should be brought back to the low side.
> Second, thinking about your proposed bundle-based workflow I have two
> questions I'd have to answer to be comfortable with the solution:
>
> 1) Does the binary bundle contain any sensitive information?
Potentially, hence the review. If the reviewer cannot prove the data he is
looking at then the presumption is yes.
> 2) Do the diffs applied to public repo contain any sensitive data?
That is a great question. Can the change of code while neither the original or
the resultant be secret while the change imply or demonstrate the secret. I
think the answer is yes.
>
> Question 1 seems tricky to someone who knows *nothing* about the bundle
> format (e.g. me). Maybe some form of bundle2text can be vetted enough
> that everyone involved believes that there is no other information
> traveling with the bundle (if so, you're golden). Here I have to trust
> other experts. On the flip side, even if the bundle itself is polluted
> (or considered to be lacking proof to the contrary), if (2) is
> considered safe, the patching of the public repo could potentially be
> done on a sacrificial hard drive before pushing.
The logistics are well established and here and now is not a place to go in to
that. But the above is the crux of what I am trying to get at.
>
> Question 2 is relatively straight forward and lead me to the patch
> idea. I would:
> - Bundle the public repository
> - Init a new repo in the secure space from the public bundle
> - Fetch from the to-be-sanitized bundle into the new repo
> - Examine commits (diffs) introduced by branches in the to-be-
> sanitized bundle
> - Perhaps get a list of all the objects in the to-be-sanitized bundle
> and do a git-cat-file on each of them (if the bundle is assembled
> correctly it shouldn't have any unreachable objects...). This step may
> be extraneous after the previous.
Here we would be missing the metadata that goes along with the commit.
Especially the SHA sums.
Thanks.
-Jason
smime.p7s
Description: S/MIME cryptographic signature