Re: [git-users] Git tags and information as file header

2014-05-16 Thread Andy Hardy
On 15/05/2014 23:39, Magnus Therning wrote:
 On Thu, May 15, 2014 at 11:21:25PM +0100, Andy Hardy wrote:
 On 15/05/2014 22:48, Magnus Therning wrote:
 
 - version: doesn't make sense in git, would it be the hash? 
 what does that tell me?
 
 I find an identifier useful when investigating problems and 
 wanting to confirm what files are involved.
 
 What kind of identifier?

Something that I can then use to identify the same file in the
repository and check it's log history, etc.

 What kind of problems and where?

 Is the file in a clone or in a deployment, i.e. you can't rely on a
 VCS to tell you what you have?

It's in the deployment, generally of on-site fixes. Deployment
involves a largeish number of files. Files are not intrinsically
dependent upon one another but may interact.

For a 'release' we'd send the customer a complete set of files, but
often a fix only requires one or two files to be modified which we ask
the customer to install. When a dealing with a later fault, we'd like to
have a method of knowing which files the customer did actually install!

-- 
Andy

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Git tags and information as file header

2014-05-16 Thread Thomas Ferris Nicolaisen
On Friday, May 16, 2014 12:52:44 AM UTC+2, Magnus Therning wrote:

 I must say though that your example might constitute another argument 
 *against* having keywords, that sort of deployment process quickly 
 leads to a mess ;) 


Yeah, the fallacy was that it was easier to deploy one file than to 
deploy the whole new application (an understanding backed by an elaborate 
deployment process involving a lot of manual work). But in the long run it 
was a right mess.

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Git tags and information as file header

2014-05-16 Thread Magnus Therning
On Fri, May 16, 2014 at 8:22 AM, Andy Hardy a...@hardyfamily.org.uk wrote:
 On 15/05/2014 23:39, Magnus Therning wrote:
 On Thu, May 15, 2014 at 11:21:25PM +0100, Andy Hardy wrote:
 On 15/05/2014 22:48, Magnus Therning wrote:

 - version: doesn't make sense in git, would it be the hash?
 what does that tell me?

 I find an identifier useful when investigating problems and
 wanting to confirm what files are involved.

 What kind of identifier?

 Something that I can then use to identify the same file in the
 repository and check it's log history, etc.

 What kind of problems and where?

 Is the file in a clone or in a deployment, i.e. you can't rely on a
 VCS to tell you what you have?

 It's in the deployment, generally of on-site fixes. Deployment
 involves a largeish number of files. Files are not intrinsically
 dependent upon one another but may interact.

 For a 'release' we'd send the customer a complete set of files, but
 often a fix only requires one or two files to be modified which we ask
 the customer to install. When a dealing with a later fault, we'd like to
 have a method of knowing which files the customer did actually install!

So, this is just the same situation as Thomas was talking about.  It
boils down to, only slightly simplified, relying on VCS keyword
substitution in order to avoid having to create proper releases which
can be recorded upon install.  It arguably shouldn't be done in the
first place, but I can see that some situations call for it.  It can
however be done just about as easily in other ways, hash, diff, ...

/M

-- 
Magnus Therning  OpenPGP: 0xAB4DFBA4
email: mag...@therning.org   jabber: mag...@therning.org
twitter: magthe   http://therning.org/magnus

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Git tags and information as file header

2014-05-16 Thread Thomas Ferris Nicolaisen
On Friday, May 16, 2014 8:22:30 AM UTC+2, Andy wrote:

 For a 'release' we'd send the customer a complete set of files, but 
 often a fix only requires one or two files to be modified which we ask 
 the customer to install. When a dealing with a later fault, we'd like to 
 have a method of knowing which files the customer did actually install! 


Why don't you just write a small script that fills in such a header on 
demand? If the header is missing, you can assume that the file is from the 
last proper release.

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[git-users] I don't know how to from rej file to adjust the source

2014-05-16 Thread lei yang
#cat recipes-devtools/python/python-heat_git.bb.rej
diff 
a/meta-openstack/recipes-devtools/python/python-heat_git.bbb/meta-openstack/recipes-devtools/python/
python-heat_git.bb(rejected hunks)
@@ -10,6 +10,9 @@ SRCNAME = heat
 SRC_URI = git://github.com/openstack/${SRCNAME}.git;branch=stable/havana \
file://heat.conf \
file://heat.init \
+   file://autoscaling_example.template \
+   file://one_vm_example.template \
+   file://two_vms_example.template \
 

 SRCREV=ff6901141fbbc0a13604491aaba01a60487d6f6d



#cat recipes-devtools/python/python-heat_git.bb
..
PR = r0
SRCNAME = heat

SRC_URI = git://github.com/openstack/${SRCNAME}.git;branch=stable/havana \
   file://heat.conf \
   file://heat.init \


SRCREV=58de9e6415f5bdabde708c8584b21b59b7e96a88
PV=2013.2.3+git${SRCPV}
S = ${WORKDIR}/git
.

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] I don't know how to from rej file to adjust the source

2014-05-16 Thread Konstantin Khomoutov
On Fri, 16 May 2014 17:21:10 +0800
lei yang yanglei.f...@gmail.com wrote:

 #cat recipes-devtools/python/python-heat_git.bb.rej
 diff
 a/meta-openstack/recipes-devtools/python/python-heat_git.bbb/meta-openstack/recipes-devtools/python/
 python-heat_git.bb(rejected hunks) @@ -10,6 +10,9 @@ SRCNAME =
 heat SRC_URI =
 git://github.com/openstack/${SRCNAME}.git;branch=stable/havana \
 file://heat.conf \ file://heat.init \
 +   file://autoscaling_example.template \
 +   file://one_vm_example.template \
 +   file://two_vms_example.template \
  
 
  SRCREV=ff6901141fbbc0a13604491aaba01a60487d6f6d

It's just a patch file in the so-called unified diff format [1].
To apply it, use the `patch` program.
If you need to apply it to the work tree of a Git repository,
use `git apply` (`git am` might also work).

1. http://en.wikipedia.org/wiki/Diff#Unified_format

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] worlds slowest git repo- what to do?

2014-05-16 Thread Dale R. Worley
 From: John Fisher fishook2...@gmail.com

 FYI we are archiving compressed Linux disk images for VMs and
 hypervisors.

A core problem is that you've got the worst sort of data for something
like Git.  Your files are huge, and being compressed, any effort to
compress saved files or find duplicate strings between them is totally
wasted.  Your workload is anti-optimized for any source management
system.

Here's something that might work (ugh):  Use Subversion, which I seem
to recall will do delta encoding between versions of a single file
but not *between* files.  Have a directory (or directories) which
contain all the big files.  Whenever you change a big file, delete the
old version and create the new version *under a different name* (so
Subversion doesn't try to delta-encode the new version relative to the
old one).  Now, for your real files, keep a directory tree like
normal, but for each of the big files, use a symbolic link (under the
desired name) that points to the actual file (off in the storage
directory).  (Not svn mv, but just a filesystem move, so that
Subversion doesn't try to connect different versions of a binary.)  I
*think* that will prevent Subversion from trying to do anything clever
with big, low-redundancy binary files.

You could probably write a script that would go through the structure
and groom it into the proper shape to be committed:  Move any big
files in the real tree into the storage directory, replacing them
with links, deleting any non-linked-to files in the storage directory,
etc.  The trick would be having a way to generate the name in the
storage directory in a way that is uniquely determined by the file
contents (and possibly modification date).  You don't want to hash the
whole file, that would be too slow...

Dale

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Bare repository

2014-05-16 Thread Dale R. Worley
 From: Alain raf.n...@gmail.com
 
 but what i do not understand is if only 1 person manage the bare repository 
 (setting access right to this repository) on local disk, how can he prevent 
 other people to modify, stage and push changes to it ?
 Moreover, the bare repository is not for only 1 file... :(
 
 I would like to understand the mechanism. How people can pull/push (having 
 non bare clone) to it, and the person managing the bare repository will 
 manage the pushes/pulls to define what is going or not ?
 i'm lost with it.

Fundamentally, access to a repository is controlled by the file-access
mechanisms of the system that contains it.  Another user on the same
system can push changes to it if the user has permission to write the
files in the repository structure.  A user on another system can
remotely push changes if the remote-access program (usually sshd)
allows the remote user to run a program on the system that has
permission to write the files in the repository structure.

Dale

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] worlds slowest git repo- what to do?

2014-05-16 Thread Konstantin Khomoutov
On Fri, 16 May 2014 10:43:20 -0400
wor...@alum.mit.edu (Dale R. Worley) wrote:

Sorry to replying to your message, not OP's.

  FYI we are archiving compressed Linux disk images for VMs and
  hypervisors.
 
 A core problem is that you've got the worst sort of data for something
 like Git.  Your files are huge, and being compressed, any effort to
 compress saved files or find duplicate strings between them is totally
 wasted.  Your workload is anti-optimized for any source management
 system.
 
 Here's something that might work (ugh):  Use Subversion, which I seem
 to recall will do delta encoding between versions of a single file
 but not *between* files.
[...]

Mercurial does this as well.  On the other hand, IIRC, after N
revisions it does something like full checkpoint to make
reconstructing past revisions faster.

I think the OP is better off using something like rsnapshot [1] or
rdiff-backup [2] for his task, or `rsync -H --no-inc-recursive` +
`cp -alR` and bit of shell scripting.  These tools provide file-level
(in fact, inode-level) deduplication by hardlinking unchanged files.
Dirvish and unison come to mind as well (I'm lazy to google the links
to their sites, sorry).

Another approach is to use a backup tool which performs block-level
deduplication.  For this, I can name obnam [3] and ZFS (snapshotting
with block-level dedup turned on).

Also not sure if this has been mentioned by other folks but there
exist bup [4] and boar [5] which build on paradigms of VCS but are
tailored to the needs of working with big binary files.  This [6] is
particularly insightful.

1. http://www.rsnapshot.org/
2. http://www.nongnu.org/rdiff-backup/
3. http://obnam.org/
4. https://github.com/bup/bup
5. https://code.google.com/p/boar
6. https://github.com/bup/bup/blob/master/DESIGN

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] worlds slowest git repo- what to do?

2014-05-16 Thread Konstantin Khomoutov
On Fri, 16 May 2014 19:53:35 +0400
Konstantin Khomoutov flatw...@users.sourceforge.net wrote:

[...]
 Another approach is to use a backup tool which performs block-level
 deduplication.  For this, I can name obnam [3] and ZFS (snapshotting
 with block-level dedup turned on).

Attic [1] does block-level dedup as well.

1. https://attic-backup.org/

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] Git tags and information as file header

2014-05-16 Thread Alain


On Friday, May 16, 2014 10:45:10 AM UTC+2, Thomas Ferris Nicolaisen wrote:

 On Friday, May 16, 2014 8:22:30 AM UTC+2, Andy wrote:

 For a 'release' we'd send the customer a complete set of files, but 
 often a fix only requires one or two files to be modified which we ask 
 the customer to install. When a dealing with a later fault, we'd like to 
 have a method of knowing which files the customer did actually install! 


 Why don't you just write a small script that fills in such a header on 
 demand? If the header is missing, you can assume that the file is from the 
 last proper release.


i was thinking about support phase. if you sell some web components, during 
support phase your customers could forget what version of your component 
they have installed.
checking some files header could help to know which version of component 
they have installed.

moreover some components could use different plugins that have their own 
version (not necessary the same as the component). but everything is bundle 
together (component, plugins,...) in a zip file.
i was thinking about this typical example in fact.

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [git-users] worlds slowest git repo- what to do?

2014-05-16 Thread John Fisher

On 05/16/2014 03:13 AM, Duy Nguyen wrote:
 On Fri, May 16, 2014 at 2:06 AM, Philip Oakley philipoak...@iee.org wrote:
 From: John Fisher fishook2...@gmail.com
 I assert based on one piece of evidence ( a post from a facebook dev) that
 I now have the worlds biggest and slowest git
 repository, and I am not a happy guy. I used to have the worlds biggest
 CVS repository, but CVS can't handle multi-G
 sized files. So I moved the repo to git, because we are using that for our
 new projects.

 goal:
 keep 150 G of files (mostly binary) from tiny sized to over 8G in a
 version-control system.
 I think your best bet so far is git-annex 

good, I am  looking at that

 (or maybe bup) for dealing
 with huge files. I plan on resurrecting Junio's split-blob series to
 make core git handle huge files better, but there's no eta on that.
 The problem here is about file size, not the number of files, or
 history depth, right?

When things here calm down, I could easily test the repo without the giant 
files, leaving 99% of files in the repo.
There is hardly any history depth because these are releases, version 
controlled by directory name. As has been
suggested I could be forced to abandon the version-control, even to the point 
of just using rsync.  But I've been doing
this with CVS for 10 years now and I hate to change or in any way move away 
fron KISS. Moving it to Git may not have
been one of my better ideas...


 Probably known issues. But some elaboration would be nice (e.g. what 
 operation is slow, how slow, some more detail
 characteristics of the repo..) in case new problems pop up. 

so far I have done add, commit, status, clone - commit and status are slow; add 
seems to depend on the files involved,
clone seems to run at network speed.
I can provide metrics later, see above. email me offline with what you want.

John

-- 
You received this message because you are subscribed to the Google Groups Git 
for human beings group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.