Re: git status core dump with bad sector!

2016-05-04 Thread Eric Chamberland

Hi,

sorry for the delay...

On 22/04/16 01:11 AM, Jeff King wrote:

On Thu, Apr 14, 2016 at 10:59:57AM -0400, Eric Chamberland wrote:


just cloned a repo and it checked-out wihtout any error (with git 2.2.0) but
got come corrupted files (because I got some sdd failures).

Then, I get a git core dump when trying to "git status" into the repo which
have a "bad sector" on sdd drive (crypted partition).

I tried with git 2.2.0 AND git version 2.8.1.185.gdc0db2c.dirty (just
modified the Makefile to remove STRIP part)

In both cases, I have a  Bus error (core dumped)


Interesting. There was a known issue with reading corrupted pack .idx
files, but it was fixed in v2.8.0. So this could be a new thing.

SIGBUS is somewhat rare, though (usually just accessing unmapped memory
should get us a SIGSEGV). What platform are you on? I seem to recall
that hardware like ARM that cares about memory alignment is more likely
to get a SIGBUS.

Linux ... 3.7.10-1.45-desktop #1 SMP PREEMPT Tue Dec 16 20:27:58 UTC 
2014 (4c885a1) x86_64 x86_64 x86_64 GNU/Linux

df .
Filesystem 1K-blocks  Used 
Available Use% Mounted on
/dev/mapper/cr_ata-ST31000524AS_6VPCXHSW-part1 961430856 699476812 
213116108  77% /pmi


model name  : Intel(R) Xeon(R) CPU   X5690  @ 3.47GHz


Program received signal SIGBUS, Bus error.
0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0
(gdb) bt
#0  0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0
#1  0x3334d90d8c20f3f0 in ?? ()
#2  0xe59b5a6cd844a601 in ?? ()
#3  0xc587a53f67985ae7 in ?? ()
#4  0x3ce81893e5541777 in ?? ()
#5  0xdeb18349a4b042ea in ?? ()
#6  0x8254de489067ec4b in ?? ()
#7  0x6fbef2439704c81b in ?? ()
#8  0xe0eee2bb385a96da in ?? ()
#9  0x76e19ab3 in ?? ()
#10 0x7fffc4d0 in ?? ()
#11 0x001d in ?? ()
#12 0x77863f80 in SHA1_Update () from /lib64/libcrypto.so.1.0.0
#13 0x005102c0 in write_sha1_file_prepare
(buf=buf@entry=0x76c81000, len=1673936, type=,
sha1=sha1@entry=0x7fffc750 "\340_~", hdr=hdr@entry=0x7fffc570 "blob
1673936",


So I'd assume here that the problem is in accessing the memory in "buf".
to actually compute the sha1. That is mmap'd data, but the process is
fairly bland (mmap however many bytes stat() tells us the file has, and
then compute the sha1). You mentioned a bad sector; could it be that the
filesystem is corrupted, and the OS is giving us SIGBUS when trying to
read unavailable bytes from an mmap'd file?


Yes it could be that...



That would explain the SIGBUS versus SIGSEGV.

What happens if you "cat" the file in question:


hmmm, it shows the beginning of the file, then ends with:

cat: Avion.Quadratique.cont.vtu.etalon: Input/output error

also, this appear in /var/log/messages:

2016-05-04T16:33:19.243595-04:00 melkor kernel: [1096660.854161] 
ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
2016-05-04T16:33:19.243609-04:00 melkor kernel: [1096660.854165] 
ata4.00: irq_stat 0x4008
2016-05-04T16:33:19.243610-04:00 melkor kernel: [1096660.854168] 
ata4.00: failed command: READ FPDMA QUEUED
2016-05-04T16:33:19.243611-04:00 melkor kernel: [1096660.854175] 
ata4.00: cmd 60/08:00:70:30:c6/00:00:53:00:00/40 tag 0 ncq 4096 in
2016-05-04T16:33:19.243612-04:00 melkor kernel: [1096660.854175] 
  res 41/40:08:71:30:c6/00:00:53:00:00/00 Emask 0x409 (media error) 
2016-05-04T16:33:19.243613-04:00 melkor kernel: [1096660.854178] 
ata4.00: status: { DRDY ERR }
2016-05-04T16:33:19.243614-04:00 melkor kernel: [1096660.854180] 
ata4.00: error: { UNC }
2016-05-04T16:33:19.340479-04:00 melkor kernel: [1096660.950794] 
ata4.00: configured for UDMA/133
2016-05-04T16:33:19.340484-04:00 melkor kernel: [1096660.950806] sd 
3:0:0:0: [sdb] Unhandled sense code
2016-05-04T16:33:19.340485-04:00 melkor kernel: [1096660.950809] sd 
3:0:0:0: [sdb]
2016-05-04T16:33:19.340485-04:00 melkor kernel: [1096660.950811] Result: 
hostbyte=DID_OK driverbyte=DRIVER_SENSE
2016-05-04T16:33:19.340486-04:00 melkor kernel: [1096660.950814] sd 
3:0:0:0: [sdb]
2016-05-04T16:33:19.340486-04:00 melkor kernel: [1096660.950815] Sense 
Key : Medium Error [current] [descriptor]
2016-05-04T16:33:19.340486-04:00 melkor kernel: [1096660.950819] 
Descriptor sense data with sense descriptors (in hex):
2016-05-04T16:33:19.340487-04:00 melkor kernel: [1096660.950820] 
 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
2016-05-04T16:33:19.340487-04:00 melkor kernel: [1096660.950829] 
 53 c6 30 71
2016-05-04T16:33:19.340488-04:00 melkor kernel: [1096660.950834] sd 
3:0:0:0: [sdb]
2016-05-04T16:33:19.340488-04:00 melkor kernel: [1096660.950836] Add. 
Sense: Unrecovered read error - auto reallocate failed
2016-05-04T16:33:19.340489-04:00 melkor kernel: [1096660.950839] sd 
3:0:0:0: [sdb] CDB:
2016-05-04T16:33:19.340489-04:00 melkor kernel: [1096660.950840] 
Read(10): 28 00 53 c6 30 70 00 00 08 00
2016-05-04T16:33:19.340489-04:00 melkor kerne

git status core dump with bad sector!

2016-04-14 Thread Eric Chamberland

Hi,

just cloned a repo and it checked-out wihtout any error (with git 2.2.0) 
but got come corrupted files (because I got some sdd failures).


Then, I get a git core dump when trying to "git status" into the repo 
which have a "bad sector" on sdd drive (crypted partition).


I tried with git 2.2.0 AND git version 2.8.1.185.gdc0db2c.dirty (just 
modified the Makefile to remove STRIP part)


In both cases, I have a  Bus error (core dumped)

Tried to make it more verbose:

GIT_TRACE=2 GIT_CURL_VERBOSE=2 GIT_TRACE_PERFORMANCE=2 
GIT_TRACE_PACK_ACCESS=2 GIT_TRACE_PACKET=2 GIT_TRACE_PACKFILE=2 
GIT_TRACE_SETUP=2 GIT_TRACE_SHALLOW=2 /opt/gitgit/bin/git status

10:54:30.644999 trace.c:318 setup: git_dir: .git
10:54:30.645094 trace.c:319 setup: git_common_dir: .git
10:54:30.645102 trace.c:320 setup: worktree: 
/pmi/cmpbib/compilation_BIB_gcc-4.5.1_64bit/TestValidation_avec_erreur_disque_git_core_dump_dans_dev_Test.ExportationVTK_Avion
10:54:30.645112 trace.c:321 setup: cwd: 
/pmi/cmpbib/compilation_BIB_gcc-4.5.1_64bit/TestValidation_avec_erreur_disque_git_core_dump_dans_dev_Test.ExportationVTK_Avion
10:54:30.645151 trace.c:322 setup: prefix: 
Ressources/dev/Test.ExportationVTK/

10:54:30.645181 git.c:350   trace: built-in: git 'status'
Bus error (core dumped)

started in gdb:

Program received signal SIGBUS, Bus error.
0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0
(gdb) bt
#0  0x77866d58 in ?? () from /lib64/libcrypto.so.1.0.0
#1  0x3334d90d8c20f3f0 in ?? ()
#2  0xe59b5a6cd844a601 in ?? ()
#3  0xc587a53f67985ae7 in ?? ()
#4  0x3ce81893e5541777 in ?? ()
#5  0xdeb18349a4b042ea in ?? ()
#6  0x8254de489067ec4b in ?? ()
#7  0x6fbef2439704c81b in ?? ()
#8  0xe0eee2bb385a96da in ?? ()
#9  0x76e19ab3 in ?? ()
#10 0x7fffc4d0 in ?? ()
#11 0x001d in ?? ()
#12 0x77863f80 in SHA1_Update () from /lib64/libcrypto.so.1.0.0
#13 0x005102c0 in write_sha1_file_prepare 
(buf=buf@entry=0x76c81000, len=1673936, type=, 
sha1=sha1@entry=0x7fffc750 "\340_~", hdr=hdr@entry=0x7fffc570 
"blob 1673936",

hdrlen=hdrlen@entry=0x7fffc56c) at sha1_file.c:2951
#14 0x0051567b in hash_sha1_file (buf=buf@entry=0x76c81000, 
len=, type=, 
sha1=sha1@entry=0x7fffc750 "\340_~") at sha1_file.c:3010
#15 0x005159f8 in index_mem (sha1=sha1@entry=0x7fffc750 
"\340_~", buf=buf@entry=0x76c81000, size=1673936, 
type=type@entry=OBJ_BLOB,
path=path@entry=0x80a818 
"Ressources/dev/Test.ExportationVTK/Ressources.Avion/Avion.Quadratique.cont.vtu.etalon", 
flags=flags@entry=0) at sha1_file.c:3305
#16 0x005160ee in index_core (flags=0, path=0x80a818 
"Ressources/dev/Test.ExportationVTK/Ressources.Avion/Avion.Quadratique.cont.vtu.etalon", 
type=OBJ_BLOB, size=, fd=7,

sha1=0x7fffc750 "\340_~") at sha1_file.c:3367
#17 index_fd (sha1=sha1@entry=0x7fffc750 "\340_~", fd=7, 
st=st@entry=0x7fffc7c0, type=type@entry=OBJ_BLOB,
path=path@entry=0x80a818 
"Ressources/dev/Test.ExportationVTK/Ressources.Avion/Avion.Quadratique.cont.vtu.etalon", 
flags=flags@entry=0) at sha1_file.c:3410
#18 0x004eac66 in ce_compare_data (st=0x7fffc7c0, 
ce=0x80a7c0) at read-cache.c:166
#19 ce_modified_check_fs (ce=0x80a7c0, st=0x7fffc7c0) at 
read-cache.c:215
#20 0x004ebb6d in ie_modified (istate=istate@entry=0x7e5fe0 
, ce=ce@entry=0x80a7c0, st=st@entry=0x7fffc7c0, 
options=options@entry=16) at read-cache.c:395
#21 0x004ebcfe in refresh_cache_ent 
(istate=istate@entry=0x7e5fe0 , ce=ce@entry=0x80a7c0, 
options=options@entry=16, err=err@entry=0x7fffc908,

changed_ret=changed_ret@entry=0x7fffc90c) at read-cache.c:1130
#22 0x004ed59c in refresh_index (istate=0x7e5fe0 , 
flags=flags@entry=6, pathspec=pathspec@entry=0x7bb738 , 
seen=seen@entry=0x0, header_msg=header_msg@entry=0x0)

at read-cache.c:1221
#23 0x00429e3b in cmd_status (argc=, 
argv=0x7fffcca0, prefix=0x7e950f 
"Ressources/dev/Test.ExportationVTK/") at builtin/commit.c:1376
#24 0x004063b3 in run_builtin (argv=0x7fffcca0, argc=1, 
p=0x7b4030 ) at git.c:352

#25 handle_builtin (argc=1, argv=0x7fffcca0) at git.c:539
#26 0x004054a1 in run_argv (argv=0x7fffca80, 
argcp=0x7fffca6c) at git.c:593

#27 main (argc=1, av=) at git.c:698

Ii would have expected git to first gave me an error when checking out 
the files!!! Here is the log:


Checking out files:  99% (28645/28934)
Checking out files: 100% (28934/28934)
Checking out files: 100% (28934/28934), done.
Already on 'master'
Your branch is up-to-date with 'origin/master'.
On valide le dépôt TestValidation avec la référence: 
9b4a485202b2b52922377842c15bfd605d240667

HEAD is now at 9b4a485 BUG: On spécifie bash comme shell...

But at least 1 file is corrupted!

I keep preciously this faulty repo to further investigation with someone 
who can help dig 

Re: [feature request] 2) Remove many tags at once and 1) Prune tags on old-branch-before-rebase

2013-03-08 Thread Eric Chamberland

Hi Junio,

On 03/07/2013 06:33 PM, Junio C Hamano wrote:

Eric Chamberland eric.chamberl...@giref.ulaval.ca writes:

What you want is a way to compute, given a set of tags (or refs in
general) and a set of branches (or another set of refs in general),
find the ones in the former that none of the latter can reach.  With
that, you can drive git tag -d $(that way).



Yes, this is *exactly* what I want...


In other words, the feature does not belong to git tag command.


2) git tag -d TOKEN*


Again, not interesting.  You already have:

 git for-each-ref --format='%(refname:short)' refs/tags/TOKEN\* |
 xargs -r git tag -d



I don't agree here for one reason:

git tag -l TOKEN*

already exists and works very well...

So why is it not interesting to have:

git tag -d TOKEN*

?

We can also write:

git tag -d `git tag -l TOKEN*`

but a simple addition to -d feature looks like a receivable behavior 
here, no?


Thanks,

Eric

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-02-04 Thread Eric Chamberland

Hi,

On 01/23/2013 01:34 PM, Sébastien Boisvert wrote:

Hello,

Here is a patch (with git format-patch) that removes any timer if
NO_SETITIMER is set.



Even with the patch, I finally got an error... :-/

Here are the log (strace -f) of a clean execution and one with the error:

http://www.giref.ulaval.ca/~ericc/NO_SETITIMER-patch_bin_git_noerror.txt.gz

http://www.giref.ulaval.ca/~ericc/NO_SETITIMER-patch_bin_git_with_error.txt.gz

Eric

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-22 Thread Eric Chamberland

So, hum, do we have some sort of conclusion?

Shall it be a fix for git to get around that lustre behavior?

If something can be done in git it would be great: it is a *lot* easier 
to change git than the lustre filesystem software for a cluster in 
running in production mode... (words from cluster team) :-/


I hope this subject will not die in the list... :-/

Thanks,

Eric



On 01/21/2013 02:29 PM, Thomas Rast wrote:

Please don't drop the Cc list!

Brian J. Murrell br...@interlinx.bc.ca writes:


What's odd is that while I cannot reproduce the original problem, there
seems to be another issue/bug with utime():


I wonder if this is related to http://jira.whamcloud.com/browse/LU-305.
  That was reported as fixed in Lustre 2.0.0 and 2.1.0 but I thought I
saw it on 2.1.1 and added a comment to the above ticket about that.


Aha, that's a very interesting bug report.  My observations support
yours: I managed to get EINTR during utime().


In the absence of it, wouldn't we in theory have to write a simple
loop-on-EINTR wrapper for *all* syscalls?


IIUC, that's what SA_RESTART is all about.


Yes, but there's precious little clear language on when SA_RESTART is
supposed to act.  In all cases?

The wording on

   http://www.delorie.com/gnu/docs/glibc/libc_485.html
   http://www.delorie.com/gnu/docs/glibc/libc_498.html

leads me to believe that SA_RESTART is actually used on the glibc side
of things, so that any glibc syscall wrapper not specifically equipped
with the restarting behavior would return EINTR unmodified.  This might
explain why utime() doesn't restart like it should (assuming we work on
the theory that POSIX doesn't allow an EINTR from utime() to begin
with).



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-22 Thread Eric Chamberland

On 01/22/2013 05:14 PM, Thomas Rast wrote:

Eric Chamberland eric.chamberl...@giref.ulaval.ca writes:


So, hum, do we have some sort of conclusion?

Shall it be a fix for git to get around that lustre behavior?

If something can be done in git it would be great: it is a *lot*
easier to change git than the lustre filesystem software for a cluster
in running in production mode... (words from cluster team) :-/


I thought you already established that simply disabling the progress
display is a sufficient workaround?  If that doesn't help, you can try
patching out all use of SIGALRM within git.



I tried that solution after Brian told me to try it, but it didn't 
solved the problem for me! :-(



Other than that I agree with Junio, from what we've seen so far, Lustre
returns EINTR on all sorts of calls that simply aren't allowed to do so.



Ok, so now the good move would be to have all this reported to lustre 
development team?  Brian, have you seen anything new from what you have 
already reported?  I have to admit that I am not a fs expert...


And I also agree with Junio point of view: The problem may impact 
mission critical applications


Eric

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-21 Thread Eric Chamberland

Hi,

It just happened again.  Now I have the strace -f output gzipped here:

http://www.giref.ulaval.ca/~ericc/strace-f_git_error.txt.gz

thanks,

Eric

On 01/21/2013 08:29 AM, Erik Faye-Lund wrote:

On Fri, Jan 18, 2013 at 6:50 PM, Eric Chamberland
eric.chamberl...@giref.ulaval.ca wrote:

Good idea!

I did a strace and here is the output with the error:

http://www.giref.ulaval.ca/~ericc/strace_git_error.txt

Hope it will be insightful!


This trace doesn't seem to contain child-processes, but instead having
their stderr inlined into the log. Try using strace -f instead...



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-21 Thread Eric Chamberland

On 01/21/2013 12:07 PM, Eric Chamberland wrote:

Hi,

It just happened again.  Now I have the strace -f output gzipped here:

http://www.giref.ulaval.ca/~ericc/strace-f_git_error.txt.gz



I added the strace -f output when non error occurs...

http://www.giref.ulaval.ca/~ericc/strace-f_git_no_error.txt.gz

a kdiff3 can show the differences just before the error...

Eric

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-18 Thread Eric Chamberland

Good idea!

I did a strace and here is the output with the error:

http://www.giref.ulaval.ca/~ericc/strace_git_error.txt

Hope it will be insightful!

Eric


On 01/17/2013 12:17 PM, Pyeron, Jason J CTR (US) wrote:

Sorry, I am in cygwin mode, and I had crossed wires in my head. 
s/ProcessMon/strace/


-Original Message-
From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On
Behalf Of Maxime Boissonneault
Sent: Thursday, January 17, 2013 11:41 AM
To: Pyeron, Jason J CTR (US)
Cc: Eric Chamberland; Philippe Vaucher; git@vger.kernel.org; Sébastien
Boisvert
Subject: Re: GIT get corrupted on lustre

I don't know of any lustre filesystem that is used on Windows. Barely
anybody uses Windows in the HPC industry.
This is a Linux cluster.

Maxime Boissonneault

Le 2013-01-17 11:40, Pyeron, Jason J CTR (US) a écrit :

-Original Message-
From: Eric Chamberland
Sent: Thursday, January 17, 2013 11:31 AM

On 01/17/2013 09:23 AM, Philippe Vaucher wrote:

Anyone has a new idea?

Did you try Jeff King's code to confirm his idea?

Philippe


Yes I did, but it was running without any problem

I find that my test case is simple (fresh git clone then git gc

in

a
crontab), I bet anyone who has access to a Lustre filesystem can
reproduce the problem...  The problem is to have such a filesystem

to

do
the tests

Stabbing in the dark, but can you log the details with ProcessMon?

http://technet.microsoft.com/en-us/sysinternals/bb896645


But I am available to do it...

-Jason



--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-17 Thread Eric Chamberland

Hi!

I still have the corruption problems

We just compiled a git without threads to try... (by the way, 
--without-pthreads doesn't work, you have to do a --disable-pthreads 
instead).


And to remove the warnings about threads at git gc execution, I did a:

git config --local pack.threads 1

and cloned a repository and started to do:

git gc

once every hour.

Then this night (at 05:35:02 exactly), the same error as usual occurred:

error: index file 
.git/objects/pack/pack-bf0748cee64a1964be0a1061c82aca51c993b825.idx is 
too small

error: refs/heads/master does not point to a valid object!

So now I am convinced that it is not a thread problem

I am kind of discouraged, we like to use git, but in this case we have 
this error which seems unsolvable!


Anyone has a new idea?

Thanks,

Eric


On 01/09/2013 04:20 PM, Eric Chamberland wrote:

Hi Brian,

On 01/08/2013 11:11 AM, Eric Chamberland wrote:

On 12/24/2012 10:11 AM, Brian J. Murrell wrote:

Have you tried adding a -q to the git command line to quiet down git's
feedback messages?





I moved to git 1.8.1 and added the -q to the command git gc but it
occured to return an error, so the -q option is not avoiding the
problem here... :-/

command in crontab:

cd /rap/jsf-051-aa/ericc/tests_git_clones/GIREF  for i in seq 10; do
/software/apps/git/1.8.1/bin/git gc -q || true;done

results:
error: index file
.git/objects/pack/pack-1f09879c88cd71a15dcc891713cf038d249830ad.idx is
too small
error: refs/remotes/origin/BIB_Branche_1_4_x does not point to a valid
object!

and this clone was a clean clone in which only git qc -q has been
run on

I still have a doubt on threads

Eric

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-17 Thread Eric Chamberland

On 01/17/2013 09:23 AM, Philippe Vaucher wrote:

Anyone has a new idea?


Did you try Jeff King's code to confirm his idea?

Philippe



Yes I did, but it was running without any problem

I find that my test case is simple (fresh git clone then git gc in a 
crontab), I bet anyone who has access to a Lustre filesystem can 
reproduce the problem...  The problem is to have such a filesystem to do 
the tests


But I am available to do it...

Thanks,

Eric



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-09 Thread Eric Chamberland

Hi Brian,

On 01/08/2013 11:11 AM, Eric Chamberland wrote:

On 12/24/2012 10:11 AM, Brian J. Murrell wrote:

Have you tried adding a -q to the git command line to quiet down git's
feedback messages?





I moved to git 1.8.1 and added the -q to the command git gc but it 
occured to return an error, so the -q option is not avoiding the 
problem here... :-/


command in crontab:

cd /rap/jsf-051-aa/ericc/tests_git_clones/GIREF  for i in seq 10; do 
/software/apps/git/1.8.1/bin/git gc -q || true;done


results:
error: index file 
.git/objects/pack/pack-1f09879c88cd71a15dcc891713cf038d249830ad.idx is 
too small
error: refs/remotes/origin/BIB_Branche_1_4_x does not point to a valid 
object!


and this clone was a clean clone in which only git qc -q has been 
run on


I still have a doubt on threads

Eric

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT get corrupted on lustre

2013-01-08 Thread Eric Chamberland

On 12/24/2012 10:11 AM, Brian J. Murrell wrote:

Have you tried adding a -q to the git command line to quiet down git's
feedback messages?



Ok, I have modified my crontab to use -q and I will wait to see if the 
problem occurs from now.



I discovered other oddities with using git on Lustre which I described
in this thread:

http://thread.gmane.org/gmane.comp.version-control.git/208886

I found that by simply disabling the feedback (which disables the
copious SIGALRM processing) I could alleviate the issue.

I wonder if your issues are more of the same.

I filed Lustre bug LU-2276 about it at:

http://jira.whamcloud.com/browse/LU-2276


Thank you for these informations.  I see the bug is unresolved!...

Eric


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


GIT get corrupted on lustre

2012-12-24 Thread Eric Chamberland

Hi,

we are using git since may and all is working fine for all of us (almost 
20 people) on our workstations.  However, when we clone our repositories 
to the cluster, only and only there

we are having many problems similiar to this post:

http://thread.gmane.org/gmane.comp.file-systems.lustre.user/12093

Doing a git clone always work fine, but when we git pull or git gc 
or git fsck, often (1/5) the local repository get corrupted.

for example, I got this error two days ago while doing git gc:

error: index file 
.git/objects/pack/pack-7b43b1c613a851392aaf4f66916dff2577931576.idx is too small
error: refs/heads/mail_seekable does not point to a valid object!

also, I got this error 5 days ago:

error: index file 
.git/objects/pack/pack-ef9b5bbff1ebc1af63ef4262ade3e18b439c58af.idx is too small
error: refs/heads/mail_seekable does not point to a valid object!
Removing stale temporary file .git/objects/pack/tmp_pack_lO7aw2

and this one some time ago:

Removing stale temporary file .git/objects/pack/tmp_pack_5CHb2F
Removing stale temporary file .git/objects/pack/tmp_pack_GY159g
Removing stale temporary file .git/objects/pack/tmp_pack_aKkXTS

We are using git 1.8.0.1 on CentOS release 5.8 (Final).

We think it could be related to the fact that we are on a *Lustre* 
filesystem, which I think doesn't fully support file locking.


Questions:

#1) However, how can we *test* the filesystem (lustre) compatibility 
with git? (Is there a unit test we can run?)


#2) Is there a way to compile GIT to be compatible with lustre? (ex: no 
threads?)


#3) If you *know* your filesystem doesn't allow file locking, how would 
you configure/compile GIT to work on it?


#4) Anyone has another idea on how to solve this?

Thanks,

Eric

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html