Re: error: git-fast-import died of signal 11

2012-10-17 Thread Uri Moszkowicz
Hi Michael,
Looks like the changes to limit solved the problem. I didn't verify if
it was the stacksize or descriptors but one of those. Final repository
size was 14GB from a 328GB dump file.

Thanks,
Uri

On Tue, Oct 16, 2012 at 2:18 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 10/15/2012 05:53 PM, Uri Moszkowicz wrote:
 I'm trying to convert a CVS repository to Git using cvs2git. I was able to
 generate the dump file without problem but am unable to get Git to
 fast-import it. The dump file is 328GB and I ran git fast-import on a
 machine with 512GB of RAM.

 fatal: Out of memory? mmap failed: Cannot allocate memory
 fast-import: dumping crash report to fast_import_crash_18192
 error: git-fast-import died of signal 11

 How can I import the repository?

 What versions of git and of cvs2git are you using?  If not the current
 versions, please try with the current versions.

 What is the nature of your repository (i.e., why is it so big)?  Does it
 consist of extremely large files?  A very deep history?  Extremely many
 branches/tags?  Extremely many files?

 Did you check whether the RAM usage of git-fast-import process was
 growing gradually to fill RAM while it was running vs. whether the usage
 seemed reasonable until it suddenly crashed?

 There are a few obvious possibilities:

 0. There is some reason that too little of your computer's RAM is
 available to git-fast-import (e.g., ulimit, other processes running at
 the same time, much RAM being used as a ramdisk, etc).

 1. Your import is simply too big for git-fast-import to hold in memory
 the accumulated things that it has to remember.  I'm not familiar with
 the internals of git-fast-import, but I believe that the main thing that
 it has to keep in RAM is the list of marks (references to git objects
 that can be referred to later in the import).  From your crash file, it
 looks like there were about 350k marks loaded at the time of the crash.
  Supposing each mark is about 100 bytes, this would only amount to 35
 Mb, which should not be a problem (*if* my assumptions are correct).

 2. Your import contains a gigantic object which individually is so big
 that it overflows some component of the import.  (I don't know whether
 large objects are handled streamily; they might be read into memory at
 some point.)  But since your computer had so much RAM this is hardly
 imaginable.

 3. git-fast-import has a memory leak and the accumulated memory leakage
 is exhausting your RAM.

 4. git-fast-import has some other kind of a bug.

 5. The contents of the dumpfile are corrupt in a way that is triggering
 the problem.  This could either be invalid input (e.g., an object that
 is reported to be quaggabytes large), or some invalid input that
 triggers a bug in git-fast-import.

 If (1), then you either need a bigger machine or git-fast-import needs
 architectural changes.

 If (2), then you either need a bigger machine or git-fast-import and/or
 git needs architectural changes.

 If (3), then it would be good to get more information about the problem
 so that the leak can be fixed.  If this is the case, it might be
 possible to work around the problem by splitting the dumpfile into
 several parts and loading them one after the other (outputting the marks
 from one run and loading them into the next).

 If (4) or (5), then it would be helpful to narrow down the problem.  It
 might be possible to do so by following the instructions in the cvs2svn
 FAQ [1] for systematically shrinking a test case to smaller size using
 destroy_repository.py and shrink_test_case.py.  If you can create a
 small repository that triggers the same problem, then there is a good
 chance that it is easy to fix.

 Michael
 (the cvs2git maintainer)

 [1] http://cvs2svn.tigris.org/faq.html#testcase

 --
 Michael Haggerty
 mhag...@alum.mit.edu
 http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error: git-fast-import died of signal 11

2012-10-16 Thread Michael Haggerty
On 10/15/2012 05:53 PM, Uri Moszkowicz wrote:
 I'm trying to convert a CVS repository to Git using cvs2git. I was able to
 generate the dump file without problem but am unable to get Git to
 fast-import it. The dump file is 328GB and I ran git fast-import on a
 machine with 512GB of RAM.
 
 fatal: Out of memory? mmap failed: Cannot allocate memory
 fast-import: dumping crash report to fast_import_crash_18192
 error: git-fast-import died of signal 11
 
 How can I import the repository?

What versions of git and of cvs2git are you using?  If not the current
versions, please try with the current versions.

What is the nature of your repository (i.e., why is it so big)?  Does it
consist of extremely large files?  A very deep history?  Extremely many
branches/tags?  Extremely many files?

Did you check whether the RAM usage of git-fast-import process was
growing gradually to fill RAM while it was running vs. whether the usage
seemed reasonable until it suddenly crashed?

There are a few obvious possibilities:

0. There is some reason that too little of your computer's RAM is
available to git-fast-import (e.g., ulimit, other processes running at
the same time, much RAM being used as a ramdisk, etc).

1. Your import is simply too big for git-fast-import to hold in memory
the accumulated things that it has to remember.  I'm not familiar with
the internals of git-fast-import, but I believe that the main thing that
it has to keep in RAM is the list of marks (references to git objects
that can be referred to later in the import).  From your crash file, it
looks like there were about 350k marks loaded at the time of the crash.
 Supposing each mark is about 100 bytes, this would only amount to 35
Mb, which should not be a problem (*if* my assumptions are correct).

2. Your import contains a gigantic object which individually is so big
that it overflows some component of the import.  (I don't know whether
large objects are handled streamily; they might be read into memory at
some point.)  But since your computer had so much RAM this is hardly
imaginable.

3. git-fast-import has a memory leak and the accumulated memory leakage
is exhausting your RAM.

4. git-fast-import has some other kind of a bug.

5. The contents of the dumpfile are corrupt in a way that is triggering
the problem.  This could either be invalid input (e.g., an object that
is reported to be quaggabytes large), or some invalid input that
triggers a bug in git-fast-import.

If (1), then you either need a bigger machine or git-fast-import needs
architectural changes.

If (2), then you either need a bigger machine or git-fast-import and/or
git needs architectural changes.

If (3), then it would be good to get more information about the problem
so that the leak can be fixed.  If this is the case, it might be
possible to work around the problem by splitting the dumpfile into
several parts and loading them one after the other (outputting the marks
from one run and loading them into the next).

If (4) or (5), then it would be helpful to narrow down the problem.  It
might be possible to do so by following the instructions in the cvs2svn
FAQ [1] for systematically shrinking a test case to smaller size using
destroy_repository.py and shrink_test_case.py.  If you can create a
small repository that triggers the same problem, then there is a good
chance that it is easy to fix.

Michael
(the cvs2git maintainer)

[1] http://cvs2svn.tigris.org/faq.html#testcase

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error: git-fast-import died of signal 11

2012-10-16 Thread Uri Moszkowicz
I'm using Git 1.8.0-rc2 and cvs2git version 2.5.0-dev (trunk). The
repository is almost 20 years old and should consist of mostly
smallish plain text files. We've been tagging every commit, in
addition to for releases and development branches, so there's a lot of
tags and branches. I didn't see the memory usage of the process before
exiting but after ~3.5 hours in a subsequent run it seems to be using
about 8.5GB of virtual memory with a resident size of only .5GB, which
should have easily fit on the 512GB machine that I was using. I'm
trying on a 1TB machine now but it doesn't look like it'll make a
difference. There is no ram disk and I have exclusive access to the
machine so only from the OS, which is trivial. The only significant
limit from my environment would be on the stack:

[umoszkow@mawhp5 ~] limit
cputime  unlimited
filesize unlimited
datasize unlimited
stacksize8000 kbytes
coredumpsize 0 kbytes
memoryuseunlimited
vmemoryuse   unlimited
descriptors  1024
memorylocked 32 kbytes
maxproc  8388608

Would that result in an mmap error though? I'll try with unlimited
stacksize and descriptors anyway.

I don't think modifying the original repository or a clone of it is
possible at this point but breaking up the import into a few steps may
be - will try that next if this fails.

On Tue, Oct 16, 2012 at 2:18 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 10/15/2012 05:53 PM, Uri Moszkowicz wrote:
 I'm trying to convert a CVS repository to Git using cvs2git. I was able to
 generate the dump file without problem but am unable to get Git to
 fast-import it. The dump file is 328GB and I ran git fast-import on a
 machine with 512GB of RAM.

 fatal: Out of memory? mmap failed: Cannot allocate memory
 fast-import: dumping crash report to fast_import_crash_18192
 error: git-fast-import died of signal 11

 How can I import the repository?

 What versions of git and of cvs2git are you using?  If not the current
 versions, please try with the current versions.

 What is the nature of your repository (i.e., why is it so big)?  Does it
 consist of extremely large files?  A very deep history?  Extremely many
 branches/tags?  Extremely many files?

 Did you check whether the RAM usage of git-fast-import process was
 growing gradually to fill RAM while it was running vs. whether the usage
 seemed reasonable until it suddenly crashed?

 There are a few obvious possibilities:

 0. There is some reason that too little of your computer's RAM is
 available to git-fast-import (e.g., ulimit, other processes running at
 the same time, much RAM being used as a ramdisk, etc).

 1. Your import is simply too big for git-fast-import to hold in memory
 the accumulated things that it has to remember.  I'm not familiar with
 the internals of git-fast-import, but I believe that the main thing that
 it has to keep in RAM is the list of marks (references to git objects
 that can be referred to later in the import).  From your crash file, it
 looks like there were about 350k marks loaded at the time of the crash.
  Supposing each mark is about 100 bytes, this would only amount to 35
 Mb, which should not be a problem (*if* my assumptions are correct).

 2. Your import contains a gigantic object which individually is so big
 that it overflows some component of the import.  (I don't know whether
 large objects are handled streamily; they might be read into memory at
 some point.)  But since your computer had so much RAM this is hardly
 imaginable.

 3. git-fast-import has a memory leak and the accumulated memory leakage
 is exhausting your RAM.

 4. git-fast-import has some other kind of a bug.

 5. The contents of the dumpfile are corrupt in a way that is triggering
 the problem.  This could either be invalid input (e.g., an object that
 is reported to be quaggabytes large), or some invalid input that
 triggers a bug in git-fast-import.

 If (1), then you either need a bigger machine or git-fast-import needs
 architectural changes.

 If (2), then you either need a bigger machine or git-fast-import and/or
 git needs architectural changes.

 If (3), then it would be good to get more information about the problem
 so that the leak can be fixed.  If this is the case, it might be
 possible to work around the problem by splitting the dumpfile into
 several parts and loading them one after the other (outputting the marks
 from one run and loading them into the next).

 If (4) or (5), then it would be helpful to narrow down the problem.  It
 might be possible to do so by following the instructions in the cvs2svn
 FAQ [1] for systematically shrinking a test case to smaller size using
 destroy_repository.py and shrink_test_case.py.  If you can create a
 small repository that triggers the same problem, then there is a good
 chance that it is easy to fix.

 Michael
 (the cvs2git maintainer)

 [1] http://cvs2svn.tigris.org/faq.html#testcase

 --
 Michael Haggerty
 mhag...@alum.mit.edu
 http://softwareswirl.blogspot.com/
--

Re: error: git-fast-import died of signal 11

2012-10-16 Thread Andrew Wong
On Tue, Oct 16, 2012 at 3:41 PM, Uri Moszkowicz u...@4refs.com wrote:
 I can do that if it still fails tomorrow. How do I build a debug version of 
 git?

 On Tue, Oct 16, 2012 at 2:35 PM, Andrew Wong andrew.k...@gmail.com wrote:
 Yea, it's a difficult problem to diagnose. It'd be really helpful if
 you can run a debug version of git and run the import process under a
 debugger.

After getting git's source, you can simply run make, and it'll
compile with debug info by default. When compiling is done, you will
see all the binaries in the source's root folder. Then, from the
source folder, you can start gdb by the command:
gdb ./git-fast-import

When you're inside gdb, put a breakpoint on die_nicely by entering:
b die_nicely

Then, you can finally run your import process by entering:
r  your_cvs_dump

When fast-import crashes/dies, you can find the stacktrace by entering:
bt

And that should tell us where it crashed, and, hopefully, where the
memory error happened.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error: git-fast-import died of signal 11

2012-10-15 Thread Andrew Wong

On 10/15/2012 11:53 AM, Uri Moszkowicz wrote:

I'm trying to convert a CVS repository to Git using cvs2git. I was able to
generate the dump file without problem but am unable to get Git to
fast-import it. The dump file is 328GB and I ran git fast-import on a
machine with 512GB of RAM.
Just taking a wild guess here. Are you using 64bit version of git? If 
not, maybe it'd help to try 64bit?



fatal: Out of memory? mmap failed: Cannot allocate memory
fast-import: dumping crash report to fast_import_crash_18192
error: git-fast-import died of signal 11
fast-import also produced a crash report. It might help to diagnose 
the issue if you can post that report?
The report shouldn't be too big. And you might want to strip any 
sensitive information before posting.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error: git-fast-import died of signal 11

2012-10-15 Thread Andrew Wong

On 10/15/2012 05:28 PM, Uri Moszkowicz wrote:

Thanks for the reply. Yes I am using a 64-bit build of Git. The report
is too large to attach to email so I've uploaded it here (~6MB tar.xz
file):

http://www.tempfiles.net/download/201210/267447/fast_import_crash18192.html

Hm, there are some blanks in the recent commands section:

  D someFile
  D someFile
  D someFile
  D someFile
  (blank here)
  reset refs/tags/someTag
  from :145763
  reset refs/heads/TAG.FIXUP
* (blank here)

There should have been some commands there. Maybe that has something to 
do with the crash? Would you be able to locate where this is in the cvs 
dump? You might be able to use those tag names around those lines to 
help locate them.

What are those blanks supposed to be? Probably commits?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html