Hello Emilian,
I'm working at Oracle on NetBeans development and we would like to start fixing build scripts to use Git instead of HG. This could be done earlier on your Git repo if you agree to as it will take time. Does not need to wait for final official donation of sources.
Can you please send me the URL,...
Thank you Martin Balin


On 24.11.2016 20:07, Emilian Bold wrote:
At under 1GB the repository size is not an issue anymore.

It's sad to see we will still have migration problems due to legal
considerations.

Could you provide an estimate how long it would take to verify and
whitelist the entire codebase Oracle plans on donating?

It's unclear to me how history would be preserved with an incremental
approach.

I would prefer we migrate the whole thing in one piece with history and all.


--emi

On Thu, Nov 24, 2016 at 5:22 PM, Jaroslav Tulach <[email protected]
wrote:
Emilian, Jan, Mark, great work.

Smooth migration from Hg to Git is essential for successful migration to
Apache. Thanks a lot for investigating how to do that.

My plan (as described in another email) is to prepare the code donation in
Hg
and update it incrementally with code integrated into Hg.

Are your conversions methods ready for incremental updates or do they only
work as a one-time batch conversion?

-jt

On čtvrtek 24. listopadu 2016 10:41:50 CET Jan Lahoda wrote:
Interesting. I tried "git gc --aggressive" on the Mark's converted
repository, and the result is:
netbeans-import/.git$ du -hs .
792M    .

The original was:
netbeans-import.git $ du -hs .
3,5G    .

(IIRC Mark was converting http://hg.netbeans.org/main, not releases, so
the
repository is a little bit smaller than the releases one.)

I tried:
$ git log -p | sha1sum

on both repositories, and the hashes appear to be the same. I also tried
to
clone the gc-ed repository using git clone --bare --no-local, and the
resulting repository is still about the same size. So, this seems good to
me, unless there is some downside I don't know about.

Jan


On Wed, Nov 23, 2016 at 8:26 PM, Emilian Bold <[email protected]>

wrote:
Actually I don't believe the data loss is that large. (There may also
be
mercurial commits that are intentionally ignored by the conversion
script,
like commits that only add tags?)

hg log | grep '^changeset:' | wc -l

   313209

git log | grep '^commit ' | wc -l

   301478

So there is a difference of 11731 commits (about 4%) but those couldn't
have such a large impact on repository size.

I hope somebody else is willing to work with me on this so we document
everything and do a reproducible repository conversion.



--emi

On Wed, Nov 23, 2016 at 9:10 PM, Emilian Bold <[email protected]>

wrote:
Well, I dunno what black magic `gc --aggressive` does but the
repository
is 0.85GB now!

I also ran `git reflog expire` first but it didn't change the size at
all.

One thing to keep in mind is that I used --force although I had 6
commits
with the warning "repository has at least one unnamed head". Which
were
probably all close branch commits (hg commit --close-branch).

So I might have have data loss(!) since I believe I read
hg-fast-export.sh

picks only one unnamed head as the migration winner. I wonder if the
gc
command didn't just purge a lot of valid commits from such an unnamed
head

and that's why the repository became so small.

Could somebody else try a test repository conversion and validate my
numbers?

git gc --aggressive --prune=now
Counting objects: 4085031, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2909203/2909203), done.
Writing objects: 100% (4085031/4085031), done.
Total 4085031 (delta 2150468), reused 1585934 (delta 0)
Checking connectivity: 4085031, done.



--emi

On Wed, Nov 23, 2016 at 7:59 PM, Paul Merlin <[email protected]>

wrote:
Hi Emilian,

I see hg-fast-export.sh finished at some point.

As expected though, git does not have any of the disk space gains.
The
converted git releases/ repository is 3.6GB.
Just a thought.
Did you try some git cleanups after the conversion?

git reflog expire --expire=now --all
git gc --aggressive --prune=now

Cheers

In case these statistics mean something:

git-fast-import statistics:
------------------------------------------------------------
---------
Alloc'd objects:    4090000
Total objects:      4085509 (  40220100 duplicates
   )
       blobs  :      1036365 (  28386238 duplicates     858087
deltas
of

969684 attempts)

       trees  :      2735935 (  11833862 duplicates    1370606
deltas
of

  2613480 attempts)

       commits:       313209 (         0 duplicates          0
deltas
of

      0 attempts)

       tags   :            0 (         0 duplicates          0
deltas
of

      0 attempts)

Total branches:        1283 (       346 loads     )

       marks:        1048576 (    313209 unique    )
       atoms:         124011

Memory total:        218429 KiB

        pools:         26711 KiB

      objects:        191718 KiB

------------------------------------------------------------
---------
pack_report: getpagesize()            =       4096
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit      = 8589934592
pack_report: pack_used_ctr            =   39000045
pack_report: pack_mmap_calls          =     733040
pack_report: pack_open_windows        =          4 /          7
pack_report: pack_mapped              = 4280730006 / 6950823920
------------------------------------------------------------
---------

--emi

On Fri, Nov 18, 2016 at 1:32 PM, Emilian Bold <
[email protected]
wrote:
A releases/ clone which on my system takes 3.8GB is reduced to
1.6GB
with

the generaldelta and aggressivemergedeltas flags (took about 14
hours).

Pretty impressive!

Converting to git with hg-fast-export.sh complains that
"repository
has at

least one unnamed head" for about 6 revisions. With --force I'm
able
to

start the conversion but it hasn't finished yet.

The git conversion is about 35% done and already using 1.3GB.

So... I assume it's going to need just like the original
repository
about

3.8GB.

I wonder if git has similar space-saving tricks?



--emi

On Thu, Nov 17, 2016 at 8:46 AM, Emilian Bold <
[email protected]>

wrote:
Forgot about this. I've just started the Mercurial repository
conversion

which will take a few hours.

Will report tomorrow or when it's done.


--emi

On Wed, Nov 16, 2016 at 11:18 PM, cowwoc <
[email protected]>
wrote:
Hi Emilian,

Any update on this?

Thanks,
Gili

On 2016-11-11 01:33 (-0500), Emilian Bold <[email protected]>
wrote:
Thank you for following through with this after we talked on
IRC.>

I will check later the size reduction for the releases/ repo.>



Reply via email to