[ 
https://issues.apache.org/jira/browse/LUCENE-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064609#comment-15064609
 ] 

Paul Elschot commented on LUCENE-6933:
--------------------------------------

I cloned from  https://github.com/dweiss/lucene-solr-svn2git.git, and it works 
as advertised.
After a git gc, the total file size is:

find . -type f -print0 | xargs -0 cat | wc
2942604 13472825 347467457

This is just under 350MB, which does not seem to be consistent with the 214MB 
that was mentioned above. Did I do something wrong?

To me the actual size is not a problem at all.

For reference, the total number of files in the local git repo is 9322:
find . -type f | wc
9322    9324  694864

And thanks for showing how and when to graft.


> Create a (cleaned up) SVN history in git
> ----------------------------------------
>
>                 Key: LUCENE-6933
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6933
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>         Attachments: migration.txt, multibranch-commits.log, tools.zip
>
>
> Goals:
> * selectively drop projects and core-irrelevant stuff:
>   ** {{lucene/site}}
>   ** {{lucene/nutch}}
>   ** {{lucene/lucy}}
>   ** {{lucene/tika}}
>   ** {{lucene/hadoop}}
>   ** {{lucene/mahout}}
>   ** {{lucene/pylucene}}
>   ** {{lucene/lucene.net}}
>   ** {{lucene/old_versioned_docs}}
>   ** {{lucene/openrelevance}}
>   ** {{lucene/board-reports}}
>   ** {{lucene/java/site}}
>   ** {{lucene/java/nightly}}
>   ** {{lucene/dev/nightly}}
>   ** {{lucene/dev/lucene2878}}
>   ** {{lucene/sandbox/luke}}
>   ** {{lucene/solr/nightly}}
> * preserve the history of all changes to core sources (Solr and Lucene).
>   ** {{lucene/java}}
>   ** {{lucene/solr}}
>   ** {{lucene/dev/trunk}}
>   ** {{lucene/dev/branches/branch_3x}}
>   ** {{lucene/dev/branches/branch_4x}}
>   ** {{lucene/dev/branches/branch_5x}}
> * provide a way to link git commits and history with svn revisions (amend the 
> log message).
> * annotate release tags
> * deal with large binary blobs (JARs): keep empty files instead for their 
> historical reference only.
> Non goals:
> * no need to preserve "exact" merge history from SVN (see "impossible" below).
> * Ability to build ancient versions is not an issue.
> Impossible:
> * It is not possible to preserve SVN "merge history" because of the following 
> reasons:
>   ** Each commit in SVN operates on individual files. So one commit can 
> "copy" (and record a merge) files from anywhere in the object tree, even 
> modifying them along the way. There simply is no equivalent for this in git. 
>   ** There are historical commits in SVN that apply changes to multiple 
> branches in one commit ({{r1569975}}) and merges *from* multiple branches in 
> one commit ({{r940806}}).
> * Because exact merge tracking is impossible then what follows is that exact 
> "linearized" history of a given file is also impossible to record. Let's say 
> changes X, Y and Z have been applied to a branch of a file A and then merged 
> back. In git, this would be reflected as a single commit flattening X, Y and 
> Z (on the target branch) and three independent commits on the branch. The 
> "copy-from" link from one branch to another cannot be represented because, as 
> mentioned, merges are done on entire branches in git, not on individual 
> files. Yes, there are commits in SVN history that have selective file merges 
> (not entire branches).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to