Re: Moving to git?

2015-06-01 Thread Ramkumar R. Aiyengar
 There is only one good rule though - no merge commmits in the history :)
Ever. Do whatever you want beyond that. A clean, simple history for each
branch is the only sensible use of Git I've seen.

+1


 - Mark


 On Sat, May 30, 2015 at 9:00 AM Adrien Grand jpou...@gmail.com wrote:

 The main benefit I see is that external contributors would get their
 name in the commit log.

 However on the other hand, I'm a bit annoyed that people easily
 disagree on the workflow: some people merge into the maintenance
 branch first and then to master, other people merge into master first
 and then cherry-pick, other people prefer rebasing instead of merging,
 etc. I personally don't really care but if we agree on moving to Git,
 I hope we can agree on the workflow at the same time. At least today
 with svn we have something simple that everybody agrees on.

 -0: I'm not against it but Subversion works well for me today. If
 everybody else agrees on switching to Git I would like us to agree on
 the workflow as well.

 --
 Adrien

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 --
 - Mark
 about.me/markrmiller


Re: Moving to git?

2015-06-01 Thread Mark Miller
bq. That is my workflow, get over it.

Done.

bq. Its not something we vote about. Its just like the editor I choose to
use.

I have care 0 about your vote or opinion on anything. Bad community member.

- Mark


On Mon, Jun 1, 2015 at 4:57 AM Robert Muir rcm...@gmail.com wrote:

 I use merge actually. Its just fine. That is my workflow, get over it.

 Its not something we vote about. Its just like the editor I choose to use.

 On Mon, Jun 1, 2015 at 2:37 AM, Ramkumar R. Aiyengar
 andyetitmo...@gmail.com wrote:
  There is only one good rule though - no merge commmits in the history :)
  Ever. Do whatever you want beyond that. A clean, simple history for each
  branch is the only sensible use of Git I've seen.
 
  +1
 
 
  - Mark
 
 
  On Sat, May 30, 2015 at 9:00 AM Adrien Grand jpou...@gmail.com wrote:
 
  The main benefit I see is that external contributors would get their
  name in the commit log.
 
  However on the other hand, I'm a bit annoyed that people easily
  disagree on the workflow: some people merge into the maintenance
  branch first and then to master, other people merge into master first
  and then cherry-pick, other people prefer rebasing instead of merging,
  etc. I personally don't really care but if we agree on moving to Git,
  I hope we can agree on the workflow at the same time. At least today
  with svn we have something simple that everybody agrees on.
 
  -0: I'm not against it but Subversion works well for me today. If
  everybody else agrees on switching to Git I would like us to agree on
  the workflow as well.
 
  --
  Adrien
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
  --
  - Mark
  about.me/markrmiller

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 --
- Mark
about.me/markrmiller


Re: Moving to git?

2015-06-01 Thread Robert Muir
I use merge actually. Its just fine. That is my workflow, get over it.

Its not something we vote about. Its just like the editor I choose to use.

On Mon, Jun 1, 2015 at 2:37 AM, Ramkumar R. Aiyengar
andyetitmo...@gmail.com wrote:
 There is only one good rule though - no merge commmits in the history :)
 Ever. Do whatever you want beyond that. A clean, simple history for each
 branch is the only sensible use of Git I've seen.

 +1


 - Mark


 On Sat, May 30, 2015 at 9:00 AM Adrien Grand jpou...@gmail.com wrote:

 The main benefit I see is that external contributors would get their
 name in the commit log.

 However on the other hand, I'm a bit annoyed that people easily
 disagree on the workflow: some people merge into the maintenance
 branch first and then to master, other people merge into master first
 and then cherry-pick, other people prefer rebasing instead of merging,
 etc. I personally don't really care but if we agree on moving to Git,
 I hope we can agree on the workflow at the same time. At least today
 with svn we have something simple that everybody agrees on.

 -0: I'm not against it but Subversion works well for me today. If
 everybody else agrees on switching to Git I would like us to agree on
 the workflow as well.

 --
 Adrien

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 --
 - Mark
 about.me/markrmiller

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-06-01 Thread Gus Heck
I've honestly never understood the perspective of eliminating merge commits
(though I've had to work with it, and the rebasing required got me into
some of the worst git snafu's I've ever been in). Merges are history too.
Why would anyone want to loose the information that code was merged from a
branch? For example if the problem was introduced when code lines were
merged, that's useful info about when/how it happened and where more
attention needs to be focused.

Not saying it's wrong, just saying I don't understand it...

I like git and use it where I can, but as was noted earlier it will
probably be necessary for the project to establish the way they wish to use
or it will likely create significant chaos as one person tries to eliminate
merges in the history and another person preserves them; One person forks
and makes pull requests while another commits directly... who reviews the
pull request... Do commiters use pull requests, or only non-commiters?

Food for thought:
https://www.atlassian.com/git/tutorials/comparing-workflows/

I don't use the github repo for solr when I build it from the repo right
now because it seems to be a secondary add on and I always favor the
canonical source, because the last thing I want is to deal with an extra
layer and figuring out where the pitfalls in the translation between layers
might be.

My $0.02,
Gus


On Mon, Jun 1, 2015 at 2:37 AM, Ramkumar R. Aiyengar 
andyetitmo...@gmail.com wrote:

  There is only one good rule though - no merge commmits in the history :)
 Ever. Do whatever you want beyond that. A clean, simple history for each
 branch is the only sensible use of Git I've seen.

 +1

 
  - Mark
 
 
  On Sat, May 30, 2015 at 9:00 AM Adrien Grand jpou...@gmail.com wrote:
 
  The main benefit I see is that external contributors would get their
  name in the commit log.
 
  However on the other hand, I'm a bit annoyed that people easily
  disagree on the workflow: some people merge into the maintenance
  branch first and then to master, other people merge into master first
  and then cherry-pick, other people prefer rebasing instead of merging,
  etc. I personally don't really care but if we agree on moving to Git,
  I hope we can agree on the workflow at the same time. At least today
  with svn we have something simple that everybody agrees on.
 
  -0: I'm not against it but Subversion works well for me today. If
  everybody else agrees on switching to Git I would like us to agree on
  the workflow as well.
 
  --
  Adrien
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
  --
  - Mark
  about.me/markrmiller




-- 
http://www.the111shift.com


Re: Moving to git?

2015-05-31 Thread david.w.smi...@gmail.com
Nice!

On Sun, May 31, 2015 at 1:31 PM Steve Davids sdav...@gmail.com wrote:

 bq. Something needs to be done about all those jars in the source
 history, I will not let this go.

 I went ahead and used the BFG Repo Cleaner
 https://rtyley.github.io/bfg-repo-cleaner/ tool to drop all of the old
 jars in the git history, here are the findings:

 $ git clone --mirror https://github.com/apache/lucene-solr.git
 lucene-solr-mirror

  489M lucene-solr-mirror

 $ java -jar ~/Downloads/bfg-1.12.3.jar --delete-files *.jar
 --protect-blobs-from trunk,branch_5x,branch_4x lucene-solr-mirror
 $ cd lucene-solr-mirror
 $ git reflog expire --expire=now --all  git gc --prune=now --aggressive

  182M lucene-solr-mirror

 $ cat lucene-solr-mirror.bfg-report/2015-05-31/10-16-36/deleted-files.txt
 af4eed0506b53f17a4d22e4f1630ee03cb7991e5 177868 Tidy.jar
 53f82a1c4c492dc810c27317857bbb02afd6fa58 62983 activation-1.1.jar
 3beb3b802ffd7502ac4b4d47e0b2a75d08e30cc3 1034049 ant-1.6.5.jar
 704717779f6d0d7eb026dc7af78a35e51adeec8b 1323005 ant-1.7.1.jar
 7f5be4a4e05939429353a90e882846aeac72b976 1933743 ant-1.8.2.jar
 063cce4f940033fa6e33d3e590cf6f5051129295 93518 ant-junit-1.7.1.jar
 704717779f6d0d7eb026dc7af78a35e51adeec8b 1323005 apache-ant-1.7.1.jar
 063cce4f940033fa6e33d3e590cf6f5051129295 93518 apache-ant-junit-1.7.1.jar
 e3c62523fb93b5e2f73365e6cee0d0bc68e48556 95511 apache-mime4j-core-0.7.jar
 1f7bf1ea13697ca0243d399ca6e5d864dd8bec0b 300168 apache-mime4j-dom-0.7.jar
 bab8b31fb99256e13fc6010701db560243c47fa7 26027
 apache-solr-commons-csv-1.0-SNAPSHOT-r966014.jar
 5c4007c7e74af85d823243153d308f80e084eff0 22478
 apache-solr-noggit-r1099557.jar
 f59a39b011591edafc7955e97ae0d195fdf8b42e 22376
 apache-solr-noggit-r1209632.jar
 2a07c61d9ecb9683a135b7847682e7c36f19bbfe 22770
 apache-solr-noggit-r1211150.jar
 30be80e0b838a9c1445936b6966ccfc7ff165ae5 36776
 apache-solr-noggit-r730138.jar
 97d779912d38d2524a0e20efa849a4b6f01a4b46 21229
 apache-solr-noggit-r730138.jar
 a798b805d0ce92606697cc1b2aac42bf416076e3 37259
 apache-solr-noggit-r944541.jar
 9b434f5760dd0d78350bdf8237273c0d5db0174e 21240
 apache-solr-noggit-r944541.jar
 8217cae0a1bc977b241e0c8517cc2e3e7cede276 43033 asm-3.1.jar
 4133d823d96bf3fc26d3a9754375dcc30d8da416 342664 asm-debug-all-4.1.jar
 f66e9a8b9868226121961c13e6a32a55d0b2f78a 229116 bcmail-jdk15-1.45.jar
 409070b0370a95c14ed4357261afb96b91d10e86 1663318 bcprov-jdk15-1.45.jar
 b64b033af70609338c07e2a88a5f7efcd1a84ddb 92027 boilerpipe-1.1.0.jar
 96c3bdbdaacd5289b0e654842e435689fbcf22e2 679423 carrot2-core-3.4.0.jar
 043c0cb889aea066f7d4126af029d00a0bcd9e81 655412 carrot2-core-3.4.0.jar
 f872cbc8eec94f7d5b29a73f99cd13089848a3cd 933657 carrot2-core-3.4.2.jar
 ce2d3bf9c28a4ff696d66a82334d15fd0161e890 995243 carrot2-core-3.4.2.jar
 be94db93d41bd4ba53b650d421cfa5fb0519b9af 958799 carrot2-core-3.5.0.1.jar
 adc127c48137d03e252f526de84a07c8d6bda521 979186 carrot2-core-3.5.0.jar
 ab44cf9314b1efff393e05f9c938446887d3570e 981085 carrot2-core-3.5.0.jar
 5ca86c5e72b2953feb0b58fbd87f76d0301cbbf6 517641 carrot2-mini-3.1.0.jar
 b1b89c9c921f16af22a88db3ff28975a8e40d886 188671 commons-beanutils-1.7.0.jar
 e633afbe6842aa92b1a8f0ff3f5b8c0e3283961b 36174 commons-cli-1.1.jar
 957b6752af9a60c1bb2a4f65db0e90e5ce00f521 46725 commons-codec-1.3.jar
 458d432da88b0efeab640c229903fb5aad274044 58160 commons-codec-1.4.jar
 e9013fed78f333c928ff7f828948b91fcb5a92b4 73098 commons-codec-1.5.jar
 ee1bc49acae11cc79eceec51f7be785590e99fd8 232771 commons-codec-1.6.jar
 41e230feeaa53618b6ac5f8d11792c2eecf4d4fd 559366 commons-collections-3.1.jar
 c35fa1fee145cba638884e41b80a401cbe4924ef 575389
 commons-collections-3.2.1.jar
 78d832c11c42023d4bc12077a1d9b7b5025217bc 143847 commons-compress-1.0.jar
 51baf91a2df10184a8cca5cb43f11418576743a1 161361 commons-compress-1.1.jar
 61753909c3f32306bf60d09e5345d47058ba2122 168596 commons-compress-1.2.jar
 6c826c528b60bb1b25e9053b7f4c920292f6c343 224548 commons-compress-1.3.jar
 f80348dfa0b59f0840c25d1b8c25d1490d1eaf51 22017
 commons-csv-1.0-SNAPSHOT-r609327.jar
 8439e6f1a8b1d82943f84688b8086869255eda86 27361
 commons-csv-1.0-SNAPSHOT-r966014.jar
 1783dbea232ced6db122268f8faa5ce773c7ea42 139966 commons-digester-1.7.jar
 9c8bd13a2002a9ff5b35b873b9f111d5281ad201 148783 commons-digester-2.0.jar
 aa209b3887c90933cdc58c8c8572e90435e8e48d 57779 commons-fileupload-1.2.1.jar
 7c59774aed4f5dd08778489aaad565690ff7c132 305001 commons-httpclient-3.1.jar
 133dc6cb35f5ca2c5920fd0933a557c2def88680 109043 commons-io-1.4.jar
 b5c7d692fe5616af4332c1a1db6efd23e3ff881b 163151 commons-io-2.1.jar
 ce0ca22c8d29a9be736d775fe50bfdc6ce770186 257923 commons-lang-2.4.jar
 532939ecab6b77ccb77af3635c55ff9752b70ab7 261809 commons-lang-2.4.jar
 98467d3a653ebad776ffa3542efeb9732fe0b482 284220 commons-lang-2.6.jar
 b73a80fab641131e6fbe3ae833549efb3c540d17 38015 commons-logging-1.0.4.jar
 1deef144cb17ed2c11c6cdcdcb2d9530fa8d0b47 60686 commons-logging-1.1.1.jar
 ae0b63586701efdc7bf03ffb0a840d50950d211c 3566844 core-3.1.1.jar
 

Re: Moving to git?

2015-05-31 Thread Dawid Weiss
 You guys totally miss the point on clone.

No, I think you miss our point.

 The thing is that svn checkout gives you enough, to do what you need
 to do.

git clone --depth 1 does as well -- you work on your stuff, then you
diff against the baseline, submit a patch. Like I said -- we differ in
opinions on what's easier to do. For me diffing against trunk with svn
is *terribly* slow and backporting to any other branch is an annoying
manual and tedious process.

D.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Robert Muir
On Sun, May 31, 2015 at 2:32 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 Yeah, but it misses the point -- history is history, if there were
 jars in it, you shouldn't just strip them, it'd be confusing.

 How was it back when Lucene was merging with Solr? Didn't it just
 initiate with a new clean repo? Maybe not all of the history is really
 needed -- if we limited ourselves to, say, all of the history that
 includes ivy then the size of the repo would drop significantly... but
 again, to me size doesn't really matter at all; one initial clone is
 no-cost. Go make yourself a cup of tea, come back and you're set.

It seems like we can do something reasonable here either way.  We are
talking about a lot of jars.

But I would love to see this kinda stuff (what history will be
imported/preserved elsewhere) as part of the proposal, that is all.
Making the slowest operation of git (which is turtle slow) more
reasonable can go a long way to win over people, like me that are more
on the -0 side. I re-clone from time to time, maybe someone else will
just keep their old workflow and use 5 checkouts or whatever they
want.

So yeah, I think the size of the jars are very relevant. All these
silly jars are maybe even the root cause of the huge repository /
git mirroring issue that spawned this thread.

And while it might not be relevant to your workflow, try to imagine
that other people have different workflow of their own, and are just
fine with that.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Robert Muir
I totally agree Doug. Losing the jars would have a cost: those old
branches wouldn't work out of box if you wanted to run tests on
them.

But I am not sure how bad that cost really is. It might be zero. I
havent tried to run e.g. lucene 2.x tests with a modern java 7 or java
8, but i bet they probably do not work due to things like hashmap
failures. And I think solr before 4.0 will not even compile, because
of things like wildcard import + base64 clashes.

So if i had my preference, we'd import all history as much as we can,
and nuke the silly jars. And I'd like that sourceforge history there
too if we can get it, but I don't know if it is really legal.

The sourceforge CVS works, see IndexWriter:
http://lucene.cvs.sourceforge.net/viewvc/lucene/lucene/com/lucene/index/IndexWriter.java?view=log


On Sun, May 31, 2015 at 3:10 PM, Doug Turnbull
dturnb...@opensourceconnections.com wrote:
 I have no dog in the svn vs git debate honestly.

 I want to say how important it is to keep healthy history. I recently went
 on a bit of code archeology dig recently to figure out why something in
 Lucene was done the way it was. It was handy that the history went as far
 back as it did, but I had to switch around to different places to continue
 the history. For example, the abrupt shift that seems to be around when
 Solr/Lucene were put together had me digging for the last pure lucene tag.
 Its over at lucene/java/branches NOT lucene/dev/tags with teh other tags.

 Then when you get to the branch for lucene-101, the first commit is:
 2001: New repository initialized by cvs2svn.

 Unable to find a cvs repo, my hunt stopped (love to hear if anyone has a CVS
 repo -- maybe from Jakarta?)

 So removing some jars isn't a big deal. But cutting off history and
 restarting at some arbitrary point can be annoying and make it harder to dig
 up more about why things are the way they are.

 /steps down from soapbox
 -Doug



 On Sunday, May 31, 2015, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote:

 Yeah, but it misses the point -- history is history, if there were
 jars in it, you shouldn't just strip them, it'd be confusing.

 How was it back when Lucene was merging with Solr? Didn't it just
 initiate with a new clean repo? Maybe not all of the history is really
 needed -- if we limited ourselves to, say, all of the history that
 includes ivy then the size of the repo would drop significantly... but
 again, to me size doesn't really matter at all; one initial clone is
 no-cost. Go make yourself a cup of tea, come back and you're set.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Dawid Weiss
 I'd like to have full consolidated history, as much as possible,
 connect-the-dots across whatever CVS/SVN/etc repos to the extent
 maximally permitted by law, as Doug hints at. Just nuke the jars.

I've done this (CVS-SVN-GIT) before. It wasn't that difficult.
Eventually (for git) you script it and it gets version after version
from CVS or SVN and appends it to git. I admit I didn't care much
about svn merging infos though. Any files can be removed/ pruned by
rewriting git trees before they're published.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Steve Davids
There are also some rather large '.dat' files in the history as well, I
found this by running on a job to delete all blobs  5MB from the history
via:

$ java -jar ~/Downloads/bfg-1.12.3.jar --strip-blobs-bigger-than 5M
--protect-blobs-from trunk,branch_5x,branch_4x lucene-solr-mirror

 Deleted files
 -
 Filename Git id


 ---
 DoubleArrayTrie.dat| 8babf9fa (16.8 MB), f3bfe15b (16.8 MB),
 ...
 TokenInfoDictionary$buffer.dat | 25938b37 (7.0 MB), 7f02420f (7.1 MB), ...

 TokenInfoDictionary$trie.dat   | 69e76d64 (16.8 MB)

 dat.dat| 7445d1c8 (16.0 MB), 79bd7c8b (16.8 MB),
 37a215e5 (16.8 MB)
 europarl.lines.txt.gz  | e0366f10 (5.5 MB)

 tid.dat| 5a1e6199 (24.9 MB), 996d3fc5 (28.1 MB),
 ...
 tid_map.dat| 690fbea5 (6.3 MB), c1c01405 (6.3 MB),
 7a8c1420 (6.4 MB)
 wiki_results.txt   | db9e9294 (19.8 MB), 52ff9357 (19.8 MB),
 ...
 wiki_sentence.txt  | 3a38f62e (19.0 MB)

Dropping just those files reduced the repo by 50M, overall size is 131MB.

Note: there is one large file still in the trunk 5MB:

 * commit df1e3b32 (protected by 'trunk') - contains 1 dirty file :
 -
 lucene/test-framework/src/resources/org/apache/lucene/util/europarl.lines.txt.gz
 (5.5 MB)


Also, I failed to provide the numbers on what `git reflog expire
--expire=now --all  git gc --prune=now --aggressive` on a fresh mirror
checkout, it results in a repo size of 320M. So, dropping the old jars
saves 120MB.

-Steve

On Sun, May 31, 2015 at 4:39 PM, david.w.smi...@gmail.com 
david.w.smi...@gmail.com wrote:

 I like where this is going!

 I also think history of source code is very important, but not history of
 ‘.jar’ files that shouldn’t have been in source control in the first
 place.  I’m fiercely negative about large binaries or ‘jar’ files that can
 be downloaded by the build system (e.g. ivy) in source control.  And it was
 already mentioned a full history (.jar’s  all) could be kept somewhere
 more for archival purposes — which is a good compromise, I think, since
 “build-ability” of history should be retained (assuming it’s even still
 possible, given Rob’s comments) but doesn’t have to be convenient (e.g. by
 it being in a separate repo).   +1 to that!

 If we were to come up with a new git repo that doesn’t have the ‘.jar’s,
 it’d be good to also streamline the history prior to the big Lucene + Solr
 merge due to the paths in source control as to where the trunk, branches,
 and tags lived.  It appears the current repo may have been a blind git
 import from subversion.  And hand-done process that is mindful of these
 things would result in a nice history.  I’ve done this sorta thing once (a
 project at my last job) and volunteer to do it here if we can get consensus
 on a move to git.

 ~ David

 On Sun, May 31, 2015 at 4:21 PM Dawid Weiss dawid.we...@cs.put.poznan.pl
 wrote:

  I'd like to have full consolidated history, as much as possible,
  connect-the-dots across whatever CVS/SVN/etc repos to the extent
  maximally permitted by law, as Doug hints at. Just nuke the jars.

 I've done this (CVS-SVN-GIT) before. It wasn't that difficult.
 Eventually (for git) you script it and it gets version after version
 from CVS or SVN and appends it to git. I admit I didn't care much
 about svn merging infos though. Any files can be removed/ pruned by
 rewriting git trees before they're published.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Moving to git?

2015-05-31 Thread Doug Turnbull
You just made my day with that CVS repo! :)

Though I don't really get a vote -- +1 to your plan Robert.

/polishes history degree
-Doug

On Sun, May 31, 2015 at 3:16 PM, Robert Muir rcm...@gmail.com wrote:

 I totally agree Doug. Losing the jars would have a cost: those old
 branches wouldn't work out of box if you wanted to run tests on
 them.

 But I am not sure how bad that cost really is. It might be zero. I
 havent tried to run e.g. lucene 2.x tests with a modern java 7 or java
 8, but i bet they probably do not work due to things like hashmap
 failures. And I think solr before 4.0 will not even compile, because
 of things like wildcard import + base64 clashes.

 So if i had my preference, we'd import all history as much as we can,
 and nuke the silly jars. And I'd like that sourceforge history there
 too if we can get it, but I don't know if it is really legal.

 The sourceforge CVS works, see IndexWriter:

 http://lucene.cvs.sourceforge.net/viewvc/lucene/lucene/com/lucene/index/IndexWriter.java?view=log


 On Sun, May 31, 2015 at 3:10 PM, Doug Turnbull
 dturnb...@opensourceconnections.com wrote:
  I have no dog in the svn vs git debate honestly.
 
  I want to say how important it is to keep healthy history. I recently
 went
  on a bit of code archeology dig recently to figure out why something in
  Lucene was done the way it was. It was handy that the history went as far
  back as it did, but I had to switch around to different places to
 continue
  the history. For example, the abrupt shift that seems to be around when
  Solr/Lucene were put together had me digging for the last pure lucene
 tag.
  Its over at lucene/java/branches NOT lucene/dev/tags with teh other tags.
 
  Then when you get to the branch for lucene-101, the first commit is:
  2001: New repository initialized by cvs2svn.
 
  Unable to find a cvs repo, my hunt stopped (love to hear if anyone has a
 CVS
  repo -- maybe from Jakarta?)
 
  So removing some jars isn't a big deal. But cutting off history and
  restarting at some arbitrary point can be annoying and make it harder to
 dig
  up more about why things are the way they are.
 
  /steps down from soapbox
  -Doug
 
 
 
  On Sunday, May 31, 2015, Dawid Weiss dawid.we...@cs.put.poznan.pl
 wrote:
 
  Yeah, but it misses the point -- history is history, if there were
  jars in it, you shouldn't just strip them, it'd be confusing.
 
  How was it back when Lucene was merging with Solr? Didn't it just
  initiate with a new clean repo? Maybe not all of the history is really
  needed -- if we limited ourselves to, say, all of the history that
  includes ivy then the size of the repo would drop significantly... but
  again, to me size doesn't really matter at all; one initial clone is
  no-cost. Go make yourself a cup of tea, come back and you're set.
 
  Dawid
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Relevant Search http://manning.com/turnbull from Manning
Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.


Re: Moving to git?

2015-05-31 Thread Dawid Weiss
 Losing the jars would have a cost: those old
 branches wouldn't work out of box if you wanted to run tests on

Yeah, I'd rather not have them at all than have them filtered and
crippled. It'll be confusing.

There's nothing wrong in preserving the SVN history (or even a full
git import from SVN, but in a separate repo) for archival reasons and
just starting a new repo from some point in history (where it makes
sense, for example the still maintained branches).

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Robert Muir
On Sun, May 31, 2015 at 3:53 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 Losing the jars would have a cost: those old
 branches wouldn't work out of box if you wanted to run tests on

 Yeah, I'd rather not have them at all than have them filtered and
 crippled. It'll be confusing.


But my argument is that they are already crippled. So what is the
purpose of keeping the jars?

I'd like to have full consolidated history, as much as possible,
connect-the-dots across whatever CVS/SVN/etc repos to the extent
maximally permitted by law, as Doug hints at. Just nuke the jars.

Propose this and I will use any versioning system you would like on top of it!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Steve Davids
bq. Something needs to be done about all those jars in the source history, I
will not let this go.

I went ahead and used the BFG Repo Cleaner
https://rtyley.github.io/bfg-repo-cleaner/ tool to drop all of the old
jars in the git history, here are the findings:

$ git clone --mirror https://github.com/apache/lucene-solr.git
lucene-solr-mirror

 489M lucene-solr-mirror

$ java -jar ~/Downloads/bfg-1.12.3.jar --delete-files *.jar
--protect-blobs-from trunk,branch_5x,branch_4x lucene-solr-mirror
$ cd lucene-solr-mirror
$ git reflog expire --expire=now --all  git gc --prune=now --aggressive

 182M lucene-solr-mirror

$ cat lucene-solr-mirror.bfg-report/2015-05-31/10-16-36/deleted-files.txt
af4eed0506b53f17a4d22e4f1630ee03cb7991e5 177868 Tidy.jar
53f82a1c4c492dc810c27317857bbb02afd6fa58 62983 activation-1.1.jar
3beb3b802ffd7502ac4b4d47e0b2a75d08e30cc3 1034049 ant-1.6.5.jar
704717779f6d0d7eb026dc7af78a35e51adeec8b 1323005 ant-1.7.1.jar
7f5be4a4e05939429353a90e882846aeac72b976 1933743 ant-1.8.2.jar
063cce4f940033fa6e33d3e590cf6f5051129295 93518 ant-junit-1.7.1.jar
704717779f6d0d7eb026dc7af78a35e51adeec8b 1323005 apache-ant-1.7.1.jar
063cce4f940033fa6e33d3e590cf6f5051129295 93518 apache-ant-junit-1.7.1.jar
e3c62523fb93b5e2f73365e6cee0d0bc68e48556 95511 apache-mime4j-core-0.7.jar
1f7bf1ea13697ca0243d399ca6e5d864dd8bec0b 300168 apache-mime4j-dom-0.7.jar
bab8b31fb99256e13fc6010701db560243c47fa7 26027
apache-solr-commons-csv-1.0-SNAPSHOT-r966014.jar
5c4007c7e74af85d823243153d308f80e084eff0 22478
apache-solr-noggit-r1099557.jar
f59a39b011591edafc7955e97ae0d195fdf8b42e 22376
apache-solr-noggit-r1209632.jar
2a07c61d9ecb9683a135b7847682e7c36f19bbfe 22770
apache-solr-noggit-r1211150.jar
30be80e0b838a9c1445936b6966ccfc7ff165ae5 36776
apache-solr-noggit-r730138.jar
97d779912d38d2524a0e20efa849a4b6f01a4b46 21229
apache-solr-noggit-r730138.jar
a798b805d0ce92606697cc1b2aac42bf416076e3 37259
apache-solr-noggit-r944541.jar
9b434f5760dd0d78350bdf8237273c0d5db0174e 21240
apache-solr-noggit-r944541.jar
8217cae0a1bc977b241e0c8517cc2e3e7cede276 43033 asm-3.1.jar
4133d823d96bf3fc26d3a9754375dcc30d8da416 342664 asm-debug-all-4.1.jar
f66e9a8b9868226121961c13e6a32a55d0b2f78a 229116 bcmail-jdk15-1.45.jar
409070b0370a95c14ed4357261afb96b91d10e86 1663318 bcprov-jdk15-1.45.jar
b64b033af70609338c07e2a88a5f7efcd1a84ddb 92027 boilerpipe-1.1.0.jar
96c3bdbdaacd5289b0e654842e435689fbcf22e2 679423 carrot2-core-3.4.0.jar
043c0cb889aea066f7d4126af029d00a0bcd9e81 655412 carrot2-core-3.4.0.jar
f872cbc8eec94f7d5b29a73f99cd13089848a3cd 933657 carrot2-core-3.4.2.jar
ce2d3bf9c28a4ff696d66a82334d15fd0161e890 995243 carrot2-core-3.4.2.jar
be94db93d41bd4ba53b650d421cfa5fb0519b9af 958799 carrot2-core-3.5.0.1.jar
adc127c48137d03e252f526de84a07c8d6bda521 979186 carrot2-core-3.5.0.jar
ab44cf9314b1efff393e05f9c938446887d3570e 981085 carrot2-core-3.5.0.jar
5ca86c5e72b2953feb0b58fbd87f76d0301cbbf6 517641 carrot2-mini-3.1.0.jar
b1b89c9c921f16af22a88db3ff28975a8e40d886 188671 commons-beanutils-1.7.0.jar
e633afbe6842aa92b1a8f0ff3f5b8c0e3283961b 36174 commons-cli-1.1.jar
957b6752af9a60c1bb2a4f65db0e90e5ce00f521 46725 commons-codec-1.3.jar
458d432da88b0efeab640c229903fb5aad274044 58160 commons-codec-1.4.jar
e9013fed78f333c928ff7f828948b91fcb5a92b4 73098 commons-codec-1.5.jar
ee1bc49acae11cc79eceec51f7be785590e99fd8 232771 commons-codec-1.6.jar
41e230feeaa53618b6ac5f8d11792c2eecf4d4fd 559366 commons-collections-3.1.jar
c35fa1fee145cba638884e41b80a401cbe4924ef 575389
commons-collections-3.2.1.jar
78d832c11c42023d4bc12077a1d9b7b5025217bc 143847 commons-compress-1.0.jar
51baf91a2df10184a8cca5cb43f11418576743a1 161361 commons-compress-1.1.jar
61753909c3f32306bf60d09e5345d47058ba2122 168596 commons-compress-1.2.jar
6c826c528b60bb1b25e9053b7f4c920292f6c343 224548 commons-compress-1.3.jar
f80348dfa0b59f0840c25d1b8c25d1490d1eaf51 22017
commons-csv-1.0-SNAPSHOT-r609327.jar
8439e6f1a8b1d82943f84688b8086869255eda86 27361
commons-csv-1.0-SNAPSHOT-r966014.jar
1783dbea232ced6db122268f8faa5ce773c7ea42 139966 commons-digester-1.7.jar
9c8bd13a2002a9ff5b35b873b9f111d5281ad201 148783 commons-digester-2.0.jar
aa209b3887c90933cdc58c8c8572e90435e8e48d 57779 commons-fileupload-1.2.1.jar
7c59774aed4f5dd08778489aaad565690ff7c132 305001 commons-httpclient-3.1.jar
133dc6cb35f5ca2c5920fd0933a557c2def88680 109043 commons-io-1.4.jar
b5c7d692fe5616af4332c1a1db6efd23e3ff881b 163151 commons-io-2.1.jar
ce0ca22c8d29a9be736d775fe50bfdc6ce770186 257923 commons-lang-2.4.jar
532939ecab6b77ccb77af3635c55ff9752b70ab7 261809 commons-lang-2.4.jar
98467d3a653ebad776ffa3542efeb9732fe0b482 284220 commons-lang-2.6.jar
b73a80fab641131e6fbe3ae833549efb3c540d17 38015 commons-logging-1.0.4.jar
1deef144cb17ed2c11c6cdcdcb2d9530fa8d0b47 60686 commons-logging-1.1.1.jar
ae0b63586701efdc7bf03ffb0a840d50950d211c 3566844 core-3.1.1.jar
b9c8c8a170881dfe9c33adc87c26348904510954 364003 cpptasks-1.0b5.jar
99baf20bacd712cae91dd6e4e1f46224cafa1a37 500676 db-4.7.25.jar
c8c4dbb92d6c23a7fbb2813eb721eb4cce91750c 313898 

Re: Moving to git?

2015-05-31 Thread Dawid Weiss
Yeah, but it misses the point -- history is history, if there were
jars in it, you shouldn't just strip them, it'd be confusing.

How was it back when Lucene was merging with Solr? Didn't it just
initiate with a new clean repo? Maybe not all of the history is really
needed -- if we limited ourselves to, say, all of the history that
includes ivy then the size of the repo would drop significantly... but
again, to me size doesn't really matter at all; one initial clone is
no-cost. Go make yourself a cup of tea, come back and you're set.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Mark Miller
In any case, decisions on these types of things are majority of PMC rules.
No one has wanted to call a vote yet. Eventually we will and eventually we
will move to git. I'm still in no hurry.

- mark
On Sun, May 31, 2015 at 9:59 AM Uwe Schindler u...@thetaphi.de wrote:

 I also clone my SVN working copy locally. After that I just switch branch.

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

  -Original Message-
  From: Yonik Seeley [mailto:ysee...@gmail.com]
  Sent: Sunday, May 31, 2015 3:56 PM
  To: Solr/Lucene Dev
  Subject: Re: Moving to git?
 
  On Sun, May 31, 2015 at 9:31 AM, Ramkumar R. Aiyengar
  andyetitmo...@gmail.com wrote:
   Personally, clone for me is 'rare', I did it once years back, and have
   never done it since. log, diff and others I do on a daily basis.
 
  Yep, I find I need fewer different working directories with git, but
 when I do
  want an additional copy, I just make a local copy of an existing repo
 since it
  has everything.
 
  -Yonik
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
  commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 --
- Mark
about.me/markrmiller


Re: Moving to git?

2015-05-31 Thread david.w.smi...@gmail.com
I like where this is going!

I also think history of source code is very important, but not history of
‘.jar’ files that shouldn’t have been in source control in the first
place.  I’m fiercely negative about large binaries or ‘jar’ files that can
be downloaded by the build system (e.g. ivy) in source control.  And it was
already mentioned a full history (.jar’s  all) could be kept somewhere
more for archival purposes — which is a good compromise, I think, since
“build-ability” of history should be retained (assuming it’s even still
possible, given Rob’s comments) but doesn’t have to be convenient (e.g. by
it being in a separate repo).   +1 to that!

If we were to come up with a new git repo that doesn’t have the ‘.jar’s,
it’d be good to also streamline the history prior to the big Lucene + Solr
merge due to the paths in source control as to where the trunk, branches,
and tags lived.  It appears the current repo may have been a blind git
import from subversion.  And hand-done process that is mindful of these
things would result in a nice history.  I’ve done this sorta thing once (a
project at my last job) and volunteer to do it here if we can get consensus
on a move to git.

~ David

On Sun, May 31, 2015 at 4:21 PM Dawid Weiss dawid.we...@cs.put.poznan.pl
wrote:

  I'd like to have full consolidated history, as much as possible,
  connect-the-dots across whatever CVS/SVN/etc repos to the extent
  maximally permitted by law, as Doug hints at. Just nuke the jars.

 I've done this (CVS-SVN-GIT) before. It wasn't that difficult.
 Eventually (for git) you script it and it gets version after version
 from CVS or SVN and appends it to git. I admit I didn't care much
 about svn merging infos though. Any files can be removed/ pruned by
 rewriting git trees before they're published.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Moving to git?

2015-05-31 Thread Robert Muir
On Sun, May 31, 2015 at 4:39 PM, david.w.smi...@gmail.com
david.w.smi...@gmail.com wrote:

 If we were to come up with a new git repo that doesn’t have the ‘.jar’s,
 it’d be good to also streamline the history prior to the big Lucene + Solr
 merge due to the paths in source control as to where the trunk, branches,
 and tags lived.  It appears the current repo may have been a blind git
 import from subversion.  And hand-done process that is mindful of these
 things would result in a nice history.  I’ve done this sorta thing once (a
 project at my last job) and volunteer to do it here if we can get consensus
 on a move to git.

The current Git history is totally broken. This is a complete
dealbreaker from my perspective, if its indicative of what svn - git
conversion will produce.

Look at CheckIndex.java history in git:

https://github.com/apache/lucene-solr/commits/trunk/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java?page=5
It stops at Feb 7, 2012.

In subversion it goes back to 2007, to the original issue where Mike
added CheckIndex:
http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java?view=log

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Mark Miller
Having to agree on mechanics certainly is a downside of Git.

There is only one good rule though - no merge commmits in the history :)
Ever. Do whatever you want beyond that. A clean, simple history for each
branch is the only sensible use of Git I've seen.

- Mark

On Sat, May 30, 2015 at 9:00 AM Adrien Grand jpou...@gmail.com wrote:

 The main benefit I see is that external contributors would get their
 name in the commit log.

 However on the other hand, I'm a bit annoyed that people easily
 disagree on the workflow: some people merge into the maintenance
 branch first and then to master, other people merge into master first
 and then cherry-pick, other people prefer rebasing instead of merging,
 etc. I personally don't really care but if we agree on moving to Git,
 I hope we can agree on the workflow at the same time. At least today
 with svn we have something simple that everybody agrees on.

 -0: I'm not against it but Subversion works well for me today. If
 everybody else agrees on switching to Git I would like us to agree on
 the workflow as well.

 --
 Adrien

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 --
- Mark
about.me/markrmiller


Re: Moving to git?

2015-05-31 Thread Upayavira
Regarding history, if we switch to git, our history will remain in svn,
even if the branches are deleted, the history and old revisions are
still there.

Upayavira


On Sun 2015, at 10:48 PM, Mark Miller wrote:
 Having to agree on mechanics certainly is a downside of Git.

 There is only one good rule though - no merge commmits in the history
 :) Ever. Do whatever you want beyond that. A clean, simple history for
 each branch is the only sensible use of Git I've seen.

 - Mark

 On Sat, May 30, 2015 at 9:00 AM Adrien Grand
 jpou...@gmail.com wrote:
 The main benefit I see is that external contributors would get their

name in the commit log.


However on the other hand, I'm a bit annoyed that people easily

disagree on the workflow: some people merge into the maintenance

branch first and then to master, other people merge into master first

and then cherry-pick, other people prefer rebasing instead of merging,

etc. I personally don't really care but if we agree on moving to Git,

I hope we can agree on the workflow at the same time. At least today

with svn we have something simple that everybody agrees on.


-0: I'm not against it but Subversion works well for me today. If

everybody else agrees on switching to Git I would like us to agree on

the workflow as well.


--

Adrien


-

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

For additional commands, e-mail: dev-h...@lucene.apache.org
 --
 - Mark about.me/markrmiller



Re: Moving to git?

2015-05-31 Thread david.w.smi...@gmail.com
Hmmm.  I pulled up this file in IntelliJ in my git checkout and viewed the
history.  It went back to March 17th 2010 (earlier than the 2012 you found)
with git hash 3ee0ace1ba6b9bff3ffaa278c0bba07e6064057dwith a commit
message of:
git-svn-id:
https://svn.apache.org/repos/asf/lucene/solr/branches/newtrunk@924483
13f79535-47bb-0310-9956-ffa450edef68
All files were added in that commit; it's the earliest commit in this git
repo.  This is the kind of thing I should be able to fix if I build a repo
manually.

Side note: I used to be able to see the commands IntelliJ gave to git, but
I don’t see it in the latest EAP anyways.  I was wondering if it passed the
an option to git log like --find-renames=40% to be more aggressive in its
rename detection.

On Sun, May 31, 2015 at 6:57 PM Robert Muir rcm...@gmail.com wrote:

 And here is IndexWriter with initial revision in 2001, but again git
 still only stops at Feb 7, 2012.

 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java?view=log

 Revision 149570 - (view) (download) (annotate) - [select for diffs]
 Added Tue Sep 18 16:29:48 2001 UTC (13 years, 8 months ago) by jvanzyl
 Original Path:
 lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java
 File length: 15076 byte(s)

 Initial revision

 So subversion history looks pretty complete. If we can add other
 history from sourceforge, fantastic, but there isn't so so much going
 on there.

 It is git that is totally broken here with respect to history.


 On Sun, May 31, 2015 at 6:47 PM, Robert Muir rcm...@gmail.com wrote:
  On Sun, May 31, 2015 at 4:39 PM, david.w.smi...@gmail.com
  david.w.smi...@gmail.com wrote:
 
  If we were to come up with a new git repo that doesn’t have the ‘.jar’s,
  it’d be good to also streamline the history prior to the big Lucene +
 Solr
  merge due to the paths in source control as to where the trunk,
 branches,
  and tags lived.  It appears the current repo may have been a blind git
  import from subversion.  And hand-done process that is mindful of these
  things would result in a nice history.  I’ve done this sorta thing once
 (a
  project at my last job) and volunteer to do it here if we can get
 consensus
  on a move to git.
 
  The current Git history is totally broken. This is a complete
  dealbreaker from my perspective, if its indicative of what svn - git
  conversion will produce.
 
  Look at CheckIndex.java history in git:
 
 
 https://github.com/apache/lucene-solr/commits/trunk/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java?page=5
  It stops at Feb 7, 2012.
 
  In subversion it goes back to 2007, to the original issue where Mike
  added CheckIndex:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java?view=log

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Moving to git?

2015-05-31 Thread Mark Miller
I'm all for a small download size in all things, but personally, I download
Git repos for a project about 1/20th as often as I download svn checkouts
(one of the things I prefer about my Git usage) and I have fast internet.
Not a sore spot here.

- Mark

On Sun, May 31, 2015 at 5:38 PM Steve Davids sdav...@gmail.com wrote:

 There are also some rather large '.dat' files in the history as well, I
 found this by running on a job to delete all blobs  5MB from the history
 via:

 $ java -jar ~/Downloads/bfg-1.12.3.jar --strip-blobs-bigger-than 5M
 --protect-blobs-from trunk,branch_5x,branch_4x lucene-solr-mirror

 Deleted files
 -
 Filename Git id


 ---
 DoubleArrayTrie.dat| 8babf9fa (16.8 MB), f3bfe15b (16.8 MB),
 ...
 TokenInfoDictionary$buffer.dat | 25938b37 (7.0 MB), 7f02420f (7.1 MB),
 ...
 TokenInfoDictionary$trie.dat   | 69e76d64 (16.8 MB)

 dat.dat| 7445d1c8 (16.0 MB), 79bd7c8b (16.8 MB),
 37a215e5 (16.8 MB)
 europarl.lines.txt.gz  | e0366f10 (5.5 MB)

 tid.dat| 5a1e6199 (24.9 MB), 996d3fc5 (28.1 MB),
 ...
 tid_map.dat| 690fbea5 (6.3 MB), c1c01405 (6.3 MB),
 7a8c1420 (6.4 MB)
 wiki_results.txt   | db9e9294 (19.8 MB), 52ff9357 (19.8 MB),
 ...
 wiki_sentence.txt  | 3a38f62e (19.0 MB)

 Dropping just those files reduced the repo by 50M, overall size is 131MB.

 Note: there is one large file still in the trunk 5MB:

 * commit df1e3b32 (protected by 'trunk') - contains 1 dirty file :
 -
 lucene/test-framework/src/resources/org/apache/lucene/util/europarl.lines.txt.gz
 (5.5 MB)


 Also, I failed to provide the numbers on what `git reflog expire
 --expire=now --all  git gc --prune=now --aggressive` on a fresh mirror
 checkout, it results in a repo size of 320M. So, dropping the old jars
 saves 120MB.

 -Steve

 On Sun, May 31, 2015 at 4:39 PM, david.w.smi...@gmail.com 
 david.w.smi...@gmail.com wrote:

 I like where this is going!

 I also think history of source code is very important, but not history of
 ‘.jar’ files that shouldn’t have been in source control in the first
 place.  I’m fiercely negative about large binaries or ‘jar’ files that can
 be downloaded by the build system (e.g. ivy) in source control.  And it was
 already mentioned a full history (.jar’s  all) could be kept somewhere
 more for archival purposes — which is a good compromise, I think, since
 “build-ability” of history should be retained (assuming it’s even still
 possible, given Rob’s comments) but doesn’t have to be convenient (e.g. by
 it being in a separate repo).   +1 to that!

 If we were to come up with a new git repo that doesn’t have the ‘.jar’s,
 it’d be good to also streamline the history prior to the big Lucene + Solr
 merge due to the paths in source control as to where the trunk, branches,
 and tags lived.  It appears the current repo may have been a blind git
 import from subversion.  And hand-done process that is mindful of these
 things would result in a nice history.  I’ve done this sorta thing once (a
 project at my last job) and volunteer to do it here if we can get consensus
 on a move to git.

 ~ David

 On Sun, May 31, 2015 at 4:21 PM Dawid Weiss dawid.we...@cs.put.poznan.pl
 wrote:

  I'd like to have full consolidated history, as much as possible,
  connect-the-dots across whatever CVS/SVN/etc repos to the extent
  maximally permitted by law, as Doug hints at. Just nuke the jars.

 I've done this (CVS-SVN-GIT) before. It wasn't that difficult.
 Eventually (for git) you script it and it gets version after version
 from CVS or SVN and appends it to git. I admit I didn't care much
 about svn merging infos though. Any files can be removed/ pruned by
 rewriting git trees before they're published.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


 --
- Mark
about.me/markrmiller


Re: Moving to git?

2015-05-31 Thread Robert Muir
And here is IndexWriter with initial revision in 2001, but again git
still only stops at Feb 7, 2012.
http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java?view=log

Revision 149570 - (view) (download) (annotate) - [select for diffs]
Added Tue Sep 18 16:29:48 2001 UTC (13 years, 8 months ago) by jvanzyl
Original Path: 
lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java
File length: 15076 byte(s)

Initial revision

So subversion history looks pretty complete. If we can add other
history from sourceforge, fantastic, but there isn't so so much going
on there.

It is git that is totally broken here with respect to history.


On Sun, May 31, 2015 at 6:47 PM, Robert Muir rcm...@gmail.com wrote:
 On Sun, May 31, 2015 at 4:39 PM, david.w.smi...@gmail.com
 david.w.smi...@gmail.com wrote:

 If we were to come up with a new git repo that doesn’t have the ‘.jar’s,
 it’d be good to also streamline the history prior to the big Lucene + Solr
 merge due to the paths in source control as to where the trunk, branches,
 and tags lived.  It appears the current repo may have been a blind git
 import from subversion.  And hand-done process that is mindful of these
 things would result in a nice history.  I’ve done this sorta thing once (a
 project at my last job) and volunteer to do it here if we can get consensus
 on a move to git.

 The current Git history is totally broken. This is a complete
 dealbreaker from my perspective, if its indicative of what svn - git
 conversion will produce.

 Look at CheckIndex.java history in git:

 https://github.com/apache/lucene-solr/commits/trunk/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java?page=5
 It stops at Feb 7, 2012.

 In subversion it goes back to 2007, to the original issue where Mike
 added CheckIndex:
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java?view=log

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Doug Turnbull
I have no dog in the svn vs git debate honestly.

I want to say how important it is to keep healthy history. I recently went
on a bit of code archeology dig recently to figure out why something in
Lucene was done the way it was. It was handy that the history went as far
back as it did, but I had to switch around to different places to continue
the history. For example, the abrupt shift that seems to be around when
Solr/Lucene were put together had me digging for the last pure lucene tag.
Its over at lucene/java/branches NOT lucene/dev/tags with teh other tags.

Then when you get to the branch for lucene-101, the first commit is:
 2001: New repository initialized by cvs2svn.

Unable to find a cvs repo, my hunt stopped (love to hear if anyone has a
CVS repo -- maybe from Jakarta?)

So removing some jars isn't a big deal. But cutting off history and
restarting at some arbitrary point can be annoying and make it harder to
dig up more about why things are the way they are.

/steps down from soapbox
-Doug



On Sunday, May 31, 2015, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote:

 Yeah, but it misses the point -- history is history, if there were
 jars in it, you shouldn't just strip them, it'd be confusing.

 How was it back when Lucene was merging with Solr? Didn't it just
 initiate with a new clean repo? Maybe not all of the history is really
 needed -- if we limited ourselves to, say, all of the history that
 includes ivy then the size of the repo would drop significantly... but
 again, to me size doesn't really matter at all; one initial clone is
 no-cost. Go make yourself a cup of tea, come back and you're set.

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Moving to git?

2015-05-31 Thread Ramkumar R. Aiyengar
+1 for git, great for working on multiple things at once.

Side note: git-svn is also not great btw for the kind of merging we need to
do with every commit, it kind of works but with too many caveats.

On the note that git clone is slow, sure, because it fetches a fair amount
of history which svn doesn't. But to compare just them is unfair, since
checkout and clone are not identical. If you want to compare times, you
will also have to add up every log, diff, or annotate you do on the tree
during your development (of which I certainly do a lot and I am sure others
do as well), and git will certainly win if you include all those because it
does no network lookup. Clone and checkout are typically one time
operations, why should their speed be a concern in any case?
I know this has come up a few times in the past but I wanted to bring this
up again.

The lucene-solr ASF git mirror has been behind by about a day. I was
speaking with the infra people and they say that the size of the repo needs
more and more ram. Forcing a sync causes a fork-bomb:

Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.

They tried a few things but it's almost certain that it needs even more
RAM, which still is a band-aid as they'd soon need even more RAM. Also,
adding RAM involves downtime for git.a.o which needs to be planned. As a
stop gap arrangement attached a volume to the instance and are using it as
swap to work around the adding RAM requires restart issue.

FAQ: How would the memory requirement change if we moved to git instead of
mirroring?
Answer: svn - git mirroring is a weird process and has quite the memory
leak. Using git directly is much cleaner.

I personally think git does make things easier to manage when you're
working on multiple overlapping things and so we should re-evaluate moving
to it. I would have been fine had the mirroring worked, as all I want is a
way to be able to work on multiple (local) branches without having to
create and maintain directories like: lucene-solr-trunk1,
lucene-solr-trunk2, or SOLR-, etc.

Opinions?


-- 
Anshum Gupta


Re: Moving to git?

2015-05-31 Thread Robert Muir
You guys totally miss the point on clone.

The thing is that svn checkout gives you enough, to do what you need
to do. And yes it does network lookup for more rare things like
history, but this works just fine in general.

On the other hand git downloads gigabytes, before you can even get started.

Something needs to be done about all those jars in the source history,
I will not let this go.

On Sun, May 31, 2015 at 9:16 AM, Ramkumar R. Aiyengar
andyetitmo...@gmail.com wrote:
 +1 for git, great for working on multiple things at once.

 Side note: git-svn is also not great btw for the kind of merging we need to
 do with every commit, it kind of works but with too many caveats.

 On the note that git clone is slow, sure, because it fetches a fair amount
 of history which svn doesn't. But to compare just them is unfair, since
 checkout and clone are not identical. If you want to compare times, you will
 also have to add up every log, diff, or annotate you do on the tree during
 your development (of which I certainly do a lot and I am sure others do as
 well), and git will certainly win if you include all those because it does
 no network lookup. Clone and checkout are typically one time operations, why
 should their speed be a concern in any case?

 I know this has come up a few times in the past but I wanted to bring this
 up again.

 The lucene-solr ASF git mirror has been behind by about a day. I was
 speaking with the infra people and they say that the size of the repo needs
 more and more ram. Forcing a sync causes a fork-bomb:

 Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.

 They tried a few things but it's almost certain that it needs even more RAM,
 which still is a band-aid as they'd soon need even more RAM. Also, adding
 RAM involves downtime for git.a.o which needs to be planned. As a stop gap
 arrangement attached a volume to the instance and are using it as swap to
 work around the adding RAM requires restart issue.

 FAQ: How would the memory requirement change if we moved to git instead of
 mirroring?
 Answer: svn - git mirroring is a weird process and has quite the memory
 leak. Using git directly is much cleaner.

 I personally think git does make things easier to manage when you're working
 on multiple overlapping things and so we should re-evaluate moving to it. I
 would have been fine had the mirroring worked, as all I want is a way to be
 able to work on multiple (local) branches without having to create and
 maintain directories like: lucene-solr-trunk1, lucene-solr-trunk2, or
 SOLR-, etc.

 Opinions?


 --
 Anshum Gupta

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Ramkumar R. Aiyengar
Personally, clone for me is 'rare', I did it once years back, and have
never done it since. log, diff and others I do on a daily basis. Same with
svn as well actually, you checkout just once usually..

I think the previous discussion had the agreement that this issue should
focus on committers rather than contributors. And committers by definition
aren't getting started with Solr. If you want to make things more flexible
and faster for contributors, sure, github mirror provides an svn facade
which allows you to check out a subversion wc from it's repos for
read/write (write's not that useful for us though since github is not the
primary repo).
On 31 May 2015 14:25, Robert Muir rcm...@gmail.com wrote:

 You guys totally miss the point on clone.

 The thing is that svn checkout gives you enough, to do what you need
 to do. And yes it does network lookup for more rare things like
 history, but this works just fine in general.

 On the other hand git downloads gigabytes, before you can even get started.

 Something needs to be done about all those jars in the source history,
 I will not let this go.

 On Sun, May 31, 2015 at 9:16 AM, Ramkumar R. Aiyengar
 andyetitmo...@gmail.com wrote:
  +1 for git, great for working on multiple things at once.
 
  Side note: git-svn is also not great btw for the kind of merging we need
 to
  do with every commit, it kind of works but with too many caveats.
 
  On the note that git clone is slow, sure, because it fetches a fair
 amount
  of history which svn doesn't. But to compare just them is unfair, since
  checkout and clone are not identical. If you want to compare times, you
 will
  also have to add up every log, diff, or annotate you do on the tree
 during
  your development (of which I certainly do a lot and I am sure others do
 as
  well), and git will certainly win if you include all those because it
 does
  no network lookup. Clone and checkout are typically one time operations,
 why
  should their speed be a concern in any case?
 
  I know this has come up a few times in the past but I wanted to bring
 this
  up again.
 
  The lucene-solr ASF git mirror has been behind by about a day. I was
  speaking with the infra people and they say that the size of the repo
 needs
  more and more ram. Forcing a sync causes a fork-bomb:
 
  Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.
 
  They tried a few things but it's almost certain that it needs even more
 RAM,
  which still is a band-aid as they'd soon need even more RAM. Also, adding
  RAM involves downtime for git.a.o which needs to be planned. As a stop
 gap
  arrangement attached a volume to the instance and are using it as swap to
  work around the adding RAM requires restart issue.
 
  FAQ: How would the memory requirement change if we moved to git instead
 of
  mirroring?
  Answer: svn - git mirroring is a weird process and has quite the memory
  leak. Using git directly is much cleaner.
 
  I personally think git does make things easier to manage when you're
 working
  on multiple overlapping things and so we should re-evaluate moving to
 it. I
  would have been fine had the mirroring worked, as all I want is a way to
 be
  able to work on multiple (local) branches without having to create and
  maintain directories like: lucene-solr-trunk1, lucene-solr-trunk2, or
  SOLR-, etc.
 
  Opinions?
 
 
  --
  Anshum Gupta

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Moving to git?

2015-05-31 Thread Robert Muir
On Sat, May 30, 2015 at 4:20 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 # time git clone --depth 1 https://github.com/apache/lucene-solr.git

This breaks rule #1 of using git, don't pass any options to any of the
commands, or it shits itself.

Git clone is slow, i think the reason is all the old jar files in the
repository. It needs to be fixed.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Robert Muir
Don't assume your workflow is everyone else's workflow. And don't try
to enforce your workflow on me.

I don't use svn OR git in the way you describe.

On Sun, May 31, 2015 at 9:31 AM, Ramkumar R. Aiyengar
andyetitmo...@gmail.com wrote:
 Personally, clone for me is 'rare', I did it once years back, and have never
 done it since. log, diff and others I do on a daily basis. Same with svn as
 well actually, you checkout just once usually..

 I think the previous discussion had the agreement that this issue should
 focus on committers rather than contributors. And committers by definition
 aren't getting started with Solr. If you want to make things more flexible
 and faster for contributors, sure, github mirror provides an svn facade
 which allows you to check out a subversion wc from it's repos for read/write
 (write's not that useful for us though since github is not the primary
 repo).

 On 31 May 2015 14:25, Robert Muir rcm...@gmail.com wrote:

 You guys totally miss the point on clone.

 The thing is that svn checkout gives you enough, to do what you need
 to do. And yes it does network lookup for more rare things like
 history, but this works just fine in general.

 On the other hand git downloads gigabytes, before you can even get
 started.

 Something needs to be done about all those jars in the source history,
 I will not let this go.

 On Sun, May 31, 2015 at 9:16 AM, Ramkumar R. Aiyengar
 andyetitmo...@gmail.com wrote:
  +1 for git, great for working on multiple things at once.
 
  Side note: git-svn is also not great btw for the kind of merging we need
  to
  do with every commit, it kind of works but with too many caveats.
 
  On the note that git clone is slow, sure, because it fetches a fair
  amount
  of history which svn doesn't. But to compare just them is unfair, since
  checkout and clone are not identical. If you want to compare times, you
  will
  also have to add up every log, diff, or annotate you do on the tree
  during
  your development (of which I certainly do a lot and I am sure others do
  as
  well), and git will certainly win if you include all those because it
  does
  no network lookup. Clone and checkout are typically one time operations,
  why
  should their speed be a concern in any case?
 
  I know this has come up a few times in the past but I wanted to bring
  this
  up again.
 
  The lucene-solr ASF git mirror has been behind by about a day. I was
  speaking with the infra people and they say that the size of the repo
  needs
  more and more ram. Forcing a sync causes a fork-bomb:
 
  Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.
 
  They tried a few things but it's almost certain that it needs even more
  RAM,
  which still is a band-aid as they'd soon need even more RAM. Also,
  adding
  RAM involves downtime for git.a.o which needs to be planned. As a stop
  gap
  arrangement attached a volume to the instance and are using it as swap
  to
  work around the adding RAM requires restart issue.
 
  FAQ: How would the memory requirement change if we moved to git instead
  of
  mirroring?
  Answer: svn - git mirroring is a weird process and has quite the memory
  leak. Using git directly is much cleaner.
 
  I personally think git does make things easier to manage when you're
  working
  on multiple overlapping things and so we should re-evaluate moving to
  it. I
  would have been fine had the mirroring worked, as all I want is a way to
  be
  able to work on multiple (local) branches without having to create and
  maintain directories like: lucene-solr-trunk1, lucene-solr-trunk2, or
  SOLR-, etc.
 
  Opinions?
 
 
  --
  Anshum Gupta

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Moving to git?

2015-05-31 Thread Uwe Schindler
I also clone my SVN working copy locally. After that I just switch branch.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Yonik Seeley [mailto:ysee...@gmail.com]
 Sent: Sunday, May 31, 2015 3:56 PM
 To: Solr/Lucene Dev
 Subject: Re: Moving to git?
 
 On Sun, May 31, 2015 at 9:31 AM, Ramkumar R. Aiyengar
 andyetitmo...@gmail.com wrote:
  Personally, clone for me is 'rare', I did it once years back, and have
  never done it since. log, diff and others I do on a daily basis.
 
 Yep, I find I need fewer different working directories with git, but when I do
 want an additional copy, I just make a local copy of an existing repo since it
 has everything.
 
 -Yonik
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-31 Thread Yonik Seeley
On Sun, May 31, 2015 at 9:31 AM, Ramkumar R. Aiyengar
andyetitmo...@gmail.com wrote:
 Personally, clone for me is 'rare', I did it once years back, and have never
 done it since. log, diff and others I do on a daily basis.

Yep, I find I need fewer different working directories with git, but
when I do want an additional copy, I just make a local copy of an
existing repo since it has everything.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-30 Thread Dawid Weiss
+1 to moving to git. I am not going to attempt to convince those stubborn types
that want to stick to SVN. I use git and svn and git simply works better for me.

I just want to explain something, because there seems to be a misunderstanding.

 time git clone git://git.apache.org/lucene-solr.git test.git

These are all apples-to-oranges comparisons. You're fetching the
entire history of all commits. SVN just fetches one branch. I
typically fetch more than one branch when I work and it takes (N * svn
checkout) times to do so (and no, switch is not much faster). For your
comparison
to be fairer, you should be comparing git clone to:

time svn co https://svn.apache.org/repos/asf/lucene/dev

I don't know what the actual time of this is or how much disk space it
takes. Yes, it is an insane command but it's really an equivalent of
having a git clone locally...

Also, even if the initial clone takes 9 minutes (which I think is
Apache's git server being dog slow), you can always fetch from any
other mirror. It's still the same repository, with all the commits,
tags, hashes, etc. Or you can fetch just the latest revision with
--depth 1. Many options out there [shrug].

 If I ever reach a point where I am working on multiple code trees, I
 expect that I will have them in separate directories because that will
 help me keep them straight.

Once you start working with git you probably won't bother having
separate folders. Simply because this thing is super helpful (and
fast) -- it aggregates all your branch commits into a single patch:

git diff my-branch..origin/master

Again, I do this with SVN too (against a remote URL) and it takes a
looong time, every time I do it. A single initial checkout time is not
enough to assess overall tool productivity...

Finally, even if your taste is to have separate folders, remember you
only need one clone of the repo, ever (and you only pull new commits later on).
Managing a different branch then becomes:

cd ..
cp -R lucene-master lucene-my-branch
cd lucene-my-branch
git co master -b my-branch
git push origin HEAD -u

done, you're set. Takes 3 seconds. Longer than a remote svn copy, at
least from Europe...

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-30 Thread Dawid Weiss
Did this, out of curiosity (from a server in the U.S.):

# time git clone https://github.com/apache/lucene-solr.git
...
Receiving objects: 100% (563630/563630), 472.01 MiB | 10.46 MiB/s, done.

real1m13.049s
user0m46.000s
sys 0m10.060s

# time git clone --depth 1 https://github.com/apache/lucene-solr.git
...
Receiving objects: 100% (9507/9507), 37.40 MiB | 9.84 MiB/s, done.

real0m7.814s
user0m2.550s
sys 0m1.110s

# time svn co https://svn.apache.org/repos/asf/lucene/dev/trunk test.svn
...
Checked out revision 1682650.

real0m34.526s
user0m12.460s
sys 0m10.320s

As you can see everything is relative...

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-30 Thread Anshum Gupta
As I've mentioned multiple times, I think git is super useful when working
on multiple unrelated things that affect the same files.
What I'd been doing so far with svn is, creating multiple physical
directories (checkouts) and working on them, and tracking them, cleaning
them up, and deleting them when done. With git, I wouldn't have to do any
of that, letting me spend more time on building/fixing things than just
managing my changes.

On Fri, May 29, 2015 at 9:45 PM, Walter Underwood wun...@wunderwood.org
wrote:

 I’m not a committer, but I’ve built production code with a lot of source
 control systems and git is by far the the most cumbersome. It does one
 thing well, handling untrusted contributors. With trusted committers,
 Subversion is very nice, thank you.

 Here are the systems I’ve used.

 * SCCS
 * RCS
 * HP history manager
 * ClearCase
 * CVS
 * Perforce
 * Subversion
 * git

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)


 On May 29, 2015, at 8:58 PM, Ishan Chattopadhyaya 
 ichattopadhy...@gmail.com wrote:

 Life is so much easier on long train/plane journeys with Git. +1.

 On Sat, May 30, 2015 at 9:21 AM, Shai Erera ser...@gmail.com wrote:

 +1 to moving to git.

 Shai
 On May 30, 2015 6:24 AM, Anshum Gupta ans...@anshumgupta.net wrote:

 * There may be other good reasons for using git, but this is not one.*
 I just added one more to the list. I think most other reasons have
 already been spoken about in previous discussions. I'm not trying to debate
 on what is better (I think it's a lot to do with *opinion*).

 I think it's a reasonable thing to move to a system that allows for
 distributed version control and makes working on multiple things at the
 same time easy. But again, that's my thought. The last time the discussion
 came up, I was +1 to moving and wasn't already using it a lot. Right now,
 I'm just trying to work on multiple things and find git easier for that
 purpose.

 I just wanted to bring this back up and see if the opinion of active
 contributors has changed since the last time by means of a polite and
 friendly discussion. In the end, we can agree to disagree but it'd be
 better than not discussing at all. :-)


 On Fri, May 29, 2015 at 7:24 PM, Walter Underwood wun...@wunderwood.org
  wrote:

 There may be other good reasons for using git, but this is not one.

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)

 On May 29, 2015, at 6:57 PM, Yonik Seeley ysee...@gmail.com wrote:

 On Fri, May 29, 2015 at 9:40 PM, Walter Underwood 
 wun...@wunderwood.org wrote:

 “git breaks when it tries to mirror” is not a convincing argument for
 moving
 to git.


 I'd be +1 without that annoyance as well.  As Anshum mentioned, this
 has come up a number of times in the past.

 -Yonik

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





 --
 Anshum Gupta






-- 
Anshum Gupta


Re: Moving to git?

2015-05-30 Thread Toke Eskildsen
Walter Underwood wun...@wunderwood.org wrote:
 I’m not a committer, but I’ve built production code with a lot of source 
 control
 systems and git is by far the the most cumbersome.

I am not a committer and I have build production code with very few source 
control systems: CVS, SVN  GIT. GIT is the one I dislike the least; 
occasionally I even find it quite neat.

 It does one thing well, handling untrusted contributors. With trusted 
 committers, [...]

So is this a question of optimizing towards committers or contributors? Ease of 
use for the core people or more openness towards outside contributions?

- Toke Eskildsen

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-30 Thread Shai Erera
The commit then push workflow *is* what allows you to maintain multiple
local branches without the nightmare of having multiple checkouts, and
managing them.

I don't think that moving to GIT has anything to do with Github. With SVN,
we also don't have any nice user interface and/or pull requests, so the
lack of one in GIT is irrelevant in my opinion.

If we consider moving to GIT, we might also want to look at using Gerrit
for our code reviews and patch submissions. It makes the code review aspect
much easier, nicer and helpful.

Shai

On Sat, May 30, 2015 at 12:36 PM, Uwe Schindler u...@thetaphi.de wrote:

 Hi,



 I think most people say that GIT is easier or better to use because they
 combine in their mind using „GIT“ with „the Github user interface“.



 This is indeed very nice to have - I (for myself) am also very happy with
 using Github, as long as it keeps simple (you only have users from Github
 sending you pull requests, if you only need to push to one location,…). On
 the other hand, I always get annoyed that you cannot do the same like “svn
 update” or “svn commit” in one turn. You always have to pull first and then
 update or vice versa first commit then push. If you are online there is no
 reason to have this separated, especially if you are a “committer”. If you
 are contributor, that fine – because you cannot push, but for committers
 this is what subversion people like me hate. And all these additional steps
 are not useful to a “centralized” infrastructure like ASF.



 As said before, at the ASF, we don’t get the Github interface as “main
 primary user interface”, because the “committers” have to use the official
 ASF git installation to push. So we are still not able to easily handle
 pull requests on github and so on. So there is no useful thing (except the
 mentioned: no longer need to have multiple checkouts locally) with GIT. The
 big backside is: without the Github Web interface, GIT is unuseable to me,
 sorry. The command line is a disaster, technical concepts behind GIT are a
 disaster; everything is a disaster J



 I would plus one to use Git, if we would solely use “GitHub” as central
 repository (like Elasticsearch), but with having the “central
 infrastructure” at the ASF:

 Clear -1 !!!



 Uwe



 -

 Uwe Schindler

 H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de

 eMail: u...@thetaphi.de



 *From:* Anshum Gupta [mailto:ans...@anshumgupta.net]
 *Sent:* Friday, May 29, 2015 11:08 PM
 *To:* dev@lucene.apache.org
 *Subject:* Moving to git?



 I know this has come up a few times in the past but I wanted to bring this
 up again.



 The lucene-solr ASF git mirror has been behind by about a day. I was
 speaking with the infra people and they say that the size of the repo needs
 more and more ram. Forcing a sync causes a fork-bomb:



 Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.



 They tried a few things but it's almost certain that it needs even more
 RAM, which still is a band-aid as they'd soon need even more RAM. Also,
 adding RAM involves downtime for git.a.o which needs to be planned. As a
 stop gap arrangement attached a volume to the instance and are using it as
 swap to work around the adding RAM requires restart issue.



 FAQ: How would the memory requirement change if we moved to git instead of
 mirroring?

 Answer: svn - git mirroring is a weird process and has quite the memory
 leak. Using git directly is much cleaner.



 I personally think git does make things easier to manage when you're
 working on multiple overlapping things and so we should re-evaluate moving
 to it. I would have been fine had the mirroring worked, as all I want is a
 way to be able to work on multiple (local) branches without having to
 create and maintain directories like: lucene-solr-trunk1,
 lucene-solr-trunk2, or SOLR-, etc.



 Opinions?





 --

 Anshum Gupta



Re: Moving to git?

2015-05-30 Thread Mark Miller
bq. I don't think that moving to GIT has anything to do with Github.

I think that's a common misconception by people that don't 'get' Git. Most
people could care less about Git as it pertains to GitHub except for one
thing - it provides a nice central master repo that is hosted that you can
push to.

That other stuff on GitHub is whatever. Candy, fluff, whatever. The power
of and ease of Git vs svn is Git, not GitHub.

svn feels like 1980, Git feels like 1990.

- Mark

On Sat, May 30, 2015 at 7:22 AM Shai Erera ser...@gmail.com wrote:

 The commit then push workflow *is* what allows you to maintain multiple
 local branches without the nightmare of having multiple checkouts, and
 managing them.

 I don't think that moving to GIT has anything to do with Github. With SVN,
 we also don't have any nice user interface and/or pull requests, so the
 lack of one in GIT is irrelevant in my opinion.

 If we consider moving to GIT, we might also want to look at using Gerrit
 for our code reviews and patch submissions. It makes the code review aspect
 much easier, nicer and helpful.

 Shai

 On Sat, May 30, 2015 at 12:36 PM, Uwe Schindler u...@thetaphi.de wrote:

 Hi,



 I think most people say that GIT is easier or better to use because they
 combine in their mind using „GIT“ with „the Github user interface“.



 This is indeed very nice to have - I (for myself) am also very happy with
 using Github, as long as it keeps simple (you only have users from Github
 sending you pull requests, if you only need to push to one location,…). On
 the other hand, I always get annoyed that you cannot do the same like “svn
 update” or “svn commit” in one turn. You always have to pull first and then
 update or vice versa first commit then push. If you are online there is no
 reason to have this separated, especially if you are a “committer”. If you
 are contributor, that fine – because you cannot push, but for committers
 this is what subversion people like me hate. And all these additional steps
 are not useful to a “centralized” infrastructure like ASF.



 As said before, at the ASF, we don’t get the Github interface as “main
 primary user interface”, because the “committers” have to use the official
 ASF git installation to push. So we are still not able to easily handle
 pull requests on github and so on. So there is no useful thing (except the
 mentioned: no longer need to have multiple checkouts locally) with GIT. The
 big backside is: without the Github Web interface, GIT is unuseable to me,
 sorry. The command line is a disaster, technical concepts behind GIT are a
 disaster; everything is a disaster J



 I would plus one to use Git, if we would solely use “GitHub” as central
 repository (like Elasticsearch), but with having the “central
 infrastructure” at the ASF:

 Clear -1 !!!



 Uwe



 -

 Uwe Schindler

 H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de

 eMail: u...@thetaphi.de



 *From:* Anshum Gupta [mailto:ans...@anshumgupta.net]
 *Sent:* Friday, May 29, 2015 11:08 PM
 *To:* dev@lucene.apache.org
 *Subject:* Moving to git?



 I know this has come up a few times in the past but I wanted to bring
 this up again.



 The lucene-solr ASF git mirror has been behind by about a day. I was
 speaking with the infra people and they say that the size of the repo needs
 more and more ram. Forcing a sync causes a fork-bomb:



 Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.



 They tried a few things but it's almost certain that it needs even more
 RAM, which still is a band-aid as they'd soon need even more RAM. Also,
 adding RAM involves downtime for git.a.o which needs to be planned. As a
 stop gap arrangement attached a volume to the instance and are using it as
 swap to work around the adding RAM requires restart issue.



 FAQ: How would the memory requirement change if we moved to git instead
 of mirroring?

 Answer: svn - git mirroring is a weird process and has quite the memory
 leak. Using git directly is much cleaner.



 I personally think git does make things easier to manage when you're
 working on multiple overlapping things and so we should re-evaluate moving
 to it. I would have been fine had the mirroring worked, as all I want is a
 way to be able to work on multiple (local) branches without having to
 create and maintain directories like: lucene-solr-trunk1,
 lucene-solr-trunk2, or SOLR-, etc.



 Opinions?





 --

 Anshum Gupta


 --
- Mark
about.me/markrmiller


Re: Moving to git?

2015-05-30 Thread Robert Muir
A git clone is just too slow right now the way its setup. So what will
be done to fix that? Currently, svn is way faster in the worst case.
In the time it takes to git clone, i can do 10 svn checkouts.

I sometimes use git, but usually when working on software, i don't
work on trivial things. I do nontrivial stuff and sometimes shit
breaks, including git workspaces and including git itself. On average
do I get  10 branches per clone() ? Not sure.

I dont know if its because there used to be JAR files in the source
trees back in the day or what, but fixing this is really mandatory.
Remember for a new developer its also something they must always do,
so its not just when 'git fucks up for me', it is a slow operation
that occasionally people must deal with. And its just too slow.

git clone:
Cloning into 'lucene-solr'...
remote: Counting objects: 563630, done.
remote: Compressing objects: 100% (136240/136240), done.
remote: Total 563630 (delta 329023), reused 539045 (delta 304604)
Receiving objects: 100% (563630/563630), 356.90 MiB | 1.79 MiB/s, done.
Resolving deltas: 100% (329023/329023), done.
Checking connectivity... done.

real3m43.157s
user0m38.243s
sys0m10.480s

svn co:
Checked out revision 1682598.
real0m34.713s
user0m5.504s
sys0m4.170s

PS don't tell me i dont know how to use git correctly, i dont care to
hear your religious arguments. A clone is still the worst case
operation, so it must be supported. Its also the first thing someone
must do, if they want to work with the codebase. And with git right
now, its broken (too slow) for lucene-solr.

On Fri, May 29, 2015 at 5:07 PM, Anshum Gupta ans...@anshumgupta.net wrote:
 I know this has come up a few times in the past but I wanted to bring this
 up again.

 The lucene-solr ASF git mirror has been behind by about a day. I was
 speaking with the infra people and they say that the size of the repo needs
 more and more ram. Forcing a sync causes a fork-bomb:

 Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.

 They tried a few things but it's almost certain that it needs even more RAM,
 which still is a band-aid as they'd soon need even more RAM. Also, adding
 RAM involves downtime for git.a.o which needs to be planned. As a stop gap
 arrangement attached a volume to the instance and are using it as swap to
 work around the adding RAM requires restart issue.

 FAQ: How would the memory requirement change if we moved to git instead of
 mirroring?
 Answer: svn - git mirroring is a weird process and has quite the memory
 leak. Using git directly is much cleaner.

 I personally think git does make things easier to manage when you're working
 on multiple overlapping things and so we should re-evaluate moving to it. I
 would have been fine had the mirroring worked, as all I want is a way to be
 able to work on multiple (local) branches without having to create and
 maintain directories like: lucene-solr-trunk1, lucene-solr-trunk2, or
 SOLR-, etc.

 Opinions?


 --
 Anshum Gupta

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-30 Thread Adrien Grand
The main benefit I see is that external contributors would get their
name in the commit log.

However on the other hand, I'm a bit annoyed that people easily
disagree on the workflow: some people merge into the maintenance
branch first and then to master, other people merge into master first
and then cherry-pick, other people prefer rebasing instead of merging,
etc. I personally don't really care but if we agree on moving to Git,
I hope we can agree on the workflow at the same time. At least today
with svn we have something simple that everybody agrees on.

-0: I'm not against it but Subversion works well for me today. If
everybody else agrees on switching to Git I would like us to agree on
the workflow as well.

-- 
Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Moving to git?

2015-05-30 Thread Uwe Schindler
Hi,

 

I think most people say that GIT is easier or better to use because they 
combine in their mind using „GIT“ with „the Github user interface“.

 

This is indeed very nice to have - I (for myself) am also very happy with using 
Github, as long as it keeps simple (you only have users from Github sending you 
pull requests, if you only need to push to one location,…). On the other hand, 
I always get annoyed that you cannot do the same like “svn update” or “svn 
commit” in one turn. You always have to pull first and then update or vice 
versa first commit then push. If you are online there is no reason to have this 
separated, especially if you are a “committer”. If you are contributor, that 
fine – because you cannot push, but for committers this is what subversion 
people like me hate. And all these additional steps are not useful to a 
“centralized” infrastructure like ASF.

 

As said before, at the ASF, we don’t get the Github interface as “main primary 
user interface”, because the “committers” have to use the official ASF git 
installation to push. So we are still not able to easily handle pull requests 
on github and so on. So there is no useful thing (except the mentioned: no 
longer need to have multiple checkouts locally) with GIT. The big backside is: 
without the Github Web interface, GIT is unuseable to me, sorry. The command 
line is a disaster, technical concepts behind GIT are a disaster; everything is 
a disaster J

 

I would plus one to use Git, if we would solely use “GitHub” as central 
repository (like Elasticsearch), but with having the “central infrastructure” 
at the ASF:

Clear -1 !!!

 

Uwe

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de http://www.thetaphi.de/ 

eMail: u...@thetaphi.de

 

From: Anshum Gupta [mailto:ans...@anshumgupta.net] 
Sent: Friday, May 29, 2015 11:08 PM
To: dev@lucene.apache.org
Subject: Moving to git?

 

I know this has come up a few times in the past but I wanted to bring this up 
again.

 

The lucene-solr ASF git mirror has been behind by about a day. I was speaking 
with the infra people and they say that the size of the repo needs more and 
more ram. Forcing a sync causes a fork-bomb:

 

Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.

 

They tried a few things but it's almost certain that it needs even more RAM, 
which still is a band-aid as they'd soon need even more RAM. Also, adding RAM 
involves downtime for git.a.o which needs to be planned. As a stop gap 
arrangement attached a volume to the instance and are using it as swap to work 
around the adding RAM requires restart issue.

 

FAQ: How would the memory requirement change if we moved to git instead of 
mirroring?

Answer: svn - git mirroring is a weird process and has quite the memory leak. 
Using git directly is much cleaner.

 

I personally think git does make things easier to manage when you're working on 
multiple overlapping things and so we should re-evaluate moving to it. I would 
have been fine had the mirroring worked, as all I want is a way to be able to 
work on multiple (local) branches without having to create and maintain 
directories like: lucene-solr-trunk1, lucene-solr-trunk2, or SOLR-, etc.

 

Opinions?

 

 

-- 

Anshum Gupta



Re: Moving to git?

2015-05-30 Thread Shawn Heisey
On 5/30/2015 6:59 AM, Adrien Grand wrote:
 The main benefit I see is that external contributors would get their
 name in the commit log.
 
 However on the other hand, I'm a bit annoyed that people easily
 disagree on the workflow: some people merge into the maintenance
 branch first and then to master, other people merge into master first
 and then cherry-pick, other people prefer rebasing instead of merging,
 etc. I personally don't really care but if we agree on moving to Git,
 I hope we can agree on the workflow at the same time. At least today
 with svn we have something simple that everybody agrees on.
 
 -0: I'm not against it but Subversion works well for me today. If
 everybody else agrees on switching to Git I would like us to agree on
 the workflow as well.

-0 is my vote as well, as long as we take Adrien's advice about the
workflow.  The normal workflow must be thoroughly documented if we are
going to change our version control tool.  There will likely be
deviations from that workflow required, and if specific deviations are
required for common-but-not-entirely-normal situations, those should be
documented as well.

Additional TL;DR detail about my thoughts and findings below:

The impression that I have gotten from watching these religious wars
over version control is that subversion is superior at absolute
correctness and faithful maintenance of the version history in the face
of problems, while git excels when you have a very large number of
people who contribute code but only a few who have write access, or for
people who work on a lot of different things in different branches.  My
impression may not be correct, and I'm absolutely sure that reality is a
lot more complex.

I was not aware of the speed differences Robert noted.  I conducted my
own timing tests on my 7 megabit DSL connection:

time git clone git://git.apache.org/lucene-solr.git test.git
real9m20.437s
user0m43.288s
sys 0m12.192s

time svn co https://svn.apache.org/repos/asf/lucene/dev/trunk test.svn
real2m16.505s
user0m6.794s
sys 0m5.267s

If I ever reach a point where I am working on multiple code trees, I
expect that I will have them in separate directories because that will
help me keep them straight.  I gather from comments on this thread that
git will let you keep them all in one repo, but I think I would get
myself into trouble doing that, working on the wrong one accidentally.

There is a significant (nearly two to one) size difference between the
cloned git repo and an equivalent svn checkout, although to be honest,
with today's typical storage sizes, this doesn't matter all that much,
unless you are maintaining separate directories for multiple git code
trees, as I probably would.

elyograg@sauron:~/asf$ du -sm test.*
498 test.git
252 test.svn

I ran git gc --aggressive which reduced the repo by 55MB.

I do use git at work for the code that I write there, and for version
control on configuration files for various server installations.  Those
repos are *nowhere* near as big as lucene-solr, though.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-29 Thread Yonik Seeley
+1 to move to git!

-Yonik


On Fri, May 29, 2015 at 5:07 PM, Anshum Gupta ans...@anshumgupta.net wrote:
 I know this has come up a few times in the past but I wanted to bring this
 up again.

 The lucene-solr ASF git mirror has been behind by about a day. I was
 speaking with the infra people and they say that the size of the repo needs
 more and more ram. Forcing a sync causes a fork-bomb:

 Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.

 They tried a few things but it's almost certain that it needs even more RAM,
 which still is a band-aid as they'd soon need even more RAM. Also, adding
 RAM involves downtime for git.a.o which needs to be planned. As a stop gap
 arrangement attached a volume to the instance and are using it as swap to
 work around the adding RAM requires restart issue.

 FAQ: How would the memory requirement change if we moved to git instead of
 mirroring?
 Answer: svn - git mirroring is a weird process and has quite the memory
 leak. Using git directly is much cleaner.

 I personally think git does make things easier to manage when you're working
 on multiple overlapping things and so we should re-evaluate moving to it. I
 would have been fine had the mirroring worked, as all I want is a way to be
 able to work on multiple (local) branches without having to create and
 maintain directories like: lucene-solr-trunk1, lucene-solr-trunk2, or
 SOLR-, etc.

 Opinions?


 --
 Anshum Gupta

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-29 Thread Yonik Seeley
On Fri, May 29, 2015 at 9:40 PM, Walter Underwood wun...@wunderwood.org wrote:
 “git breaks when it tries to mirror” is not a convincing argument for moving
 to git.

I'd be +1 without that annoyance as well.  As Anshum mentioned, this
has come up a number of times in the past.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Moving to git?

2015-05-29 Thread Walter Underwood
There may be other good reasons for using git, but this is not one.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

On May 29, 2015, at 6:57 PM, Yonik Seeley ysee...@gmail.com wrote:

 On Fri, May 29, 2015 at 9:40 PM, Walter Underwood wun...@wunderwood.org 
 wrote:
 “git breaks when it tries to mirror” is not a convincing argument for moving
 to git.
 
 I'd be +1 without that annoyance as well.  As Anshum mentioned, this
 has come up a number of times in the past.
 
 -Yonik
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 



Re: Moving to git?

2015-05-29 Thread Ishan Chattopadhyaya
Life is so much easier on long train/plane journeys with Git. +1.

On Sat, May 30, 2015 at 9:21 AM, Shai Erera ser...@gmail.com wrote:

 +1 to moving to git.

 Shai
 On May 30, 2015 6:24 AM, Anshum Gupta ans...@anshumgupta.net wrote:

 * There may be other good reasons for using git, but this is not one.*
 I just added one more to the list. I think most other reasons have
 already been spoken about in previous discussions. I'm not trying to debate
 on what is better (I think it's a lot to do with *opinion*).

 I think it's a reasonable thing to move to a system that allows for
 distributed version control and makes working on multiple things at the
 same time easy. But again, that's my thought. The last time the discussion
 came up, I was +1 to moving and wasn't already using it a lot. Right now,
 I'm just trying to work on multiple things and find git easier for that
 purpose.

 I just wanted to bring this back up and see if the opinion of active
 contributors has changed since the last time by means of a polite and
 friendly discussion. In the end, we can agree to disagree but it'd be
 better than not discussing at all. :-)


 On Fri, May 29, 2015 at 7:24 PM, Walter Underwood wun...@wunderwood.org
 wrote:

 There may be other good reasons for using git, but this is not one.

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)

 On May 29, 2015, at 6:57 PM, Yonik Seeley ysee...@gmail.com wrote:

 On Fri, May 29, 2015 at 9:40 PM, Walter Underwood wun...@wunderwood.org
 wrote:

 “git breaks when it tries to mirror” is not a convincing argument for
 moving
 to git.


 I'd be +1 without that annoyance as well.  As Anshum mentioned, this
 has come up a number of times in the past.

 -Yonik

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





 --
 Anshum Gupta




Re: Moving to git?

2015-05-29 Thread Walter Underwood
I’m not a committer, but I’ve built production code with a lot of source 
control systems and git is by far the the most cumbersome. It does one thing 
well, handling untrusted contributors. With trusted committers, Subversion is 
very nice, thank you.

Here are the systems I’ve used.

* SCCS
* RCS
* HP history manager
* ClearCase
* CVS
* Perforce
* Subversion
* git

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


On May 29, 2015, at 8:58 PM, Ishan Chattopadhyaya ichattopadhy...@gmail.com 
wrote:

 Life is so much easier on long train/plane journeys with Git. +1.
 
 On Sat, May 30, 2015 at 9:21 AM, Shai Erera ser...@gmail.com wrote:
 +1 to moving to git.
 
 Shai
 
 On May 30, 2015 6:24 AM, Anshum Gupta ans...@anshumgupta.net wrote:
  There may be other good reasons for using git, but this is not one.
 I just added one more to the list. I think most other reasons have already 
 been spoken about in previous discussions. I'm not trying to debate on what 
 is better (I think it's a lot to do with *opinion*).
 
 I think it's a reasonable thing to move to a system that allows for 
 distributed version control and makes working on multiple things at the same 
 time easy. But again, that's my thought. The last time the discussion came 
 up, I was +1 to moving and wasn't already using it a lot. Right now, I'm just 
 trying to work on multiple things and find git easier for that purpose.
 
 I just wanted to bring this back up and see if the opinion of active 
 contributors has changed since the last time by means of a polite and 
 friendly discussion. In the end, we can agree to disagree but it'd be better 
 than not discussing at all. :-)
 
 
 On Fri, May 29, 2015 at 7:24 PM, Walter Underwood wun...@wunderwood.org 
 wrote:
 There may be other good reasons for using git, but this is not one.
 
 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)
 
 On May 29, 2015, at 6:57 PM, Yonik Seeley ysee...@gmail.com wrote:
 
 On Fri, May 29, 2015 at 9:40 PM, Walter Underwood wun...@wunderwood.org 
 wrote:
 “git breaks when it tries to mirror” is not a convincing argument for moving
 to git.
 
 I'd be +1 without that annoyance as well.  As Anshum mentioned, this
 has come up a number of times in the past.
 
 -Yonik
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 
 
 -- 
 Anshum Gupta
 



Re: Moving to git?

2015-05-29 Thread Anshum Gupta
* There may be other good reasons for using git, but this is not one.*
I just added one more to the list. I think most other reasons have already
been spoken about in previous discussions. I'm not trying to debate on what
is better (I think it's a lot to do with *opinion*).

I think it's a reasonable thing to move to a system that allows for
distributed version control and makes working on multiple things at the
same time easy. But again, that's my thought. The last time the discussion
came up, I was +1 to moving and wasn't already using it a lot. Right now,
I'm just trying to work on multiple things and find git easier for that
purpose.

I just wanted to bring this back up and see if the opinion of active
contributors has changed since the last time by means of a polite and
friendly discussion. In the end, we can agree to disagree but it'd be
better than not discussing at all. :-)


On Fri, May 29, 2015 at 7:24 PM, Walter Underwood wun...@wunderwood.org
wrote:

 There may be other good reasons for using git, but this is not one.

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)

 On May 29, 2015, at 6:57 PM, Yonik Seeley ysee...@gmail.com wrote:

 On Fri, May 29, 2015 at 9:40 PM, Walter Underwood wun...@wunderwood.org
 wrote:

 “git breaks when it tries to mirror” is not a convincing argument for
 moving
 to git.


 I'd be +1 without that annoyance as well.  As Anshum mentioned, this
 has come up a number of times in the past.

 -Yonik

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





-- 
Anshum Gupta


Re: Moving to git?

2015-05-29 Thread Walter Underwood
“git breaks when it tries to mirror” is not a convincing argument for moving to 
git. It might be an argument for fixing the mirroring in git.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

On May 29, 2015, at 6:03 PM, Yonik Seeley ysee...@gmail.com wrote:

 +1 to move to git!
 
 -Yonik
 
 On Fri, May 29, 2015 at 5:07 PM, Anshum Gupta ans...@anshumgupta.net wrote:
 I know this has come up a few times in the past but I wanted to bring this
 up again.
 
 The lucene-solr ASF git mirror has been behind by about a day. I was
 speaking with the infra people and they say that the size of the repo needs
 more and more ram. Forcing a sync causes a fork-bomb:
 
 Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.
 
 They tried a few things but it's almost certain that it needs even more RAM,
 which still is a band-aid as they'd soon need even more RAM. Also, adding
 RAM involves downtime for git.a.o which needs to be planned. As a stop gap
 arrangement attached a volume to the instance and are using it as swap to
 work around the adding RAM requires restart issue.
 
 FAQ: How would the memory requirement change if we moved to git instead of
 mirroring?
 Answer: svn - git mirroring is a weird process and has quite the memory
 leak. Using git directly is much cleaner.
 
 I personally think git does make things easier to manage when you're working
 on multiple overlapping things and so we should re-evaluate moving to it. I
 would have been fine had the mirroring worked, as all I want is a way to be
 able to work on multiple (local) branches without having to create and
 maintain directories like: lucene-solr-trunk1, lucene-solr-trunk2, or
 SOLR-, etc.
 
 Opinions?
 
 
 --
 Anshum Gupta
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 



Re: Moving to git?

2015-05-29 Thread Shai Erera
+1 to moving to git.

Shai
On May 30, 2015 6:24 AM, Anshum Gupta ans...@anshumgupta.net wrote:

 * There may be other good reasons for using git, but this is not one.*
 I just added one more to the list. I think most other reasons have already
 been spoken about in previous discussions. I'm not trying to debate on what
 is better (I think it's a lot to do with *opinion*).

 I think it's a reasonable thing to move to a system that allows for
 distributed version control and makes working on multiple things at the
 same time easy. But again, that's my thought. The last time the discussion
 came up, I was +1 to moving and wasn't already using it a lot. Right now,
 I'm just trying to work on multiple things and find git easier for that
 purpose.

 I just wanted to bring this back up and see if the opinion of active
 contributors has changed since the last time by means of a polite and
 friendly discussion. In the end, we can agree to disagree but it'd be
 better than not discussing at all. :-)


 On Fri, May 29, 2015 at 7:24 PM, Walter Underwood wun...@wunderwood.org
 wrote:

 There may be other good reasons for using git, but this is not one.

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)

 On May 29, 2015, at 6:57 PM, Yonik Seeley ysee...@gmail.com wrote:

 On Fri, May 29, 2015 at 9:40 PM, Walter Underwood wun...@wunderwood.org
 wrote:

 “git breaks when it tries to mirror” is not a convincing argument for
 moving
 to git.


 I'd be +1 without that annoyance as well.  As Anshum mentioned, this
 has come up a number of times in the past.

 -Yonik

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





 --
 Anshum Gupta



Re: Moving to git?

2015-05-29 Thread david.w.smi...@gmail.com
+1 to git.  Git may not be perfect, but SVN isn’t either.

On Fri, May 29, 2015 at 11:59 PM Ishan Chattopadhyaya 
ichattopadhy...@gmail.com wrote:

 Life is so much easier on long train/plane journeys with Git. +1.

 On Sat, May 30, 2015 at 9:21 AM, Shai Erera ser...@gmail.com wrote:

 +1 to moving to git.

 Shai
 On May 30, 2015 6:24 AM, Anshum Gupta ans...@anshumgupta.net wrote:

 * There may be other good reasons for using git, but this is not one.*
 I just added one more to the list. I think most other reasons have
 already been spoken about in previous discussions. I'm not trying to debate
 on what is better (I think it's a lot to do with *opinion*).

 I think it's a reasonable thing to move to a system that allows for
 distributed version control and makes working on multiple things at the
 same time easy. But again, that's my thought. The last time the discussion
 came up, I was +1 to moving and wasn't already using it a lot. Right now,
 I'm just trying to work on multiple things and find git easier for that
 purpose.

 I just wanted to bring this back up and see if the opinion of active
 contributors has changed since the last time by means of a polite and
 friendly discussion. In the end, we can agree to disagree but it'd be
 better than not discussing at all. :-)


 On Fri, May 29, 2015 at 7:24 PM, Walter Underwood wun...@wunderwood.org
  wrote:

 There may be other good reasons for using git, but this is not one.

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)

 On May 29, 2015, at 6:57 PM, Yonik Seeley ysee...@gmail.com wrote:

 On Fri, May 29, 2015 at 9:40 PM, Walter Underwood 
 wun...@wunderwood.org wrote:

 “git breaks when it tries to mirror” is not a convincing argument for
 moving
 to git.


 I'd be +1 without that annoyance as well.  As Anshum mentioned, this
 has come up a number of times in the past.

 -Yonik

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





 --
 Anshum Gupta





Moving to git?

2015-05-29 Thread Anshum Gupta
I know this has come up a few times in the past but I wanted to bring this
up again.

The lucene-solr ASF git mirror has been behind by about a day. I was
speaking with the infra people and they say that the size of the repo needs
more and more ram. Forcing a sync causes a fork-bomb:

Can't fork: Cannot allocate memory at /usr/share/perl5/Git.pm line 1517.

They tried a few things but it's almost certain that it needs even more
RAM, which still is a band-aid as they'd soon need even more RAM. Also,
adding RAM involves downtime for git.a.o which needs to be planned. As a
stop gap arrangement attached a volume to the instance and are using it as
swap to work around the adding RAM requires restart issue.

FAQ: How would the memory requirement change if we moved to git instead of
mirroring?
Answer: svn - git mirroring is a weird process and has quite the memory
leak. Using git directly is much cleaner.

I personally think git does make things easier to manage when you're
working on multiple overlapping things and so we should re-evaluate moving
to it. I would have been fine had the mirroring worked, as all I want is a
way to be able to work on multiple (local) branches without having to
create and maintain directories like: lucene-solr-trunk1,
lucene-solr-trunk2, or SOLR-, etc.

Opinions?


-- 
Anshum Gupta