I'm considering to import https://github.com/myui/incubator-hivemall
to ASF repository tomorrow.
Let me know if it's NOT okey.

Github tag/release issue is my concern though ..
https://lists.apache.org/thread.html/db78e1f8fc121d8e6b016d2f61d06ccafebf9fd30b4ec00883c78557@%3Clegal-discuss.apache.org%3E

I would like to remain the past git tags to keep track of changes.

Thanks,
Makoto

2016-11-30 20:35 GMT+09:00 Makoto Yui <[email protected]>:
> I'm considering to update the following way because git push does not
> work when performing shallow copy (maybe due to ASF git server
> version/configuration).
>
> You can find the tested repository on 
> https://github.com/myui/incubator-hivemall
>
> $ git clone https://github.com/myui/hivemall.git incubator-hivemall
> $ git filter-branch --index-filter 'git rm -r --cached
> --ignore-unmatch lib/ target/*.jar' --tag-name-filter cat
> --prune-empty -- --all
> $ rm -rf .git/refs/original/
> $ git reflog expire --expire=now --all
> $ git gc --aggressive --prune=now
> $ git remote set-url origin https://github.com/myui/incubator-hivemall.git
> $ git push -f -u origin master
> $ git push origin --tags --force
>
> $ git clone https://github.com/myui/incubator-hivemall.git
> $ cd incubator-hivemall
> $ git_find_big.sh | head -10
>
> All sizes are in kB's. The pack column is the size of the object,
> compressed, inside the pack file.
> size  pack  SHA                                       location
> 1391  1383  b8d432e6a3c0074951abd35caf0a777caf47afbf
> xgboost/lib/xgboost4j_0.60-0.10.jar
> 765   303   11c617713ee2ad3f847aee7627ee8639c5a79667
> core/src/test/resources/hivemall/mf/ml1k.train
> 639   613   de4e32983604238bc72fe3f6cb6beea76fde0e8d
> src/site/resources/images/hivemall_overview_bg.png
> 382   117   8b66187fe067c3aa389ce8c98108f349ceae159c
> src/site/resources/fonts/fontawesome-webfont.svg
> 220   192   04d8605fd8daaafa72a2b6dfa2a2d48c75c57a10
> src/site/resources/images/asf_bg.png
> 194   186   fb29a3d2ee04b7981463de89a77ccc7436f4ad9a
> docs/gitbook/resources/images/techstack.png
> 191   76    e00b1127f6fb4fdcc1606a20b05e16b5456acacc
> core/src/test/resources/hivemall/mf/ml1k.test
> 149   88    f221e50a2ef60738ba30932d834530cdfe55cb3e
> src/site/resources/fonts/fontawesome-webfont.ttf
>
> 2016-11-30 14:31 GMT+09:00 Makoto Yui <[email protected]>:
>> Hi Takeshi,
>>
>> I was almost to perform the initial code dump (stopped).
>>
>> Be aware almost all commit hash will be changed when rewriting Git logs by 
>> [1].
>> [1] git filter-branch --index-filter 'git rm -r --cached
>> --ignore-unmatch lib/ target/*.jar' --prune-empty -- --all
>>
>> So, I'm considering to make a shallow copy limiting 100-300 or so
>> (that does not include large binaries).
>>
>> Thanks,
>> Makoto
>>
>> 2016-11-30 2:44 GMT+09:00 Takeshi Yamamuro <[email protected]>:
>>> Hi, all
>>>
>>> I also have no strong opinion though, it seems it'd be better to keep as
>>> much activities (that is, commit logs) as possible there.
>>> I'm afraid few activity logs possibly make newbies misunderstand that
>>>  hivemall is inactive.
>>>
>>> As for the rebasing, it's not tough to rebase #285 (this is my own pr).
>>> So, rewriting the logs sounds good to me.
>>>
>>> // maropu
>>>
>>> On Tue, Nov 29, 2016 at 11:24 PM, Makoto Yui <[email protected]> wrote:
>>>
>>>> Kai,
>>>>
>>>> 2016-11-29 22:35 GMT+09:00 Kai Sasaki <[email protected]>:
>>>> > Currently we have 6 PRs and some of them (especially #285, #336 and #385)
>>>> > are relatively large.
>>>> > It might cause somewhat troublesome rebasing.
>>>>
>>>> Yes, it's my concern.
>>>>
>>>> But, such large PRs should better to be contributed in the Apache
>>>> Incubation process.
>>>> I'm considering to invite some of them to the Hivemall committer.
>>>>
>>>> Another concern is moving github stars/watchers as seen in [1].
>>>> [1] https://issues.apache.org/jira/browse/INFRA-12995
>>>>
>>>> > Do you think some of them are not ready to be merged? I think merging
>>>> some
>>>> > of them before reflogging history
>>>> > can make migrating work easy. But if they are not ready, it's okay. We
>>>> can
>>>> > work on rebasing after this work.
>>>>
>>>> I'm currently reviewing #385 but it need to be revised in several parts.
>>>> Also, #336 requires large refactoring.
>>>>
>>>> So, better to do initial code dump first.
>>>>
>>>> Shallow copied repository can be pushed from git v1.9 and later
>>>> (I'm not sure about ASF git version though).
>>>> http://blogs.atlassian.com/2014/05/handle-big-repositories-git/
>>>>
>>>> Thanks,
>>>> Makoto
>>>>
>>>
>>>
>>>
>>> --
>>> ---
>>> Takeshi Yamamuro

Reply via email to