Re: Timeout with git clone
Hi, Thanks for the answers and comments. > Yes, I agree that it probably would be much better to go back to use > dulwich both for protocol serving and for providing data for the web > frontend, instead of forking out to git. Disclaimer: I don't know has > fast dulwich is these days. It could perhaps also be relevant to > research what other python git hosting solutions do. Are there other python git hosting solutions? The very reason I'm here is that I didn't really find anything else... > If interested in contributing in this area, a first step could be to > create a proof of concept of switching back to Dulwich and doing some > benchmarks - both for local cloning with infinite network bandwidth > (where I doubt dulwich can match pure git) and for more realistic remote > internet bandwidth (where I guess it doesn't matter much). Sounds like a good plan. I don't know if I'll find the time, but I'll try. > But also note that subprocessio no longer only is used by pygrack. It is > also used for run_git_command in > kallithea/lib/vcs/backends/git/repository.py (introduced in > 1f4d4b8d72f5), mainly for cloning and listing changesets. A full > solution would require somehow replacing run_git_command with dulwich. > But that can be done one at a time. Yes I'm aware of that. Kind regards, Quentin ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
Re: Timeout with git clone
Hi The code in this area has worked surprisingly well since Kallithea inherited it, even though it has popped up regularly needing tricky maintenance. I agree it would be nice to refactor / reimplement this area. It is just that nobody invested time or sponsorship in doing it. I guess it hasn't caused enough pain for anybody to justify it ;-) Yes, I agree that it probably would be much better to go back to use dulwich both for protocol serving and for providing data for the web frontend, instead of forking out to git. Disclaimer: I don't know has fast dulwich is these days. It could perhaps also be relevant to research what other python git hosting solutions do. If interested in contributing in this area, a first step could be to create a proof of concept of switching back to Dulwich and doing some benchmarks - both for local cloning with infinite network bandwidth (where I doubt dulwich can match pure git) and for more realistic remote internet bandwidth (where I guess it doesn't matter much). But also note that subprocessio no longer only is used by pygrack. It is also used for run_git_command in kallithea/lib/vcs/backends/git/repository.py (introduced in 1f4d4b8d72f5), mainly for cloning and listing changesets. A full solution would require somehow replacing run_git_command with dulwich. But that can be done one at a time. /Mads On 18/04/2023 16:55, Quentin Wenger wrote: Digging a bit deeper: - The changeset that you linked (https://kallithea-scm.org/repos/kallithea/changeset/034e4fe1ebb2#rhodecodelibsubprocessiopy_n127) actually shows that historically it went the other way round, that is at first dulwich's server was used but then considered "buggy", therefore the implementation was replaced by some custom code. - That custom code looks like coming from https://github.com/dvdotsenko/git_http_backend.py. That repo hasn't been updated since 2012, neither do its forks show any sign of recent activity. - In contrast, dulwich, while officially still in beta, is actively developed. IMhO the proper move would be to go back to dulwich. Chances are that those buggy things have been fixed in the last ten years. And if they haven't, better report them upstream than reinvent the wheel. By the way, do we have any more precise idea of what was considered buggy at the time? What do you think? ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
Re: Timeout with git clone
Digging a bit deeper: - The changeset that you linked (https://kallithea-scm.org/repos/kallithea/changeset/034e4fe1ebb2#rhodecodelibsubprocessiopy_n127) actually shows that historically it went the other way round, that is at first dulwich's server was used but then considered "buggy", therefore the implementation was replaced by some custom code. - That custom code looks like coming from https://github.com/dvdotsenko/git_http_backend.py. That repo hasn't been updated since 2012, neither do its forks show any sign of recent activity. - In contrast, dulwich, while officially still in beta, is actively developed. IMhO the proper move would be to go back to dulwich. Chances are that those buggy things have been fixed in the last ten years. And if they haven't, better report them upstream than reinvent the wheel. By the way, do we have any more precise idea of what was considered buggy at the time? What do you think? ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
Re: Timeout with git clone
Hi Mads, I can try that, but I'm a bit worried that it is monkey-patching and half-solving at best. And the fact that this area is considered "obscure code" is even worse. Trying to get a broader picture: There are comments like `TODO: This function now uses os underlying 'git' command which is generally not good.` all over the place. Maybe there should be a larger refactoring of the git backend taking place, where all uses of native Git are replaced by dulwich? That way the cryptic code in lib/vcs/subprocessio.py will also go away. Is there any specific reason that those TODOs haven't been handled so far, apart from limited dev resources? Thanks, Quentin ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general
Re: Timeout with git clone
Hi I haven't seen that problem and can't reproduce it. The wait for 10 seconds in some pretty obscure code came from a comment in https://kallithea-scm.org/repos/kallithea/changeset/034e4fe1ebb2#rhodecodelibsubprocessiopy_n127 before The Big Fork. The comment became reality in https://kallithea-scm.org/repos/kallithea/changeset/01aca0a4f876#kallithealibvcssubprocessiopy_n125 when moving to Python 3. It doesn't seem to have caused problems so far, but I might have been too naive and trusted the comment too much. Does it work better for you if changing it back: kr.wait(2) - if not kr.wait(10): + if len(t) > ccm + 3: raise IOError( "Timed out while waiting for input from subprocess.") I don't see why that should be a good change, but perhaps it fixes your issue. Please let me know if you think I should push https://kallithea-scm.org/repos/kallithea-incoming/changeset/35e5c3dcec22 . /Mads On 15/04/2023 01:21, Quentin Wenger wrote: Hi, When cloning a medium-sized repo (not extremely large but with a couple heavy media files), I consistently get a timeout preventing the cloning from completing. Client: $ git clonehttps://user@domain/main_website Cloning into 'main_website'... Password for 'https://user@domain': remote: Enumerating objects: 10798, done. remote: Counting objects: 100% (10798/10798), done. remote: Compressing objects: 100% (5199/5199), done. fetch-pack: unexpected disconnect while reading sideband packet fatal: early EOF fatal: fetch-pack: invalid index-pack output The error occurs during the "Receiving objects:" phase, around 60%. Server log with DEBUG: 2023-04-14 19:05:26.748 INFO [kallithea.controllers.base] pull action on git repo "main_website" by "user" from IP 2023-04-14 19:05:26.748 DEBUG [kallithea.config.middleware.pygrack] handling cmd ['git', 'upload-pack', '--stateless-rpc', '/home/domain/hosting_kallithea/repos/main_website'] Exception in thread Thread-6: Traceback (most recent call last): File "/opt/alt/python310/lib64/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/domain/hosting_kallithea/source/kallithea/lib/vcs/subprocessio.py", line 129, in run raise IOError( OSError: Timed out while waiting for input from subprocess. [UID:1552][1444643] Child process with pid: 1444662 was killed by signal: 15, core dumped: no Cloning via git+ssh directly instead of the https protocol works fine. Has this been experienced before? Is this just a matter of using a longer timeout value on line 128 of kallithea/lib/vcs/subprocessio.py? How was the value 10 seconds chosen in the first place? What about making it configurable if it is arbitrary? Thanks, Quentin ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general ___ kallithea-general mailing list kallithea-general@sfconservancy.org https://lists.sfconservancy.org/mailman/listinfo/kallithea-general