Reviving this very old thread, since this is still very much a problem in
Jenkins core a decade later. As I commented here
<https://stackoverflow.com/questions/21268327/rsync-alternative-to-jenkins-copy-artifacts-plugin#comment117378767_25530456>,
I'm seeing massive (~13x) performance gains by replacing copyArtifact with
a shell call to curl or wget in my pipelines.
As I understand it, copyArtifact uses a single Jenkins "control channel",
which has severely limited i/o and/or cpu resources, and this has been so
as far back as I can see. This causes not only sluggish copying of
artifacts from controller to agent, but also is a major factor in the
similarly abysmal performance of archiving artifacts in the other direction
(artifact compression being the other factor).
I am experimenting with workarounds. In lieu of installing a proper
artifact management system and replacing all archive/copyArtifact with
calls to its REST API (which I'll be doing later this year), I'm hoping to
find a quick alternative. The two I'm considering ATM are:
1. HTTP GET each artifact URL in question via curl, wget, etc.
1. This is nice bc it can just use the same semantics I was already
using with copyArtifact, that is, jobName, branchName,
lastSuccessfulBuild
symlinks..
2. This is great for known individual artifacts, but *http **requires
significant extra complexity to fetch whole artifact folders or artifacts
matching wildcard/regex like copyArtifact supports*. HTTP doesn't
have a notion of a directory, so you have to pre-process by fetching an
artifact index page, processing, and looping.
1. This guy said that Jenkins supports http fetching a zip of any
folder <https://stackoverflow.com/a/31434010/532621>, but that's
not working for me on jenkins 2.249.2.
3. Another problem here is you have to deal with jenkins
authentication / API tokens.
2. SCP/RSYNC supports rich file/directory pattern matching, but
1. require knowledge of the location of the artifacts on the
controller's disk. This is non-trivial for multibranch pipeline projects
(which I use liberally). *Scp would be an obvious choice if I could
figure out how to deterministically construct the path to a multibranch
pipeline branch job on the controller's disk.*
2. Authentication is trivial since all users/config in my jenkins
infra is managed by ansible, so my jenkin user can automatically ssh to
any
other node in the infra without password.
Any insight into way of replacing copyArtifact with curl/scp would be
greatly appreciated. Thanks for your time.
On Friday, October 21, 2011 at 7:44:50 AM UTC-7 David Karlsen wrote:
> No idea.
> Not even if the pull request was handled and put onto master.
>
> 2011/10/21 Marcelo Brunken <[email protected]>:
>
>
> > Any Ideas when that release comes out ?
> >
> > 2011/10/19 David Karlsen <[email protected]>
> >>
> >> It is also slow over ssh. I saw a fix and pull request for it here the
> >> other day - by using TCP nodelay. It has not been applied yet AFAIK.
> >>
> >> Den 19. okt. 2011 11:36 skrev "Marcelo Brunken" <[email protected]>
> >> følgende:
> >>>
> >>> Hellow,
> >>> There are a few tickets alread about this problem ... our bottleneck is
> >>> the copy process between slave and master, is there a solution on way ?
> >>> Someone is working on it?
> >>> I am trying to figure out how it could be faster, I think if the
> transfer
> >>> protocol is changed or something, HTTP sucks. (I am almost sure it is
> sent
> >>> via HTTP)
> >>> Thanks
> >
>
> --
> --
> David J. M. Karlsen - http://www.linkedin.com/in/davidkarlsen
>
--
You received this message because you are subscribed to the Google Groups
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-dev/115f6473-9bff-4a95-b89f-d29579a51082n%40googlegroups.com.