[JIRA] (JENKINS-48258) git client plugin occasionally fails with "text file busy" error

2019-01-28 Thread cun...@drivescale.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Christopher Unkel commented on  JENKINS-48258  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: git client plugin occasionally fails with "text file busy" error   
 

  
 
 
 
 

 
 FYI, I've been running a snapshot build of beta2 + this fix for nearly 8 months.  Prior to this fix I encountered the bug several times a week.  It has not happened a single time since installing the snapshot version.  It has completely resolved the issue for me.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d)  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-48258) git client plugin occasionally fails with "text file busy" error

2018-04-23 Thread cun...@drivescale.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Christopher Unkel commented on  JENKINS-48258  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: git client plugin occasionally fails with "text file busy" error   
 

  
 
 
 
 

 
 Yup–that's it. Other discussion aside, I like the fix in PR313.  It's a bit ugly to have to execute 'cp', but I think it will solve the problem completely and do so with a fix local to the git client. As far as the stack overflow article, I'm not sure either, but I have at least two theories: 
 
There's more than one JVM in the world: maybe openjdk doesn't set FD_CLOEXEC but other JVMs do. 
Even with FD_CLOEXEC there's still a short race window between fork() in the parent and the exec() in the child.  On Linux this can be solved by using vfork() instead of fork(), but the POSIX semantics don't guarantee such. 
  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-48258) git client plugin occasionally fails with "text file busy" error

2018-04-23 Thread cun...@drivescale.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Christopher Unkel commented on  JENKINS-48258  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: git client plugin occasionally fails with "text file busy" error   
 

  
 
 
 
 

 
 Mark Waite: this bug cannot be understood by thinking about the behavior of one git client thread in isolation.  The straight-line code in the git client is correct and it does always close the unique temporary file before the call to command-line git.  The problem is that there are other threads in the JVM and they may also run commands and make subprocesses.  The mechanics of making subprocesses creates duplicates of open files.  It is one of the duplicates that is open, not the version opened by the git client code. To be more explicit, imagine that I have two build jobs running:  job 1 needs to do a git checkout, and job 2 needs to run make.  Say that Jenkins is running as process ID 1000, thread 1 is running the git checkout, and thread 2 is running the make.  Here's a thread/process execution interleaving in which the bug manifests: 
 
Process 1000, thread 1: open ssh123456.sh for writing, file descriptor 4 
Process 1000, thread 2: fork in preparation to run make, creating process 1001.  Inherits file descriptor 4 open for writing to ssh123456.sh. 
Process 1001: exec() make. 
Process 1000, thread 1: write contents of ssh123456.sh. 
Process 1000, thread 1: close ssh123456.sh.  Process 1000 no longer has ssh123456.sh open for writing.  However, this does not close file descriptor 4 in process 1001 (running make), hence ssh123456.sh is still open somewhere on the system for writing. 
Process 1000, thread 1: fork() in preparation to run git, creating process 1002. 
Process 1002: exec() git. 
Process 1002: fork in preparation to run SSH_AGENT script, creating process 1003. 
Process 1003: exec() ssh123456.sh --> ETXTBSY.  ssh123456.sh is open for writing as file descriptor 4 in process 1001 (make). 
 So the script file is not open in the Jenkins process, but nonetheless it is open somewhere on the system, hence ETXTBSY.  And the fact that some other totally unrelated code can make a copy of the file descriptor and mess things up is why it's a Java runtime bug.  A combination of vfork() and the close-on-exec flag would ensure that the file descriptor 4 in process 1001 in step 3, thus closing the copy.  That's what's being contemplated as the fix in the JVM. One workaround is what's in PR313: copy the script using cp, which doesn't create children, so can't have stranded an open file descriptor to its destination.  Another is what I proposed, which is to use a lock to ensure that steps 2 and 3 above cannot happen between steps 1 and 5.      
 

  
 
 
 
 

 
   

[JIRA] (JENKINS-48258) git client plugin occasionally fails with "text file busy" error

2018-04-23 Thread cun...@drivescale.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Christopher Unkel commented on  JENKINS-48258  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: git client plugin occasionally fails with "text file busy" error   
 

  
 
 
 
 

 
 Mark Waite: "thread 2" in the above could be any other thread in the Java virtual machine.   From a Java language point of view, the PrintWriter object and the underlying FileOutputStream would seem to be local to the thread that is executing createUnixGitSSH() and subsequently asks for the git CLI process to be launched.  However, on a Unix JVM implementation, underlying the FileOutputStream is an open file descriptor to the script.  File descriptors always have process scope. They are not thread local. So when I see the bug I'm guessing that thread 2 is a different thread launching a shell build step from some unrelated job.  On Unix Runtime.exec() is implemented with a fork(), producing a child process, followed by an exec() in the child process.  The child process inherits all open file descriptors.  If badly timed, this includes the open file descriptor for the SSH script, and that's how the script file is still open: it's open in a child process totally unrelated to what the git client plugin code is doing.  By the time the git client code runs git, the script is closed in the Jenkins process itself.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] (JENKINS-48258) git client plugin occasionally fails with "text file busy" error

2018-04-23 Thread cun...@drivescale.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Christopher Unkel commented on  JENKINS-48258  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: git client plugin occasionally fails with "text file busy" error   
 

  
 
 
 
 

 
 Mark Waite: as I understand the issue as described in JDK-8068370, the issue isn't that the file descriptor for the fix isn't closed, but rather that the open file descriptor is inherited into a child process forked by another thread, so there's a race between writing the GIT_SSH file contents and launching processes: 
 
In thread 1: open GIT_SSH script file for writing. 
In thread 2: fork process; child inherits open file descriptor of GIT_SSH file. 
In thread 1: close GIT_SSH file. 
In thread 1: fork thread to run git.  GIT_SSH file is still open in previous child, so ETXTBSY results. 
 So if a lock can preclude step 2 from happening between step 1 and step 3, the race would be fixed  That said, PR313 has a more local fix and would seem preferable if it works. I see this issue, but only intermittently, so it will probably take a month of testing to be confident the PR is a solution.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 

[JIRA] (JENKINS-48258) git client plugin occasionally fails with "text file busy" error

2018-04-23 Thread cun...@drivescale.com (JIRA)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Christopher Unkel commented on  JENKINS-48258  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: git client plugin occasionally fails with "text file busy" error   
 

  
 
 
 
 

 
 It seems like it should be possible to eliminate the race with the use of a lock to prevent processes from being forked while the GIT_SSH target script is open for writing. Roughly: 
 
In hudson.Launcher, add a static ReadWriteLock field using a ReentrantReadWriteLock. 
Hold the read lock in hudson.Launcher.ProcStarter.start() when calling launch(). 
Also hold the read lock in the various deprecated final hudson.Launcher.launch() overloads. 
Expose the (write) lock publicly for use in preventing launching. 
When writing files that will be executed, hold the write lock from when that file is created to when it is closed.  Specifically, hold the lock for the lifetime of the PrintWriter in each of org.jenkinsci.plugins.gitclient.CliGitApiImpl.createUnixSshAskpass(), .createUnixStandardAskpass(), and .createUnixGitSSH(). 
 Does this seem like a viable approach?  Worth developing a patch?    
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)  
 
 

 
   
 

  
 

  
 

   





-- 
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop