Having worked with Julia on this issue, I can share some findings:

We tracked this down to git processes intermittently hanging when fetching 
material updates on the GoCD server:

   - in the go/api/support endpoint, we saw MaterialUpdateListener threads 
   hanging (the issue was affecting more than one pipeline)
   - ps aux | grep git on the server showed that a few git processes have 
   been running for hours

A restart fixes the issue temporarily because there's an in-memory data 
structure tracking material updates that is only cleared (for a particular 
material) when its corresponding MaterialUpdateListener thread has 
completed its run. In our case, the thread kept waiting for the hung 
processes so the data wasn't cleared. Restarting GoCD server clears this 
data.

We suspected that the hanging git processes was related to some network 
instability. Per Aravind's suggestion, we wrapped the git binary with strace, 
and indeed we found that the process keeps waiting on a socket read.

Another possible factor is the git version used in the server, which is 
1.8.3.1 (the latest available version in the internal repository, but 
really old). Our agents use the same old version, and we have seen the same 
hanging issue there as well.

A temporary workaround which has worked for us is to wrap the git binary 
used by GoCD around timeout. We are also looking into upgrading git and 
understanding the observed network instability.


On Monday, May 23, 2016 at 5:57:29 PM UTC-4, [email protected] wrote:
>
> We are using GoCD version 15.2 with Github Pull Request Builder Plugin 
> version 1.2.4.
>
>
> An issue we are seeing is that at least once a day, the pipeline that uses 
> the PR plugin is unable to be triggered despite PRs being created. The only 
> temporary solution we've found was to restart the Go server.
>
> In "Pipeline" view, the play buttons are grayed out, while in 
> "Environment" view, the pipeline is "preparing to schedule".
>
> Pipeline tab:
>
> <https://lh3.googleusercontent.com/-_wDL4bMtUg0/V0N5M9e_I_I/AAAAAAAAAfo/7Hmj4y79RRUhwCxNE5o3eaRZYLdAAHtAQCLcB/s1600/Screen%2BShot%2B2016-05-23%2Bat%2B17.41.09.png>
>
>
> <https://lh3.googleusercontent.com/-_wDL4bMtUg0/V0N5M9e_I_I/AAAAAAAAAfo/7Hmj4y79RRUhwCxNE5o3eaRZYLdAAHtAQCLcB/s1600/Screen%2BShot%2B2016-05-23%2Bat%2B17.41.09.png>
> Environment tab: 
> <https://lh3.googleusercontent.com/-_wDL4bMtUg0/V0N5M9e_I_I/AAAAAAAAAfo/7Hmj4y79RRUhwCxNE5o3eaRZYLdAAHtAQCLcB/s1600/Screen%2BShot%2B2016-05-23%2Bat%2B17.41.09.png>
>
> <https://lh3.googleusercontent.com/-qEzXHWTy8BQ/V0N5Q3-OH3I/AAAAAAAAAfs/ixIx1SPv_IAquaVlCgzz3BUMtCIBDvfgACLcB/s1600/Screen%2BShot%2B2016-05-23%2Bat%2B17.41.04.png>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Here is what we see in the plugin logs:
> 2016-05-23 11:03:14,752  WARN [92@MessageListener for 
> MaterialUpdateListener] GitHubPRBuildPlugin:67 - get latest revisions since:
> java.lang.RuntimeException: Exception Occurred: [git, fetch, origin, 
> +refs/pull/*/head:refs/remotes/origin/pull-request/*] - 
> /home/go-server/pipelines/flyweight/2d857165-df0a-4f33-b466-0b62b4d54b68
>         at com.tw.go.plugin.cmd.Console.runOrBomb(Console.java:33)
>         at 
> com.tw.go.plugin.git.GitCmdHelper.runAndGetOutput(GitCmdHelper.java:397)
>         at 
> com.tw.go.plugin.git.GitCmdHelper.runOrBomb(GitCmdHelper.java:385)
>         at com.tw.go.plugin.git.GitCmdHelper.fetch(GitCmdHelper.java:195)
>         at com.tw.go.plugin.GitHelper.fetchAndReset(GitHelper.java:109)
>         at 
> com.tw.go.plugin.GitHelper.fetchAndResetToHead(GitHelper.java:99)
>         at com.tw.go.plugin.GitHelper.cloneOrFetch(GitHelper.java:41)
>         at 
> in.ashwanthkumar.gocd.github.GitHubPRBuildPlugin.handleLatestRevisionSince(GitHubPRBuildPlugin.java:190)
>         at 
> in.ashwanthkumar.gocd.github.GitHubPRBuildPlugin.handle(GitHubPRBuildPlugin.java:88)
>         at 
> com.thoughtworks.go.plugin.infra.DefaultPluginManager$1.execute(DefaultPluginManager.java:172)
>         at 
> com.thoughtworks.go.plugin.infra.DefaultPluginManager$1.execute(DefaultPluginManager.java:167)
>         at 
> com.thoughtworks.go.plugin.infra.FelixGoPluginOSGiFramework.executeActionOnTheService(FelixGoPluginOSGiFramework.java:315)
>         at 
> com.thoughtworks.go.plugin.infra.FelixGoPluginOSGiFramework.doOn(FelixGoPluginOSGiFramework.java:245)
>         at 
> com.thoughtworks.go.plugin.infra.DefaultPluginManager.submitTo(DefaultPluginManager.java:167)
>         at 
> com.thoughtworks.go.plugin.access.PluginRequestHelper.submitRequest(PluginRequestHelper.java:32)
>         at 
> com.thoughtworks.go.plugin.access.scm.SCMExtension.latestModificationSince(SCMExtension.java:156)
>         at 
> com.thoughtworks.go.server.service.materials.PluggableSCMMaterialPoller.modificationsSince(PluggableSCMMaterialPoller.java:77)
>         at 
> com.thoughtworks.go.server.service.materials.PluggableSCMMaterialPoller.modificationsSince(PluggableSCMMaterialPoller.java:46)
>         at 
> com.thoughtworks.go.server.service.MaterialService.modificationsSince(MaterialService.java:110)
>         at 
> com.thoughtworks.go.server.materials.ScmMaterialUpdater.insertLatestOrNewModifications(ScmMaterialUpdater.java:52)
>         at 
> com.thoughtworks.go.server.materials.PluggableSCMMaterialUpdater.insertLatestOrNewModifications(PluggableSCMMaterialUpdater.java:61)
>         at 
> com.thoughtworks.go.server.materials.MaterialDatabaseUpdater.insertLatestOrNewModifications(MaterialDatabaseUpdater.java:155)
>         at 
> com.thoughtworks.go.server.materials.MaterialDatabaseUpdater.updateMaterialWithNewRevisions(MaterialDatabaseUpdater.java:147)
>         at 
> com.thoughtworks.go.server.materials.MaterialDatabaseUpdater$2.doInTransaction(MaterialDatabaseUpdater.java:110)
>         at 
> com.thoughtworks.go.server.transaction.TransactionCallback.doWithExceptionHandling(TransactionCallback.java:24)
>         at 
> com.thoughtworks.go.server.transaction.TransactionTemplate$3.doInTransaction(TransactionTemplate.java:53)
>         at 
> org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
>         at 
> com.thoughtworks.go.server.transaction.TransactionTemplate.executeWithExceptionHandling(TransactionTemplate.java:49)
>         at 
> com.thoughtworks.go.server.materials.MaterialDatabaseUpdater.updateMaterial(MaterialDatabaseUpdater.java:107)
>         at 
> com.thoughtworks.go.server.materials.MaterialUpdateListener.onMessage(MaterialUpdateListener.java:48)
>         at 
> com.thoughtworks.go.server.materials.MaterialUpdateListener.onMessage(MaterialUpdateListener.java:29)
>         at 
> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:65)
>         at 
> com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:50)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.commons.exec.ExecuteException: Process exited with 
> an error: 128 (Exit value: 128)
>         at 
> org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
>         at 
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
>         at 
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:153)
>         at com.tw.go.plugin.cmd.Console.runOrBomb(Console.java:25)
>         ... 33 more
>
> Has anyone come across this issue? 
>
> We're also curious about why the restart fixes the issue (although only 
> temporarily), given that the error was happening when a git command was 
> executed on the flyweight directory. Any pointers on this perhaps?
>

-- 
You received this message because you are subscribed to the Google Groups 
"go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to