Also, if you are having trouble shutting down the agents process, it would be great if you could get a thread dump and post it, before you kill -9 it.
Karl On Sun, Jan 11, 2015 at 7:25 PM, Karl Wright <[email protected]> wrote: > Hi Adrian, > > If you noted the comment stream in CONNECTORS-590, I was able to > demonstrate conclusively that the problem was in Postgres. I have not seen > the problem in 9.3, but that does not mean it's gone. What version of > Postgresql are you using? > > In any case, while this problem definitely terminates your job, it will > not happen very often. I suspect the frequency of occurrence may depend on > how loaded the database is. > > Karl > > > On Sun, Jan 11, 2015 at 7:14 PM, Adrian Conlon <[email protected]> > wrote: > >> Hi All, >> >> >> >> I’m getting an occurrence of what looks very similar to CONNECTORS-590. >> >> >> >> The circumstances are: >> >> >> >> 1) MCF Jobs proceeding very slowly (looks like a Postgresql vacuum >> is needed) >> >> 2) Stop tomcat >> >> 3) Attempt to stop the agents normally >> >> 4) Wait a minute or two >> >> 5) Decide to “kill -9” the agents process >> >> 6) Vacuum the database >> >> 7) Restart tomcat >> >> 8) Restart the agents >> >> >> >> When I checked the job status page, I found that two of the jobs (out >> around 4000 or so) had the following status (or very similar): >> >> >> >> Error: Unexpected jobqueue status - record id 1417115392831, expecting >> active status, saw 4 >> >> >> >> Setup-wise, I’m running a release candidate of v1.8 RC (I think RC2), >> using postresql as the crawl database and running on Ubuntu Linux. I’m >> using zookeeper style synchronisation. >> >> >> >> Let me know if more information etc. is needed or if you think it’s a >> new/real issue. >> >> Adrian >> >> ____________________________________________________________ >> Electronic mail messages entering and leaving Arup business >> systems are scanned for acceptability of content and viruses >> > >
