Hi Nathan, there is one service in Matterhorn (service registry) which is keeping track of the workflows that are being executed. That service does not only know which state a workflow is in, it also knows on which host it is running. So what you need to do is tell the service registry to restart all the jobs that are currently marked as "running" on the affected machines.
Unfortunately, there was not time so far to add this to the ui, so you will need to do this manually by updating the workflow's running status in that database. 1) You can find the affected workflows by issuing SELECT j.id FROM job j, service_registration s, host_registration h WHERE host = 'http://x.y.z'; AND j.status = 2 AND j.operation = 'START_WORKFLOW' AND j.processor_svc = s.id AND s.host_reg = h.id which basically translates to "find me every job that started a workflow which is still marked as running on host x.y.z. 2) After that it should be as easy as making sure that job is restarted by setting the status to "qeueued": UPDATE job SET status = 0 FROM job j, service_registration s, host_registration h WHERE host = 'http://x.y.z'; AND j.status = 2 AND j.operation = 'START_WORKFLOW' AND j.processor_svc = s.id AND s.host_reg = h.id Tobias On 30.08.2011, at 12:59, Nathan Cameron wrote: > Hello all, > Yesterday the core computer in our system that handles video processing and > distribution got overloaded and the matterhorn service stopped altogether. I > knew of no alternative but to restart the service. It was processing several > recordings when this happened. Upon restarting the web UI many of the > recordings initially showed they were in the same place they were before, and > a few failed completely. It's been approximately 7 hours since I did the > restart, and none of the recordings' states have changed. > > My question then: Is there some way to force the core to resume processing > on half processed files? > > I'm also wondering if there is a way to take one of the raw capture folders > on a given capture agent and upload it to the media module. For example, the > recordings that failed have full audio and video. I know they would have > successfully processed apart from the system error. How do I take one of > those folders from the capture agent (like a 2677) and get it to retry? Is > there a specific file from the folder I must upload? > > Any help here is appreciated. Until I can upgrade some more of our hardware > I'm going to be running into these issues. > > Thank You, > Nathan > _______________________________________________ > Community mailing list > [email protected] > http://lists.opencastproject.org/mailman/listinfo/community > > > To unsubscribe please email > [email protected] > _______________________________________________ _______________________________________________ Community mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/community To unsubscribe please email [email protected] _______________________________________________
