Hi David, > On our recently upgraded system (1.3.x) I'm seing repeated failures to > distribute, the workflow just stalls on the "Distributing media to > progressive downloads"
where is matterhorn trying to copy the files to? Is it a network share or a local disk? Does ist start copying the files, and if so, do you see the file size increase? > I've so far tried: > 1) Not having the workers talk to the admin via mod_proxy > 2) Doubled the memory on the admin. I am wondering why you are thinking that it could be a memory issue? The distribution service first copies the files to the local workspace and then copy the files to the final destination using either file channels (uses the native os for copying) or in chunks of 1 MB. > Before I did 1. I did seem messages that "distributing file x timed out. > retrying in yms" - but never saw any evidence of a retry. The last job after > I made change 1. just failed after putting all 3 nodes in the cluster under > heavy load. That is *really* strange. I tried to find that log message (or similar ones) in the codebase without any luck. Do you think you could possibly get the correct message from the logs? > The media packages in question are in the region of 1.2Gb and contain just > over an hour of recordings. Smaller media packages have no issues, this leads > me to suspect that there is a regression in this service that is doing > something like reading the files into memory. A timeout seems likely. What could be possible is that we are hitting a bug with downloading large files from the working file repository. Do you have the working file repository root configured to the same share as the workspace root (i. e. is hard linking enabled)? Thanks, Tobias _______________________________________________ Matterhorn mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn To unsubscribe please email [email protected] _______________________________________________
