On 03/21/2012 09:46 AM, Marco van Wieringen wrote: > Stephen Thompson<stephen<at> seismo.berkeley.edu> writes: >> This seems similar, but in my case, I'm not waiting for a tape to be >> loaded, the proper tape is already in the drive, but being written to be >> another job. I see how it's possible the code is treating both cases >> the same, that the drive is unavailable (for whatever reason), therefore >> the starting job must wait. >> >> But there's a logical distinction, where if my starting job had started >> when the job that's despooling (and blocking the drive) was spooling, my >> job would have started and be happily spooling along while the other job >> despools. There's no contention for an appropriate volume in this case >> and it seems like (logically at least) my job should start. >> > Taking a quick peek at the code: > > src/stored/append.c:do_append_data > > calls acquire_device_for_append which as far as I know want > a tape drive it can write to, are your jobs in state running > or only scheduled e.g. the ones that are waiting to spool. > Because after the acquire_device_for_append succeeds the > state of the job is set to RUNNING. >
Interesting. Yes, my "jobs that are not spooling" are in a "running" state, which means they have moved past the drive acquire, which is where I thought they might have been blocking. Apparently not. > The whole spooling is setup after that and then it > writes the session header and after that the > fd is told it can send its data. > > I guess you are stuck there. It may not be obvious but maybe > we could only reserve the drive and start the spooling > but then we need to write the session header the moment we > start despooling after we have actually acquired the storage > which we reserved earlier. Its not obvious if this might work > and as the SD is kind of interesting I'm not sure this > is going to work. > Do you mean "spooling" when you say "despooling" in this last paragraph? I'm not following you, though I don't have an understanding of each step in the process. Are you saying that something (session header) is written to tape when a job starts, but before it starts spooling? That would certainly explain the job being blocked. If that something is written to the database, I don't see why it would be blocking. When my system is free today, I'm going to try to collect more information under controlled circumstances. thanks, Stephen >> Stephen >> >>> And no nothing done is that respect as far that I know of >>> and your best bet is the Enterprise version to get the priority >>> raised for this feature request. >>> >> >> We're an academic/research lab, and don't have the budget for that, but >> thanks for the suggestion. I guess mostly, I am trying to confirm if >> this is expected behavior, as a first step. >> > > If your jobs are stuck in probably a non-running state we can almost > be sure its hanging on the acquire call. As to budget you have to ask > Bacula Systems maybe its not as bad as you make it sound. > > Marco > > > ------------------------------------------------------------------------------ > This SF email is sponsosred by: > Try Windows Azure free for 90 days Click Here > http://p.sf.net/sfu/sfd2d-msazure > _______________________________________________ > Bacula-devel mailing list > Bacula-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-devel -- Stephen Thompson Berkeley Seismological Laboratory step...@seismo.berkeley.edu 215 McCone Hall # 4760 404.538.7077 (phone) University of California, Berkeley 510.643.5811 (fax) Berkeley, CA 94720-4760 ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel