On 03/21/2012 09:46 AM, Marco van Wieringen wrote:
> Stephen Thompson<stephen<at>  seismo.berkeley.edu>  writes:
>> This seems similar, but in my case, I'm not waiting for a tape to be
>> loaded, the proper tape is already in the drive, but being written to be
>> another job.  I see how it's possible the code is treating both cases
>> the same, that the drive is unavailable (for whatever reason), therefore
>> the starting job must wait.
>>
>> But there's a logical distinction, where if my starting job had started
>> when the job that's despooling (and blocking the drive) was spooling, my
>> job would have started and be happily spooling along while the other job
>> despools.  There's no contention for an appropriate volume in this case
>> and it seems like (logically at least) my job should start.
>>
> Taking a quick peek at the code:
>
> src/stored/append.c:do_append_data
>
> calls acquire_device_for_append which as far as I know want
> a tape drive it can write to, are your jobs in state running
> or only scheduled e.g. the ones that are waiting to spool.
> Because after the acquire_device_for_append succeeds the
> state of the job is set to RUNNING.
>

Interesting.  Yes, my "jobs that are not spooling" are in a "running" 
state, which means they have moved past the drive acquire, which is 
where I thought they might have been blocking.  Apparently not.


> The whole spooling is setup after that and then it
> writes the session header and after that the
> fd is told it can send its data.
>
> I guess you are stuck there. It may not be obvious but maybe
> we could only reserve the drive and start the spooling
> but then we need to write the session header the moment we
> start despooling after we have actually acquired the storage
> which we reserved earlier. Its not obvious if this might work
> and as the SD is kind of interesting I'm not sure this
> is going to work.
>


Do you mean "spooling" when you say "despooling" in this last paragraph?

I'm not following you, though I don't have an understanding of each step 
in the process.  Are you saying that something (session header) is 
written to tape when a job starts, but before it starts spooling?  That 
would certainly explain the job being blocked.  If that something is 
written to the database, I don't see why it would be blocking.

When my system is free today, I'm going to try to collect more 
information under controlled circumstances.

thanks,
Stephen



>> Stephen
>>
>>> And no nothing done is that respect as far that I know of
>>> and your best bet is the Enterprise version to get the priority
>>> raised for this feature request.
>>>
>>
>> We're an academic/research lab, and don't have the budget for that, but
>> thanks for the suggestion.  I guess mostly, I am trying to confirm if
>> this is expected behavior, as a first step.
>>
>
> If your jobs are stuck in probably a non-running state we can almost
> be sure its hanging on the acquire call. As to budget you have to ask
> Bacula Systems maybe its not as bad as you make it sound.
>
> Marco
>
>
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Bacula-devel mailing list
> Bacula-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-devel


-- 
Stephen Thompson               Berkeley Seismological Laboratory
step...@seismo.berkeley.edu    215 McCone Hall # 4760
404.538.7077 (phone)           University of California, Berkeley
510.643.5811 (fax)             Berkeley, CA 94720-4760

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to