Muhammad,

The issue is not the staging but the de-duplication of staging when
multiple jobs reference the same input data.

-Dougal


On Tue, Jun 1, 2010 at 4:15 PM, Muhammad Junaid <[email protected]> wrote:
> Hi ,
> I am not an expert on it,  but the first idea that came into my mind was to
> place the input data files on a location which supports gsiftp ,
> and provide the gsiftp:// url as a stage in parameter, which GRAM should be
> able to pull the input file from.
> This is not going to at-least modify your application's workflow.
>
> kind regards,
>
> Muhammad Junaid
>
> Dougal Ballantyne wrote:
>>
>> Dear GT,
>>
>> I have been working on a project for several months now researching
>> and developing a grid solution based on Globus Toolkit 4. Many thanks
>> to people who have helped me with previous issues.
>>
>> I have a slightly Off-Topic question related to how others handle a
>> particular scenario.
>>
>> We have a job generation and control application that we have added
>> support for Globus through some perl modules that call globusrun-ws.
>> When a job is generated, the program pulls from the job database the
>> associated input files and creates an XML file which lists the input
>> files in StageIn and the requested results file in StageOut. This
>> works great for a single job and jobs that all use different input
>> data. However we often have a scenario when we generate several
>> hundred jobs that all use the same input data. In our current setup we
>> would StageIn the same input file several hundred times.
>>
>> I was wondering if that was a method or known best practice within the
>> Globus Toolkit for handling this sort of scenario. I am aware that we
>> could modify the tool to stage the data first, run the jobs and then
>> remove the input file BUT that would also be a change of workflow for
>> the users.
>>
>> Your thoughts or comments greatly appreciated.
>>
>> Kind regards,
>>
>> Dougal Ballantyne
>>
>
>

Reply via email to