Muhammad, The issue is not the staging but the de-duplication of staging when multiple jobs reference the same input data.
-Dougal On Tue, Jun 1, 2010 at 4:15 PM, Muhammad Junaid <[email protected]> wrote: > Hi , > I am not an expert on it, but the first idea that came into my mind was to > place the input data files on a location which supports gsiftp , > and provide the gsiftp:// url as a stage in parameter, which GRAM should be > able to pull the input file from. > This is not going to at-least modify your application's workflow. > > kind regards, > > Muhammad Junaid > > Dougal Ballantyne wrote: >> >> Dear GT, >> >> I have been working on a project for several months now researching >> and developing a grid solution based on Globus Toolkit 4. Many thanks >> to people who have helped me with previous issues. >> >> I have a slightly Off-Topic question related to how others handle a >> particular scenario. >> >> We have a job generation and control application that we have added >> support for Globus through some perl modules that call globusrun-ws. >> When a job is generated, the program pulls from the job database the >> associated input files and creates an XML file which lists the input >> files in StageIn and the requested results file in StageOut. This >> works great for a single job and jobs that all use different input >> data. However we often have a scenario when we generate several >> hundred jobs that all use the same input data. In our current setup we >> would StageIn the same input file several hundred times. >> >> I was wondering if that was a method or known best practice within the >> Globus Toolkit for handling this sort of scenario. I am aware that we >> could modify the tool to stage the data first, run the jobs and then >> remove the input file BUT that would also be a change of workflow for >> the users. >> >> Your thoughts or comments greatly appreciated. >> >> Kind regards, >> >> Dougal Ballantyne >> > >
