> > Your idea requires the user to make sure the output is \t separated. > > Yes... I've been doing that for years and life has been better ever since. But sure, the separator should be a parameter.
> Maybe we could have an option that would indicate the splitting char. > The default would be none = don't split: > > > load_parallel_results(file,split="\t") > myvar1 myvar2 V1 V2 > 1 1 A Hello 1 > 2 1 A Bye 2 > 3 1 A Wow 3 > 4 2 A Interesting 9 > 5 1 B NewYork 3 > > > load_parallel_results(file) > myvar1 myvar2 stdout stderr > 1 1 A "Hello\t1\nBye\t2\nWow\t3\n" "" > 2 2 A "Interesting\t9\n" "" > 3 1 B "NewYork\t3\n" "" > > That seems reasonable. > I am also somewhat concerned that the current function loads all > stdout/stderr files - even if they are never used. It would be better > if that could be done lazily - see > > http://stackoverflow.com/questions/20923089/r-store-functions-in-a-data-frame > I'm not sure there's a 'right' answer here. I think it depends on how you'll use the results. > > I believe I would prefer returning a data-structure, that you could > select the relevant records from based on the arguments. And when you > have the records you want, you can ask to have the stdout/stderr read > in and possibly expanded as rows. This would be able to scale to much > bigger stdout/stderr and many more jobs. > Seems reasonable. > > Maybe the trivial solution is to simply return a table of the args+the > filenames of stdout/stderr, and then have a function that turns that > table into the read in files, which you can run either immediately or > after you have selected the relevant rows. > > Yes -- I often do this: first go to the file system to collect all the file paths I might be interested in and the relevant metadata (for me, it's typically creation date). Then I figure out when paths I want to load, and then load them all in. David /Ole