Thanks, Romain, for the explanations. I’ve seen the use of the relativizer in the FileSource, but given that I assumed a source would be executed on the submission machine, this was a bit confusing… besides that it’s much cleaner this way, my approach so far to use a ScalaTask running in the local environment probably had the same effect.
I also think it might be a bit more “efficient” with the use of pre-compiled sources because the (many) ScalaTask scripts need to (re-)compiled every time first before the workflow executes. Every time I extend/refactor my plugin, I learn something new and how to properly use OpenMOLE’s features… :D Cheers, Andreas > On 9 Jun 2015, at 03:05, Romain Reuillon <[email protected]> wrote: > > Hi Andreas, > > great that you got through. The sources are by construction executed on the > submission machine. So make sure that all the operations which depend on the > locally accessible resources (file reading, db query...) are executed in the > source (no lazy operation). > > The executionContext is used to pilot the side effects. It provides an output > stream, in case you need to display something and a file path relativizer > which is used in case some openmole execution manager wants to relocate all > file path. By default the relativizer does nothing and the output stream > points to System.out. In the GUI the output stream will be redirected to the > display aera associated to the workflow execution. We have no use for the > relativizer anymore but I kept it in case it might be of some use in the > future. In the future we could also imagine some mechanism in the execution > for rewritting the database connection query and other mechanism to be able > to control the context of the side effects.... > > To enable the file relativizer for your source: > val expandedPath = executionContext.relativise(path.from(context)) > > cheers, > Romain > > Le 09/06/2015 08:25, Andreas Schuh a écrit : >> In the meantime I’ve implemented some Source’s for this. >> >> The one question whose answer remains unclear to me whether I should use the >> ExecutionContext and what for. >> >> My DataSetSource as outlined in the previous email is currently implemented >> as follows, which works fine in the LocalEnvironment. Yet have to test it in >> an actual distributed environment: >> >> object DataSetSource { >> def apply(setId: Prototype[String], dataSet: Prototype[DataSet]) = >> new SourceBuilder { >> addInput(setId) >> addOutput(dataSet) >> def toSource = new DataSetSource(setId, dataSet) with Built >> } >> } >> >> abstract class DataSetSource(setId: Prototype[String], dataSet: >> Prototype[DataSet]) extends Source { >> override def process(context: Context, executionContext: >> ExecutionContext)(implicit rng: RandomProvider) = { >> val name = context.option(setId).get >> Variable(dataSet, DataSet(name)) >> } >> } >> >>> On 8 Jun 2015, at 20:06, Andreas Schuh <[email protected]> wrote: >>> >>> Hi, >>> >>> I had a look at how data sources are implemented in OpenMOLE, but would >>> like some more background information before I attempt to write my own for >>> my REPEAT workflow plugin. Basically what I currently have is a workflow >>> with some IDs identifying for instance the image dataset, the registration >>> method, and the set of parameters for this registration method which >>> correspond to a row in a CSV table. Most of the information needed by the >>> ScalaTask’s is made available by the user via a HOCON configuration file. >>> My plugin contains classes and objects for easy access to the parsed >>> configuration values. For example, for a specific image dataset to be used, >>> I have a class such as >>> >>> object Dataset { >>> val names: Set[String] >>> } >>> >>> class Dataset(val id: String) { >>> val dir= … >>> def imgCsv = ... >>> def imgPath(imgId: String) = … >>> // ... >>> } >>> >>> A typical workflow then starts with an ExplorationTask which samples all >>> the dataset IDs: >>> >>> val setId = Val[String] >>> val exploreDataSets = ExplorationTask(setId in Dataset.names) >>> >>> After the exploration transition, I want to inject a Val[Dataset] into the >>> workflow that is then used as input to a task which therefore has access to >>> all the information about the dataset via the respective Dataset class >>> instance: >>> >>> val dataSet = Val[Dataset] >>> val getDataSet = Capsule(ScalaTask(“val dataSet = Dataset(setId)”) set >>> (inputs += setId, outputs += dataSet), strainer = true) >>> >>> To not require all the data to be strained through this simple ScalaTask, I >>> tried to use the new “Strain” pattern instead, but realised that this >>> doesn’t work because the newly injected dataSet variable is then only >>> available in one branch of the “Strain” puzzle. Maybe this is still an >>> issue of this pattern… on the other hand, it looks like a Source would be >>> more suitable for what I want to do ? >>> >>> val imgId = Val[String] >>> val exploreImages = ExplorationTask(CSVSampling(“${dataSet.imgCsv}”) set >>> (columns += (“ID”, imgId))) >>> val processImage = ScalaTask(“ … dataSet.imgPath(imgId) “) set (inputs += >>> (dataSet, imgId))) >>> >>> val ex = exploreDataSets -< getDataSet — exploreImages -< processImage start >>> >>> Note: I am using here my modified (hacked) CSVSampling which takes an >>> ExpandedString as argument instead of File. >>> >>> >>> With a custom DatasetSource, I would instead have something like: >>> >>> val getDataSetUsingSource = Capsule(EmptyTask() set (inputs += setId, >>> outputs += dataSet)) source DatasetSource(setId, dataSet) >>> >>> >>> Any suggestions on how to best inject the “Dataset” variable into the >>> workflow ? Using a ScalaTask or a Source ? Note that instantiating this >>> class requires information from a local HOCON configuration file whose >>> content I currently insert as string literal into the getDataSet ScalaTask >>> script. The DatasetSource instance could have access to the >>> com.typesafe.config.Config object of my loaded plugin with the already >>> parsed information. >>> >>> Thanks, >>> >>> Andreas >> >> _______________________________________________ >> OpenMOLE-users mailing list >> [email protected] >> http://fedex.iscpif.fr/mailman/listinfo/openmole-users > > _______________________________________________ OpenMOLE-users mailing list [email protected] http://fedex.iscpif.fr/mailman/listinfo/openmole-users
