Hi all,
I have a proposal about how we name pipelines and their executions.
The purpose is to clarify the differences between the two, have
consensus between runners, and unify the implementation.
Current states:
* PipelineOptions.appName defaults to mainClass name
* DataflowPipelineOptions.jobName defaults to appName+user+datetime
* FlinkPipelineOptions.jobName defaults to appName+user+datetime
Proposal:
1. Replace PipelineOptions.appName with PipelineOptions.pipelineName.
* It is the user-visible name for a specific graph.
* default to mainClass name.
* Use cases: Find all executions of a pipeline
2. Add jobName to top level PipelineOptions.
* It is the unique name for an execution
* defaults to pipelineName + user + datetime + random Integer
* Use cases:
-- Finding all executions by USER_A between TIME_X and TIME_Y
-- Naming resources created by the execution. for example:
Writing temp files to folder TMP_DIR/jobName/, Writing to default
output file jobName.output, Creating temp /subscriptions/jobName
Please let me know what you think.
Thanks
--
Pei