I'm trying to create JSON files that define Amazon data pipeline jobs. We 
have 40 or so different jobs that all need processing on EMR. There's a lot 
of shared config between our tasks, with only the input paths and EMR step 
definition changing per job. However, since a lot of tasks share the same 
EMR config, I need the ability to reuse the same EMR config for multiple 
tasks, while also having the flexibility to override it per task. 

So in the example I first gave, there are 2 jobs using the same default EMR 
config, and one that will use a custom config. As I say, it's basically a 
factory pattern, with each job knowing which EMR config it needs, and the 
whole thing being templated generically.

Here's a sample of the JSON template I'm trying to populate:

{% for definition in definitions %}
    {
      "id": "EmrActivityId_{{ loop.index }}",
      "name": "EmrActivity_{{ definition.suite }}",
      "precondition": {
        "ref": "PreconditionId_{{ loop.index }}"
      },
      "runsOn": {
        "ref": "EmrClusterId"
      },
      "type": "EmrActivity",
      "myDate": "{{ date | default('#{format(minusDays(@scheduledStartTime, 1), 
\'YYYY-MM-dd\')}') }}",
      "step": "{{ {{ definition.step_name }}.step }}"
    },
    {
      "id": "PreconditionId_{{ loop.index }}",
      "name": "InputExistsPrecondition",
      "s3Prefix": "s3://example-{{ env }}-data{{ definition.s3_precondition }}",
      "type": "S3PrefixNotEmpty"
    },
{% endfor %}


The rest of the template is identical for all jobs. So I'm really looking 
for a way to be able to populate the step definition per job so I can 
maximise reuse. As I say, there are around 40 of these particular tasks I 
need to migrate.

On Monday, January 5, 2015 4:13:59 PM UTC, Brian Coca wrote:
>
> sounds like a very complicated process, what are you trying to do in 
> the end? it is normally simpler with ansible, it is rare to need 
> nested variable includes. 
>
> On Mon, Jan 5, 2015 at 10:09 AM, Mark <[email protected] 
> <javascript:>> wrote: 
> > Thanks for your reply Michael. Can you think of any other way I can 
> achieve 
> > what I'm trying to? I have tried: 
> > 
> > - templating the vars file before loading it with include_vars, but 
> jinja 
> > complained about undefined variables 
> > - doing a string replace into the vars file, but then the variables in 
> the 
> > default_step.yml file weren't interpolated when it was later loaded by 
> > ansible 
> > - trying to load the step by name in the json template I'm trying to 
> create. 
> > I.e. for each product/suite combination I added 'step: default_step', 
> then 
> > had another vars file containing all of the steps (e..g 
> steps.default_step, 
> > steps.custom_step) and then in my json template trying "{{ {{ 
> > definition.step }}.step }}", but jinja wouldn't parse that. 
> > 
> > I'm racking my brains but can't think how I can do this. 
> > 
> > Thanks 
> > 
> > On Monday, January 5, 2015 2:14:45 PM UTC, Michael DeHaan wrote: 
> >> 
> >> There is no facility for a variable file including another. 
> >> 
> >> Jinja2 is also not invoked when reading variable files at that time, so 
> >> include won't help. 
> >> 
> >> 
> >> 
> >> On Mon, Jan 5, 2015 at 6:17 AM, Mark <[email protected]> wrote: 
> >>> 
> >>> Hi, 
> >>> 
> >>> I'm trying to create some sort of factory pattern I guess in my vars 
> >>> files for creating Amazon data pipeline jobs. What I want is to have a 
> list 
> >>> of dicts, and each one will contain a parameter ("step") that I want 
> to be 
> >>> different per item. However, since this value is quite large & 
> complicated, 
> >>> and will be used for 90% of the items, i don't want to have to 
> duplicate 
> >>> this value 40 or so times. 
> >>> 
> >>> So, is it possible to include a vars file in another one? I tried 
> using 
> >>> jinja2's "include" function, but it complained that certain variables 
> >>> weren't defined because it was trying to resolve the variables in the 
> >>> included file. In fact, those variables should be parsed later in the 
> main 
> >>> play, now when including one vars file into the main one. I can't use 
> roles 
> >>> because this is part of a larger pattern in which the main vars file 
> is 
> >>> loaded dynamically depending on another variable. 
> >>> 
> >>> Some examples might make clear what I mean. 
> >>> 
> >>> Here's my playbook, "create-job.yml": 
> >>> 
> >>> - name: "Create a data pipeline and definition for {{ product }} {{ 
> job 
> >>> }}" 
> >>>   hosts: localhost 
> >>>   gather_facts: True 
> >>>   vars_files: 
> >>>     - "vars/pipelines/{{ group }}/env/{{ env }}.yml" 
> >>>     - "vars/pipelines/{{ group }}/{{ job }}.yml" 
> >>> 
> >>> 
> >>> Here's my vars file, "job1.yml", for the "job1" job: 
> >>> 
> >>> template: multiple-emr 
> >>> startTime: 03:00:00 
> >>> definitions: 
> >>> - product: web_v2 
> >>>   suite: websuite 
> >>>   {% include default_step.yml %}         # how to include 
> >>> "default_step.yml"? 
> >>> - product: db2 
> >>>    suite: dbsuite 
> >>>   {% include default_step.yml %} 
> >>> - product: custom 
> >>>   suite: customsuite 
> >>>   {% include custom_step.yml %} 
> >>> ... x 40 
> >>> 
> >>> 
> >>> And here's the contents of "default_step.yml": 
> >>> 
> >>> s3_precondition: "/raw/data/#{node.myDate}/{{ product }}/" 
> >>> step: 
> >>> - "s3://my-bucket/artifacts/emr-jar-2.1.1.jar" 
> >>> - "com.example.SampleEMR" 
> >>> - "-Dinput=s3n://example-{{ env }}-data{{ s3_precondition | 
> >>> replace('node.', '') }}*{{ suite }}_#{myDate}.*.gz" 
> >>> - "-Doutput=s3n://example-{{ env 
> }}-data/intermediate/data/#{myDate}/{{ 
> >>> product }}/{{ suite }}/", 
> >>> - "-DoutputFormat=json", 
> >>> ... 
> >>> 
> >>> 
> >>> How can I achieve this with ansible? 
> >>> 
> >>> Mark 
> >>> 
> >>> -- 
> >>> You received this message because you are subscribed to the Google 
> Groups 
> >>> "Ansible Project" group. 
> >>> To unsubscribe from this group and stop receiving emails from it, send 
> an 
> >>> email to [email protected]. 
> >>> To post to this group, send email to [email protected]. 
> >>> To view this discussion on the web visit 
> >>> 
> https://groups.google.com/d/msgid/ansible-project/cbe7f116-aadf-4d57-94ea-712834cb498a%40googlegroups.com.
>  
>
> >>> For more options, visit https://groups.google.com/d/optout. 
> >> 
> >> 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "Ansible Project" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an 
> > email to [email protected] <javascript:>. 
> > To post to this group, send email to [email protected] 
> <javascript:>. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/ansible-project/a6197878-8366-4137-8af2-4cbc3b368e07%40googlegroups.com.
>  
>
> > 
> > For more options, visit https://groups.google.com/d/optout. 
>
>
>
> -- 
> Brian Coca 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/ansible-project/ba8f6a53-67cb-4949-aa2b-029265504770%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to