Re: [ansible-devel] Ansilbe read all variable even they do not need for task (perfomance issue with lookup_plugins)

2016-09-21 Thread Ben Roubicek
For others that may come across this, the bug was fixed.  Comments here 
https://github.com/ansible/ansible/issues/17024

On Monday, September 12, 2016 at 1:31:32 PM UTC-7, Ben Roubicek wrote:
>
> Bug report here: https://github.com/ansible/ansible/issues/17024
>
> On Monday, September 12, 2016 at 8:55:20 AM UTC-7, Ben Roubicek wrote:
>>
>> Tobias, I must have jumped the gun on Friday, apologies. Looks like its 
>> actually a different commit causing the issue.  Sorry :-)
>>
>> Culprit commit:  
>> https://github.com/ansible/ansible/commit/6b286ee0c88f3252a27ff94f39d794566c3f7d6f#diff-76ffc0551f8cf3d6255500316568e60bR412
>>
>> My scenario is that I have a inventory/foo/group_vars/all/hashivault.yml 
>> file that specifies 100+ secrets that are to be retrieved via a custom 
>> lookup plugin.  Before this commit, evaluation of the variables would only 
>> occur when they are used.  With this commit applied, each task is 
>> re-evaluating the variables causing thousands of calls during the play.
>>
>> My lookup calls look like:
>> db_password: "{{ lookup('vault', 'secret/db_password', 'value') }}"
>>
>>
>> Anyway, tinkering with the code a bit.
>> in lib/ansible/executor/task_executor.py
>> By simply changing 
>> self._connection.set_host_overrides(host=self._host, hostvars=variables.
>> get('hostvars', {}).get(self._host.name, {}))
>>
>> back to: 
>> self._connection.set_host_overrides(host=self._host)
>>
>> Reduces the calls back to only those required by the task.
>>
>> I dont believe I understand the code well enough to make a PR, so I would 
>> appreciate some advice on how best to proceed with a fix (assuming this was 
>> an unintended side-effect of this commit).
>>
>> Thanks
>> -Ben
>>
>> On Saturday, September 10, 2016 at 1:51:23 AM UTC-7, Tobias Wolf wrote:
>>>
>>> Can you point out how you think that particular commit affects how often 
>>> lookups are evaluated? I don't see it.
>>>
>>> All that is doing is to make a os.listdir (essentially an ls) in the 
>>> host_vars and group_vars directories and then checking group names and host 
>>> names against this listing instead of checking on the file system causing a 
>>> storm of lstats.
>>>
>>> Maybe you were mistaken by the word "lookup"? Lookup refers to `stat` 
>>> here.
>>>
>>> regards,
>>> Tobias (author of that commit)
>>>
>>> On Saturday, September 10, 2016 at 2:31:56 AM UTC+2, Ben Roubicek wrote:
>>>>
>>>> Also, I found the commit that changed this behavior.
>>>>
>>>> https://github.com/ansible/ansible/commit/7287effb5ce241ce645d61e55e981edc73fa382a
>>>>
>>>>
>>>>
>>>> On Friday, September 9, 2016 at 5:01:23 PM UTC-7, Ben Roubicek wrote:
>>>>>
>>>>> Following up here. The behavior of lookup plugin evaluation has 
>>>>> definitely changed post 2.1.0.0.  In 2.1.0.0, lookup calls seem to be 
>>>>> lazy 
>>>>> evaluated so if you perform an HTTP lookup to Hashicorp Vault, for 
>>>>> example, 
>>>>> that call is only made when the variable is used in the play.  Post 
>>>>> 2.1.0.0, all variables seem to be evaluated for each task!  Thats a huge 
>>>>> change of behavior causing considerable performance problems for us since 
>>>>> we relay on gathering secrets from an external source specified at the 
>>>>> inventory level.
>>>>>
>>>>> Is there an explanation for why that change was made?  Can it be 
>>>>> changed back to the previous behavior?
>>>>>
>>>>> On Thursday, August 25, 2016 at 12:05:18 AM UTC-7, jhawkesworth wrote:
>>>>>>
>>>>>> I wonder if you can use uri delegated to your localhost to collect 
>>>>>> your facts, and use set_fact to make them available to the play.
>>>>>>
>>>>>> Also I wonder if a custom setup module might be able to do what you 
>>>>>> need? If you can get the vars you need into facts you can use one of the 
>>>>>> existing fact caching mechanisms 
>>>>>> http://docs.ansible.com/ansible/playbooks_variables.html#fact-caching
>>>>>>
>>>>>> Jon
>>>>>>
>>>>>> On Tuesday, August 23, 2016 at 2:54:36 PM UTC+1, Artyom Aleksandrov 
>>>>>> wrote:
>>>>>>>

Re: [ansible-devel] Ansilbe read all variable even they do not need for task (perfomance issue with lookup_plugins)

2016-09-12 Thread Ben Roubicek
Bug report here: https://github.com/ansible/ansible/issues/17024

On Monday, September 12, 2016 at 8:55:20 AM UTC-7, Ben Roubicek wrote:
>
> Tobias, I must have jumped the gun on Friday, apologies. Looks like its 
> actually a different commit causing the issue.  Sorry :-)
>
> Culprit commit:  
> https://github.com/ansible/ansible/commit/6b286ee0c88f3252a27ff94f39d794566c3f7d6f#diff-76ffc0551f8cf3d6255500316568e60bR412
>
> My scenario is that I have a inventory/foo/group_vars/all/hashivault.yml 
> file that specifies 100+ secrets that are to be retrieved via a custom 
> lookup plugin.  Before this commit, evaluation of the variables would only 
> occur when they are used.  With this commit applied, each task is 
> re-evaluating the variables causing thousands of calls during the play.
>
> My lookup calls look like:
> db_password: "{{ lookup('vault', 'secret/db_password', 'value') }}"
>
>
> Anyway, tinkering with the code a bit.
> in lib/ansible/executor/task_executor.py
> By simply changing 
> self._connection.set_host_overrides(host=self._host, hostvars=variables.
> get('hostvars', {}).get(self._host.name, {}))
>
> back to: 
> self._connection.set_host_overrides(host=self._host)
>
> Reduces the calls back to only those required by the task.
>
> I dont believe I understand the code well enough to make a PR, so I would 
> appreciate some advice on how best to proceed with a fix (assuming this was 
> an unintended side-effect of this commit).
>
> Thanks
> -Ben
>
> On Saturday, September 10, 2016 at 1:51:23 AM UTC-7, Tobias Wolf wrote:
>>
>> Can you point out how you think that particular commit affects how often 
>> lookups are evaluated? I don't see it.
>>
>> All that is doing is to make a os.listdir (essentially an ls) in the 
>> host_vars and group_vars directories and then checking group names and host 
>> names against this listing instead of checking on the file system causing a 
>> storm of lstats.
>>
>> Maybe you were mistaken by the word "lookup"? Lookup refers to `stat` 
>> here.
>>
>> regards,
>> Tobias (author of that commit)
>>
>> On Saturday, September 10, 2016 at 2:31:56 AM UTC+2, Ben Roubicek wrote:
>>>
>>> Also, I found the commit that changed this behavior.
>>>
>>> https://github.com/ansible/ansible/commit/7287effb5ce241ce645d61e55e981edc73fa382a
>>>
>>>
>>>
>>> On Friday, September 9, 2016 at 5:01:23 PM UTC-7, Ben Roubicek wrote:
>>>>
>>>> Following up here. The behavior of lookup plugin evaluation has 
>>>> definitely changed post 2.1.0.0.  In 2.1.0.0, lookup calls seem to be lazy 
>>>> evaluated so if you perform an HTTP lookup to Hashicorp Vault, for 
>>>> example, 
>>>> that call is only made when the variable is used in the play.  Post 
>>>> 2.1.0.0, all variables seem to be evaluated for each task!  Thats a huge 
>>>> change of behavior causing considerable performance problems for us since 
>>>> we relay on gathering secrets from an external source specified at the 
>>>> inventory level.
>>>>
>>>> Is there an explanation for why that change was made?  Can it be 
>>>> changed back to the previous behavior?
>>>>
>>>> On Thursday, August 25, 2016 at 12:05:18 AM UTC-7, jhawkesworth wrote:
>>>>>
>>>>> I wonder if you can use uri delegated to your localhost to collect 
>>>>> your facts, and use set_fact to make them available to the play.
>>>>>
>>>>> Also I wonder if a custom setup module might be able to do what you 
>>>>> need? If you can get the vars you need into facts you can use one of the 
>>>>> existing fact caching mechanisms 
>>>>> http://docs.ansible.com/ansible/playbooks_variables.html#fact-caching
>>>>>
>>>>> Jon
>>>>>
>>>>> On Tuesday, August 23, 2016 at 2:54:36 PM UTC+1, Artyom Aleksandrov 
>>>>> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>>
>>>>>> Thank you for answer. It's bad news because all variables which *can 
>>>>>> be* used by host  reads  on each task. 
>>>>>> Variables in *all* group of course provide most reads but problem 
>>>>>> also applies to other groups and host_vars. =(
>>>>>>  
>>>>>> > Can you give a different example of that being in a group the 
>>>>>> current host is not a member of and ye

Re: [ansible-devel] Ansilbe read all variable even they do not need for task (perfomance issue with lookup_plugins)

2016-09-12 Thread Ben Roubicek
Tobias, I must have jumped the gun on Friday, apologies. Looks like its 
actually a different commit causing the issue.  Sorry :-)

Culprit commit: 
 
https://github.com/ansible/ansible/commit/6b286ee0c88f3252a27ff94f39d794566c3f7d6f#diff-76ffc0551f8cf3d6255500316568e60bR412

My scenario is that I have a inventory/foo/group_vars/all/hashivault.yml 
file that specifies 100+ secrets that are to be retrieved via a custom 
lookup plugin.  Before this commit, evaluation of the variables would only 
occur when they are used.  With this commit applied, each task is 
re-evaluating the variables causing thousands of calls during the play.

My lookup calls look like:
db_password: "{{ lookup('vault', 'secret/db_password', 'value') }}"


Anyway, tinkering with the code a bit.
in lib/ansible/executor/task_executor.py
By simply changing 
self._connection.set_host_overrides(host=self._host, hostvars=variables.get(
'hostvars', {}).get(self._host.name, {}))

back to: 
self._connection.set_host_overrides(host=self._host)

Reduces the calls back to only those required by the task.

I dont believe I understand the code well enough to make a PR, so I would 
appreciate some advice on how best to proceed with a fix (assuming this was 
an unintended side-effect of this commit).

Thanks
-Ben

On Saturday, September 10, 2016 at 1:51:23 AM UTC-7, Tobias Wolf wrote:
>
> Can you point out how you think that particular commit affects how often 
> lookups are evaluated? I don't see it.
>
> All that is doing is to make a os.listdir (essentially an ls) in the 
> host_vars and group_vars directories and then checking group names and host 
> names against this listing instead of checking on the file system causing a 
> storm of lstats.
>
> Maybe you were mistaken by the word "lookup"? Lookup refers to `stat` here.
>
> regards,
> Tobias (author of that commit)
>
> On Saturday, September 10, 2016 at 2:31:56 AM UTC+2, Ben Roubicek wrote:
>>
>> Also, I found the commit that changed this behavior.
>>
>> https://github.com/ansible/ansible/commit/7287effb5ce241ce645d61e55e981edc73fa382a
>>
>>
>>
>> On Friday, September 9, 2016 at 5:01:23 PM UTC-7, Ben Roubicek wrote:
>>>
>>> Following up here. The behavior of lookup plugin evaluation has 
>>> definitely changed post 2.1.0.0.  In 2.1.0.0, lookup calls seem to be lazy 
>>> evaluated so if you perform an HTTP lookup to Hashicorp Vault, for example, 
>>> that call is only made when the variable is used in the play.  Post 
>>> 2.1.0.0, all variables seem to be evaluated for each task!  Thats a huge 
>>> change of behavior causing considerable performance problems for us since 
>>> we relay on gathering secrets from an external source specified at the 
>>> inventory level.
>>>
>>> Is there an explanation for why that change was made?  Can it be changed 
>>> back to the previous behavior?
>>>
>>> On Thursday, August 25, 2016 at 12:05:18 AM UTC-7, jhawkesworth wrote:
>>>>
>>>> I wonder if you can use uri delegated to your localhost to collect your 
>>>> facts, and use set_fact to make them available to the play.
>>>>
>>>> Also I wonder if a custom setup module might be able to do what you 
>>>> need? If you can get the vars you need into facts you can use one of the 
>>>> existing fact caching mechanisms 
>>>> http://docs.ansible.com/ansible/playbooks_variables.html#fact-caching
>>>>
>>>> Jon
>>>>
>>>> On Tuesday, August 23, 2016 at 2:54:36 PM UTC+1, Artyom Aleksandrov 
>>>> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>>
>>>>> Thank you for answer. It's bad news because all variables which *can 
>>>>> be* used by host  reads  on each task. 
>>>>> Variables in *all* group of course provide most reads but problem 
>>>>> also applies to other groups and host_vars. =(
>>>>>  
>>>>> > Can you give a different example of that being in a group the 
>>>>> current host is not a member of and yet still being referenced?
>>>>>
>>>>> No. There is no problem with this case.
>>>>>
>>>>> Have you thought about changing this behavior or implementing runtime 
>>>>> cache for variables?
>>>>>
>>>>>
>>>>> On Wednesday, August 17, 2016 at 6:05:29 PM UTC+3, James Cammarata 
>>>>> wrote:
>>>>>>
>>>>>> Sorry for the delayed response, but in the case of having it in the 
>>>>>> `all` group, that