[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-15 Thread Dmitry Lysnichenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Lysnichenko updated AMBARI-17198:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed
To https://git-wip-us.apache.org/repos/asf/ambari.git
   8de7bbd..05cff45  branch-2.4 -> branch-2.4
   4edadd7..2de48e4  trunk -> trunk


> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Assignee: Dmitry Lysnichenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch, AMBARI-17198.patch.1
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-15 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Attachment: AMBARI-17198.patch.1

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch, AMBARI-17198.patch.1
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-15 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Attachment: (was: AMBARI-17198.patch.1)

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch, AMBARI-17198.patch.1
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-15 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Attachment: (was: AMBARI-17198.patch.1)

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch, AMBARI-17198.patch.1
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-15 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Attachment: AMBARI-17198.patch.1

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch, AMBARI-17198.patch.1
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-14 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Attachment: AMBARI-17198.patch.1

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch, AMBARI-17198.patch.1
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-14 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Attachment: (was: AMBARI-17198.patch.1)

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-14 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Attachment: AMBARI-17198.patch.1

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch, AMBARI-17198.patch.1
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-13 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Status: Patch Available  (was: Open)

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-17198) Failure in mahout package installation upon retry is not correctly reported causing EU to fail

2016-06-13 Thread Dmytro Grinenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-17198:
-
Summary: Failure in mahout package installation upon retry is not correctly 
reported causing EU to fail  (was: BUG-56045 Failure in mahout package 
installation upon retry is not correctly reported causing EU to fail)

> Failure in mahout package installation upon retry is not correctly reported 
> causing EU to fail
> --
>
> Key: AMBARI-17198
> URL: https://issues.apache.org/jira/browse/AMBARI-17198
> Project: Ambari
>  Issue Type: Bug
>  Components: ambari-server
>Affects Versions: 2.4.0
>Reporter: Dmytro Grinenko
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: AMBARI-17198.patch
>
>
> *Steps*
> 1. With Ambari 2.2.2 build, deploy HDP 2.4.0.0 cluster
> 2. Register bits for HDP-2.4.2.0-195 and start Installation of packages
> 3. Observed an error in first attempt of package install on one of the host
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-560.txt
> No handlers could be found for logger "root"
> 2016-04-14 01:22:09,756 - Caught signal 15, will handle it gracefully. 
> Compute the actual version if possible before exiting.
> 2016-04-14 01:22:09,785 - Package Manager failed to install packages. Error: 
> (4, 'Interrupted system call')
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", 
> line 386, in install_packages
> retry_count=agent_stack_retry_count)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 154, in __init__
> self.env.run()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run
> self.run_action(resource, action)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action
> provider_action()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 54, in action_install
> self.install_package(package_name, self.resource.use_repos, 
> self.resource.skip_repos)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 45, in install_package
> active_base_repos = self.get_active_base_repos()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/zypper.py",
>  line 73, in get_active_base_repos
> (code, output) = self.call_with_retries(LIST_ACTIVE_REPOS_CMD)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 80, in call_with_retries
> return self._call_with_retries(cmd, is_checked=False, **kwargs)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py",
>  line 91, in _call_with_retries
> code, out = func(cmd, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 70, in inner
> result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 105, in call
> tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 240, in _call
> ready, _, _ = select.select(read_set, [], [], 1)
> error: (4, 'Interrupted system call')
>  Python script has been killed due to timeout after waiting 1800 secs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)