> On June 16, 2016, 10:52 p.m., Andrew Onischuk wrote: > > ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py, > > line 63 > > <https://reviews.apache.org/r/48766/diff/2/?file=1421794#file1421794line63> > > > > Else should go into for, not if (in case break didn't execute). > > > > Also (if code == 0) line is not needed since checked_call checks for > > non-zero return code itself and throws a reasonable exception in that case. > > Andrew Onischuk wrote: > Just to be easier to understand here is what I mean: > > def get_mysql_pid_file(): > cmd = 'mysqladmin variables' > code, output = shell.checked_call(cmd) > > for line in output.splitlines(): > if (line.find("pid_file") != -1): > mysql_pid_file = line.split()[3].strip() > break > else: > raise Fail(format("Output of '{cmd}' didn't contain pid_file. > Output: {output}")) > return mysql_pid_file > > Andrew Onischuk wrote: > Juanjo sorry for bothering you so much with this issues. I just feel like > this is something which we should fix to be able debug problems in this place > more simply in future. > > Juanjo Marron wrote: > No problem Andrew! I totally understand the process and it does not > bother me at all. These reviews the best way to find a good solution. > > Correct me if I'm wrong, but I dont think this last solution is not > working since status action will fail to get the mysqld file. > Iterating line by line the mysqladmin variables output, which looks like > following: > > > +-----------------------------------------+-------------------------------------------------------------------------------------------+ > | Variable_name | Value > | > > +-----------------------------------------+-------------------------------------------------------------------------------------------+ > | auto_increment_increment | 1 > | > | auto_increment_offset | 1 > | > | autocommit | ON > | > | automatic_sp_privileges | ON > | > | back_log | 50 > | > | basedir | /usr/ > | > | big_tables | OFF > | > | binlog_cache_size | 32768 > | > > ..... > > | old_passwords | OFF > | > | open_files_limit | 1024 > | > | optimizer_prune_level | 1 > | > | optimizer_search_depth | 62 > | > | optimizer_switch | > index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on > | > | pid_file | /var/run/mysqld/mysqld.pid > | > | plugin_dir | /usr/lib64/mysql/plugin > | > | port | 3306 > | > | preload_buffer_size | 32768 > | > | profiling | OFF > | > > ..... > > if the first line does not include pid_file, else will run and hence > raise the Fail exception. > > Additionally, even if we solved this last issue with this proposal: > > def get_mysql_pid_file(): > mysql_pid_file = None > cmd = 'mysqladmin variables' > code, output =shell.checked_call(cmd) > for line in output.splitlines(): > if (line.find("pid_file") != -1): > mysql_pid_file = line.split()[3].strip() > break > if mysql_pid_file is None: > raise Fail(format("Output of '{cmd}' didn't contain pid_file. Output: > {output}")) > return mysql_pid_file > > we will be breaking the restart logic (I tested it). > > When mysql is stopped, mysqladmin variables will return error code (!=0) > and the following output: > > Output of 'mysqladmin variables' didn't contain pid_file. Output: > mysqladmin: connect to server at 'localhost' failed > error: 'Can't connect to local MySQL server through socket > '/var/lib/mysql/mysql.sock' (2)' > Check that mysqld is running and that the socket: > '/var/lib/mysql/mysql.sock' exists! > > And it shouldnt be considered as an error, just is a stopped status. > After that, status action should allow the start to run and complete the > restart action. > Now we are breaking it. Status when stopped is throwing exception on > shell.checked_call() and the start never runs, > > Start and Stop actions when run individually are still working fine they > are not aware of the pid file, just status checks it. > > Fot this reason, I still consider we shouldnt throw an exception if > mysqladmin variables returns error code. So using shell.call() should be > enough. > And just in the case when mysqladmin variables runs properly (code ==0), > search for pid_file and if not found that's when the Fail excpetion needs to > be raised. > > Also if mysqladmin variables fails mysql_pid_file = None still needs to > be retruned, because when checking the status and the service is stopped, the > check_process_status() will be able to read a None and return > ComponentNonRunning exception, which is the expected behavour. > > Said all that, this is my new proposal: > > def get_mysql_pid_file(): > mysql_pid_file = None > cmd = 'mysqladmin variables' > code, output = shell.call(cmd) > if code == 0: > for line in output.splitlines(): > if (line.find("pid_file") != -1): > mysql_pid_file = line.split()[3].strip() > break > if mysql_pid_file is None: > raise Fail("Output of '{0}' didn't contain pid_file. Output: > '{1}'").format(cmd, output) > return mysql_pid_file > > > I tested start, stop, and restart and seems to be working fine. > > Please, let me know what do you think and I will upload the patch if your > feedback is positive
Thanks for a detailed response Juanjo! 1)"Status when stopped is throwing exception on shell.checked_call() and the start never runs." I didn't realize that we run status check during start command. Can you point to that code? 2) Also another issue I see is that you removed only_if and not_if from start and stop. Which probably makes mysql start/stop indeponent. Fyi we have all the start stop commands idenpotent. Which means even if the component is stopped stop command should succeed the same for start. I think it would be much faster if we solve this in skype. Please add me: empower-666. - Andrew ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48766/#review138115 ----------------------------------------------------------- On June 16, 2016, 9:20 p.m., Juanjo Marron wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/48766/ > ----------------------------------------------------------- > > (Updated June 16, 2016, 9:20 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Andrew Onischuk, and Dmitro > Lisnichenko. > > > Bugs: AMBARI-13238 > https://issues.apache.org/jira/browse/AMBARI-13238 > > > Repository: ambari > > > Description > ------- > > The MySQL service status in mysql_service.py simply checks for a process with > name mysqld. In our environment, a different service ran another MySQL > instance on that node and as a result, the status of the MySQL service in > Hive showed green (because it could find a mysqld process) even though the > instance used by Hive wasn't started. That also made the "Start Service" > action for Hive fail, because the metastore service couldn't connect to the > MySQL database. > > The proposed fix makes the service check more robust by retrieving the > pid_file of the MySQL instance first by running "mysqladmin variables" and > parsing out the pid_file. Then it checks if the process exists. > > A new pacth is proposed based on the reviews added to the original one > (https://reviews.apache.org/r/39146/) > > The patch relies now on the pid (similar to other status check). mysql > pid_file is not known a priori (it is not a service property similar to other > ambari components) and it depends on the OS. > The pid file location is obatined with "mysqladmin variables" command (which > seems recognized by all the Ambari supported OS). > The Hive MySQL server instance is started without passing in any parameters > so It uses the default configuration. Any other MySQL instance that could be > running on the machine would need to be started with explicit config. > Hence just running mysqladmin will return the values for the Hive MySQL > instance. > The pid dile name is parsed by a new method which has been created to obatin > it from the mysqladmin variables return. > The status relies now on check_process_status() wich is already an ambari > function. > Unit Test have been modified accordingly to avoid the pgrep verification that > could create conflicts with others mysql instances working on the same node. > > > Diffs > ----- > > > ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py > 7862774 > ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py > 827f6f7 > > Diff: https://reviews.apache.org/r/48766/diff/ > > > Testing > ------- > > +1 overall. Here are the results of testing the latest attachment > http://issues.apache.org/jira/secure/attachment/12810584/AMBARI-13238.patch > against trunk revision . > +1 @author. The patch does not contain any @author tags. > +1 tests included. The patch appears to include 1 new or modified test files. > +1 javac. The applied patch does not increase the total number of javac > compiler warnings. > +1 release audit. The applied patch does not increase the total number of > release audit warnings. > +1 core tests. The patch passed unit tests in ambari-server. > Test results: > https://builds.apache.org/job/Ambari-trunk-test-patch/7367//testReport/ > Console output: > https://builds.apache.org/job/Ambari-trunk-test-patch/7367//console > This message is automatically generated. > > > Thanks, > > Juanjo Marron > >