[jira] [Comment Edited] (YARN-10427) Duplicate Job IDs in SLS output
[ https://issues.apache.org/jira/browse/YARN-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253589#comment-17253589 ] Szilard Nemeth edited comment on YARN-10427 at 12/22/20, 4:29 PM: -- Hi [~werd.up], Thanks for reporting this issue and congratulations for the first reported Hadoop YARN jira. {quote}In the process of attempting to verify and validate the SLS output, I've encountered a number of issues including runtime exceptions and bad output. {quote} I read through your observations and spent some time to play around with SLS. If you encountered other issues, please report other jiras if you have some time. As the process of running SLS involved some repetitive tasks like uploading configs to the remote machine, launch SLS, save the resulted logs..., I created some scripts into my public [Github repo here|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427] Let me summarize what the scripts are doing: 1. [config dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/config]: This is the exact same configuration file set that you attached to this jira, with one exception of the log4j.properties file, that turns on DEBUG logging for SLS. 2. [upstream-patches dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/upstream-patches]: This is the directory of the logging patch that helped me see the issues more clearly. My code changes are also pushed to my [Hadoop fork|https://github.com/szilard-nemeth/hadoop/tree/YARN-10427-investigation] 3. [scripts dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts]: This is the directory that contains all my scripts to build Hadoop + launch SLS and save produced logs to the local machine. As I have been working on a remote cluster, there's a script called [setup-vars-upstream.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/setup-vars-upstream.sh] that contains some configuration values for the remote cluster + some local directories. If you want to use the scripts, all you need to do is to replace the configs in this file according to your environment. 3.1 [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]: This is the script that builds Hadoop according to the environment variables and launches the SLS suite on the remote cluster. 3.2 [start-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/start-sls.sh]: This is the most important script as this will be executed on the remote machine. I think the script itself is straightforward enough, but let me briefly list what it does: - This script assumes that the Hadoop dist package is copied to the remote machine (this was done by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]) - Cleans up all Hadoop-related directories and extracts the Hadoop dist tar.gz - Copies the config to Hadoop's config dirs so SLS will use these particular configs - Launches SLS by starting slsrun.sh with the appropriate CLI swithces - Greps for some useful data in the resulted SLS log file. 3.3 [launch-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/launch-sls.sh]: This script is executed by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh] as its last step. Once the start-sls.sh is finished, the [save-latest-sls-logs.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/save-latest-sls-logs.sh] script is started. As the name implies it saves the latest SLS log dir and SCPs it to the local machine. The target directory of the local machine is determined by the config
[jira] [Comment Edited] (YARN-10427) Duplicate Job IDs in SLS output
[ https://issues.apache.org/jira/browse/YARN-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253589#comment-17253589 ] Szilard Nemeth edited comment on YARN-10427 at 12/22/20, 4:26 PM: -- Hi [~werd.up], Thanks for reporting this issue and congratulations for the first reported Hadoop YARN jira. {quote}In the process of attempting to verify and validate the SLS output, I've encountered a number of issues including runtime exceptions and bad output. {quote} I read through your observations and spent some time to play around with SLS. If you encountered other issues, please report other jiras if you have some time. As the process of running SLS involved some repetitive tasks like uploading configs to the remote machine, launch SLS, save the resulted logs..., I created some scripts into my public [Github repo here|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427] Let me summarize what are these scripts are doing: 1. [config dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/config]: This is the exact same configuration file set that you attached to this jira, with one exception of the log4j.properties file, that turns on DEBUG logging for SLS. 2. [upstream-patches dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/upstream-patches]: This is the directory of the logging patch that helped me see the issues more clearly. My code changes are also pushed to my [Hadoop fork|https://github.com/szilard-nemeth/hadoop/tree/YARN-10427-investigation] 3. [scripts dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts]: This is the directory that contains all my scripts to build Hadoop + launch SLS and save produced logs to the local machine. As I have been working on a remote cluster, there's a script called [setup-vars-upstream.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/setup-vars-upstream.sh] that contains some configuration values for the remote cluster + some local directories. If you want to use the scripts, all you need to do is to replace the configs in this file according to your environment. 3.1 [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]: This is the script that builds Hadoop according to the environment variables and launches the SLS suite on the remote cluster. 3.2 [start-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/start-sls.sh]: This is the most important script as this will be executed on the remote machine. I think the script itself is straightforward enough, but let me briefly list what it does: - This script assumes that the Hadoop dist package is copied to the remote machine (this was done by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]) - Cleans up all Hadoop-related directories and extracts the Hadoop dist tar.gz - Copies the config to Hadoop's config dirs so SLS will use these particular configs - Launches SLS by starting slsrun.sh with the appropriate CLI swithces - Greps for some useful data in the resulted SLS log file. 3.3 [launch-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/launch-sls.sh]: This script is executed by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh] as its last step. Once the start-sls.sh is finished, the [save-latest-sls-logs.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/save-latest-sls-logs.sh] script is started. As the name implies it saves the latest SLS log dir and SCPs it to the local machine. The target directory of the local machine is determined by the config
[jira] [Comment Edited] (YARN-10427) Duplicate Job IDs in SLS output
[ https://issues.apache.org/jira/browse/YARN-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253589#comment-17253589 ] Szilard Nemeth edited comment on YARN-10427 at 12/22/20, 4:26 PM: -- Hi [~werd.up], Thanks for reporting this issue and congratulations for the first reported Hadoop YARN jira. {quote}In the process of attempting to verify and validate the SLS output, I've encountered a number of issues including runtime exceptions and bad output. {quote} I read through your observations and spent some time to play around with SLS. If you encountered other issues, please report other jiras if you have some time. As the process of running SLS involved some repetitive tasks like uploading configs to the remote machine, launch SLS, save the resulted logs..., I created some scripts into my public [Github repo here|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427] Let me summarize what the scripts are doing: 1. [config dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/config]: This is the exact same configuration file set that you attached to this jira, with one exception of the log4j.properties file, that turns on DEBUG logging for SLS. 2. [upstream-patches dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/upstream-patches]: This is the directory of the logging patch that helped me see the issues more clearly. My code changes are also pushed to my [Hadoop fork|https://github.com/szilard-nemeth/hadoop/tree/YARN-10427-investigation] 3. [scripts dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts]: This is the directory that contains all my scripts to build Hadoop + launch SLS and save produced logs to the local machine. As I have been working on a remote cluster, there's a script called [setup-vars-upstream.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/setup-vars-upstream.sh] that contains some configuration values for the remote cluster + some local directories. If you want to use the scripts, all you need to do is to replace the configs in this file according to your environment. 3.1 [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]: This is the script that builds Hadoop according to the environment variables and launches the SLS suite on the remote cluster. 3.2 [start-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/start-sls.sh]: This is the most important script as this will be executed on the remote machine. I think the script itself is straightforward enough, but let me briefly list what it does: - This script assumes that the Hadoop dist package is copied to the remote machine (this was done by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]) - Cleans up all Hadoop-related directories and extracts the Hadoop dist tar.gz - Copies the config to Hadoop's config dirs so SLS will use these particular configs - Launches SLS by starting slsrun.sh with the appropriate CLI swithces - Greps for some useful data in the resulted SLS log file. 3.3 [launch-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/launch-sls.sh]: This script is executed by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh] as its last step. Once the start-sls.sh is finished, the [save-latest-sls-logs.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/save-latest-sls-logs.sh] script is started. As the name implies it saves the latest SLS log dir and SCPs it to the local machine. The target directory of the local machine is determined by the config
[jira] [Comment Edited] (YARN-10427) Duplicate Job IDs in SLS output
[ https://issues.apache.org/jira/browse/YARN-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253589#comment-17253589 ] Szilard Nemeth edited comment on YARN-10427 at 12/22/20, 3:57 PM: -- Hi [~werd.up], Thanks for reporting this issue and congratulations for the first reported Hadoop YARN jira. {quote}In the process of attempting to verify and validate the SLS output, I've encountered a number of issues including runtime exceptions and bad output. {quote} I read through your observations and spent some time to play around with SLS. If you encountered other issues, please report other jiras if you have some time. As the process of running SLS involved some repetitive tasks like uploading configs to the remote machine, launch SLS, save the resulted logs..., I created some scripts into my public [Github repo here|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427] Let me break summarize what are these scripts are doing: 1. [config dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/config]: This is the exact same configuration file set that you attached to this jira, with one exception of the log4j.properties file, that turns on DEBUG logging for SLS. 2. [upstream-patches dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/upstream-patches]: This is the directory of the logging patch that helped me see the issues more clearly. My code changes are also pushed to my [Hadoop fork|https://github.com/szilard-nemeth/hadoop/tree/YARN-10427-investigation] 3. [scripts dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts]: This is the directory that contains all my scripts to build Hadoop + launch SLS and save produced logs to the local machine. As I have been working on a remote cluster, there's a script called [setup-vars-upstream.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/setup-vars-upstream.sh] that contains some configuration values for the remote cluster + some local directories. If you want to use the scripts, all you need to do is to replace the configs in this file according to your environment. 3.1 [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]: This is the script that builds Hadoop according to the environment variables and launches the SLS suite on the remote cluster. 3.2 [start-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/start-sls.sh]: This is the most important script as this will be executed on the remote machine. I think the script itself is straightforward enough, but let me briefly list what it does: - This script assumes that the Hadoop dist package is copied to the remote machine (this was done by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]) - Cleans up all Hadoop-related directories and extracts the Hadoop dist tar.gz - Copies the config to Hadoop's config dirs so SLS will use these particular configs - Launches SLS by starting slsrun.sh with the appropriate CLI swithces - Greps for some useful data in the resulted SLS log file. 3.3 [launch-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/launch-sls.sh]: This script is executed by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh] as its last step. Once the start-sls.sh is finished, the [save-latest-sls-logs.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/save-latest-sls-logs.sh] script is started. As the name implies it saves the latest SLS log dir and SCPs it to the local machine. The target directory of the local machine is determined by the config
[jira] [Comment Edited] (YARN-10427) Duplicate Job IDs in SLS output
[ https://issues.apache.org/jira/browse/YARN-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17253589#comment-17253589 ] Szilard Nemeth edited comment on YARN-10427 at 12/22/20, 3:56 PM: -- Hi [~werd.up], Thanks for reporting this issue and congratulations for the first reported Hadoop YARN jira. {quote}In the process of attempting to verify and validate the SLS output, I've encountered a number of issues including runtime exceptions and bad output. {quote} I read through your observations and spent some time to play around with SLS. If you encountered other issues, please report other jiras if you have some time. As the process of running SLS involved some repetitive tasks like uploading configs to the remote machine, launch SLS, save the resulted logs..., I created some scripts into my public [Github repo here|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427] Let me break summarize what are these scripts are doing: 1. [config dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/config]: This is the exact same configuration file set that you attached to this jira, with one exception of the log4j.properties file, that turns on DEBUG logging for SLS. 2. [upstream-patches dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/upstream-patches]: This is the directory of the logging patch that helped me see the issues more clearly. My code changes are also pushed to my [Hadoop fork|https://github.com/szilard-nemeth/hadoop/tree/YARN-10427-investigation] 3. [scripts dir|https://github.com/szilard-nemeth/linux-env/tree/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts]: This is the directory that contains all my scripts to build Hadoop + launch SLS and save produced logs to the local machine. As I have been working on a remote cluster, there's a script called [setup-vars-upstream.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/setup-vars-upstream.sh] that contains some configuration values for the remote cluster + some local directories. If you want to use the scripts, all you need to do is to replace the configs in this file according to your environment. 3.1 [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]: This is the script that builds Hadoop according to the environment variables and launches the SLS suite on the remote cluster. 3.2 [start-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/start-sls.sh]: This is the most important script as this will be executed on the remote machine. I think the script itself is straightforward enough, but let me briefly list what it does: - This script assumes that the Hadoop dist package is copied to the remote machine (this was done by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh]) - Cleans up all Hadoop-related directories and extracts the Hadoop dist tar.gz - Copies the config to Hadoop's config dirs so SLS will use these particular configs - Launches SLS by starting slsrun.sh with the appropriate CLI swithces - Greps for some useful data in the resulted SLS log file. 3.3 [launch-sls.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/launch-sls.sh]: This script is executed by [build-and-launch.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/build-and-launch-sls.sh] as its last step. Once the start-sls.sh is finished, the [save-latest-sls-logs.sh|https://github.com/szilard-nemeth/linux-env/blob/ff84652b34bc23c1f88766f781f6648365becde5/workplace-specific/cloudera/investigations/YARN-10427/scripts/save-latest-sls-logs.sh] script is started. As the name implies it saves the latest SLS log dir and SCPs it to the local machine. The target directory of the local machine is determined by the config