[
https://issues.apache.org/jira/browse/TEZ-717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078435#comment-14078435
]
Jonathan Eagles commented on TEZ-717:
-------------------------------------
{quote}
Handling of the staging-directory
TezClient writes out several files (dag plan in non session mode, additional
tez configuration, local resources) to the staging-directory which are then
meant to be localized by YARN, and available to the DAGAppMaster on local-fs.
I'm assuming this is meant to run with the staging-directory configured to the
local FS - since there's no available HDFS cluster. The local-client itself
needs to ensure this.
{quote}
Is there a way to set the default file system programmatically? I've been
setting fs.defaultFS to file:/// in tez-site.xml for now, but would love to do
what you are suggesting.
{quote}
Handling localized resources
Files to be localized can be provided at application submission time, or during
DAG submission. It looks like the DAGSubmission case may be handled - since
that's handled by 're-localization' within Tez itself. The initial local
resources, however, are not handled. This could be addressed at a later point
though, with the assumption being that the client is started with relevant jars
in the classpath.
{quote}
Left this alone for now. If you don't think we need this in the first pass, I'm
tempted to leave it for later. Unless you have a good solution idea.
{quote}
Handling local-directories
local-directories are normally setup by the NodeManager. I believe the patch on
TEZ-707 expects this to be set. However, I don't see any local-dir setup in
this patch. Is that meant to be done here ? Ideally, this should be something
other than the staging-dir, or a dir within the staging directory.
{quote}
I've set the LOCAL_DIRS environment variable to address this. Something more
elaborate is needed. What are you thoughts on this?
{quote}
Handling credentials
The ApplicationSubmissionContext also has Credentials which are required to
access the app master. When running with YARN - these credentials will be
localized into a specific path and read back by UserGroupInformation itself.
Looks like this is something that will need to be passed in as explicit
parameters to the DAGAppmaster, since the YARN localization is not available.
This could be a separate DAGAppMaster constructor meant just for local mode. I
see some code to generate a sessionToken within DAGAppMaster - handling client
credentials correctly, should get rid of this.
{quote}
Read the amCredential byte stream out of the AMContainerSpec and populated the
current user's credentials. Remove the unnecessary hack to DAGAppMaster
bq. FrameworkClient creation - this could just accept a Configuration
instance or a boolean flag to indicate whether a local client is created or
not. I'd prefer avoiding System properties to determine this information.
Passed a configuration into the constructor and let the FrameworkClient decide
what type of framework is created. Removed the setting and reading of system
properties
bq. Similarly for TezClient local mode determination.
Likewise as above
bq. Don't see any usage of EnvironmentUpdateUtils - is that no longer
required. (local resource reading is now using getAbsolutePath - is that the
workaround? )
Good catch. Added the LOCAL_DIRS and LOG_DIRS back for now.
bq. getApplicationReport - a switch/case statement would be much easier to
read.
Yes. Much easier to read. Pulled it out into its own function to better isolate
the conversion
bq. LocalClient has everything as static - don't think any of the fields
need to be static.
At first, something like this wasn't necessary. Now that TezClient and
DAGClient both are creating Local FrameworkClients, they need to share the
dapAppMaster, clientHandler, and others. I'm ok with another approach such as a
global cache of objects. Thoughts?
bq. "System.setProperty("user.dir"" - not sure why this is required.
The works in tandem with absolute path since only copies of system properties
are changed. If the java resolves the path with absolute path it will return
the correct "working directory". Otherwise, if the underlying system resolves
the path, the working directory and application launch will be used. The maybe
a solution using FileSytem to do all reads, but that seems excessive.
bq. Determination of session mode can be done if a Configuration is made
available to the LocalClient via FrameworkClient, instead of parsing it out
from the launch options.
Much cleaner. Fixed.
bq. Mentioned earlier - but there's a bunch of new SessionToken creation
code in the DAGAppMaster which should not exist.
Cleaned up.
> Client changes to allow local mode DAG submission
> -------------------------------------------------
>
> Key: TEZ-717
> URL: https://issues.apache.org/jira/browse/TEZ-717
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Siddharth Seth
> Assignee: Jonathan Eagles
> Priority: Blocker
> Attachments: TEZ-717-v10.patch, TEZ-717-v12.patch, TEZ-717-v6.patch,
> TEZ-717-v8.patch, TEZ-717-v9.patch, TEZ-717.patch, TEZ-717.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)