[ https://issues.apache.org/jira/browse/TEZ-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011852#comment-14011852 ]
Mohammad Kamrul Islam commented on TEZ-1106: -------------------------------------------- {quote} In this case, I think it might be useful as we should be trying to resolve the path only once. {quote} I think we can ensure it either by adding a new property into Conf or by updating the existing Conf property for staging dir that you originally hinted. {quote} Understood. However, why does the user or any of the framework code need to access basePath()? Shouldn't all public access be to the system path dir i.e the one with .tez/appId/ ? {quote} Currently client job submission checks if the base directory exists. Only usage i can see at: TezClientUtils:createApplicationSubmissionContext () {noformat} FileSystem fs = TezClientUtils.ensureStagingDirExists(conf, amConfig.getStagingDir()); {noformat} {quote} My assumption here is that that once the AM is running, there should never be any situation where the staging dir does not exist. If it has disappeared, it means something went wrong or someone else manually deleted it. Would I be correct in the above assumption? If yes, this implies that the code in the AM should never be silently creating the dir. Likewise, are there any points in the client where the expectation is that the dir should already exist? {quote} Yes. we should create it only once and first time. How can we make sure if the creation is the first time? Option1 : Create yet another new property in the conf and store it there after first calculation. Option 2:Write a special method to create and return the System stage dir. It should be called once during job submission. All other cases, no default creation. Create if doesn't exists and and LOG.warn a message. Preference? > Tez framework should use a unique subdir when creating new files in staging > ----------------------------------------------------------------------------- > > Key: TEZ-1106 > URL: https://issues.apache.org/jira/browse/TEZ-1106 > Project: Apache Tez > Issue Type: Bug > Reporter: Mohammad Kamrul Islam > Assignee: Mohammad Kamrul Islam > Attachments: TEZ-1106.1.patch, TEZ-1106.2.patch, TEZ-1106.3.patch, > TEZ-1106.4.patch > > > Currently the files are created in different sub-directories. It is hard to > manage and cleanup at the end. > The proposal is to create a new subdir : $STAGE_DIR/<APP_ID>/ > All recovery files will go under : $STAGE_DIR/<APP_ID>/recovery/<attemp_num>/ > All confs will go under: $STAGE_DIR/<APP_ID>/conf/ > All dagplans will go: $STAGE_DIR/<APP_ID>/dag_id/plan/ -- This message was sent by Atlassian JIRA (v6.2#6252)