[ 
https://issues.apache.org/jira/browse/TEZ-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011852#comment-14011852
 ] 

Mohammad Kamrul Islam commented on TEZ-1106:
--------------------------------------------

{quote}
In this case, I think it might be useful as we should be trying to resolve the 
path only once.
{quote}

I think we can ensure it either by adding a new property into Conf or by 
updating the existing Conf property for staging dir  that you originally hinted.

{quote}
Understood. However, why does the user or any of the framework code need to 
access basePath()? Shouldn't all public access be to the system path dir i.e 
the one with .tez/appId/ ?
{quote}

Currently client job submission checks if the base directory exists. Only usage 
i can see at:
TezClientUtils:createApplicationSubmissionContext ()
{noformat}
 FileSystem fs = TezClientUtils.ensureStagingDirExists(conf,
        amConfig.getStagingDir());
{noformat}
{quote}
My assumption here is that that once the AM is running, there should never be 
any situation where the staging dir does not exist. If it has disappeared, it 
means something went wrong or someone else manually deleted it. Would I be 
correct in the above assumption? If yes, this implies that the code in the AM 
should never be silently creating the dir. Likewise, are there any points in 
the client where the expectation is that the dir should already exist?
{quote}

Yes. we should create it only once and first time. How can we make sure if the 
creation  is the first time?

Option1 : Create yet another new property in the conf and store it there  after 
first calculation.

Option 2:Write a special method to create and return the System stage dir. It 
should be called once during job submission. All other cases, no default 
creation. Create if doesn't exists and and LOG.warn a message.

Preference?
 





> Tez framework should use a unique subdir when creating new files in staging  
> -----------------------------------------------------------------------------
>
>                 Key: TEZ-1106
>                 URL: https://issues.apache.org/jira/browse/TEZ-1106
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Mohammad Kamrul Islam
>         Attachments: TEZ-1106.1.patch, TEZ-1106.2.patch, TEZ-1106.3.patch, 
> TEZ-1106.4.patch
>
>
> Currently the files are created in different sub-directories. It is hard to 
> manage and cleanup at the end.
> The proposal is to create a new subdir  : $STAGE_DIR/<APP_ID>/
> All recovery files will go under  : $STAGE_DIR/<APP_ID>/recovery/<attemp_num>/
> All confs will go under:  $STAGE_DIR/<APP_ID>/conf/
> All dagplans will go:  $STAGE_DIR/<APP_ID>/dag_id/plan/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to