[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739540#comment-13739540
 ] 

Hudson commented on YARN-1036:
--

FAILURE: Integrated in Hadoop-Hdfs-0.23-Build #699 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/699/])
YARN-1036. Distributed Cache gives inconsistent result if cache files get 
deleted from task tracker. Contributed by Mayank Bansal and Ravi Prakash 
(jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513636)
* /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTrackerImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java


 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 0.23.10

 Attachments: YARN-1036.branch-0.23.patch, 
 YARN-1036.branch-0.23.patch, YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-13 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738337#comment-13738337
 ] 

Jason Lowe commented on YARN-1036:
--

Agree with Ravi that we should focus on porting the change to 0.23 and fix any 
issues that also apply to trunk/branch-2 in a separate JIRA.  Therefore I agree 
with Omkar that we should simply break or omit the LOCALIZED case from the 
switch statement since 0.23 doesn't have localCacheDirectoryManager to match 
the trunk behavior.  Otherwise patch looks good to me.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch, YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-13 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738541#comment-13738541
 ] 

Omkar Vinit Joshi commented on YARN-1036:
-

+1 ... thanks for updating the patch..lgtm.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch, 
 YARN-1036.branch-0.23.patch, YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-13 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738754#comment-13738754
 ] 

Jason Lowe commented on YARN-1036:
--

+1 lgtm as well.  Committing this.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch, 
 YARN-1036.branch-0.23.patch, YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-08 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733639#comment-13733639
 ] 

Ravi Prakash commented on YARN-1036:


Hi Omkar!
Thanks a lot for pointing out the problem in the earlier patch. 

Regarding the changes you are proposing, I meant for this JIRA to simply be a 
backport of MAPREDUCE-4342. I wasn't able to re-open that JIRA because it has 
already been closed (hence I had to file this new JIRA).

If you have spotted a problem with the current patch, I would welcome your 
suggested changes. However if you have an issue with the approach, I would 
request you to please pursue them in a separate JIRA as they lie outside the 
scope of simple backporting. Most of this code is already in trunk as is.

Please let me know if this is acceptable to you.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch, YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731945#comment-13731945
 ] 

Koji Noguchi commented on YARN-1036:


When reading MAPREDUCE-4342 initially, I didn't understand why the files were 
deleted in the first place.   Until recently, when we started seeing 
inconsistent distributed cache after we enabled 
Disk-Fail-In-Place(MAPREDUCE-2413).  

NodeManager/TaskTracker keeps on running even after finding bad disks without 
throwing away distributed cache entries coming from those disks resulting in 
incomplete set of distributed cache available to our users.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732125#comment-13732125
 ] 

Ravi Prakash commented on YARN-1036:


Thanks for the review Omkar and Koji! Are you satisfied with Koji's explanation 
Omkar?

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732214#comment-13732214
 ] 

Omkar Vinit Joshi commented on YARN-1036:
-

[~knoguchi] We get LOCALIZED even when resource is localized by public/private 
localizer (and when resource is in DOWNLOADING state).. so this event will be 
fired only when file is successfully localized. Probably we need to delete the 
file only if it is already localized ... thoughts? isResourcePresent is 
checking for already LOCALIZED file right?

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732637#comment-13732637
 ] 

Ravi Prakash commented on YARN-1036:


Omkar! We don't want to delete a file. We want it to be localized again if it 
is detected to no longer exist. Which is what this patch is doing. This is what 
Koji is also saying.

I already tested the patch does the right thing on my single node cluster. My 
test methodology was this. 
1. Configure 2 directories for the NM. One on the node hard disk, and another 
on a pen drive.
2. Ran a sleep job with -files option specifying a file.
3. Make sure the file is localized on the pen drive. (If it isn't run another 
sleep job with a different file to be put in distcache)
4. unplug the pendrive (to simulate a bad disk).

Before the patch, running a sleep job requesting the same old file in distcache 
didn't localize the file again. So if the job had required that file, it would 
have failed.
After the patch, it detects that the file which was already localized is 
missing, and so it localizes it again. This is the behavior we want. Do you 
agree?

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732653#comment-13732653
 ] 

Omkar Vinit Joshi commented on YARN-1036:
-

[~raviprak] Let me try to explain again.
* Resource R1 is not in the cache. so on (REQUEST) new resource was created in 
(DOWNLOADING) state and download started.
* Once download is finished; we get (LOCALIZED) (here we don't want to check 
whether file is present on disc or not.. do we have to? if yes then probably we 
may have to update trunk also as I separated the logic in one of the jiras.) 
and we change the resource state to (LOCALIZED) and notify all waiting 
containers. [File is present on disc].
* Now file got deleted.
* Another container requests the file (REQUEST).. now we should definitely 
check if the file is already present on disc or not. If not remove it from 
local cache (DataStructure-- by delete the file I wanted to say from local 
cache not from disc as it is already gone from there).
Let me know your thoughts.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732731#comment-13732731
 ] 

Ravi Prakash commented on YARN-1036:


[~ojoshi] 
bq. (here we don't want to check whether file is present on disc or not.. do we 
have to? 
I agree that the very first time the file is localized, checking for the 
existence of the file all over again is redundant. The file could go missing 
right after checking, and we would be none the wiser. However in 0.23, I am 
willing to check again. That's why I am not proposing the change to trunk. 
There's a lot of changes between trunk/branch-2 and branch-0.23 and we are only 
cherry-picking those necessary for maintaining the line. That is why we are 
also not backporting all the cool stuff you have done e.g. in YARN-539 and 
YARN-467 etc.

bq. now we should definitely check if the file is already present on disc or 
not. If not remove it from local cache (DataStructure-- by delete the file I 
wanted to say from local cache not from disc as it is already gone from there).
Isn't that exactly what the new code in the patch is doing?
{code}
  if (rsrc != null  (!isResourcePresent(rsrc))) {
LOG.info(Resource  + rsrc.getLocalPath()
+  is missing, localizing it again);
localrsrc.remove(req);
rsrc = null;
  }
{code}

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732738#comment-13732738
 ] 

Ravi Prakash commented on YARN-1036:


In the way you suggest 
{quote}
{code}
case LOCALIZED: break;
case REQUEST:
+  if (rsrc != null  (!isResourcePresent(rsrc))) {
+LOG.info(Resource  + rsrc.getLocalPath()
++  is missing, localizing it again);
+localrsrc.remove(req);
+rsrc = null;
+  }
.
{code}
{quote}

Once R1 had been localized, and the disk went bad, we would never again check 
if it still exists. That's entirely the problem we are trying to solve here.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732743#comment-13732743
 ] 

Omkar Vinit Joshi commented on YARN-1036:
-

[~raviprak] no it is actually not doing it the way it has to.
ResourceState :- DOWNLOADING and
LOCALIZED event 
Do you think there is any point in executing below code? *if (rsrc.getState() 
== ResourceState.LOCALIZED) {*
{code}
  if (rsrc != null  (!isResourcePresent(rsrc))) {
LOG.info(Resource  + rsrc.getLocalPath()
+  is missing, localizing it again);
localrsrc.remove(req);
decrementFileCountForLocalCacheDirectory(req, rsrc);
rsrc = null;
  }
  if (null == rsrc) {
rsrc = new LocalizedResource(req, dispatcher);
localrsrc.put(req, rsrc);
  }

  public boolean isResourcePresent(LocalizedResource rsrc) {
boolean ret = true;
if (rsrc.getState() == ResourceState.LOCALIZED) {
  File file = new File(rsrc.getLocalPath().toUri().getRawPath().
toString());
  if (!file.exists()) {
ret = false;
  }
}
return ret;
  }
{code}
anyway lets see what others have to say... but clearly by mixing this we may 
end up seeing some random race condition issues.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732824#comment-13732824
 ] 

Hadoop QA commented on YARN-1036:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12596727/YARN-1036.branch-0.23.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1671//console

This message is automatically generated.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch, YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-07 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732887#comment-13732887
 ] 

Omkar Vinit Joshi commented on YARN-1036:
-

Thanks for updating it...

few more comments.
bq. rsrc == null (I can't fathom why it would be), but still. 
I don't think it will ever be null in LOCALIZED case. lets isolate them.

bq. +  } //No break here. Falling through.
lets put break;

Regarding test code

bq. +  LocalizedResource lr1 = createLocalizedResource(req1, dispatcher);

why are we creating LocalizedResource? Probably we don't need to create one and 
control explicitly.. We can monitor dispatch queue. thoughts?

bq. +  file:///tmp/r1), 1);
lets not have hard coded paths.. will fail on windows (as you are also creating 
one file). Also try to create file in current working directory instead.

probably you can take a look at 
TestLocalResourcesTrackerImpl#testHierarchicalLocalCacheDirectories (trunk)

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch, YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731470#comment-13731470
 ] 

Hadoop QA commented on YARN-1036:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12596459/YARN-1036.branch-0.23.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1664//console

This message is automatically generated.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-06 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731473#comment-13731473
 ] 

Omkar Vinit Joshi commented on YARN-1036:
-

Thanks [~raviprak]
Probably we need to isolate the logic for LOCALIZED and REQUEST scenarios? 
thoughts?
{code}
+  if (rsrc != null  (!isResourcePresent(rsrc))) {
+LOG.info(Resource  + rsrc.getLocalPath()
++  is missing, localizing it again);
+localrsrc.remove(req);
+rsrc = null;
+  }
{code}
the code is not required to be executed when a resource is getting LOCALIZED.. 
in trunk we have isolated them. Probably as in branch 0.23 we don't have 
anything like localCacheDirectoryManager it makes sense to just keep 
break...and do nothing in case it is LOCALIZED?
{code}
case LOCALIZED: break;
case REQUEST:
+  if (rsrc != null  (!isResourcePresent(rsrc))) {
+LOG.info(Resource  + rsrc.getLocalPath()
++  is missing, localizing it again);
+localrsrc.remove(req);
+rsrc = null;
+  }
.
{code}
didn't review the test code.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira