date:20130127

[jira] [Updated] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-27 Thread Avner BenHanoch (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avner BenHanoch updated MAPREDUCE-4049:
---

Release Note: 
Allow ReduceTask loading a third party plugin for shuffle (and merge) instead 
of the default shuffle. A corresponding ShuffleProvider is anyhow allowed to 
run in the NM as an AuxiliaryService.
Use new config option: mapreduce.job.reduce.shuffle.consumer.plugin.class - 
Name of the class whose instance will be used to send shuffle requests by 
reducetasks of this job. The class must be an instance of 
org.apache.hadoop.mapred.ShuffleConsumerPlugin.


  was:Support Shuffle Consumer plugins from 3rd parties.


 plugin for generic shuffle service
 --

 Key: MAPREDUCE-4049
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
  Labels: merge, plugin, rdma, shuffle
 Fix For: 2.0.3-alpha

 Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
 MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch


 Support generic shuffle service as set of two plugins: ShuffleProvider  
 ShuffleConsumer.
 This will satisfy the following needs:
 # Better shuffle and merge performance. For example: we are working on 
 shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
 or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
 RDMA shuffle, the plugin can also utilize a suitable merge approach during 
 the intermediate merges. Hence, getting much better performance.
 # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
 dependency of NodeManager with a specific version of mapreduce shuffle 
 (currently targeted to 0.24.0).
 References:
 # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
 from Auburn University with others, 
 [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
 # I am attaching 2 documents with suggested Top Level Design for both plugins 
 (currently, based on 1.0 branch)
 # I am providing link for downloading UDA - Mellanox's open source plugin 
 that implements generic shuffle service using RDMA and levitated merge.  
 Note: At this phase, the code is in C++ through JNI and you should consider 
 it as beta only.  Still, it can serve anyone that wants to implement or 
 contribute to levitated merge. (Please be advised that levitated merge is 
 mostly suit in very fast networks) - 
 [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-27 Thread Avner BenHanoch (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563821#comment-13563821
]

Avner BenHanoch commented on MAPREDUCE-4049:

Alejandro, thanks for consulting me. I just added a note. Feel free to edit
if needed.

plugin for generic shuffle service
--

Key: MAPREDUCE-4049
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
Project: Hadoop Map/Reduce
Issue Type: Sub-task
Components: performance, task, tasktracker
Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
Reporter: Avner BenHanoch
Assignee: Avner BenHanoch
Labels: merge, plugin, rdma, shuffle
Fix For: 2.0.3-alpha

Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf,
MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch

Support generic shuffle service as set of two plugins: ShuffleProvider
ShuffleConsumer.
This will satisfy the following needs:
# Better shuffle and merge performance. For example: we are working on
shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE,
or Infiniband) instead of using the current HTTP shuffle. Based on the fast
RDMA shuffle, the plugin can also utilize a suitable merge approach during
the intermediate merges. Hence, getting much better performance.
# Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden
dependency of NodeManager with a specific version of mapreduce shuffle
(currently targeted to 0.24.0).
References:
# Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu
from Auburn University with others,
[http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
# I am attaching 2 documents with suggested Top Level Design for both plugins
(currently, based on 1.0 branch)
# I am providing link for downloading UDA - Mellanox's open source plugin
that implements generic shuffle service using RDMA and levitated merge.
Note: At this phase, the code is in C++ through JNI and you should consider
it as beta only. Still, it can serve anyone that wants to implement or
contribute to levitated merge. (Please be advised that levitated merge is
mostly suit in very fast networks) -
[http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable

2013-01-27 Thread Mariappan Asokan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563823#comment-13563823
 ] 

Mariappan Asokan commented on MAPREDUCE-4807:
-

Hi Alejandro,
  I have another possible release notes:

*NEW FEATURE*
*Allow external implementations of the sort phase in a Map task*

I will leave it to your choice.

Thanks.

-- Asokan


 Allow MapOutputBuffer to be pluggable
 -

 Key: MAPREDUCE-4807
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 2.0.2-alpha
Reporter: Arun C Murthy
Assignee: Mariappan Asokan
 Fix For: 2.0.3-alpha

 Attachments: COMBO-mapreduce-4809-4807.patch, 
 COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, 
 mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, 
 mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch


 Allow MapOutputBuffer to be pluggable

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

2013-01-27 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563853#comment-13563853
 ] 

Harsh J commented on MAPREDUCE-1347:


Since this patch makes a safe change with a properly working test (that fails 
without the fix, exposing the issue cleanly), and has been awaiting further 
review for long (has already addressed previous comments), I'll go ahead and 
commit it in by tomorrow EOD unless anyone else has any objections. Thanks! :)

 Missing synchronization in MultipleOutputFormat
 ---

 Key: MAPREDUCE-1347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1347
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.0
Reporter: Todd Lipcon
Assignee: Harsh J
  Labels: concurrency
 Attachments: MAPREDUCE-1347.r10.diff, mapreduce.1347.r1.diff, 
 MAPREDUCE-1347.r2.diff, MAPREDUCE-1347.r3.diff, MAPREDUCE-1347.r4.diff, 
 MAPREDUCE-1347.r5.diff, MAPREDUCE-1347.r6.diff, MAPREDUCE-1347.r7.diff, 
 MAPREDUCE-1347.r8.diff, MAPREDUCE-1347.r9.diff


 MultipleOutputFormat's RecordWriter implementation doesn't use 
 synchronization when accessing the recordWriters member. When using 
 multithreaded mappers or reducers, this can result in problems where two 
 threads will both try to create the same file, causing 
 AlreadyBeingCreatedException. Doing this more fine-grained than just 
 synchronizing the whole method is probably a good idea, so that multithreaded 
 mappers can actually achieve parallelism writing into separate output streams.
 From what I can tell, the new API's MultipleOutputs seems not to have this 
 issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2293) Enhance MultipleOutputs to allow additional characters in the named output name

2013-01-27 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563854#comment-13563854
 ] 

Harsh J commented on MAPREDUCE-2293:


No comments anyone? May I commit this in as-is?

 Enhance MultipleOutputs to allow additional characters in the named output 
 name
 ---

 Key: MAPREDUCE-2293
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2293
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: David Rosenstrauch
Assignee: Harsh J
Priority: Minor
 Attachments: mapreduce.mo.removecheck.r1.diff, 
 mapreduce.mo.removecheck.r2.diff, mapreduce.mo.removecheck.r3.diff, 
 mapreduce.mo.removecheck.r4.diff, mapreduce.mo.removecheck.r5.diff


 Currently you are only allowed to use alpha-numeric characters in a named 
 output name in the MultipleOutputs class.  This is a bit of an onerous 
 restriction, as it would be extremely convenient to be able to use non 
 alpha-numerics in the name too.  (E.g., a '.' character would be very 
 helpful, so that you can use the named output name for holding a file 
 name/extension.  Perhaps '-' and a '_' characters as well.)
 The restriction seems to be somewhat arbitrary - it appears to be only 
 enforced in the checkTokenName method.  (Though I don't know if there's any 
 downstream impact by loosening this restriction.)
 Would be extremely helpful/useful to have this fixed though!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2454) Allow external sorter plugin for MR

2013-01-27 Thread Mariappan Asokan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-2454:


Release Note: MAPREDUCE-4807 Allow external implementations of the sort 
phase in a Map task

 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.2-alpha
Reporter: Mariappan Asokan
Assignee: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Fix For: 2.0.3-alpha

 Attachments: HadoopSortPlugin.pdf, HadoopSortPlugin.pdf, 
 KeyValueIterator.java, MapOutputSorterAbstract.java, MapOutputSorter.java, 
 mapreduce-2454-modified-code.patch, mapreduce-2454-modified-test.patch, 
 mapreduce-2454-new-test.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
 mapreduce-2454.patch, mapreduce-2454-protection-change.patch, 
 mr-2454-on-mr-279-build82.patch.gz, MR-2454-trunkPatchPreview.gz, 
 ReduceInputSorter.java


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed

2013-01-27 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-4956:
---

Attachment: MAPREDUCE-4956_1.patch

I've created a patch, which allows the newly added JH info, i.e., the workflow 
information of a job and the locality/avataar of a task attempt, to be exposed 
to CLI and rumen. The test cases are also included.

One accompanied issue I've found is that there seems to be no end-to-end test 
for rumen.

Thanks,
Zhijie

 The Additional JH Info Should Be Exposed
 

 Key: MAPREDUCE-4956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4956
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-4956_1.patch


 In MAPREDUCE-4838, the addition info has been added to JH. This info is 
 useful to be exposed, at least via UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4039) Sort Avoidance

2013-01-27 Thread anty.rao (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563982#comment-13563982
]

anty.rao commented on MAPREDUCE-4039:
-

[~masokan]
You can get on with it!
Looking forward to seeing this feature incorporated to trunk.

Sort Avoidance
--

Key: MAPREDUCE-4039
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039
Project: Hadoop Map/Reduce
Issue Type: New Feature
Components: mrv2
Affects Versions: 0.23.2
Reporter: anty.rao
Assignee: anty
Priority: Minor
Fix For: 0.23.2

Attachments: IndexedCountingSortable.java,
MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch,
MAPREDUCE-4039-branch-0.23.2.patch

Inspired by
[Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf],
in 5.1 MapReduce Enhanceemtns:
{quote}*Sort Avoidance*. Certain operators such as hash join
and hash aggregation require shuffling, but not sorting. The
MapReduce API was enhanced to automatically turn off
sorting for these operations. When sorting is turned off, the
mapper feeds data to the reducer which directly passes the
data to the Reduce() function bypassing the intermediate
sorting step. This makes many SQL operators significantly
more ecient.{quote}
There are a lot of applications which need aggregation only, not
sorting.Using sorting to achieve aggregation is costly and inefficient.
Without sorting, up application can make use of hash table or hash map to do
aggregation efficiently.But application should bear in mind that reduce
memory is limited, itself is committed to manage memory of reduce, guard
against out of memory. Map-side combiner is not supported, you can also do
hash aggregation in map side as a workaround.
the following is the main points of sort avoidance implementation
# add a configuration parameter ??mapreduce.sort.avoidance??, boolean type,
to turn on/off sort avoidance workflow.Two type of workflow are coexist
together.
# key/value pairs emitted by map function is sorted by partition only, using
a more efficient sorting algorithm: counting sort.
# map-side merge, use a kind of byte merge, which just concatenate bytes from
generated spills, read in bytes, write out bytes, without overhead of
key/value serialization/deserailization, comparison, which current version
incurs.
# reduce can start up as soon as there is any map output available, in
contrast to sort workflow which must wait until all map outputs are fetched
and merged.
# map output in memory can be directly consumed by reduce.When reduce can't
catch up with the speed of incoming map outputs, in-memory merge thread will
kick in, merging in-memory map outputs onto disk.
# sequentially read in on-disk files to feed reduce, in contrast to currently
implementation which read multiple files concurrently, result in many disk
seek. Map output in memory take precedence over on disk files in feeding
reduce function.
I have already implement this feature based on hadoop CDH3U3 and done some
performance evaluation, you can reference to
[https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it
into yarn. Welcome for commenting.

[jira] [Commented] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations

2013-01-27 Thread Jerry Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563988#comment-13563988
 ] 

Jerry Chen commented on MAPREDUCE-4961:
---

[~asokan]
Really thanks for your review and suggestion. I thinked about your suggestion. 
It can be done in that way.

While the problem is that when the MergeManager don't provide a interface 
method for local merge case, other implementations of merge manager will still 
hard to benefit from the MergeManager abstraction. For example, when I am 
considering the HashMergeManager, I need to call into HashMergeManager for 
local merge because the merge process is different with the static 
Merege.merge. So the HashShuffle still needs to deal with specially on this in 
runLocal(). Although this is not a big issue, yet the purpose of MergeManager 
interface is to provide an abstraction layer for Shuffle use.

While I am not very insisting on chaning MergeManager, if you think the above 
reason making sense, let keep the change in MergeManager; Otherwise, let's take 
your approach. Please kindly give your idea on this.

Jerry



 Map reduce running local should also go through ShuffleConsumerPlugin for 
 enabling different MergeManager implementations
 -

 Key: MAPREDUCE-4961
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Jerry Chen
Assignee: Jerry Chen
 Attachments: MAPREDUCE-4961.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 
 extends Shuffle to be able to provide different MergeManager implementations. 
 While using these pluggable features, I find that when a map reduce is 
 running locally, a RawKeyValueIterator was returned directly from a static 
 call of Merge.merge, which break the assumption that the Shuffle may provide 
 different merge methods although there is no copy phase for this situation.
 The use case is when I am implementating a hash-based MergeManager, we don't 
 need sort in map side, while when running the map reduce locally, the 
 hash-based MergeManager will have no chance to be used as it goes directly to 
 Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete.
 So we need to move the code calling Merger.merge from Reduce Task to 
 ShuffleConsumerPlugin implementation, so that the Suffle implementation can 
 decide how to do the merge and return corresponding iterator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4883) Reducer's Maximum Shuffle Buffer Size should be enlarged for 64bit JVM

2013-01-27 Thread Jerry Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563998#comment-13563998
 ] 

Jerry Chen commented on MAPREDUCE-4883:
---

[~jerrylead]
The key mapred.job.reduce.input.buffer.percent has already be depreciated and 
replaced by a new name mapreduce.reduce.input.buffer.percent.

 Reducer's Maximum Shuffle Buffer Size should be enlarged for 64bit JVM
 --

 Key: MAPREDUCE-4883
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4883
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.20.2, 1.0.3
 Environment: Especially for 64bit JVM
Reporter: Lijie Xu
Assignee: Jerry Chen
  Labels: patch
 Attachments: MAPREDUCE-4883.patch

   Original Estimate: 12h
  Remaining Estimate: 12h

 In hadoop-0.20.2, hadoop-1.0.3 or other versions, reducer's shuffle buffer 
 size cannot exceed 2048MB (i.e., Integer.MAX_VALUE). This is reasonable for 
 32bit JVM.
 But for 64bit JVM, although reducer's JVM size can be set more than 2048MB 
 (e.g., mapred.child.java.opts=-Xmx4000m), the heap size used for shuffle 
 buffer is at most 2048MB * maxInMemCopyUse (default 0.7) not 4000MB * 
 maxInMemCopyUse. 
 So the pointed piece of code in ReduceTask.java needs modification for 64bit 
 JVM.
 ---
   private final long maxSize;
   private final long maxSingleShuffleLimit;
  
   private long size = 0;
  
   private Object dataAvailable = new Object();
   private long fullSize = 0;
   private int numPendingRequests = 0;
   private int numRequiredMapOutputs = 0;
   private int numClosed = 0;
   private boolean closed = false;
  
   public ShuffleRamManager(Configuration conf) throws IOException {
 final float maxInMemCopyUse =
   conf.getFloat(mapred.job.shuffle.input.buffer.percent, 0.70f);
 if (maxInMemCopyUse  1.0 || maxInMemCopyUse  0.0) {
   throw new IOException(mapred.job.shuffle.input.buffer.percent +
 maxInMemCopyUse);
 }
 // Allow unit tests to fix Runtime memory
 --   maxSize = (int)(conf.getInt(mapred.job.reduce.total.mem.bytes,
 --(int)Math.min(Runtime.getRuntime().maxMemory(), Integer.MAX_VALUE))
 --  * maxInMemCopyUse);
 maxSingleShuffleLimit = (long)(maxSize * 
 MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION);
 LOG.info(ShuffleRamManager: MemoryLimit= + maxSize +
  , MaxSingleShuffleLimit= + maxSingleShuffleLimit);
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4838) Add extra info to JH files

2013-01-27 Thread Matt Foley (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564008#comment-13564008
 ] 

Matt Foley commented on MAPREDUCE-4838:
---

When done, please also commit to hadoop-1 branch.  Thank you.

 Add extra info to JH files
 --

 Key: MAPREDUCE-4838
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838_2.patch, 
 MAPREDUCE-4838_3.patch, MAPREDUCE-4838_4.patch, MAPREDUCE-4838_5.patch, 
 MAPREDUCE-4838.patch


 It will be useful to add more task-info to JH for analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4838) Add extra info to JH files

2013-01-27 Thread Matt Foley (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated MAPREDUCE-4838:
--

Target Version/s: 1.2.0, 2.0.3-alpha  (was: 2.0.3-alpha)

 Add extra info to JH files
 --

 Key: MAPREDUCE-4838
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838_2.patch, 
 MAPREDUCE-4838_3.patch, MAPREDUCE-4838_4.patch, MAPREDUCE-4838_5.patch, 
 MAPREDUCE-4838.patch


 It will be useful to add more task-info to JH for analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2264) Job status exceeds 100% in some cases

2013-01-27 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564012#comment-13564012
 ] 

Chris Douglas commented on MAPREDUCE-2264:
--

The patch to trunk changes {{onDiskMapOutputs}} to be a {{TreeSet}} of 
{{CompressAwarePath}} instead of {{Path}}, but the latter doesn't implement 
{{Comparable}}, neither is the {{TreeSet}} instantiated with a {{Comparator}}. 
So there's no defined ordering for the sorted set.

 Job status exceeds 100% in some cases 
 --

 Key: MAPREDUCE-2264
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2264
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.2, 0.20.205.0
Reporter: Adam Kramer
Assignee: Devaraj K
  Labels: critical-0.22.0
 Fix For: 1.2.0, 2.0.3-alpha

 Attachments: MAPREDUCE-2264-0.20.205-1.patch, 
 MAPREDUCE-2264-0.20.205.patch, MAPREDUCE-2264-0.20.3.patch, 
 MAPREDUCE-2264-branch-1-1.patch, MAPREDUCE-2264-branch-1-2.patch, 
 MAPREDUCE-2264-branch-1.patch, MAPREDUCE-2264-trunk-1.patch, 
 MAPREDUCE-2264-trunk-1.patch, MAPREDUCE-2264-trunk-2.patch, 
 MAPREDUCE-2264-trunk-3.patch, MAPREDUCE-2264-trunk.patch, more than 100%.bmp


 I'm looking now at my jobtracker's list of running reduce tasks. One of them 
 is 120.05% complete, the other is 107.28% complete.
 I understand that these numbers are estimates, but there is no case in which 
 an estimate of 100% for a non-complete task is better than an estimate of 
 99.99%, nor is there any case in which an estimate greater than 100% is valid.
 I suggest that whatever logic is computing these set 99.99% as a hard maximum.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4837) Add MR-AM web-services to branch-1

2013-01-27 Thread Matt Foley (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated MAPREDUCE-4837:
--

Target Version/s: 1.2.0

 Add MR-AM web-services to branch-1
 --

 Key: MAPREDUCE-4837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4837
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: MAPREDUCE-4837.patch


 Add MR-AM web-services to branch-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4837) Add MR-AM web-services to branch-1

2013-01-27 Thread Matt Foley (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564026#comment-13564026
 ] 

Matt Foley commented on MAPREDUCE-4837:
---

How are we doing getting this reviewed and in to branch-1 for 1.2.0?

 Add MR-AM web-services to branch-1
 --

 Key: MAPREDUCE-4837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4837
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: MAPREDUCE-4837.patch


 Add MR-AM web-services to branch-1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed

2013-01-27 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564027#comment-13564027
 ] 

Zhijie Shen commented on MAPREDUCE-4956:


Forgot to mention than MAPREDUCE-4956_1.patch is the incremental patch based on 
MAPREDUCE-4838_5.patch.

 The Additional JH Info Should Be Exposed
 

 Key: MAPREDUCE-4956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4956
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: MAPREDUCE-4956_1.patch


 In MAPREDUCE-4838, the addition info has been added to JH. This info is 
 useful to be exposed, at least via UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4963) StatisticsCollector improperly keeps track of Last Day and Last Hour statistics for new TaskTrackers

2013-01-27 Thread Matt Foley (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated MAPREDUCE-4963:
--

Target Version/s: 1.2.0  (was: 1.1.2)

 StatisticsCollector improperly keeps track of Last Day and Last Hour 
 statistics for new TaskTrackers
 

 Key: MAPREDUCE-4963
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4963
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 1.2.0

 Attachments: MAPREDUCE-4963.patch


 The StatisticsCollector keeps track of updates to the Total Tasks Last Day, 
 Succeed Tasks Last Day, Total Tasks Last Hour, and Succeeded Tasks Last 
 Hour per Task Tracker which is displayed on the JobTracker web UI.  It uses 
 buckets to manage when to shift task counts from Last Hour to Last Day 
 and out of Last Day.  After the JT has been running for a while, the 
 connected TTs will have the max number of buckets and will keep shifting them 
 at each update.  If a new TT connects (or an old on rejoins), it won't have 
 the max number of buckets, but the code that drops the buckets uses the same 
 counter for all sets of buckets.  This means that new TTs will prematurely 
 drop their buckets and the stats will be incorrect.  
 example:
 # Max buckets is 5
 # TaskTracker A has these values in its buckets [4, 2, 0, 3, 10] (i.e. 19)
 # A new TaskTracker, B, connects; it has nothing in its buckets: [ ] (i.e. 0)
 # TaskTracker B runs 3 tasks and TaskTracker A runs 5
 # An update occurs
 # TaskTracker A has [2, 0, 3, 10, 5] (i.e. 20)
 # TaskTracker B should have [3] but it will drop that bucket after adding it 
 during the update and instead have [ ] again (i.e. 0)
 # TaskTracker B will keep doing that forever and always show 0 in the web UI
 We can fix this by not using the same counter for all sets of buckets

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-01-27 Thread Matt Foley (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated MAPREDUCE-4964:
--

Target Version/s: 1.2.0  (was: 1.1.2)

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-01-27 Thread Matt Foley (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564042#comment-13564042
 ] 

Matt Foley commented on MAPREDUCE-4964:
---

Since this is still in progress, please target 1.2 (branch-1).  Thanks.

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-01-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4964:


Attachment: (was: MR-4964.patch)

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-01-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4964:


Attachment: MR-4964.patch

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4397) Introduce HADOOP_SECURITY_CONF_DIR for task-controller

2013-01-27 Thread Matt Foley (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated MAPREDUCE-4397:
--

Description: The linux task controller currently hard codes the directory 
in which to look for its config file at compile time (via the HADOOP_CONF_DIR 
macro). Adding a new environment variable to look for task-controller's conf 
dir (with strict permission checks) would make installation much more flexible. 
 (was: The linux task controller concurrently hard code the directory to look 
for its config file at compile time (via the HADOOP_CONF_DIR macro). Adding a 
new environment variable to look for task-controller's conf dir (with strict 
permission checks) would make installation much more flexible.)

 Introduce HADOOP_SECURITY_CONF_DIR for task-controller
 --

 Key: MAPREDUCE-4397
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4397
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task-controller
Reporter: Luke Lu
Assignee: Yu Gao
 Fix For: 1.1.2

 Attachments: mapreduce-4397-branch-1.patch, test-patch.result


 The linux task controller currently hard codes the directory in which to look 
 for its config file at compile time (via the HADOOP_CONF_DIR macro). Adding a 
 new environment variable to look for task-controller's conf dir (with strict 
 permission checks) would make installation much more flexible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-01-27 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564079#comment-13564079
 ] 

Karthik Kambatla commented on MAPREDUCE-4964:
-

On closer look, realized the previous patch might mitigate but would not solve 
the problem.

JobLocalizer constructor makes a shallow copy of the ttConf object. The new 
patch makes a deep copy of the conf object just before setting the 
user-specific local dirs.

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

2013-01-27 Thread Matt Foley (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564086#comment-13564086
]

Matt Foley commented on MAPREDUCE-4396:
---

The patch for HADOOP-8734 is the same in LocalJobRunner.java. However, that
patch includes a proper unit test change in TestMRWithDistributedCache.java.
@Yu, your comment above that This patch can be verified by running unit test
TestMRWithDistributedCache in a user directory with restrictive permission
(e.g. 700) is correct, but the conclusion that therefore no new unit test is
necessary is incorrect. What's necessary is a unit test that does precisely
that. The patch from HADOOP-8734 provides such a patch.

I've merged the unit test patch to hadoop-1 and hadoop-1.1.

Make LocalJobRunner work with private distributed cache
---

Key: MAPREDUCE-4396
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4396
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
Priority: Minor
Attachments: mapreduce-4396-branch-1.patch, test-afterpatch.result,
test-beforepatch.result, test-patch.result

Some LocalJobRunner related unit tests fails if user directory permission
and/or umask is too restrictive.

[jira] [Updated] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

2013-01-27 Thread Matt Foley (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated MAPREDUCE-4396:
--

Fix Version/s: 1.1.2

 Make LocalJobRunner work with private distributed cache
 ---

 Key: MAPREDUCE-4396
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4396
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
Priority: Minor
 Fix For: 1.1.2

 Attachments: mapreduce-4396-branch-1.patch, test-afterpatch.result, 
 test-beforepatch.result, test-patch.result


 Some LocalJobRunner related unit tests fails if user directory permission 
 and/or umask is too restrictive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

2013-01-27 Thread Matt Foley (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated MAPREDUCE-4396:
--

Attachment: HADOOP-8734-LocalJobRunner.patch

 Make LocalJobRunner work with private distributed cache
 ---

 Key: MAPREDUCE-4396
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4396
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
Priority: Minor
 Fix For: 1.1.2

 Attachments: HADOOP-8734-LocalJobRunner.patch, 
 mapreduce-4396-branch-1.patch, test-afterpatch.result, 
 test-beforepatch.result, test-patch.result


 Some LocalJobRunner related unit tests fails if user directory permission 
 and/or umask is too restrictive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-01-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4964:


Attachment: MR-4964.patch

Updated the patch with a test that reproduces the issue without the fix, and 
verifies that the fix fixes it.

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch, MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-01-27 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4964:


Status: Patch Available  (was: Open)

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch, MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

2013-01-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564096#comment-13564096
 ] 

Hadoop QA commented on MAPREDUCE-4964:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12566723/MR-4964.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3283//console

This message is automatically generated.

 JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong 
 user's directory
 -

 Key: MAPREDUCE-4964
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4964.patch, MR-4964.patch


 In the following code, if jobs corresponding to different users (X and Y) are 
 localized simultaneously, it is possible that jobconf can be written to the 
 wrong user's directory. (X's job.xml can be written to Y's directory)
 {code}
   public void localizeJobFiles(JobID jobid, JobConf jConf,
   Path localJobTokenFile, TaskUmbilicalProtocol taskTracker)
   throws IOException, InterruptedException {
 localizeJobFiles(jobid, jConf,
 lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile,
 taskTracker);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4049) plugin for generic shuffle service

[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable

[jira] [Commented] (MAPREDUCE-1347) Missing synchronization in MultipleOutputFormat

[jira] [Commented] (MAPREDUCE-2293) Enhance MultipleOutputs to allow additional characters in the named output name

[jira] [Updated] (MAPREDUCE-2454) Allow external sorter plugin for MR

[jira] [Updated] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed

[jira] [Commented] (MAPREDUCE-4039) Sort Avoidance

[jira] [Commented] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations

[jira] [Commented] (MAPREDUCE-4883) Reducer's Maximum Shuffle Buffer Size should be enlarged for 64bit JVM

[jira] [Commented] (MAPREDUCE-4838) Add extra info to JH files

[jira] [Updated] (MAPREDUCE-4838) Add extra info to JH files

[jira] [Commented] (MAPREDUCE-2264) Job status exceeds 100% in some cases

[jira] [Updated] (MAPREDUCE-4837) Add MR-AM web-services to branch-1

[jira] [Commented] (MAPREDUCE-4837) Add MR-AM web-services to branch-1

[jira] [Commented] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed

[jira] [Updated] (MAPREDUCE-4963) StatisticsCollector improperly keeps track of Last Day and Last Hour statistics for new TaskTrackers

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

[jira] [Commented] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

[jira] [Updated] (MAPREDUCE-4397) Introduce HADOOP_SECURITY_CONF_DIR for task-controller

[jira] [Commented] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

[jira] [Commented] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

[jira] [Updated] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

[jira] [Updated] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

[jira] [Updated] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

[jira] [Commented] (MAPREDUCE-4964) JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong user's directory

29 matches

Site Navigation

Mail list logo

Footer information