Ok... 
Where are you pulling these questions from? 

Seriously. 


On Nov 7, 2012, at 11:21 AM, Ramasubramanian Narayanan 
<ramasubramanian.naraya...@gmail.com> wrote:

> Hi,
> 
>    I came across the following question in some sites and the answer that 
> they provided seems to be wrong according to me... I might be wrong... Can 
> some one help on confirming the right answers for these 11 questions pls.. 
> appreciate the explanation if you could able to provide...
> 
> *******************************************************************************
> You are running a job that will process a single InputSplit on a cluster 
> which has no other jobs
> currently running. Each node has an equal number of open Map slots. On which 
> node will Hadoop
> first attempt to run the Map task?
> A. The node with the most memory
> B. The node with the lowest system load
> C. The node on which this InputSplit is stored
> D. The node with the most free local disk space
> 
> My Answer            : C 
> Answer Given in site : A
> 
> *******************************************************************************
> What is a Writable?
> A. Writable is an interface that all keys and values in MapReduce must 
> implement. Classes implementing this interface must implement methods 
> forserializingand deserializing themselves.
> B. Writable is an abstract class that all keys and values in MapReduce must 
> extend. Classes extending this abstract base class must implementmethods for 
> serializing and deserializingthemselves
> C. Writable is an interface that all keys, but not values, in MapReduce must 
> implement. Classes implementing this interface mustimplementmethods for 
> serializing and deserializing themselves.
> D. Writable is an abstract class that all keys, but not values, in MapReduce 
> must extend. Classes extending this abstract base class must implementmethods 
> for serializing and deserializing themselves.
> 
> My Answer            : A
> Answer Given in site : B
> 
> *******************************************************************************
> 
> You write a MapReduce job to process 100 files in HDFS. Your MapReducc 
> algorithm uses
> TextInputFormat and the IdentityReducer: the mapper applies a regular 
> expression over input
> values and emits key-value pairs with the key consisting of the matching 
> text, and the value
> containing the filename and byte offset. Determine the difference between 
> setting the number of
> reducers to zero.
> A. There is no differenceinoutput between the two settings.
> B. With zero reducers, no reducer runs and the job throws an exception. With 
> one reducer,
> instances of matching patterns are stored in a single file on HDFS.
> C. With zero reducers, all instances of matching patterns are gathered 
> together in one file on
> HDFS. With one reducer, instances ofmatching patternsstored in multiple files 
> on HDFS.
> D. With zero reducers, instances of matching patterns are stored in multiple 
> files on HDFS. With
> one reducer, all instances of matching patterns aregathered together in one 
> file on HDFS.
> 
> My Answer            : D
> Answer Given in site : C
> 
> *******************************************************************************
> 
> During the standard sort and shuffle phase of MapReduce, keys and values are 
> passed to
> reducers. Which of the following is true?
> A. Keys are presented to a reducerin sorted order; values foragiven key are 
> not sorted.
> B. Keys are presented to a reducer in soiled order; values for a given key 
> are sorted in ascending
> order.
> C. Keys are presented to a reducer in random order; values for a given key 
> are not sorted.
> D. Keys are presented to a reducer in random order; values for a given key 
> are sorted in
> ascending order.
> 
> My Answer            : A
> Answer Given in site : D
> 
> *******************************************************************************
> 
> Which statement best describes the data path of intermediate key-value pairs 
> (i.e., output of the
> mappers)?
> A. Intermediate key-value pairs are written to HDFS. Reducers read the 
> intermediate data from
> HDFS.
> B. Intermediate key-value pairs are written to HDFS. Reducers copy the 
> intermediate data to the
> local disks of the machines runningthe reduce tasks.
> C. Intermediate key-value pairs are written to the local disks of the 
> machines running the map
> tasks, and then copied to the machinerunning thereduce tasks.
> D. Intermediate key-value pairs are written to the local disks of the 
> machines running the map
> tasks, and are then copied to HDFS. Reducers read theintermediate data from 
> HDFS.
> 
> My Answer            : C
> Answer Given in site : B
> 
> *******************************************************************************
> 
> You are developing a combiner that takes as input Text keys, IntWritable 
> values, and emits Text
> keys, Intwritable values. Which interface should your class implement?
> A. Mapper <Text, IntWritable, Text, IntWritable>
> B. Reducer <Text, Text, IntWritable, IntWritable>
> C. Reducer <Text, IntWritable, Text, IntWritable>
> D. Combiner <Text, IntWritable, Text, IntWritable>
> E. Combiner <Text, Text, IntWritable, IntWritable>
> 
> My Answer            : D
> Answer Given in site : C
> 
> *******************************************************************************
> 
> What happens in a MapReduce job when you set the number of reducers to one?
> A. A single reducer gathers and processes all the output from all the 
> mappers. The output is
> written in as many separate files as there are mappers.
> B. A single reducer gathers and processes all the output from all the 
> mappers. The output is
> written to a single file in HDFS.
> C. Setting the number of reducers to one creates a processing bottleneck, and 
> since the number
> of reducers as specified by the programmer is used as areference value only, 
> the MapReduce
> runtime provides a default setting for the number of reducers.
> D. Setting the number of reducers to one is invalid, and an exception is 
> thrown
> 
> My Answer            : B
> Answer Given in site : C
> 
> *******************************************************************************
> 
> In the standard word count MapReduce algorithm, why might using a combiner 
> reduce the overall
> Job running time?
> A. Because combiners perform local aggregation of word counts, thereby 
> allowing the mappers to
> process input data faster.
> B. Because combiners perform local aggregation of word counts, thereby 
> reducing the number of
> mappers that need to run.
> C. Because combiners perform local aggregation of word counts, and then 
> transfer that data to
> reducers without writing the intermediatedata to disk.
> D. Because combiners perform local aggregation of word counts, thereby 
> reducing the number of
> key-value pairs that need to be snuff letacross thenetwork to the reducers.
> 
> My Answer            : C
> Answer Given in site : A
> 
> *******************************************************************************
> 
> You need to create a GUI application to help your company's sales people add 
> and edit customer
> information. Would HDFS be appropriate for this customer information file?
> A. Yes, because HDFS isoptimized forrandom access writes.
> B. Yes, because HDFS is optimized for fast retrieval of relatively small 
> amounts of data.
> C. No, becauseHDFS can only be accessed by MapReduce applications.
> D. No, because HDFS is optimized for write-once, streaming access for 
> relatively large files.
> 
> My Answer            : D
> Answer Given in site : A
> 
> *******************************************************************************
> 
> You need to create a job that does frequency analysis on input data. You will 
> do this by writing a
> Mapper that uses TextInputForma and splits each value (a line of text from an 
> input file) into
> individual characters. For each one of these characters, you will emit the 
> character as a key and
> as IntWritable as the value. Since this will produce proportionally more 
> intermediate data than
> input data, which resources could you expect to be likely bottlenecks?
> A. Processor and RAM
> B. Processor and disk I/O
> C. Disk I/O and network I/O
> D. Processor and network I/O
> 
> My Answer            : D
> Answer Given in site : B
> 
> *******************************************************************************
> 
> Which of the following statements best describes how a large (100 GB) file is 
> stored in HDFS?
> A. The file is divided into variable size blocks, which are stored on 
> multiple data nodes. Each block
> is replicated three timesby default.
> B. The file is replicated three times by default. Each ropy of the file is 
> stored on a separate
> datanodes.
> C. The master copy of the file is stored on a single datanode. The replica 
> copies are divided into
> fixed-size blocks, which are stored on multiple datanodes.
> D. The file is divided into fixed-size blocks, which are stored on multiple 
> datanodes.Eachblock is
> replicated three times by default. Multiple blocks from the same file 
> mightreside on the same
> datanode.
> E. The tile is divided into fixed-sizeblocks, which are stored on multiple 
> datanodes.Eachblock is
> replicated three times by default.HDES guarantees that different blocks from 
> the same file are
> never on the same datanode.
> 
> My Answer            : D
> Answer Given in site : B
> 
> *******************************************************************************
> 
> regards,
> Rams

Reply via email to