Ok... Where are you pulling these questions from? Seriously.
On Nov 7, 2012, at 11:21 AM, Ramasubramanian Narayanan <ramasubramanian.naraya...@gmail.com> wrote: > Hi, > > I came across the following question in some sites and the answer that > they provided seems to be wrong according to me... I might be wrong... Can > some one help on confirming the right answers for these 11 questions pls.. > appreciate the explanation if you could able to provide... > > ******************************************************************************* > You are running a job that will process a single InputSplit on a cluster > which has no other jobs > currently running. Each node has an equal number of open Map slots. On which > node will Hadoop > first attempt to run the Map task? > A. The node with the most memory > B. The node with the lowest system load > C. The node on which this InputSplit is stored > D. The node with the most free local disk space > > My Answer : C > Answer Given in site : A > > ******************************************************************************* > What is a Writable? > A. Writable is an interface that all keys and values in MapReduce must > implement. Classes implementing this interface must implement methods > forserializingand deserializing themselves. > B. Writable is an abstract class that all keys and values in MapReduce must > extend. Classes extending this abstract base class must implementmethods for > serializing and deserializingthemselves > C. Writable is an interface that all keys, but not values, in MapReduce must > implement. Classes implementing this interface mustimplementmethods for > serializing and deserializing themselves. > D. Writable is an abstract class that all keys, but not values, in MapReduce > must extend. Classes extending this abstract base class must implementmethods > for serializing and deserializing themselves. > > My Answer : A > Answer Given in site : B > > ******************************************************************************* > > You write a MapReduce job to process 100 files in HDFS. Your MapReducc > algorithm uses > TextInputFormat and the IdentityReducer: the mapper applies a regular > expression over input > values and emits key-value pairs with the key consisting of the matching > text, and the value > containing the filename and byte offset. Determine the difference between > setting the number of > reducers to zero. > A. There is no differenceinoutput between the two settings. > B. With zero reducers, no reducer runs and the job throws an exception. With > one reducer, > instances of matching patterns are stored in a single file on HDFS. > C. With zero reducers, all instances of matching patterns are gathered > together in one file on > HDFS. With one reducer, instances ofmatching patternsstored in multiple files > on HDFS. > D. With zero reducers, instances of matching patterns are stored in multiple > files on HDFS. With > one reducer, all instances of matching patterns aregathered together in one > file on HDFS. > > My Answer : D > Answer Given in site : C > > ******************************************************************************* > > During the standard sort and shuffle phase of MapReduce, keys and values are > passed to > reducers. Which of the following is true? > A. Keys are presented to a reducerin sorted order; values foragiven key are > not sorted. > B. Keys are presented to a reducer in soiled order; values for a given key > are sorted in ascending > order. > C. Keys are presented to a reducer in random order; values for a given key > are not sorted. > D. Keys are presented to a reducer in random order; values for a given key > are sorted in > ascending order. > > My Answer : A > Answer Given in site : D > > ******************************************************************************* > > Which statement best describes the data path of intermediate key-value pairs > (i.e., output of the > mappers)? > A. Intermediate key-value pairs are written to HDFS. Reducers read the > intermediate data from > HDFS. > B. Intermediate key-value pairs are written to HDFS. Reducers copy the > intermediate data to the > local disks of the machines runningthe reduce tasks. > C. Intermediate key-value pairs are written to the local disks of the > machines running the map > tasks, and then copied to the machinerunning thereduce tasks. > D. Intermediate key-value pairs are written to the local disks of the > machines running the map > tasks, and are then copied to HDFS. Reducers read theintermediate data from > HDFS. > > My Answer : C > Answer Given in site : B > > ******************************************************************************* > > You are developing a combiner that takes as input Text keys, IntWritable > values, and emits Text > keys, Intwritable values. Which interface should your class implement? > A. Mapper <Text, IntWritable, Text, IntWritable> > B. Reducer <Text, Text, IntWritable, IntWritable> > C. Reducer <Text, IntWritable, Text, IntWritable> > D. Combiner <Text, IntWritable, Text, IntWritable> > E. Combiner <Text, Text, IntWritable, IntWritable> > > My Answer : D > Answer Given in site : C > > ******************************************************************************* > > What happens in a MapReduce job when you set the number of reducers to one? > A. A single reducer gathers and processes all the output from all the > mappers. The output is > written in as many separate files as there are mappers. > B. A single reducer gathers and processes all the output from all the > mappers. The output is > written to a single file in HDFS. > C. Setting the number of reducers to one creates a processing bottleneck, and > since the number > of reducers as specified by the programmer is used as areference value only, > the MapReduce > runtime provides a default setting for the number of reducers. > D. Setting the number of reducers to one is invalid, and an exception is > thrown > > My Answer : B > Answer Given in site : C > > ******************************************************************************* > > In the standard word count MapReduce algorithm, why might using a combiner > reduce the overall > Job running time? > A. Because combiners perform local aggregation of word counts, thereby > allowing the mappers to > process input data faster. > B. Because combiners perform local aggregation of word counts, thereby > reducing the number of > mappers that need to run. > C. Because combiners perform local aggregation of word counts, and then > transfer that data to > reducers without writing the intermediatedata to disk. > D. Because combiners perform local aggregation of word counts, thereby > reducing the number of > key-value pairs that need to be snuff letacross thenetwork to the reducers. > > My Answer : C > Answer Given in site : A > > ******************************************************************************* > > You need to create a GUI application to help your company's sales people add > and edit customer > information. Would HDFS be appropriate for this customer information file? > A. Yes, because HDFS isoptimized forrandom access writes. > B. Yes, because HDFS is optimized for fast retrieval of relatively small > amounts of data. > C. No, becauseHDFS can only be accessed by MapReduce applications. > D. No, because HDFS is optimized for write-once, streaming access for > relatively large files. > > My Answer : D > Answer Given in site : A > > ******************************************************************************* > > You need to create a job that does frequency analysis on input data. You will > do this by writing a > Mapper that uses TextInputForma and splits each value (a line of text from an > input file) into > individual characters. For each one of these characters, you will emit the > character as a key and > as IntWritable as the value. Since this will produce proportionally more > intermediate data than > input data, which resources could you expect to be likely bottlenecks? > A. Processor and RAM > B. Processor and disk I/O > C. Disk I/O and network I/O > D. Processor and network I/O > > My Answer : D > Answer Given in site : B > > ******************************************************************************* > > Which of the following statements best describes how a large (100 GB) file is > stored in HDFS? > A. The file is divided into variable size blocks, which are stored on > multiple data nodes. Each block > is replicated three timesby default. > B. The file is replicated three times by default. Each ropy of the file is > stored on a separate > datanodes. > C. The master copy of the file is stored on a single datanode. The replica > copies are divided into > fixed-size blocks, which are stored on multiple datanodes. > D. The file is divided into fixed-size blocks, which are stored on multiple > datanodes.Eachblock is > replicated three times by default. Multiple blocks from the same file > mightreside on the same > datanode. > E. The tile is divided into fixed-sizeblocks, which are stored on multiple > datanodes.Eachblock is > replicated three times by default.HDES guarantees that different blocks from > the same file are > never on the same datanode. > > My Answer : D > Answer Given in site : B > > ******************************************************************************* > > regards, > Rams