Thanks Mohammad, I read through the documents, and they are very informative. After reading them, I have one more very basic concept question coming to my mind, what are the differences and relationship between HDFS and various file format -- sequence file (map file based on it), HAR file? I think the answer is, HDFS is the foundation file system, we can put raw binary file upload to HDFS (without using sequence file, HAR file, etc.), and we can also use some special designed file format based on HDFS -- like sequence file (map file based on it), HAR file. Is that correct understanding? Thanks in advance.
regards, Lin On Wed, Nov 28, 2012 at 9:30 PM, Mohammad Tariq <[email protected]> wrote: > Good pointer by Dyuti. For an explanation on Sequence and HAR files you > can visit another great post on Cloudera's blog section here : > http://blog.cloudera.com/blog/2009/02/the-small-files-problem/ > > Regards, > Mohammad Tariq > > > > On Wed, Nov 28, 2012 at 6:52 PM, dyuti a <[email protected]> wrote: > >> Hi Lin, >> check this link too >> http://blog.cloudera.com/blog/2011/01/hadoop-io-sequence-map-set-array-bloommap-files/ >> >> Hope it helps! >> dti >> >> On Wed, Nov 28, 2012 at 6:42 PM, Lin Ma <[email protected]> wrote: >> >>> Thanks Mohammad, >>> >>> I searched Hadoop file format, but only find sequence file format, so it >>> is why I have the confusion. >>> >>> 1. Are these file formats built on top of sequence file format? >>> 2. Appreciate if you could kindly point me to the official >>> documentation for the file formats. >>> >>> regards, >>> Lin >>> >>> >>> On Wed, Nov 28, 2012 at 9:06 PM, Mohammad Tariq <[email protected]>wrote: >>> >>>> Hello Lin, >>>> >>>> Along with that, Hadoop MapFiles, SetFiles, IFiles , HAR files. >>>> But each has its own significance and used under different scenarios. >>>> >>>> Regards, >>>> Mohammad Tariq >>>> >>>> >>>> >>>> On Wed, Nov 28, 2012 at 6:29 PM, Lin Ma <[email protected]> wrote: >>>> >>>>> Sorry I miss a question mark. I should say "are there any other >>>>> built-in file format supported by Hadoop?" :-) >>>>> >>>>> regards, >>>>> Lin >>>>> >>>>> On Wed, Nov 28, 2012 at 8:58 PM, Lin Ma <[email protected]> wrote: >>>>> >>>>>> Hello everyone, >>>>>> >>>>>> I have a very basic question. Besides sequence file format ( >>>>>> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/SequenceFile.html), >>>>>> are there any other built-in file format supported by Hadoop? >>>>>> >>>>>> thanks in advance, >>>>>> Lin >>>>>> >>>>> >>>>> >>>> >>> >> >
