Hello, In one my use-cases, I generate large number of sequential files. In all of these files, I store a bunch of key/value pairs. The key is a string, and value is a list of FLOAT values. I know the number of float values that I am storing, and based on which I am estimating the size of the file to be around 700KB (approximately). However, when I see size in HDFS, it shows very less, something around 20KB. I am not using compression technique while writing the sequence files. Any clue here?
regards rab
