Block sizes are per-file, not permanently set on the HDFS. So create your files with a sufficiently large block size (2G is OK if it fits your usecase well). This way you won't have block splits, as you desire.
For example, to upload a file via the shell with a tweaked blocksize, I'd do: hadoop dfs -Ddfs.block.size=2147483648 -copyFromLocal localFile remoteFile Packet sizes are not what you want to tweak here. On Tue, Nov 8, 2011 at 1:02 PM, donal0412 <donal0...@gmail.com> wrote: > Hi, > I want to store lots of files in HDFS, the file size is <= 2G. > I don't want the file to split into blocks,because I need the whole file > while processing it, and I don't want to transfer blocks to one node when > processing it. > A easy way to do this would be set dfs.write.packet.size to 2G, I wonder if > some one has similar experiences or known whether this is practicable. > Will there be performance problems when set the packet size to a big number? > > Thanks! > donal > -- Harsh J