If there is a hard requirement for input split being one block you could just 
make your input split fit a smaller block size. 

Just saying, in case you can't overcome the 2G ceiling

J

Sent from my mobile. Please excuse the typos.

On 2010-10-18, at 5:08 PM, "elton sky" <[email protected]> wrote:

>> Why would you want to use a block size of > 2GB?
> For keeping a maps input split in a single block~
> 
> On Tue, Oct 19, 2010 at 9:07 AM, Michael Segel 
> <[email protected]>wrote:
> 
>> 
>> Ok, I'll bite.
>> Why would you want to use a block size of > 2GB?
>> 
>> 
>> 
>>> Date: Mon, 18 Oct 2010 21:33:34 +1100
>>> Subject: BUG: Anyone use block size more than 2GB before?
>>> From: [email protected]
>>> To: [email protected]
>>> 
>>> Hello,
>>> 
>>> In
>>> 
>> hdfs.org.apache.hadoop.hdfs.DFSClient<eclipse-javadoc:%E2%98%82=HadoopSrcCode/src%3Chdfs.org.apache.hadoop.hdfs%7BDFSClient.java%E2%98%83DFSClient>
>>> 
>> .DFSOutputStream<eclipse-javadoc:%E2%98%82=HadoopSrcCode/src%3Chdfs.org.apache.hadoop.hdfs%7BDFSClient.java%E2%98%83DFSClient%E2%98%83DFSOutputStream>.writeChunk(byte[]
>>> b, int offset, int len, byte[] checksum)
>>> The second last line:
>>> 
>>> int psize = Math.min((int)(blockSize-bytesCurBlock), writePacketSize);
>>> 
>>> When I use blockSize  bigger than 2GB, which is out of the boundary of
>>> integer something weird would happen. For example, for a 3GB block it
>> will
>>> create more than 2Million packets.
>>> 
>>> Anyone noticed this before?
>>> 
>>> Elton
>> 
>> 

Reply via email to