[ http://issues.apache.org/jira/browse/HADOOP-212?page=comments#action_12383319 ]
alan wootton commented on HADOOP-212: ------------------------------------- Ok, I get it now. Even though it's currently impossible for any block, except the last block of a file, to be anything other than 32mb it looks like the system would support it. We need to remove all references to BLOCK_SIZE. I see some problems. FSSataSet doesn't know which file it's working with. It always uses BLOCK_SIZE. DFSClient.DFSOutputStream.write() has same problem. I'll vote yes. > allow changes to dfs block size > ------------------------------- > > Key: HADOOP-212 > URL: http://issues.apache.org/jira/browse/HADOOP-212 > Project: Hadoop > Type: Improvement > Components: dfs > Versions: 0.2 > Reporter: Owen O'Malley > Assignee: Owen O'Malley > Priority: Critical > Fix For: 0.3 > > Trying to change the DFS block size, led the realization that the 32,000,000 > was hard coded into the source code. I propose: > 1. Change the default block size to 64 * 1024 * 1024. > 2. Add the config variable dfs.block.size that sets the default block size. > 3. Add a parameter to the FileSystem, DFSClient, and ClientProtocol create > method that let's the user control the block size. > 4. Rename the FileSystem.getBlockSize to getDefaultBlockSize. > 5. Add a new method to FileSytem.getBlockSize that takes a pathname. > 6. Use long for the block size in the API, which is what was used before. > However, the implementation will not work if block size is set bigger than > 2**31. > 7. Have the InputFormatBase use the blocksize of each file to determine the > split size. > Thoughts? -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
