Thank you for the response. Actually it is not a single file, I have JSON files that amount to 115 GB, these JSON files need to be processed and loaded into a Hbase data tables on the same cluster for later processing. Not considering the disk space required for the Hbase storage, If I reduce the replication to 3, how much more HDFS space will I require?
Thank you, On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <[email protected]> wrote: > If the file is a txt file, you could get a good compression ratio. > Changing the replication to 3 and the file will fit. But not sure what your > usecase is what you want to achieve by putting this data there. Any > transformation on this data and you would need more space to save the > transformed data. > > If you have 5 nodes and they are not virtual machines, you should consider > adding more harddisks to your cluster. > > > On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <[email protected]>wrote: > >> Hello, >> >> I have a hadoop cluster of 5 nodes with a total of available HDFS space >> 130 GB with replication set to 5. >> I have a file of 115 GB, which needs to be copied to the HDFS and >> processed. >> Do I need to have anymore HDFS space for performing all processing >> without running into any problems? or is this space sufficient? >> >> -- >> Regards, >> Ouch Whisper >> 010101010101 >> > > -- Regards, Ouch Whisper 010101010101
