RE: Appending to HDFS file

Liu, Yi A Wed, 27 Aug 2014 23:36:35 -0700

Right, please use FileSystem#append

From: Stanley Shi [mailto:[email protected]]
Sent: Thursday, August 28, 2014 2:18 PM
To: [email protected]
Subject: Re: Appending to HDFS file


You should not use this method:
FSDataOutputStream fp = fs.create(pt, true)

Here's the java doc for this "create" method:

  /**
   * Create an FSDataOutputStream at the indicated Path.
   * @param f the file to create
   * @param overwrite if a file with this name already exists, then if true,
   *   the file will be overwritten, and if false an exception will be thrown.
   */
  public FSDataOutputStream create(Path f, boolean overwrite)
      throws IOException {
    return create(f, overwrite,
                  getConf().getInt("io.file.buffer.size", 4096),
                  getDefaultReplication(f),
                  getDefaultBlockSize(f));
  }

On Wed, Aug 27, 2014 at 2:12 PM, rab ra 
<[email protected]<mailto:[email protected]>> wrote:

hello

Here is d code snippet, I use to append

def outFile = "${outputFile}.txt"

Path pt = new Path("${hdfsName}/${dir}/${outFile}")

def fs = org.apache.hadoop.fs.FileSystem.get(configuration);

FSDataOutputStream fp = fs.create(pt, true)

fp << "${key} ${value}\n"
On 27 Aug 2014 09:46, "Stanley Shi" <[email protected]<mailto:[email protected]>> 
wrote:
would you please past the code in the loop?

On Sat, Aug 23, 2014 at 2:47 PM, rab ra 
<[email protected]<mailto:[email protected]>> wrote:

Hi

By default, it is true in hadoop 2.4.1. Nevertheless, I have set it to true 
explicitly in hdfs-site.xml. Still, I am not able to achieve append.

Regards
On 23 Aug 2014 11:20, "Jagat Singh" 
<[email protected]<mailto:[email protected]>> wrote:
What is value of dfs.support.append in hdfs-site.cml

https://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml



On Sat, Aug 23, 2014 at 1:41 AM, rab ra 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

I am currently using Hadoop 2.4.1.I am running a MR job using hadoop streaming 
utility.

The executable needs to write large amount of information in a file. However, 
this write is not done in single attempt. The file needs to be appended with 
streams of information generated.

In the code, inside a loop, I open a file in hdfs, appends some information. 
This is not working and I see only the last write.

How do I accomplish append operation in hadoop? Can anyone share a pointer to 
me?




regards
Bala




--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]



--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]

RE: Appending to HDFS file

Reply via email to