Right, please use FileSystem#append
From: Stanley Shi [mailto:[email protected]]
Sent: Thursday, August 28, 2014 2:18 PM
To: [email protected]
Subject: Re: Appending to HDFS file
You should not use this method:
FSDataOutputStream fp = fs.create(pt, true)
Here's the java doc for this "create" method:
/**
* Create an FSDataOutputStream at the indicated Path.
* @param f the file to create
* @param overwrite if a file with this name already exists, then if true,
* the file will be overwritten, and if false an exception will be thrown.
*/
public FSDataOutputStream create(Path f, boolean overwrite)
throws IOException {
return create(f, overwrite,
getConf().getInt("io.file.buffer.size", 4096),
getDefaultReplication(f),
getDefaultBlockSize(f));
}
On Wed, Aug 27, 2014 at 2:12 PM, rab ra
<[email protected]<mailto:[email protected]>> wrote:
hello
Here is d code snippet, I use to append
def outFile = "${outputFile}.txt"
Path pt = new Path("${hdfsName}/${dir}/${outFile}")
def fs = org.apache.hadoop.fs.FileSystem.get(configuration);
FSDataOutputStream fp = fs.create(pt, true)
fp << "${key} ${value}\n"
On 27 Aug 2014 09:46, "Stanley Shi" <[email protected]<mailto:[email protected]>>
wrote:
would you please past the code in the loop?
On Sat, Aug 23, 2014 at 2:47 PM, rab ra
<[email protected]<mailto:[email protected]>> wrote:
Hi
By default, it is true in hadoop 2.4.1. Nevertheless, I have set it to true
explicitly in hdfs-site.xml. Still, I am not able to achieve append.
Regards
On 23 Aug 2014 11:20, "Jagat Singh"
<[email protected]<mailto:[email protected]>> wrote:
What is value of dfs.support.append in hdfs-site.cml
https://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
On Sat, Aug 23, 2014 at 1:41 AM, rab ra
<[email protected]<mailto:[email protected]>> wrote:
Hello,
I am currently using Hadoop 2.4.1.I am running a MR job using hadoop streaming
utility.
The executable needs to write large amount of information in a file. However,
this write is not done in single attempt. The file needs to be appended with
streams of information generated.
In the code, inside a loop, I open a file in hdfs, appends some information.
This is not working and I see only the last write.
How do I accomplish append operation in hadoop? Can anyone share a pointer to
me?
regards
Bala
--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]
--
Regards,
Stanley Shi,
[http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]