it was defined at hadoop-config.sh
On Fri, Mar 28, 2014 at 1:19 PM, divye sheth divs.sh...@gmail.com wrote:
Which version of hadoop are u using? AFAIK the hadoop mapred home is the
directory where hadoop is installed or in other words untarred.
Thanks
Divye Sheth
On Mar 28, 2014 10:43
Try adding the hadoop bin path to system path.
-Rahul Singh
On Fri, Mar 28, 2014 at 11:32 AM, Azuryy Yu azury...@gmail.com wrote:
it was defined at hadoop-config.sh
On Fri, Mar 28, 2014 at 1:19 PM, divye sheth divs.sh...@gmail.com wrote:
Which version of hadoop are u using? AFAIK the
we can execute the above command anywhere or do i need to execute it in any
particular directory?
thanks
On Thu, Mar 27, 2014 at 11:41 PM, divye sheth divs.sh...@gmail.com wrote:
I believe you are using Hadoop 2. In order to get the mapred working you
need to set the HADOOP_MAPRED_HOME path
You can execute this command on any machine where you have set the
HADOOP_MAPRED_HOME
Thanks
Divye Sheth
On Fri, Mar 28, 2014 at 12:31 PM, Avinash Kujur avin...@gmail.com wrote:
we can execute the above command anywhere or do i need to execute it in
any particular directory?
thanks
On
i am not getting where to set HADOOP_MAPRED_HOME and how to set.
thanks
On Fri, Mar 28, 2014 at 12:06 AM, divye sheth divs.sh...@gmail.com wrote:
You can execute this command on any machine where you have set the
HADOOP_MAPRED_HOME
Thanks
Divye Sheth
On Fri, Mar 28, 2014 at 12:31 PM,
Yes, use
http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.Path,
long, long)
On Fri, Mar 28, 2014 at 7:33 AM, Libo Yu yu_l...@hotmail.com wrote:
Hi all,
hadoop path fsck -files -block -locations can list locations for all
Please also indicate your exact Hadoop version in use.
On Fri, Mar 28, 2014 at 9:04 AM, haihong lu ung3...@gmail.com wrote:
dear all:
I had a problem today, when i executed the command mapred job
-list on a slave, an error came out. show the message as below:
14/03/28 11:18:47
Hi Avinash,
The export command you can execute on any one machine in the cluster as of
now. Once you have executed the export command i.e. export
HADOOP_MAPRED_HOME=/path/to/your/hadoop/installation you can then execute
the mapred job -list command from that very same machine.
Thanks
Divye Sheth
There's is a big chance that your map output is being copied to your
reducer, this could take quite some time if you have a lot of data and
could be resolved by:
1) having more reducers
2) adjust the slowstart parameter so that the copying can start while the
map tasks are still running
Regards,
I have a program that do some map-reduce job and then read the result
of the job.
I learned that hdfs is not strong consistent. when it's safe to read the result?
as long as output/_SUCCESS exist?
_SUCCES implies that the job has succesfully terminated, so this seems like
a reasonable criterion.
Regards, Dieter
2014-03-28 9:33 GMT+01:00 Li Li fancye...@gmail.com:
I have a program that do some map-reduce job and then read the result
of the job.
I learned that hdfs is not strong
thanks. is the following codes safe?
int exitCode=ToolRunner.run()
if(exitCode==0){
//safe to read result
}
On Fri, Mar 28, 2014 at 4:36 PM, Dieter De Witte drdwi...@gmail.com wrote:
_SUCCES implies that the job has succesfully terminated, so this seems like
a reasonable criterion.
How to run data node block scanner on data node in a cluster from a
remote machine?
By default data node executes block scanner in 504 hours. This is the
default value of dfs.datanode.scan.period . If I want to run the data
node block scanner then one way is to configure the
How to run data node block scanner on data node in a cluster from a remote
machine?
By default data node executes block scanner in 504 hours. This is the
default value of dfs.datanode.scan.period . If I want to run the data
node block scanner then one way is to configure the property of
To ensure data I/O integrity, hadoop uses CRC 32 mechanism to generate
checksum for the data stored on hdfs . But suppose I have a data node machine
that does not have ecc(error correcting code) type of memory, So will hadoop
hdfs will be able to generate checksum for data blocks when
Hello Reena,
No there isn't a programmatic way to invoke the block scanner. Note
though that the property to control its period is DN-local, so you can
change it on DNs and do a DN rolling restart to make it take effect
without requiring a HDFS downtime.
On Fri, Mar 28, 2014 at 3:07 PM, reena
While the HDFS functionality of computing, storing and validating
checksums for block files does not specifically _require_ ECC, you do
_want_ ECC to avoid frequent checksum failures.
This is noted in Tom's book as well, in the chapter that discusses
setting up your own cluster:
ECC memory is
I was going through this link
http://stackoverflow.com/questions/9406477/data-integrity-in-hdfs-which-data-nodes-verifies-the-checksum
. Its written that in recent version of hadoop only the last data node
verifies the checksum as the write happens in a pipeline fashion.
Now I have a question:
hi,
how can i be assignee fro a particular issue?
i can't see any option for being assignee on the page.
Thanks.
no doubt
Sent from my iPhone 6
On Mar 23, 2014, at 17:37, Fengyun RAO raofeng...@gmail.com wrote:
What does this exception mean? I googled a lot, all the results tell me it's
because the time is not synchronized between datanode and namenode.
However, I checked all the servers, that the
Hey,
I did look in HDFS for replication in filesystem master x slave.
Have any way to do master x master?
I just have 1 TB of files in a server and i want to replicate to another
server, in real time sync.
Thanks !
Hi All,
I have created a wiki on github:
https://github.com/ercoppa/HadoopDiagrams/wiki
This is an effort to provide an updated documentation of how the internals
of Hadoop work. The main idea is to help the user understand the big
picture without removing too much internal details. You can
If you’re spitballing options might also look at Pattern
http://www.cascading.org/projects/pattern/
Has some nuances so be sure to spend the time to vet your specific use case
(i.e. what you’re actually doing in R and what you want to accomplish
leveraging data in Hadoop).
From: Sri
You mean replication between two different hadoop cluster or you just need data
to be replicated between two different nodes?
Sent from my iPhone
On Mar 28, 2014, at 8:10 AM, Victor Belizário victor_beliza...@hotmail.com
wrote:
Hey,
I did look in HDFS for replication in filesystem
what is your compression format gzip, lzo or snappy
for lzo final output
FileOutputFormat.setCompressOutput(conf, true);
FileOutputFormat.setOutputCompressorClass(conf, LzoCodec.class);
In addition, to make LZO splittable, you need to make a LZO index file.
On Thu, Mar 27, 2014 at 8:57 PM,
have you looked into FileSystem API this is hadoop v2.2.0
http://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/fs/FileSystem.html
does not exist in
http://hadoop.apache.org/docs/r1.2.0/api/org/apache/hadoop/fs/FileSystem.html
how about adding
ipc.client.connect.max.retries.on.timeouts
*2 (default is 45)*Indicates the number of retries a client will make on
socket timeout to establish a server connection.
does that help?
On Thu, Mar 27, 2014 at 4:23 PM, John Lilley john.lil...@redpoint.netwrote:
It seems to take a
Hi Avin,
You should be added as an sub-project's contributor, then you can be an
assignee. so you can find how to be an contributor on the Wiki.
On Fri, Mar 28, 2014 at 6:50 PM, Avinash Kujur avin...@gmail.com wrote:
hi,
how can i be assignee fro a particular issue?
i can't see any option
Very helpful indeed Emillio, thanks!
On Fri, Mar 28, 2014 at 12:58 PM, Emilio Coppa erco...@gmail.com wrote:
Hi All,
I have created a wiki on github:
https://github.com/ercoppa/HadoopDiagrams/wiki
This is an effort to provide an updated documentation of how the internals
of Hadoop work.
if the job complets without any failures exitCode should be 0 and safe
to read the result
public class MyApp extends Configured implements Tool {
public int run(String[] args) throws Exception {
// Configuration processed by ToolRunner
Configuration conf = getConf();
Hi Victor,
if by replication you mean copy from one cluster to other, you can use the
distcp command.
Cheers.
On 28 Mar 2014, at 16:30, Serge Blazhievsky hadoop...@gmail.com wrote:
You mean replication between two different hadoop cluster or you just need
data to be replicated between two
Hi Reena,
the pipeline is per block. If you have half of your file in data node A only,
that means the pipeline had only one node (node A, in this case, probably
because replication factor is set to 1) and then, data node A has the checksums
for its block. The same applies to data node B.
hello experts,
am really new to hadoop - Is it possible to find out based on pig or hive
query to find out under the hood map reduce algorithm??
thanks
You can use ILLUSTRATE and EXPLAIN commands to see the execution plan, if
you mean that by 'under the hood algorithm'
http://pig.apache.org/docs/r0.11.1/test.html
Regards,
Shahab
On Fri, Mar 28, 2014 at 5:51 PM, Spark Storm using.had...@gmail.com wrote:
hello experts,
am really new to
Hi Max,
Not sure if you have already, but you might also want to look into
Apache Ambari [1] for provisioning, managing, and monitoring Hadoop
clusters.
Many have successfully deployed Hadoop clusters on EC2 using Ambari.
[1] http://ambari.apache.org/
Yusaku
On Fri, Mar 28, 2014 at 7:07 PM,
35 matches
Mail list logo