I wont be in a position to fix that depending on HDFS-1804 as we are
upgrading to CDH4 in the coming month. Just wanted a short term solution. I
have read somewhere that manual movement of the blocks would help. Could
some one guide me to the exact steps or precautions I should take while
doing
Hello All,
Can anyone please explain what we mean by Streaming data access in HDFS.
Data is usually copied to HDFS and in HDFS the data is splitted across
DataNodes in blocks.
Say for example, I have an input file of 10240 MB(10 GB) in size and a block
size of 64 MB. Then there will be 160
Streaming means process it as its coming to HDFS, like where in hadoop this
hadoop streaming enable hadoop to receive data using executable of
different types
i hope you have already read this :
http://hadoop.apache.org/docs/r0.18.1/streaming.html#Hadoop+Streaming
*Warm Regards_**∞_*
* Shashwat
i am getting thia error while cloning the hadoop trunk code from
git.apache.org using terminal.
error is:
[cloudera@localhost ~]$ git clone git://git.apache.org/hadoop-common.githadoop
Initialized empty Git repository in /home/cloudera/hadoop/.git/
fatal: Unable to look up git.apache.org (port
Hi all,
Here is a problem that confuses me.
when I use java code to manipulate pseudo-distributed hadoop , it throws an
exception:
java.io.IOException: Failed on local exception: java.io.EOFException; Host
Details : local host is: localhost/127.0.0.1; destination host is:
localhost:9000;
I have
try this git clone https://github.com/apache/hadoop-common.git hadoop
On Wed, Mar 5, 2014 at 1:58 PM, Avinash Kujur avin...@gmail.com wrote:
i am getting thia error while cloning the hadoop trunk code from
git.apache.org using terminal.
error is:
[cloudera@localhost ~]$ git clone
Hi Shashwat,
This is an excerpt from Hadoop The Definitive Guide--Tom White
Hadoop Streaming
Hadoop provides an API to MapReduce that allows you to write your map and reduce
functions in languages other than Java. Hadoop Streaming uses Unix standard
streams
as the interface between Hadoop and
which version of hadoop you are using?
This is something similar with your error log:
http://stackoverflow.com/questions/19895969/can-access-hadoop-fs-through-shell-but-not-through-java-main
Regards,
*Stanley Shi,*
On Wed, Mar 5, 2014 at 4:29 PM, 张超 chao.zh...@dianping.com wrote:
Hi all,
are you asking why data read/write from/to hdfs blocks via mapreduce
framework is done in streaming manner?
On Wed, Mar 5, 2014 at 2:05 PM, Radhe Radhe radhe.krishna.ra...@live.comwrote:
Hi Shashwat,
This is an excerpt from Hadoop The Definitive Guide--Tom White
Hadoop Streaming
Hadoop
you can write a simple tool to move blocks peer to peer. I had such tool
before, but I cannot find it now.
background: our cluster is not balanced, load balancer is very slow, so i
wrote this tool to move blocks from one node to another node.
On Wed, Mar 5, 2014 at 4:06 PM, divye sheth
Hi Nitin,
I believe Hadoop Streaming is different from Streaming Data Access in HDFS.
We usually copy the data in HDFS and then the MR application reads the data
through Map and Reduce tasks.
I need to clear about WHAT and HOW is done in Streaming Data Access in HDFS.
Thanks,
RR
Date: Wed,
Hadoop streaming allows you to create and run Map/Reduce jobs with any
executable or script as the mapper and/or the reducer. In other words, you
need not need to learn java programming for writing simple mapreduce
program.
Where as streaming data access in HDFS is totally different. When
Does this require any downtime? I guess it should and any other precautions
that I should take?
Thanks Azuryy.
On Wed, Mar 5, 2014 at 2:19 PM, Azuryy Yu azury...@gmail.com wrote:
you can write a simple tool to move blocks peer to peer. I had such tool
before, but I cannot find it now.
after downloading 150 mb it gave this error.
error: RPC failed; result=18, HTTP code = 20047 MiB | 24 KiB/s
did not get what it means.
On Wed, Mar 5, 2014 at 12:30 AM, Nitin Pawar nitinpawar...@gmail.comwrote:
try this git clone https://github.com/apache/hadoop-common.git hadoop
On Wed, Mar
I think you are having issues because of slow network.
I would say you checkout the source code from apache svn
ex: svn co
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-2.4
you can checkout from the branch you want to work on.
On Wed, Mar 5, 2014 at 2:48 PM, Avinash Kujur
It looks like your network is unstable. You may consider download it as a
zip from github if you just want a copy of the source code. Try this link:
https://github.com/apache/hadoop-common/archive/trunk.zip
On Wed, Mar 5, 2014 at 5:18 PM, Avinash Kujur avin...@gmail.com wrote:
after
It don't need any downtime. just like Balancer, but this tool move blocks
peer to peer. you specified source node and destination node. then start.
On Wed, Mar 5, 2014 at 5:12 PM, divye sheth divs.sh...@gmail.com wrote:
Does this require any downtime? I guess it should and any other
What is the code that you are trying?
*Warm Regards_**∞_*
* Shashwat Shriparv*
[image:
http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9[image:
https://twitter.com/shriparv] https://twitter.com/shriparv[image:
when i am using this command
mvn clean install -DskipTests -Pdist
its giving this error:
[cloudera@localhost ~]$ mvn clean install -DskipTests -Pdist
[INFO] Scanning for projects...
[INFO]
[INFO] BUILD FAILURE
[INFO]
Did you execute the command from /home/cloudera? Does it contains the
hadoop source code? You need to execute the command from the source code
directory.
On Wed, Mar 5, 2014 at 6:28 PM, Avinash Kujur avin...@gmail.com wrote:
when i am using this command
mvn clean install -DskipTests -Pdist
[cloudera@localhost hadoop-common-trunk]$ mvn clean install -DskipTests
-Pdist
[INFO] Scanning for projects...
Downloading:
http://repo.maven.apache.org/maven2/org/apache/felix/maven-bundle-plugin/2.4.0/maven-bundle-plugin-2.4.0.pom
[ERROR] The build could not read 1 project - [Help 1]
[ERROR]
home/cloudera/ contains hadoop files.
On Wed, Mar 5, 2014 at 2:40 AM, Avinash Kujur avin...@gmail.com wrote:
[cloudera@localhost hadoop-common-trunk]$ mvn clean install -DskipTests
-Pdist
[INFO] Scanning for projects...
Downloading:
yes. it has internet access.
On Wed, Mar 5, 2014 at 2:47 AM, Mingjiang Shi m...@gopivotal.com wrote:
see the error message:
Unknown host repo.maven.apache.org - [Help 2]
Does your machine has internet access?
On Wed, Mar 5, 2014 at 6:42 PM, Avinash Kujur avin...@gmail.com wrote:
it looks like more a connection problem as it complains it cannot access
repo.maven.apache.org.
On Wed, Mar 5, 2014 at 6:49 PM, Avinash Kujur avin...@gmail.com wrote:
yes. it has internet access.
On Wed, Mar 5, 2014 at 2:47 AM, Mingjiang Shi m...@gopivotal.com wrote:
see the error
see the error message:
Unknown host repo.maven.apache.org - [Help 2]
Does your machine has internet access?
On Wed, Mar 5, 2014 at 6:42 PM, Avinash Kujur avin...@gmail.com wrote:
home/cloudera/ contains hadoop files.
On Wed, Mar 5, 2014 at 2:40 AM, Avinash Kujur avin...@gmail.com wrote:
if i follow repo.maven.apache.org link on my url, it is showing this
message :
Browsing for this directory has been disabled.
View http://search.maven.org/#browse this directory's contents on
http://search.maven.org http://search.maven.org/#browse instead.
so how can i change the link from
Can you access this link?
http://repo.maven.apache.org/maven2/org/apache/felix/maven-bundle-plugin/2.4.0/maven-bundle-plugin-2.4.0.pom
On Wed, Mar 5, 2014 at 6:54 PM, Avinash Kujur avin...@gmail.com wrote:
if i follow repo.maven.apache.org link on my url, it is showing this
message :
Are you doing on standalone one box? How large are your test files and how long
of the jobs of each type took?
Yong
From: anth...@mattas.net
Subject: Benchmarking Hive Changes
Date: Tue, 4 Mar 2014 21:31:42 -0500
To: user@hadoop.apache.org
I’ve been trying to benchmark some of the Hive
Yes, I'm using the HortonWorks Data Platform 2.0 Sandbox which is a
standalone box.
But shame on me it looks like the files are both very tiny (46K), I'm
seeing about 23 seconds per query, which appears mostly to be starting up
MR.
So I'm going to find a new data set and try again, is there any
Vinod,
One more observation I can share is that all the times the NM or RM is
getting killed, I see the following kind of messages in the NM's log
2014-03-05 05:33:23,824 DEBUG
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Node's
health-status : true,
2014-03-05 05:33:23,824
Hello Hadoop enthusiasts,
As you are no doubt aware, ApacheCon North America will be held in Denver,
Colorado starting on April 7th. Hadoop has 25 talks and two tutorials!! Check
it out here: http://apacheconnorthamerica2014.sched.org/?s=hadoop.
We would love to see you in Denver next
Can any one help me here ?
On Tue, Mar 4, 2014 at 3:23 PM, nagarjuna kanamarlapudi
nagarjuna.kanamarlap...@gmail.com wrote:
Yes I installed..
mvn clean install -DskipTests was successful. Only import to eclipse is
failing.
On Tue, Mar 4, 2014 at 12:51 PM, Azuryy Yu azury...@gmail.com
You can safely move block files between disks. Follow the instructions
here:
http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F
On Tue, Mar 4, 2014 at 11:47 PM, divye sheth divs.sh...@gmail.com wrote:
Thanks Harsh. The jira is fixed in
The last iteration of stinger is coming with Tez.
The HDP 2 sandbox that you're using is not including Tez. You can add it
manually if you would like (doc is available on Hortonworks.com/labs) or
it'll be available of the HDP 2.1 sandbox.
Kind regards
Olivier
On 5 Mar 2014 17:15, Anthony Mattas
Hi,
I'm going to bump this question up, but it's looking like I may have to
write my own implementation to make this work. Any ways around that using
the existing technology? Is this something that would be useful enough to
modify the existing LocalFileSystem class and contribute back?
Thanks,
See second bullet under https://hadoop.apache.org/mailing_lists.html#User
On Wed, Mar 5, 2014 at 11:11 AM, Dibyendu Karmakar
dibyendu.d...@gmail.comwrote:
unsubscribe
--
Dibyendu Karmakar,
dibyendu.d...@gmail.com
The message means it can not connect to ResourceManager.
Could you share your configuration ? It might be easier to figure out the
real issue.
Thanks
Xuan Gong
On Wed, Mar 5, 2014 at 11:29 AM, Sai Prasanna ansaiprasa...@gmail.comwrote:
Hi,
I have a five node cluster. One master and 4
The thing about yarn is you chose what is right for the the workload.
For example: Spark may not the right choice if for example join tables do
not fit in memory.
On Wednesday, March 5, 2014, Anthony Mattas anth...@mattas.net wrote:
With Tez and Spark becoming mainstream what does Map Reduce
Hi Sai,
A few questions:
1. which version of hadoop are you using? yarn.resourcemanager.hostname is
a new configuration which is not available old versions.
2. Does your yarn-site.xml contains yarn.resourcemanager.scheduler.address?
If yes, what's the value?
3. or you could access
I believe in the future the spark functional style api will dominate the
big data world. Very few people will use the native mapreduce API. Even now
usually users use third-party mapreduce library such as cascading,
scalding, scoobi or script language hive, pig rather than the native
mapreduce
Hi,
it used to be possible to submit a request to a sevlet
using org.apache.hadoop.hdfs.web.AuthFilter as POST specifying
user.name(simple authentication) as a form parameter.
For example,
curl -X POST -d 'user.name=foo' 'http://'
after https://issues.apache.org/jira/browse/HADOOP-10193,
Sorry, it should be accessing http://node_manager_ip:8042/conf to check
the value of yarn.resourcemanager.
scheduler.address on the node manager.
On Thu, Mar 6, 2014 at 9:36 AM, Mingjiang Shi m...@gopivotal.com wrote:
Hi Sai,
A few questions:
1. which version of hadoop are you using?
Hi,
1) Is it possible to do an in-place migration, while keeping all
data in HDFS safely?
yes. stop the HDFS firstly, then run start-dfs.sh -upgrade
2) If it is yes, is there any doc/guidance to do this?
you just want a HDFS upgrade, so I don't think there are some useful doc.
3)
hi,
i am getting error in between when downloading all th jars usng maven
command:
mvn clean install -DskipTests -Pdist
the error is:
[INFO] --- hadoop-maven-plugins:3.0.0-SNAPSHOT:protoc (compile-protoc) @
hadoop-common ---
[WARNING] [protoc, --version] failed with error code 1
help me
Hi Jerry,
R
efer to the following links for reference:
http://www.michael-noll.com/blog/2011/08/23/performing-an-hdfs-upgrade-of-an-hadoop-cluster/
http://wiki.apache.org/hadoop/Hadoop_Upgrade
Notes:
1. the hadoop version used in the doc may be different from yours, but they
are good references
Do you have protobuf installed on your build box?
you can use which protoc to check.
Looks like protobuf is missing.
On Thu, Mar 6, 2014 at 2:55 PM, Avinash Kujur avin...@gmail.com wrote:
hi,
i am getting error in between when downloading all th jars usng maven
command:
mvn clean install
yes. protobuf is installed. libprotoc 2.4.1
i checked.
On Wed, Mar 5, 2014 at 11:04 PM, Gordon Wang gw...@gopivotal.com wrote:
Do you have protobuf installed on your build box?
you can use which protoc to check.
Looks like protobuf is missing.
On Thu, Mar 6, 2014 at 2:55 PM, Avinash Kujur
47 matches
Mail list logo