Hi,
Linux kernel provides delay accounting information through a netlink
socket to user space. You can read more about it here:
http://www.mjmwired.net/kernel/Documentation/accounting/taskstats.txt.
I think there's a python tool called iotop that uses this feature.
Hope this helps.
Regards,
Hi,
As far as I can tell I've followed the setup instructions for a hadoop cluster
to the letter,
but I find that the datanodes can't connect to the namenode on port 9000
because it is only
listening for connections from localhost.
In my case, the namenode is called centos1, and the datanode
Hi,
Please let us know how this works out. Also, it would be nice if
people with experience with other RDMBS than MySQL and Oracle could
comment on the syntax and performance of their respective RDBMS with
regard to Hadoop. Even if the syntax of the current SQL queries are
valid for
Michael Lynch wrote:
Hi,
As far as I can tell I've followed the setup instructions for a hadoop
cluster to the letter,
but I find that the datanodes can't connect to the namenode on port 9000
because it is only
listening for connections from localhost.
In my case, the namenode is called
On Fri, Feb 13, 2009 at 8:37 AM, Steve Loughran ste...@apache.org wrote:
Michael Lynch wrote:
Hi,
As far as I can tell I've followed the setup instructions for a hadoop
cluster to the letter,
but I find that the datanodes can't connect to the namenode on port 9000
because it is only
Hello,
Running a MR job on 7 machines failed when it came to processing
53GB. Browsing the errors,
org.saptarshiguha.rhipe.GRMapreduce$GRCombiner.reduce(GRMapreduce.java:149)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:1106)
at
I had a problem that it listened only on 8020, even though I told it to use
9000
On Fri, Feb 13, 2009 at 7:50 AM, Norbert Burger norbert.bur...@gmail.comwrote:
On Fri, Feb 13, 2009 at 8:37 AM, Steve Loughran ste...@apache.org wrote:
Michael Lynch wrote:
Hi,
As far as I can tell I've
Anum Ali wrote:
yes
On Thu, Feb 12, 2009 at 4:33 PM, Steve Loughran ste...@apache.org wrote:
Anum Ali wrote:
Iam working on Hadoop SVN version 0.21.0-dev. Having some problems ,
regarding running its examples/file from eclipse.
It gives error for
Exception in thread main
This only occurs in linux , in windows its fine.
On Fri, Feb 13, 2009 at 7:11 AM, Steve Loughran ste...@apache.org wrote:
Anum Ali wrote:
yes
On Thu, Feb 12, 2009 at 4:33 PM, Steve Loughran ste...@apache.org
wrote:
Anum Ali wrote:
Iam working on Hadoop SVN version 0.21.0-dev.
Anum Ali wrote:
This only occurs in linux , in windows its fine.
do a java -version for me, and an ant -diagnostics, stick both on the bugrep
https://issues.apache.org/jira/browse/HADOOP-5254
It may be that XInclude only went live in java1.6u5; I'm running a
JRockit JVM which predates
Sean,
A few things in your messages is not clear to me. Currently this is what
I make out of it :
1) with 1k limit, you do see the problem.
2) with 16 limit - (?) not clear if you see the problem
3) with 8k you don't see the problem
3a) with or without the patch, I don't know.
But if
Does anyone have an expected or experienced write speed to HDFS outside
of Map/Reduce? Any recommendations on properties to tweak in
hadoop-site.xml?
Currently I have a multi-threaded writer where each thread is writing to
a different file. But after a while I get this:
java.io.IOException:
One thing to mention is 'limit' is not SQL standard. Microsoft SQL
Server uses the SELECT TOP 100 FROM table. Some RDBMS may not support
any such syntax. To be more SQL compliant you should use some data
like an auto ID or DATE column for an offset. It is tricky to write
anything truly database
I see that there is a patch for the fair scheduler for 0.18.1 in
HADOOP-3746. Does anyone know if there is a similar patch for the capacity
scheduler? I did a search on JIRA but didn't find anything.
Bill
Raghu,
Apologies for the confusion. I was seeing the problem with any setting
for dfs.datanode.max.xcievers... 1k, 2k and 8k. Likewise, I was also seeing
the problem with different open file settings, all the way up to 32k.
Since I installed the patch, HDFS has been performing much better. The
Sean Knapp wrote:
Raghu,
Apologies for the confusion. I was seeing the problem with any setting
for dfs.datanode.max.xcievers... 1k, 2k and 8k. Likewise, I was also seeing
the problem with different open file settings, all the way up to 32k.
Since I installed the patch, HDFS has been performing
Raghu,
Great, thanks for the help.
Regards,
Sean
2009/2/13 Raghu Angadi rang...@yahoo-inc.com
Sean Knapp wrote:
Raghu,
Apologies for the confusion. I was seeing the problem with any setting
for dfs.datanode.max.xcievers... 1k, 2k and 8k. Likewise, I was also
seeing
the problem with
Is there a way to tell Hadoop to not run Map and Reduce concurrently? I'm
running into a problem where I set the jvm to Xmx768 and it seems like 2
mappers and 2 reducers are running on each machine that only has 1.7GB of
ram, so it complains of not being able to allocate memory...(which makes
Hello,
I would really appreciate any help I can get on this! I've suddenly ran into
a very strange error.
when I do:
bin/start-all
I get:
hadoop$ bin/start-all.sh
starting namenode, logging to
/Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-namenode-loteria.cs.tamu.edu.out
starting
Sandy -
I suggest you take a look into your NameNode and DataNode logs. From the
information posted, these likely would be at
/Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-namenode-loteria.cs.tamu.edu.log
Hey Sandy
I had a similar problem with Hadoop. All I did was I stopped all the daemons
using stop-all.sh. Then formatted the namenode again using hadoop namenode
-format. After this I went on to restarting everything by using start-all.sh
I hope you dont have much data on the datanode,
Kris,
This is the case when you have only 1 reducer.
If it doesn't have any side effects for you..
Rasit
2009/2/14 Kris Jirapinyo kjirapi...@biz360.com:
Is there a way to tell Hadoop to not run Map and Reduce concurrently? I'm
running into a problem where I set the jvm to Xmx768 and it seems
What do you mean when I have only 1 reducer?
On Fri, Feb 13, 2009 at 4:11 PM, Rasit OZDAS rasitoz...@gmail.com wrote:
Kris,
This is the case when you have only 1 reducer.
If it doesn't have any side effects for you..
Rasit
2009/2/14 Kris Jirapinyo kjirapi...@biz360.com:
Is there a way
I agree with Amar and James,
if you require permissions for your project,
then
1. create a group in linux for your user.
2. give group write access to all files in HDFS. (hadoop dfs -chmod -R
g+w / - or sth, I'm not totally sure.)
3. change group ownership of all files in HDFS. (hadoop dfs
With this configuration, any user having that group name will be able
to write to any location..
(I've tried this in local network, though)
2009/2/14 Rasit OZDAS rasitoz...@gmail.com:
I agree with Amar and James,
if you require permissions for your project,
then
1. create a group in linux
Have only one instance of the reduce task. This will run once your map tasks
are completed. You can set this in your job conf by using
conf.setNumReducers(1)
Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
2009/2/13 Kris Jirapinyo
I can't afford to have only one reducer as my dataset is huge...right now it
is 50GB and so the output.collect() in the reducer will surely run out of
java heap space.
2009/2/13 Amandeep Khurana ama...@gmail.com
Have only one instance of the reduce task. This will run once your map
tasks
are
What you can probably do is have the combine function do some reducing
before the single reducer starts off. That might help.
Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
2009/2/13 Kris Jirapinyo kris.jirapi...@biz360.com
I can't afford to have only
Thanks for the recommendation, haven't really looked into how the combiner
might be able to help. Now, are there any downsides to having one 50GB file
as an output? If I understand correctly, the number of reducers you set for
your job is the number of files you will get as output.
2009/2/13
Yes, number of output files = number of reducers. There is no downside of
having a 50GB file. That really isnt too much of data. Ofcourse, multiple
reducers would be much faster. But since you want a sequential run, having a
single reducer is the only option I am aware of.
You could consider
Hi
I ran into a use case where I need to keep two contexts for metrics.
One being ganglia and the other being a file context (to do offline
metrics analysis).
I altered JvmMetrics to allow for the user to supply a context
instead of if getting one by name, and altered file context for it
The parser problem is related to jar files , can be resolved not a bug.
Forwarding link for its solution
http://www.jroller.com/navanee/entry/unsupportedoperationexception_this_parser_does_not
On 2/13/09, Steve Loughran ste...@apache.org wrote:
Anum Ali wrote:
This only occurs in linux ,
32 matches
Mail list logo