Had to ask :D
On 10/02/2012 07:19 PM, Russell Jurney wrote:
I believe he means per node.
Russell Jurney http://datasyndrome.com
On Oct 2, 2012, at 6:15 PM, hadoopman wrote:
Only 24 map and 8 reduce tasks for 38 data nodes? are you sure that's right?
Sounds VERY low for a cluster
Only 24 map and 8 reduce tasks for 38 data nodes? are you sure that's
right? Sounds VERY low for a cluster that size.
We have only 10 c2100's and are running I believe 140 map and 70 reduce
slots so far with pretty decent performance.
On 10/02/2012 12:55 PM, Alexander Pivovarov wrote:
38
I'm curious if you have been able to track down the cause of the error?
We've seen similar problems with loading data and I've discovered if I
presort my data before the load that things go a LOT smoother.
When running queries against our data sometimes we've seen it where the
jobtracker just
http://www.omgubuntu.co.uk/2011/12/java-to-be-removed-from-ubuntu-uninstalled-from-user-machines/
I'm curious what this will mean for Hadoop on Ubuntu systems moving
forward. I've tried openJDK nearly two years ago with Hadoop. Needless
to say it was a real problem.
Hopefully we can still
SPAM 2.0 :D
On 08/27/2011 10:06 AM, Shahnawaz Saifi wrote:
Whats' this?
On Sat, Aug 27, 2011 at 9:35 AM, Senthil wrote:
So we're seeing the following error during some of our hive loads:
2011-07-05 12:26:52,927 Stage-2 map = 100%, reduce = 100%
Ended Job = job_201106302113_3864
Loading data to table default.merged_weblogs partition (day=null)
Failed with exception Number of dynamic partitions created is 1013,
wh
I've run into similar problems in my hive jobs and will look at the
'mapred.child.ulimit' option. One thing that we've found is when
loading data with insert overwrite into our hive tables we've needed to
include a 'CLUSTER BY' or 'DISTRIBUTE BY' option. Generally that's
fixed our memory issu
Some things which helped us include setting your vm.swappiness to 0 and
mounting your disks with noatime,nodiratime options.
Also make sure your disks aren't setup with RAID (JBOD is recommended)
You might want to run terasort as you tweak your environment. It's very
helpful when checking if
When we load data into hive sometimes we've run into situations where
the load fails and the logs show a heap out of memory error. If I load
just a few days (or months) of data then no problem. But then if I try
to load two years (for example) of data then I've seen it fail. Not
with every f
My guess is it's like back in the days when Linux was considered a 'bad'
option for running a production system and people would freak out when
they found out about it. It was so new and people were just learning
what it's all about. Today it's very mainstream but it took people a
while to fi
r...@gmail.com
On Tue, Apr 26, 2011 at 5:59 AM, hadoopman wrote:
Has anyone had problems with the latest version of hadoop and the fair
scheduler not placing jobs into pools correctly? We're digging into it
currently. An older version of hadoop (using our config file) is worki
hanks& Regards,
Saurabh Bhutyani
Call : 9820083104
Gtalk: s4saur...@gmail.com
On Tue, Apr 26, 2011 at 5:59 AM, hadoopman wrote:
Has anyone had problems with the latest version of hadoop and the fair
scheduler not placing jobs into pools correctly? We're digging into it
curren
Has anyone had problems with the latest version of hadoop and the fair
scheduler not placing jobs into pools correctly? We're digging into it
currently. An older version of hadoop (using our config file) is
working fine however the latest version seems to be putting everything
into the defaul
here. If you use a
32-bit Java this would be a problem.
On Wed, Apr 13, 2011 at 3:16 PM, hadoopman wrote:
Is there an issue with using the regex SerDe with loading into Hive text
files above 2 gigs in size? I've been experiencing out of memory errors
with a select group of logs when ru
Is there an issue with using the regex SerDe with loading into Hive text
files above 2 gigs in size? I've been experiencing out of memory errors
with a select group of logs when running a hive job. I have been able
to load the data if I use split to cut it in half or thirds. No problem.
Goo
I have a process which is loading data into hive hourly. Loading data
hourly isn't a problem however when I load historical data say 24-48
hours I receive the below error msg. In googling I've come across some
suggestions that jvm memory needs to be increased. Are there any other
options or
Great tip. I'll give it a try.
Thanks!
On 04/04/2011 10:17 PM, Alex Kozlov wrote:
Try using octal, I.e. '\040'.
On Apr 4, 2011, at 8:21 PM, hadoopman wrote:
I had a similar problem though my logs were terminated with carriage return.
Many of the fields in my logs
I had a similar problem though my logs were terminated with carriage
return. Many of the fields in my logs are deliminated with a space. We
tried using \s but that basically removed every instance of the letter s
(yeah I thought that was amusing too). In some cases we were able to do
a \\t b
On 12/06/2010 07:48 PM, Dmitriy Ryaboy wrote:
Do you have the failing task's log?
-Dmitriy
On Sat, Dec 4, 2010 at 12:47 PM, hadoopman wrote:
I'll have to look for it. This is my first full blown installation of
Hadoop. Still a LOT to learn
Is that the name it's t
I've run into an interesting problem with syncing a couple of clusters
using distcp. We've validated that it works to a local installation
from our remote cluster. I suspect our firewalls 'may' be responsible
for the problem we're experiencing. We're using ports 9000, 9001 and
50010.I've ver
On 11/30/2010 03:51 AM, Steve Loughran wrote:
On 30/11/10 03:59, hadoopman wrote:
you don't need all the files in the cluster in sync as a lot of them
are intermediate and transient files.
Instead use dfscopy to copy source files to the two clusters, this
runs across the machines i
We have two Hadoop clusters in two separate buildings. Both clusters
are loading the same data from the same sources (the second cluster is
for DR).
We're looking at how we can recover the primary cluster and catch it
back up again as new data will continue to feed into the DR cluster.
It's
22 matches
Mail list logo