Hello,
we are trying to configure hadoop HDP 2.2 running on azure cloud to use a
Azure Storage BLOB instead of regular HDFS.
Cluster is up and running, we can list files in azure blob storage over
hdoop fs commands. But when trying to run smoke test mapreduce teragen we
are getting following
, Jakub Stransky stransky...@gmail.com
wrote:
Hello,
I am configuring capacity scheduler all seems ok but I cannot find what
is the meaning of the following property
yarn.scheduler.capacity.root.unfunded.capacity
I just found that everywhere is set to 50 and description is No
description
Hello,
I am configuring capacity scheduler all seems ok but I cannot find what is
the meaning of the following property
yarn.scheduler.capacity.root.unfunded.capacity
I just found that everywhere is set to 50 and description is No
description.
Can anybody clarify or point to where to find
Hello experienced users,
we are new to hadoop hence using nearly default configuration including
scheduler - which I guess by default is Capacity Scheduler.
Lately we were confronted with following behaviour on the cluster. We are
using apache oozie for job submission of various data pipes. We
requested(and used ofcourse) by my pig-script (not as a yarn queue
configuration or some such stuff.. I want to limit it from outside on a
per job basis. I would ideally like to set the number in my pig-script.)
Can I do this?
Thanks,
Sunil.
--
Jakub Stransky
cz.linkedin.com
or
code, then we can, I think. We do have this property mapreduce.job.maps.
Regards,
Shahab
On Tue, Oct 21, 2014 at 2:42 AM, Jakub Stransky stransky...@gmail.com
wrote:
Hello,
as far as I understand. Number of mappers you cannot drive. The number of
reducers you can control via PARALEL keyword
Distcp?
On 17 Oct 2014 20:51, Alexander Pivovarov apivova...@gmail.com wrote:
try to run on dest cluster datanode
$ hadoop fs -cp hdfs://from_cluster/hdfs://to_cluster/
On Fri, Oct 17, 2014 at 11:26 AM, Shivram Mani sm...@pivotal.io wrote:
What is your approx input size ?
Do
Hello experienced users,
I did try to use profiling of tasks during mapreduce
property
namemapreduce.task.profile/name
valuetrue/value
/property
property
namemapreduce.task.profile.maps/name
value0-5/value
/property
property
Hello experienced hadoop users,
I have one beginners question regarding cpu utilization on datanodes when
running MR job. Cluster of 5 machines, 2NN +3 DN really inexpensive hw
using following parameters:
# hadoop - yarn-site.xml
yarn.nodemanager.resource.memory-mb : 2048
,
Siddhi
--
Jakub Stransky
cz.linkedin.com/in/jakubstransky
!
Adam
2014-09-12 17:51 GMT+02:00 Jakub Stransky stransky...@gmail.com:
Hello experienced hadoop users,
I have one beginners question regarding cpu utilization on datanodes when
running MR job. Cluster of 5 machines, 2NN +3 DN really inexpensive hw
using following parameters:
# hadoop - yarn
(mapreduce.reduce.memory.mb).
If you run the MapReduce app master, you need 1024 MB (
yarn.app.mapreduce.am.resource.mb).
Therefore, you run MapReduce job, you can run only 2 containers per
NodeManager (3 x 768 = 2304 2048) on your setup.
2014-09-12 20:37 GMT+02:00 Jakub Stransky stransky...@gmail.com
Hello hadoop users,
I am facing following issue when running M/R job during a reduce phase:
Container [pid=22961,containerID=container_1409834588043_0080_01_10] is
running beyond virtual memory limits. Current usage: 636.6 MB of 1 GB
physical memory used; 2.1 GB of 2.1 GB virtual memory
as 768M and reduce memory as 1024M and am as
1024M.
With AM and a single map task it is 1.7M and cannot start another
container for reducer.
Reduce these values and check.
On 9/11/14, Jakub Stransky stransky...@gmail.com wrote:
Hello hadoop users,
I am facing following issue when running
Hello experienced hadoop users,
I am having a data pipeline consisting of two java MR jobs coordinated by
oozie scheduler. Both of them process the same data but the first one is
more than 10 times slower than second one. Job counters on RM page are not
much helpful in that matter. I have
Hello,
I am getting following error when running on 500MB dataset compressed in
avro data format.
Container [pid=22961,containerID=container_1409834588043_0080_01_10] is
running beyond virtual memory limits. Current usage: 636.6 MB of 1 GB
physical memory used; 2.1 GB of 2.1 GB virtual
Hello,
we are using Hadoop 2.2.0 (HDP 2.0), avro 1.7.4. running on CentOS 6.3
I am facing a following issue when using a AvroMultipleOutputs with dynamic
output files. My M/R job works fine for a smaller amount of data or at
least the error hasn't appear there so far. With bigger amount of data I
18 matches
Mail list logo