[jira] [Created] (KYLIN-1515) Cube Build - java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses

Richard Calaba (JIRA) Mon, 21 Mar 2016 21:19:11 -0700

Richard Calaba created KYLIN-1515:
-------------------------------------

             Summary: Cube Build - java.io.IOException: Cannot initialize 
Cluster. Please check your configuration for mapreduce.framework.name and the 
correspond server addresses
                 Key: KYLIN-1515
                 URL: https://issues.apache.org/jira/browse/KYLIN-1515
             Project: Kylin
          Issue Type: Bug
          Components: Job Engine
    Affects Versions: v1.5.0
         Environment: MapR - Hadoop 2.5.1
            Reporter: Richard Calaba
            Assignee: Dong Li



Knowing that MapR is not officially supported we were able to use Kylin 1.2 in 
our MapR distro successfully. 

After upgrade to Kylin 1.5.0 we are facing issue with the Cube Build process - 
the one which worked on 1.2 without issues. The Cube is created from scratch 
(no Kylin metadata migration) on clean install of Kylinn1.5.0 (HDFS directory 
/kytlin and HBase tables KYLIN* and kylin* deleted prior upgrade from 1.2 to 
1.5.0).

The build process is Failing in Step 1 complaining about property value 
"mapreduce.framework.name". According to this post 
https://stackoverflow.com/questions/19642862/cannot-initialize-cluster-exception-while-running-job-on-hadoop-2
 - the solution should be to ensure the respective property is correctly set in 
the file mapred-site.xml.

Originally in our MapR distro the property was commented (and having value 
yarn-tez) - even after adding the "yarn" value -> the Build process still fails 
with same exception - I am not sure what is wrong with our cluster 
configuration.  Anyone has an idea ???

Below is our mapred-site.xml content:
==============================

cat /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/mapred-site.xml

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>node1:10020</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>node1:19888</value>
  </property>
  <!--
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn-tez</value>
  </property>
  -->
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

Known workaround:
================

Know workaround to make this error to disappear is to delete from 
conf/kylin_hive_conf.xml this property section:

<property>
<name>dfs.block.size</name>
<value>32000000</value>
<description>Want more mappers for in-mem cubing, thus smaller the DFS block 
size</description>
</property>


The full log output of Cube Build Step 1 - attached below: 
==============================================

OS command error exit with 1 -- hive -e "USE default;
DROP TABLE IF EXISTS 
kylin_intermediate_TestCube_clone2_19700101000000_2922789940817071255;

CREATE EXTERNAL TABLE IF NOT EXISTS 
kylin_intermediate_TestCube_clone2_19700101000000_2922789940817071255
(
DEFAULT_BATTING_PLAYER_ID string
,DEFAULT_BATTING_YEAR int
,DEFAULT_BATTING_RUNS int
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177'
STORED AS SEQUENCEFILE
LOCATION 
'/kylin/kylin_metadata/kylin-3eb4b652-a2a4-4659-8b6a-dc822e1341fb/kylin_intermediate_TestCube_clone2_19700101000000_2922789940817071255';

SET dfs.replication=2;
SET dfs.block.size=32000000;
SET hive.exec.compress.output=true;
SET hive.auto.convert.join.noconditionaltask=true;
SET hive.auto.convert.join.noconditionaltask.size=300000000;
SET 
mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET 
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET hive.merge.mapfiles=true;
SET hive.merge.mapredfiles=true;
SET mapred.output.compression.type=BLOCK;
SET hive.merge.size.per.task=256000000;
SET hive.support.concurrency=false;
SET mapreduce.job.split.metainfo.maxsize=-1;
INSERT OVERWRITE TABLE 
kylin_intermediate_TestCube_clone2_19700101000000_2922789940817071255 SELECT
BATTING.PLAYER_ID
,BATTING.YEAR
,BATTING.RUNS
FROM DEFAULT.BATTING as BATTING 
LEFT JOIN DEFAULT.TEMP_BATTING as TEMP_BATTING
ON BATTING.PLAYER_ID = TEMP_BATTING.COL_VALUE
;

"

Logging initialized using configuration in 
jar:file:/opt/mapr/hive/hive-1.0/lib/hive-common-1.0.0-mapr-1510.jar!/hive-log4j.properties
OK
Time taken: 0.611 seconds
OK
Time taken: 0.83 seconds
OK
Time taken: 0.474 seconds
Query ID = mapr_20160321201212_610078b4-5805-43eb-8fd1-87304530a84e
Total jobs = 3
2016-03-21 08:12:32     Starting to launch local task to process map join;      
maximum memory = 477102080
2016-03-21 08:12:32     Dump the side-table for tag: 1 with group count: 95196 
into file: 
file:/tmp/mapr/b35c5ac2-3231-4ef1-9e6b-216c0a1bd9ef/hive_2016-03-21_20-12-31_085_8296009472449837835-1/-local-10003/HashTable-Stage-9/MapJoin-mapfile01--.hashtable
2016-03-21 08:12:32     Uploaded 1 File to: 
file:/tmp/mapr/b35c5ac2-3231-4ef1-9e6b-216c0a1bd9ef/hive_2016-03-21_20-12-31_085_8296009472449837835-1/-local-10003/HashTable-Stage-9/MapJoin-mapfile01--.hashtable
 (7961069 bytes)
2016-03-21 08:12:32     End of local task; Time Taken: 0.853 sec.
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: Cannot initialize Cluster. Please check your configuration 
for mapreduce.framework.name and the correspond server addresses.
        at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76)
        at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)
        at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:449)
        at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:399)
        at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1619)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1379)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1019)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1009)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:201)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:153)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:364)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:299)
        at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:662)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:631)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:570)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Job Submission failed with exception 'java.io.IOException(Cannot initialize 
Cluster. Please check your configuration for mapreduce.framework.name and the 
correspond server addresses.)'
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KYLIN-1515) Cube Build - java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses

Reply via email to