Thanks Lijun,

I found an answer that fixed my problem.  Apparently, Ambari configuration of 
the HDFS HA mode configured all of the core Horton packaged modules correctly 
for them to work in HA mode (including HIVE).  Hive would no longer accept a 
hdfs://<hostname>:<port>/ syntax in HA mode and was expecting 
hdfs://<ha_cluster_name>/ format.

The issue was that the Ambari 2.6 platform was still using a deprecated setting 
in core-site.xml named “fs.defaultFS” pointing to my new HA cluster 
“hdfs://bdp01”.  However, apparently Kylin has this deprecated and is expecting 
“fs.default.name” to hold this setting.  So, I went into Ambari HDFS advanced 
configs, under the “Custom core-site” section and added a custom property for:  
“fs.default.name=hdfs://bdp01” and rolled out the configs via Ambari.  Kylin 
was then able to find the updated core-site.xml file in 
${KYLIN_HOME}/Hadoop-conf/core-site.xml (a symbolic link to the 
Ambari-configured file).  Once this was done, Kylin was once again able to 
build cubes!

-Phil

From: Lijun Cao <641507...@qq.com>
Sent: Monday, November 12, 2018 6:46 PM
To: user@kylin.apache.org
Subject: Re: Kylin 2.3.1 failing cube build on HDFS High-Availability cluster

Hi Phil,

What’s your deployment of your HBase cluster? Is it deployed as a standalone 
cluster?

Here is a blog which have mentioned the settings of NN 
HA(http://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/). But the 
scene is deploying HBase cluster as a standalone cluster.

See if it can help you.

Best Regards

Lijun Cao


在 2018年11月13日,02:39,Phil Scott 
<phil.sc...@pricespider.com<mailto:phil.sc...@pricespider.com>> 写道:

Folks,

My real question:  Are there any settings in kylin.properties, or in the 
hdfs-site.xml or hive-site.xml, that can clue Kylin into the required syntax 
for HA HDFS urls?

Background:

I have been running Kylin 2.3.1 for almost a year (very happily), on a Horton 
HDP 2.6 cluster.  This weekend, my HDFS namenode had an issue and went down.  I 
decided to upgrade it to HDFS High Availability mode.
See: 
https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.2/bk_ambari-operations/content/how_to_configure_namenode_high_availability.html
 for details.

My HDFS cluster is now operating in HA mode, and now my Kylin Cube Builds are 
failing on step 1.  They’ve been working fine up until this change.

Once in HA mode, HDFS clients are supposed to recognize from the hdfs-site.conf 
file that the HA mode is enabled, and use a different syntax for talking to 
HDFS urls.  For example, in the logs for cube-build step 1, Kylin is trying to 
tell Hive to create an external table and map its “location” to an HDFS 
location, using the old NameNode’s hostname directly (like this…)

(**** CREATE EXTERNAL TABLE code snipped out above ***)

STORED AS SEQUENCEFILE
LOCATION 
'hdfs://pschd01.internaldomain.com:8020/<hdfs://pschd01.internaldomain.com:8020/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f>kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f<hdfs://pschd01.internaldomain.com:8020/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f>';

(*** ALTER TABLE command comes next ***


In the above, the 
‘hdfs://pschd01.internaldomain.com:8020/’<hdfs://pschd01.internaldomain.com:8020/%E2%80%99>
 address is directly addressing the old HDFS NameNode.  This throws an error as 
follows:

Failed with exception Wrong FS: 
hdfs://pschd01.internaldomain.com:8020<hdfs://pschd01.internaldomain.com:8020/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f/.hive-staging_hive_2018-11-12_01-22-26_487_4731507377031334971-1/-ext-10000>/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f/.hive-staging_hive_2018-11-12_01-22-26_487_4731507377031334971-1/-ext-10000<hdfs://pschd01.internaldomain.com:8020/kylin/kylin_metadata/kylin-51b56b1a-0f95-4825-ab13-d23a5ccb90ee/kylin_intermediate_ereputationv2_reviews_distinct_v2_prod_cube_453e6583_b7fb_4e62_8ffc_a330bb4e246f/.hive-staging_hive_2018-11-12_01-22-26_487_4731507377031334971-1/-ext-10000>,
 expected: hdfs://bdp01
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask


So, Hive is complaining that it is expecting to see the new HA syntax which is: 
  hdfs://<ha_service_name>/  instead of hdfs://<namenode_host>:<namenode_port>/

It looks like Kylin is generating HIVE statements that use the old namenode 
host syntax, but needs to somehow be configured to use the new HDFS HA syntax.

I appreciate any help!!!

-Phil

Reply via email to