[jira] [Commented] (KYLIN-2995) Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing

kangkaisen (JIRA) Wed, 06 Dec 2017 18:39:14 -0800

    [ 
https://issues.apache.org/jira/browse/KYLIN-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281254#comment-16281254
 ]


kangkaisen commented on KYLIN-2995:
-----------------------------------

Not about performance, It's a bug.

Like the method {{bindCurrentConfiguration}}  in {{KylinMapper}} and 
{{KylinReducer}}, All MR job must call this method first, Because we must 
ensure we use the  {{context.getConfiguration()}} for HDFS, not the default 
Configuration. It's the same thing in Spark.

For example， If the following config exists in Kylin server's mountTable.xml,  
doesn't exists in DN node's mountTable.xml. When Kylin Spark job visit 
hdfs://XXXX/kylin, The  {{FileNotFoundException}} will throw.

{code:java}
  <property>
    <name>fs.viewfs.mounttable.XXXX.link./kylin</name>
    <value>hdfs://XXXX/kylin</value>
  </property>
{code}


> Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing
> ------------------------------------------------------------------
>
>                 Key: KYLIN-2995
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2995
>             Project: Kylin
>          Issue Type: Bug
>          Components: Spark Engine
>    Affects Versions: v2.1.0
>            Reporter: kangkaisen
>            Assignee: kangkaisen
>         Attachments: KYLIN-2995.patch
>
>
> Currenly, we load metadata from HDFS in 
> SparkCubing:{{AbstractHadoopJob.loadKylinConfigFromHdfs}}, But HadoopUtil 
> will use new Configuration, we should use SparkContext.hadoopConfiguration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (KYLIN-2995) Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing

Reply via email to