Re: Kylin and Remote Server

Shi, Shaofeng Mon, 31 Aug 2015 18:01:08 -0700

should be related with: https://issues.apache.org/jira/browse/KYLIN-975


The patch is available now; You can make a new build by pull 0.7-staging
branch and then run scripts/package.sh

On 9/1/15, 7:46 AM, "Diego Pinheiro" <[email protected]> wrote:

>After some changes, I am getting the following error:
>
>[pool-5-thread-2]:[2015-08-31
>19:42:13,394][ERROR][org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJ
>ob.run(FactDistinctColumnsJob.java:83)]
>- error in FactDistinctColumnsJob
>java.io.IOException:
>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube_
>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>table not found)
>        at 
>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForma
>t.java:97)
>        at 
>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForma
>t.java:51)
>        at 
>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactDi
>stinctColumnsJob.java:101)
>        at 
>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctCo
>lumnsJob.java:74)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>        at 
>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable
>.java:112)
>        at 
>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutab
>le.java:107)
>        at 
>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChai
>nedExecutable.java:50)
>        at 
>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutab
>le.java:107)
>        at 
>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Defaul
>tScheduler.java:132)
>        at 
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1145)
>        at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:615)
>        at java.lang.Thread.run(Thread.java:745)
>Caused by: 
>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube_
>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>table not found)
>        at 
>org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMe
>taStore.java:1560)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at 
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>57)
>        at 
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>pl.java:43)
>        at java.lang.reflect.Method.invoke(Method.java:606)
>        at 
>org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHand
>ler.java:105)
>        at com.sun.proxy.$Proxy45.get_table(Unknown Source)
>        at 
>org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStor
>eClient.java:997)
>        at 
>org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191)
>        at 
>org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(Initial
>izeInput.java:105)
>        at 
>org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInpu
>t.java:86)
>        at 
>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForma
>t.java:95)
>        ... 13 more
>
>Did you already face it?
>
>
>On Wed, Aug 26, 2015 at 11:20 PM, Diego Pinheiro
><[email protected]> wrote:
>> @DroopyHoo, it is good to know that since we are planning to change
>> authentication, but this is not the cause of my error. Since hongbin
>> ma comments, I changed my kylin properties to use the same as sandbox
>> (which is my VM HDP 2.1). So, I am using LDAP auth for now.
>>
>> @hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
>> changed it to false, but I am still getting the same errors:
>>
>> [pool-5-thread-2]:[2015-08-25
>> 
>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.check
>>Status(HadoopStatusChecker.java:91)]
>> - error check status
>> java.net.ConnectException: Connection refused
>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>>     at 
>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:3
>>39)
>>     at 
>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl
>>.java:198)
>>     at 
>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182
>>)
>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>     at java.net.Socket.connect(Socket.java:579)
>>
>> org.apache.kylin.job.exception.ExecuteException:
>>java.lang.NullPointerException
>>     at 
>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>ble.java:110)
>>     at 
>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultCha
>>inedExecutable.java:50)
>>     at 
>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>ble.java:106)
>>
>> Just to resume, this is my current environment:
>>
>> - one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it
>>worked well);
>> - HDP 2.1 is my cluster, my machine is the client;
>> - I did not change any configuration in my cluster;
>> - Kylin is in my client machine with the default configuration files;
>> - I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
>> client as a Hadoop CLI;
>> - In the template of these files: hadoop/core-site.xml,
>> hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
>> properties to set my cluster IP. All these tools are apparently ok
>> since I can access my cluster from them.
>>
>> Whereas my cluster is ok, I was thinking if my problem is in Hadoop
>> CLI configuration files in the client...what changes have you guys
>> done in those three configuration files? Did you also change
>> yarn-site.xml or hdfs-site.xml?
>>
>> HDP 2.1 makes Kylin installation really easy. But, since I am new in
>> Hadoop, I am facing these problems when setting up the client machine.
>>
>>
>> On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <[email protected]> wrote:
>>> Hi Diego
>>>
>>> We met this error stack when deploying in our Hadoop enviroment(not
>>> sandbox). The problem we met is  the function for checking MR job
>>>status do
>>> not support kerberos auth (our hadoop cluster use kerberos service).
>>>So we
>>> made some changes to this part of source code.
>>>
>>> I'm not sure whether this case could help you to analyse the problem.
>>>
>>> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>>>
>>>> Hi Bin Mahone,
>>>>
>>>> sorry for the late reply. Thank you for your support. I didn't know
>>>> about Kylin instances. It is really interesting.
>>>>
>>>> However, let me ask you, I was setting up my hadoop client machine
>>>> with Kylin to communicate to my sandbox. But things are not working
>>>> well.
>>>>
>>>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>>>> are working and I can access my "remote server" from my client machine
>>>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>>>> to my sandbox). Then, Kylin was built and everything was ok until I
>>>> tried to build the cube.
>>>>
>>>> I got the following errors always in the second step of cube build:
>>>>
>>>> [pool-5-thread-2]:[2015-08-25
>>>>
>>>> 
>>>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.che
>>>>ckStatus(HadoopStatusChecker.java:91)]
>>>> - error check status
>>>> java.net.ConnectException: Connection refused
>>>>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>      at
>>>> 
>>>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java
>>>>:339)
>>>>      at
>>>> 
>>>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketIm
>>>>pl.java:198)
>>>>      at
>>>> 
>>>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:1
>>>>82)
>>>>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>      at java.net.Socket.connect(Socket.java:579)
>>>>      at java.net.Socket.connect(Socket.java:528)
>>>>      at java.net.Socket.<init>(Socket.java:425)
>>>>      at java.net.Socket.<init>(Socket.java:280)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cre
>>>>ateSocket(DefaultProtocolSocketFactory.java:80)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cre
>>>>ateSocket(DefaultProtocolSocketFactory.java:122)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:7
>>>>07)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpM
>>>>ethodDirector.java:387)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMeth
>>>>odDirector.java:171)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:
>>>>397)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:
>>>>323)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopSta
>>>>tusGetter.java:78)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.ja
>>>>va:55)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatus
>>>>Checker.java:56)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecuta
>>>>ble.java:136)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:106)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultC
>>>>hainedExecutable.java:50)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:106)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Def
>>>>aultScheduler.java:133)
>>>>      at
>>>> 
>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja
>>>>va:1145)
>>>>      at
>>>> 
>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j
>>>>ava:615)
>>>>      at java.lang.Thread.run(Thread.java:745)
>>>>
>>>> org.apache.kylin.job.exception.ExecuteException:
>>>> java.lang.NullPointerException
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:110)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultC
>>>>hainedExecutable.java:50)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:106)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Def
>>>>aultScheduler.java:133)
>>>>      at
>>>> 
>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja
>>>>va:1145)
>>>>      at
>>>> 
>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j
>>>>ava:615)
>>>>      at java.lang.Thread.run(Thread.java:745)
>>>> Caused by: java.lang.NullPointerException
>>>>      at
>>>> 
>>>>org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduc
>>>>eExecutable.java:73)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:105)
>>>>      ... 6 more
>>>>
>>>> Do you have any thoughts about these errors? (detailed log is
>>>>attached)
>>>>
>>>>
>>>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <[email protected]>
>>>>wrote:
>>>>>
>>>>> the document is summarized at
>>>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>>>
>>>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <[email protected]>
>>>>>wrote:
>>>>>
>>>>>> hi Diego
>>>>>>
>>>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it
>>>>>>is
>>>>>> enabled when you cannot run Kylin server on the same machine as your
>>>>>> hadoop
>>>>>> CLI, for example, if you're starting Kylin from you local IDE, and
>>>>>>you
>>>>>> hadoop CLI is a sandbox in another machine, this is the "remote"
>>>>>>case.
>>>>>>
>>>>>> In most of the production deployments we suggest using '"non-remote"
>>>>>> mode,
>>>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>>>> depicts
>>>>>> the scenario:
>>>>>>
>>>>>> 
>>>>>>https://github.com/apache/incubator-kylin/blob/0.7/website/images/ins
>>>>>>tall/on_cli_install_scene.png
>>>>>>
>>>>>> Kylin instances are stateless,  the runtime state is saved in its
>>>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>>>> conf/kylin.properties). For load balance considerations it is
>>>>>>possible
>>>>>> to
>>>>>> start multiple Kylin instances sharing the same metadata store (thus
>>>>>> sharing the same state on table schemas, job status, cube status,
>>>>>>etc.)
>>>>>>
>>>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>>>> conf/kylin.properties specifying the runtime mode, it has three
>>>>>>options:
>>>>>> 1.
>>>>>> "job" for running job engine only 2. "query" for running query
>>>>>>engine
>>>>>> only
>>>>>> and 3 "all" for running both. Notice that only one server can run
>>>>>>the
>>>>>> job
>>>>>> engine("all" mode or "job" mode), the others must all be "query"
>>>>>>mode.
>>>>>>
>>>>>> A typical scenario is depicted in the attachment chart.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>>
>>>>>> *Bin Mahone | 马洪宾*
>>>>>> Apache Kylin: http://kylin.io
>>>>>> Github: https://github.com/binmahone
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>> *Bin Mahone | 马洪宾*
>>>>> Apache Kylin: http://kylin.io
>>>>> Github: https://github.com/binmahone
>>>
>>>
>>> --
>>> -------
>>> Wei Hu

Re: Kylin and Remote Server

Reply via email to