Re: Kylin and Remote Server

Shi, Shaofeng Mon, 31 Aug 2015 20:20:04 -0700

or you can change “kylin.job.hive.database.for.intermediatetable” back to
“default” to bypass this issue;


On 9/1/15, 8:59 AM, "Shi, Shaofeng" <[email protected]> wrote:

>should be related with: https://issues.apache.org/jira/browse/KYLIN-975
>
>The patch is available now; You can make a new build by pull 0.7-staging
>branch and then run scripts/package.sh
>
>On 9/1/15, 7:46 AM, "Diego Pinheiro" <[email protected]> wrote:
>
>>After some changes, I am getting the following error:
>>
>>[pool-5-thread-2]:[2015-08-31
>>19:42:13,394][ERROR][org.apache.kylin.job.hadoop.cube.FactDistinctColumns
>>J
>>ob.run(FactDistinctColumnsJob.java:83)]
>>- error in FactDistinctColumnsJob
>>java.io.IOException:
>>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube
>>_
>>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>>table not found)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>a
>>t.java:97)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>a
>>t.java:51)
>>        at 
>>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactD
>>i
>>stinctColumnsJob.java:101)
>>        at 
>>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctC
>>o
>>lumnsJob.java:74)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>        at 
>>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutabl
>>e
>>.java:112)
>>        at 
>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>b
>>le.java:107)
>>        at 
>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultCha
>>i
>>nedExecutable.java:50)
>>        at 
>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>b
>>le.java:107)
>>        at 
>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Defau
>>l
>>tScheduler.java:132)
>>        at 
>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java
>>:
>>1145)
>>        at 
>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>a
>>:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: 
>>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube
>>_
>>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>>table not found)
>>        at 
>>org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveM
>>e
>>taStore.java:1560)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at 
>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
>>:
>>57)
>>        at 
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>>m
>>pl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at 
>>org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHan
>>d
>>ler.java:105)
>>        at com.sun.proxy.$Proxy45.get_table(Unknown Source)
>>        at 
>>org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaSto
>>r
>>eClient.java:997)
>>        at 
>>org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(Initia
>>l
>>izeInput.java:105)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInp
>>u
>>t.java:86)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>a
>>t.java:95)
>>        ... 13 more
>>
>>Did you already face it?
>>
>>
>>On Wed, Aug 26, 2015 at 11:20 PM, Diego Pinheiro
>><[email protected]> wrote:
>>> @DroopyHoo, it is good to know that since we are planning to change
>>> authentication, but this is not the cause of my error. Since hongbin
>>> ma comments, I changed my kylin properties to use the same as sandbox
>>> (which is my VM HDP 2.1). So, I am using LDAP auth for now.
>>>
>>> @hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
>>> changed it to false, but I am still getting the same errors:
>>>
>>> [pool-5-thread-2]:[2015-08-25
>>> 
>>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.chec
>>>k
>>>Status(HadoopStatusChecker.java:91)]
>>> - error check status
>>> java.net.ConnectException: Connection refused
>>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>     at 
>>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
>>>3
>>>39)
>>>     at 
>>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImp
>>>l
>>>.java:198)
>>>     at 
>>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:18
>>>2
>>>)
>>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>     at java.net.Socket.connect(Socket.java:579)
>>>
>>> org.apache.kylin.job.exception.ExecuteException:
>>>java.lang.NullPointerException
>>>     at 
>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecut
>>>a
>>>ble.java:110)
>>>     at 
>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultCh
>>>a
>>>inedExecutable.java:50)
>>>     at 
>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecut
>>>a
>>>ble.java:106)
>>>
>>> Just to resume, this is my current environment:
>>>
>>> - one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it
>>>worked well);
>>> - HDP 2.1 is my cluster, my machine is the client;
>>> - I did not change any configuration in my cluster;
>>> - Kylin is in my client machine with the default configuration files;
>>> - I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
>>> client as a Hadoop CLI;
>>> - In the template of these files: hadoop/core-site.xml,
>>> hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
>>> properties to set my cluster IP. All these tools are apparently ok
>>> since I can access my cluster from them.
>>>
>>> Whereas my cluster is ok, I was thinking if my problem is in Hadoop
>>> CLI configuration files in the client...what changes have you guys
>>> done in those three configuration files? Did you also change
>>> yarn-site.xml or hdfs-site.xml?
>>>
>>> HDP 2.1 makes Kylin installation really easy. But, since I am new in
>>> Hadoop, I am facing these problems when setting up the client machine.
>>>
>>>
>>> On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <[email protected]> wrote:
>>>> Hi Diego
>>>>
>>>> We met this error stack when deploying in our Hadoop enviroment(not
>>>> sandbox). The problem we met is  the function for checking MR job
>>>>status do
>>>> not support kerberos auth (our hadoop cluster use kerberos service).
>>>>So we
>>>> made some changes to this part of source code.
>>>>
>>>> I'm not sure whether this case could help you to analyse the problem.
>>>>
>>>> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>>>>
>>>>> Hi Bin Mahone,
>>>>>
>>>>> sorry for the late reply. Thank you for your support. I didn't know
>>>>> about Kylin instances. It is really interesting.
>>>>>
>>>>> However, let me ask you, I was setting up my hadoop client machine
>>>>> with Kylin to communicate to my sandbox. But things are not working
>>>>> well.
>>>>>
>>>>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>>>>> are working and I can access my "remote server" from my client
>>>>>machine
>>>>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>>>>> to my sandbox). Then, Kylin was built and everything was ok until I
>>>>> tried to build the cube.
>>>>>
>>>>> I got the following errors always in the second step of cube build:
>>>>>
>>>>> [pool-5-thread-2]:[2015-08-25
>>>>>
>>>>> 
>>>>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.ch
>>>>>e
>>>>>ckStatus(HadoopStatusChecker.java:91)]
>>>>> - error check status
>>>>> java.net.ConnectException: Connection refused
>>>>>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>>      at
>>>>> 
>>>>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.jav
>>>>>a
>>>>>:339)
>>>>>      at
>>>>> 
>>>>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketI
>>>>>m
>>>>>pl.java:198)
>>>>>      at
>>>>> 
>>>>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:
>>>>>1
>>>>>82)
>>>>>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>>      at java.net.Socket.connect(Socket.java:579)
>>>>>      at java.net.Socket.connect(Socket.java:528)
>>>>>      at java.net.Socket.<init>(Socket.java:425)
>>>>>      at java.net.Socket.<init>(Socket.java:280)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cr
>>>>>e
>>>>>ateSocket(DefaultProtocolSocketFactory.java:80)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cr
>>>>>e
>>>>>ateSocket(DefaultProtocolSocketFactory.java:122)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:
>>>>>7
>>>>>07)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Http
>>>>>M
>>>>>ethodDirector.java:387)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMet
>>>>>h
>>>>>odDirector.java:171)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java
>>>>>:
>>>>>397)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java
>>>>>:
>>>>>323)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopSt
>>>>>a
>>>>>tusGetter.java:78)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.j
>>>>>a
>>>>>va:55)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatu
>>>>>s
>>>>>Checker.java:56)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecut
>>>>>a
>>>>>ble.java:136)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:106)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(Default
>>>>>C
>>>>>hainedExecutable.java:50)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:106)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(De
>>>>>f
>>>>>aultScheduler.java:133)
>>>>>      at
>>>>> 
>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>a
>>>>>va:1145)
>>>>>      at
>>>>> 
>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>j
>>>>>ava:615)
>>>>>      at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> org.apache.kylin.job.exception.ExecuteException:
>>>>> java.lang.NullPointerException
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:110)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(Default
>>>>>C
>>>>>hainedExecutable.java:50)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:106)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(De
>>>>>f
>>>>>aultScheduler.java:133)
>>>>>      at
>>>>> 
>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>a
>>>>>va:1145)
>>>>>      at
>>>>> 
>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>j
>>>>>ava:615)
>>>>>      at java.lang.Thread.run(Thread.java:745)
>>>>> Caused by: java.lang.NullPointerException
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapRedu
>>>>>c
>>>>>eExecutable.java:73)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:105)
>>>>>      ... 6 more
>>>>>
>>>>> Do you have any thoughts about these errors? (detailed log is
>>>>>attached)
>>>>>
>>>>>
>>>>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <[email protected]>
>>>>>wrote:
>>>>>>
>>>>>> the document is summarized at
>>>>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>>>>
>>>>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <[email protected]>
>>>>>>wrote:
>>>>>>
>>>>>>> hi Diego
>>>>>>>
>>>>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it
>>>>>>>is
>>>>>>> enabled when you cannot run Kylin server on the same machine as
>>>>>>>your
>>>>>>> hadoop
>>>>>>> CLI, for example, if you're starting Kylin from you local IDE, and
>>>>>>>you
>>>>>>> hadoop CLI is a sandbox in another machine, this is the "remote"
>>>>>>>case.
>>>>>>>
>>>>>>> In most of the production deployments we suggest using
>>>>>>>'"non-remote"
>>>>>>> mode,
>>>>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>>>>> depicts
>>>>>>> the scenario:
>>>>>>>
>>>>>>> 
>>>>>>>https://github.com/apache/incubator-kylin/blob/0.7/website/images/in
>>>>>>>s
>>>>>>>tall/on_cli_install_scene.png
>>>>>>>
>>>>>>> Kylin instances are stateless,  the runtime state is saved in its
>>>>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>>>>> conf/kylin.properties). For load balance considerations it is
>>>>>>>possible
>>>>>>> to
>>>>>>> start multiple Kylin instances sharing the same metadata store
>>>>>>>(thus
>>>>>>> sharing the same state on table schemas, job status, cube status,
>>>>>>>etc.)
>>>>>>>
>>>>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>>>>> conf/kylin.properties specifying the runtime mode, it has three
>>>>>>>options:
>>>>>>> 1.
>>>>>>> "job" for running job engine only 2. "query" for running query
>>>>>>>engine
>>>>>>> only
>>>>>>> and 3 "all" for running both. Notice that only one server can run
>>>>>>>the
>>>>>>> job
>>>>>>> engine("all" mode or "job" mode), the others must all be "query"
>>>>>>>mode.
>>>>>>>
>>>>>>> A typical scenario is depicted in the attachment chart.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>>
>>>>>>> *Bin Mahone | 马洪宾*
>>>>>>> Apache Kylin: http://kylin.io
>>>>>>> Github: https://github.com/binmahone
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>>
>>>>>> *Bin Mahone | 马洪宾*
>>>>>> Apache Kylin: http://kylin.io
>>>>>> Github: https://github.com/binmahone
>>>>
>>>>
>>>> --
>>>> -------
>>>> Wei Hu
>

Re: Kylin and Remote Server

Reply via email to