After some changes, I am getting the following error:
[pool-5-thread-2]:[2015-08-31
19:42:13,394][ERROR][org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:83)]
- error in FactDistinctColumnsJob
java.io.IOException:
NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube_desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
table not found)
at
org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97)
at
org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
at
org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:101)
at
org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:74)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at
org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:112)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by:
NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube_desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
table not found)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
at com.sun.proxy.$Proxy45.get_table(Unknown Source)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
at org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191)
at
org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:105)
at
org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:86)
at
org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95)
... 13 more
Did you already face it?
On Wed, Aug 26, 2015 at 11:20 PM, Diego Pinheiro
<[email protected]> wrote:
> @DroopyHoo, it is good to know that since we are planning to change
> authentication, but this is not the cause of my error. Since hongbin
> ma comments, I changed my kylin properties to use the same as sandbox
> (which is my VM HDP 2.1). So, I am using LDAP auth for now.
>
> @hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
> changed it to false, but I am still getting the same errors:
>
> [pool-5-thread-2]:[2015-08-25
> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
> - error check status
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
>
> org.apache.kylin.job.exception.ExecuteException:
> java.lang.NullPointerException
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>
> Just to resume, this is my current environment:
>
> - one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it worked
> well);
> - HDP 2.1 is my cluster, my machine is the client;
> - I did not change any configuration in my cluster;
> - Kylin is in my client machine with the default configuration files;
> - I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
> client as a Hadoop CLI;
> - In the template of these files: hadoop/core-site.xml,
> hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
> properties to set my cluster IP. All these tools are apparently ok
> since I can access my cluster from them.
>
> Whereas my cluster is ok, I was thinking if my problem is in Hadoop
> CLI configuration files in the client...what changes have you guys
> done in those three configuration files? Did you also change
> yarn-site.xml or hdfs-site.xml?
>
> HDP 2.1 makes Kylin installation really easy. But, since I am new in
> Hadoop, I am facing these problems when setting up the client machine.
>
>
> On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <[email protected]> wrote:
>> Hi Diego
>>
>> We met this error stack when deploying in our Hadoop enviroment(not
>> sandbox). The problem we met is the function for checking MR job status do
>> not support kerberos auth (our hadoop cluster use kerberos service). So we
>> made some changes to this part of source code.
>>
>> I'm not sure whether this case could help you to analyse the problem.
>>
>> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>>
>>> Hi Bin Mahone,
>>>
>>> sorry for the late reply. Thank you for your support. I didn't know
>>> about Kylin instances. It is really interesting.
>>>
>>> However, let me ask you, I was setting up my hadoop client machine
>>> with Kylin to communicate to my sandbox. But things are not working
>>> well.
>>>
>>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>>> are working and I can access my "remote server" from my client machine
>>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>>> to my sandbox). Then, Kylin was built and everything was ok until I
>>> tried to build the cube.
>>>
>>> I got the following errors always in the second step of cube build:
>>>
>>> [pool-5-thread-2]:[2015-08-25
>>>
>>> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
>>> - error check status
>>> java.net.ConnectException: Connection refused
>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> at
>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>>> at
>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>>> at
>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>> at java.net.Socket.connect(Socket.java:579)
>>> at java.net.Socket.connect(Socket.java:528)
>>> at java.net.Socket.<init>(Socket.java:425)
>>> at java.net.Socket.<init>(Socket.java:280)
>>> at
>>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
>>> at
>>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
>>> at
>>> org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
>>> at
>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
>>> at
>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>>> at
>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>>> at
>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>>> at
>>> org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
>>> at
>>> org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
>>> at
>>> org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
>>> at
>>> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
>>> at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>> at
>>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>> at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>> at
>>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> org.apache.kylin.job.exception.ExecuteException:
>>> java.lang.NullPointerException
>>> at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
>>> at
>>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>> at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>> at
>>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.NullPointerException
>>> at
>>> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
>>> at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
>>> ... 6 more
>>>
>>> Do you have any thoughts about these errors? (detailed log is attached)
>>>
>>>
>>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <[email protected]> wrote:
>>>>
>>>> the document is summarized at
>>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>>
>>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <[email protected]> wrote:
>>>>
>>>>> hi Diego
>>>>>
>>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
>>>>> enabled when you cannot run Kylin server on the same machine as your
>>>>> hadoop
>>>>> CLI, for example, if you're starting Kylin from you local IDE, and you
>>>>> hadoop CLI is a sandbox in another machine, this is the "remote" case.
>>>>>
>>>>> In most of the production deployments we suggest using '"non-remote"
>>>>> mode,
>>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>>> depicts
>>>>> the scenario:
>>>>>
>>>>> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
>>>>>
>>>>> Kylin instances are stateless, the runtime state is saved in its
>>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>>> conf/kylin.properties). For load balance considerations it is possible
>>>>> to
>>>>> start multiple Kylin instances sharing the same metadata store (thus
>>>>> sharing the same state on table schemas, job status, cube status, etc.)
>>>>>
>>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>>> conf/kylin.properties specifying the runtime mode, it has three options:
>>>>> 1.
>>>>> "job" for running job engine only 2. "query" for running query engine
>>>>> only
>>>>> and 3 "all" for running both. Notice that only one server can run the
>>>>> job
>>>>> engine("all" mode or "job" mode), the others must all be "query" mode.
>>>>>
>>>>> A typical scenario is depicted in the attachment chart.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>> *Bin Mahone | 马洪宾*
>>>>> Apache Kylin: http://kylin.io
>>>>> Github: https://github.com/binmahone
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> *Bin Mahone | 马洪宾*
>>>> Apache Kylin: http://kylin.io
>>>> Github: https://github.com/binmahone
>>
>>
>> --
>> -------
>> Wei Hu
>>