Re: Kylin and Remote Server

Diego Pinheiro Wed, 26 Aug 2015 20:21:06 -0700

@DroopyHoo, it is good to know that since we are planning to change
authentication, but this is not the cause of my error. Since hongbin
ma comments, I changed my kylin properties to use the same as sandbox
(which is my VM HDP 2.1). So, I am using LDAP auth for now.


@hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
changed it to false, but I am still getting the same errors:

[pool-5-thread-2]:[2015-08-25
19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
- error check status
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
    at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)

org.apache.kylin.job.exception.ExecuteException: java.lang.NullPointerException
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
    at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)

Just to resume, this is my current environment:

- one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it worked well);
- HDP 2.1 is my cluster, my machine is the client;
- I did not change any configuration in my cluster;
- Kylin is in my client machine with the default configuration files;
- I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
client as a Hadoop CLI;
- In the template of these files: hadoop/core-site.xml,
hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
properties to set my cluster IP. All these tools are apparently ok
since I can access my cluster from them.

Whereas my cluster is ok, I was thinking if my problem is in Hadoop
CLI configuration files in the client...what changes have you guys
done in those three configuration files? Did you also change
yarn-site.xml or hdfs-site.xml?

HDP 2.1 makes Kylin installation really easy. But, since I am new in
Hadoop, I am facing these problems when setting up the client machine.


On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <[email protected]> wrote:
> Hi Diego
>
> We met this error stack when deploying in our Hadoop enviroment(not
> sandbox). The problem we met is  the function for checking MR job status do
> not support kerberos auth (our hadoop cluster use kerberos service). So we
> made some changes to this part of source code.
>
> I'm not sure whether this case could help you to analyse the problem.
>
> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>
>> Hi Bin Mahone,
>>
>> sorry for the late reply. Thank you for your support. I didn't know
>> about Kylin instances. It is really interesting.
>>
>> However, let me ask you, I was setting up my hadoop client machine
>> with Kylin to communicate to my sandbox. But things are not working
>> well.
>>
>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>> are working and I can access my "remote server" from my client machine
>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>> to my sandbox). Then, Kylin was built and everything was ok until I
>> tried to build the cube.
>>
>> I got the following errors always in the second step of cube build:
>>
>> [pool-5-thread-2]:[2015-08-25
>>
>> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
>> - error check status
>> java.net.ConnectException: Connection refused
>>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>>      at
>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>>      at
>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>>      at
>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>      at java.net.Socket.connect(Socket.java:579)
>>      at java.net.Socket.connect(Socket.java:528)
>>      at java.net.Socket.<init>(Socket.java:425)
>>      at java.net.Socket.<init>(Socket.java:280)
>>      at
>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
>>      at
>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
>>      at
>> org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
>>      at
>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
>>      at
>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>>      at
>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>>      at
>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>>      at
>> org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
>>      at
>> org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
>>      at
>> org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
>>      at
>> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>      at
>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>      at
>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>      at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>      at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>      at java.lang.Thread.run(Thread.java:745)
>>
>> org.apache.kylin.job.exception.ExecuteException:
>> java.lang.NullPointerException
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
>>      at
>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>      at
>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>      at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>      at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>      at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.NullPointerException
>>      at
>> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
>>      ... 6 more
>>
>> Do you have any thoughts about these errors? (detailed log is attached)
>>
>>
>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <[email protected]> wrote:
>>>
>>> the document is summarized at
>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>
>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <[email protected]> wrote:
>>>
>>>> hi Diego
>>>>
>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
>>>> enabled when you cannot run Kylin server on the same machine as your
>>>> hadoop
>>>> CLI, for example, if you're starting Kylin from you local IDE, and you
>>>> hadoop CLI is a sandbox in another machine, this is the "remote" case.
>>>>
>>>> In most of the production deployments we suggest using '"non-remote"
>>>> mode,
>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>> depicts
>>>> the scenario:
>>>>
>>>> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
>>>>
>>>> Kylin instances are stateless,  the runtime state is saved in its
>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>> conf/kylin.properties). For load balance considerations it is possible
>>>> to
>>>> start multiple Kylin instances sharing the same metadata store (thus
>>>> sharing the same state on table schemas, job status, cube status, etc.)
>>>>
>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>> conf/kylin.properties specifying the runtime mode, it has three options:
>>>> 1.
>>>> "job" for running job engine only 2. "query" for running query engine
>>>> only
>>>> and 3 "all" for running both. Notice that only one server can run the
>>>> job
>>>> engine("all" mode or "job" mode), the others must all be "query" mode.
>>>>
>>>> A typical scenario is depicted in the attachment chart.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> *Bin Mahone | 马洪宾*
>>>> Apache Kylin: http://kylin.io
>>>> Github: https://github.com/binmahone
>>>>
>>>
>>>
>>> --
>>> Regards,
>>>
>>> *Bin Mahone | 马洪宾*
>>> Apache Kylin: http://kylin.io
>>> Github: https://github.com/binmahone
>
>
> --
> -------
> Wei Hu
>

Re: Kylin and Remote Server

Reply via email to