[ 
https://issues.apache.org/jira/browse/KYLIN-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250699#comment-16250699
 ] 

Shaofeng SHI commented on KYLIN-3035:
-------------------------------------

Hi Shawn, please join Kylin community with the mailing list: 
https://kylin.apache.org/community/
We can discuss issues there. The JIRA is for bug and task tracking, not suggest 
to do Q/A.

> How to use Kylin on EMR with S3 as hbase storage
> ------------------------------------------------
>
>                 Key: KYLIN-3035
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3035
>             Project: Kylin
>          Issue Type: Bug
>          Components: Metadata
>    Affects Versions: v2.2.0
>         Environment: EMR 5.5.0, Kylin 2.2.0
>            Reporter: Shawn Wang
>            Assignee: Shaofeng SHI
>
> Can somebody give an example of how to use kylin on EMR with S3 as hbase 
> storage, which support reuse the previously built cube on new EMR after the 
> original EMR has been terminated.
> My purpose is simple:
> 1. use transient EMR cluster to build cubes
> 2. use a persistent cluster to handle query requests
> Of course, the culsters should share same hbase storage, so I setup the 
> cluster to use S3 as hbase storage, after 2.2.0 fix the "HFile not written to 
> S3" issue, I have been built a sample cube successfully, using configurations:
> EMR:
> {noformat}
> [
>       {
>               "Classification": "hbase-site",
>               "Properties": {
>                       "hbase.rootdir": "s3://kylin-emrfs/hbase-production"
>               }
>       },
>       {
>               "Classification": "hbase",
>               "Properties": {
>                       "hbase.emr.storageMode": "s3"
>               }
>       },
>       {
>               "Classification": "emrfs-site",
>               "Properties": {
>                       "fs.s3.consistent": "true",
>                       "fs.s3.consistent.metadata.tableName": 
> "KylinEmrFSMetadata"
>               }
>       }
> ]
> {noformat}
> kylin.propertities:
> {noformat}
> kylin.env.hdfs-working-dir=s3://kylin-emrfs/kylin-working-dir
> kylin.server.mode=all
> {noformat}
> Then I create a new cluster with same EMR configuration and query mode for 
> kylin, kylin just can't startup with errors:
> {noformat}
> 2017-11-13 07:33:44,415 INFO  
> [main-SendThread(ip-172-31-1-10.cn-north-1.compute.internal:2181)] 
> zookeeper.ClientCnxn:876 : Socket connection established to 
> ip-172-31-1-10.cn-north-1.compute.internal/172.31.1.10:2181, initiating 
> session
> 2017-11-13 07:33:44,422 INFO  
> [main-SendThread(ip-172-31-1-10.cn-north-1.compute.internal:2181)] 
> zookeeper.ClientCnxn:1299 : Session establishment complete on server 
> ip-172-31-1-10.cn-north-1.compute.internal/172.31.1.10:2181, sessionid = 
> 0x15fb4173c100156, negotiated timeout = 40000
> 2017-11-13 07:33:48,380 DEBUG [main] hbase.HBaseConnection:279 : HTable 
> 'kylin_metadata' already exists
> Exception in thread "main" java.lang.IllegalArgumentException: Failed to find 
> metadata store by url: kylin_metadata@hbase
>       at 
> org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:89)
>       at 
> org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:101)
>       at 
> org.apache.kylin.rest.service.AclTableMigrationTool.checkIfNeedMigrate(AclTableMigrationTool.java:94)
>       at 
> org.apache.kylin.tool.AclTableMigrationCLI.main(AclTableMigrationCLI.java:41)
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed 
> after attempts=1, exceptions:
> Mon Nov 13 07:33:48 UTC 2017, 
> RpcRetryingCaller{globalStartTime=1510558428667, pause=100, retries=1}, 
> java.net.ConnectException: 拒绝连接
>       at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
>       at org.apache.hadoop.hbase.client.HTable.get(HTable.java:864)
>       at org.apache.hadoop.hbase.client.HTable.get(HTable.java:830)
>       at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.internalGetFromHTable(HBaseResourceStore.java:385)
>       at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.getFromHTable(HBaseResourceStore.java:363)
>       at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.existsImpl(HBaseResourceStore.java:116)
>       at 
> org.apache.kylin.common.persistence.ResourceStore.exists(ResourceStore.java:144)
>       at 
> org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:84)
>       ... 3 more
> Caused by: java.net.ConnectException: 拒绝连接
>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>       at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>       at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>       at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
>       at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupConnection(RpcClientImpl.java:416)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:722)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:909)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1244)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:35372)
>       at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:856)
>       at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:847)
>       at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:137)
>       ... 10 more
> 2017-11-13 07:33:48,709 INFO  [Thread-1] 
> client.ConnectionManager$HConnectionImplementation:2180 : Closing master 
> protocol: MasterService
> 2017-11-13 07:33:48,710 INFO  [Thread-1] 
> client.ConnectionManager$HConnectionImplementation:1718 : Closing zookeeper 
> sessionid=0x15fb4173c100156
> 2017-11-13 07:33:48,712 INFO  [Thread-1] zookeeper.ZooKeeper:684 : Session: 
> 0x15fb4173c100156 closed
> 2017-11-13 07:33:48,712 INFO  [main-EventThread] zookeeper.ClientCnxn:519 : 
> EventThread shut down for session: 0x15fb4173c100156
> ERROR: Unknown error. Please check full log.
> {noformat}
> And if I change the Kylin server mode to all, Kylin can startup, but the page 
> on port 7070 can not be opened, with similar errors as above.
> I am wondering If there are some other configuration I have missed, or I am 
> just in the wrong way.
> It'll be many thanks if someone can give an complete example to show how to 
> get this work done!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to