Hi Sonny,

You can configure an R/W separated deployment with two EMRs: one is Hadoop
only and the other is the HBase cluster. In the EC2 that run Kylin, install
both Hadoop and HBase client/configuration. And then tell Kylin you have
Hadoop and HBase in two clusters (refer to the blog). Kylin will run jobs
in the W cluster and bulk load HFile to the R cluster.

https://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/

Many Kylin users run in this R/W separated architecture. I once tried it on
Azure with two clusters, it worked well. Not tested with EMR, but I think
they are similar.


2018-08-06 10:55 GMT+08:00 Sonny Heer <[email protected]>:

> Yea that would be great if Kylin can have a centralized metastore in RDS.
>
> The big problem for us now is this:
>
> 2 emr clusters each running kylin on master node.  Both share hbase s3
> root dir.
>
> Cluster A creates a cube and does a build.  Cluster B can see the cube as
> it builds in “monitor”, but once cube is finished.  Cube is “ready” only in
> cluster A (job launched from).
>
> We need somewhat isolated kylin nodes that can still share the same
> backend.  This is a big win since then each cluster can scale read/write
> independently in EMR - this is our goal.  Having read/write in the same
> cluster doesn’t work for various reasons...
>
> It seems kylin is really close since the monitoring of the cube is in sync
> when sharing same hbase backend.
>
> Using read replica did not work - when we try to login from the replica
> kylin want able to work
>
>
>
> On Sun, Aug 5, 2018 at 7:01 PM ShaoFeng Shi <[email protected]>
> wrote:
>
>> Hi Sonny,
>>
>> EMR HBase read replica is a great feature, but we didn't try. Are you
>> going to using this feature? or just want to deploy Kylin as a cluster?
>>
>> If putting Kylin metadata to RDS, can it be easier for you?
>>
>> 2018-08-04 0:05 GMT+08:00 Sonny Heer <[email protected]>:
>>
>>> we'd like to use emr hbase read replicas if possible.  We had some
>>> issues using this stragety since kylin requires write capability from all
>>> nodes (on login for example).
>>>
>>> idea is to cluster kylin using multiple EMRs on master node.  If this
>>> isn't possible we may go with separate instance approach, but that is prone
>>> to errors as emr libs have to copied around..
>>>
>>> ref:
>>> https://aws.amazon.com/blogs/big-data/setting-up-read-
>>> replica-clusters-with-hbase-on-amazon-s3/
>>>
>>> Anyone else have experience or can share their use case on emr?
>>>
>>> Thanks!
>>>
>>> On Thu, Aug 2, 2018 at 2:32 PM Sonny Heer <[email protected]> wrote:
>>>
>>>> Is it possible in the new version of kylin to have multiple EMR
>>>> clusters with Kylin installed on master node but talking to the same S3
>>>> location.
>>>>
>>>> e.g. one Write EMR cluster and one Read EMR cluster
>>>>
>>>> ?
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>


-- 
Best regards,

Shaofeng Shi 史少锋

Reply via email to