Hi Sonny, You can configure an R/W separated deployment with two EMRs: one is Hadoop only and the other is the HBase cluster. In the EC2 that run Kylin, install both Hadoop and HBase client/configuration. And then tell Kylin you have Hadoop and HBase in two clusters (refer to the blog). Kylin will run jobs in the W cluster and bulk load HFile to the R cluster.
https://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/ Many Kylin users run in this R/W separated architecture. I once tried it on Azure with two clusters, it worked well. Not tested with EMR, but I think they are similar. 2018-08-06 10:55 GMT+08:00 Sonny Heer <[email protected]>: > Yea that would be great if Kylin can have a centralized metastore in RDS. > > The big problem for us now is this: > > 2 emr clusters each running kylin on master node. Both share hbase s3 > root dir. > > Cluster A creates a cube and does a build. Cluster B can see the cube as > it builds in “monitor”, but once cube is finished. Cube is “ready” only in > cluster A (job launched from). > > We need somewhat isolated kylin nodes that can still share the same > backend. This is a big win since then each cluster can scale read/write > independently in EMR - this is our goal. Having read/write in the same > cluster doesn’t work for various reasons... > > It seems kylin is really close since the monitoring of the cube is in sync > when sharing same hbase backend. > > Using read replica did not work - when we try to login from the replica > kylin want able to work > > > > On Sun, Aug 5, 2018 at 7:01 PM ShaoFeng Shi <[email protected]> > wrote: > >> Hi Sonny, >> >> EMR HBase read replica is a great feature, but we didn't try. Are you >> going to using this feature? or just want to deploy Kylin as a cluster? >> >> If putting Kylin metadata to RDS, can it be easier for you? >> >> 2018-08-04 0:05 GMT+08:00 Sonny Heer <[email protected]>: >> >>> we'd like to use emr hbase read replicas if possible. We had some >>> issues using this stragety since kylin requires write capability from all >>> nodes (on login for example). >>> >>> idea is to cluster kylin using multiple EMRs on master node. If this >>> isn't possible we may go with separate instance approach, but that is prone >>> to errors as emr libs have to copied around.. >>> >>> ref: >>> https://aws.amazon.com/blogs/big-data/setting-up-read- >>> replica-clusters-with-hbase-on-amazon-s3/ >>> >>> Anyone else have experience or can share their use case on emr? >>> >>> Thanks! >>> >>> On Thu, Aug 2, 2018 at 2:32 PM Sonny Heer <[email protected]> wrote: >>> >>>> Is it possible in the new version of kylin to have multiple EMR >>>> clusters with Kylin installed on master node but talking to the same S3 >>>> location. >>>> >>>> e.g. one Write EMR cluster and one Read EMR cluster >>>> >>>> ? >>>> >>> >> >> >> -- >> Best regards, >> >> Shaofeng Shi 史少锋 >> >> -- Best regards, Shaofeng Shi 史少锋
