Re: multiple EMRs sync

Sonny Heer Mon, 06 Aug 2018 10:56:26 -0700

ShaoFeng,

Is Strikingly open to sharing their work?  It appears our use case is
similar and would love to see what work they have matches ours.


On Mon, Aug 6, 2018 at 7:01 AM Sonny Heer <[email protected]> wrote:

> Does that require a HA cluster & kylin installed on its own instance?  EMR
> doesn't spin up services as HA on its master node.   I'd be curious to see
> what Strikingly has done and if they have it deployed on AWS.
>
>
>
> On Sun, Aug 5, 2018 at 10:57 PM ShaoFeng Shi <[email protected]>
> wrote:
>
>> Hi Sonny,
>>
>> You can configure an R/W separated deployment with two EMRs: one is
>> Hadoop only and the other is the HBase cluster. In the EC2 that run Kylin,
>> install both Hadoop and HBase client/configuration. And then tell Kylin you
>> have Hadoop and HBase in two clusters (refer to the blog). Kylin will run
>> jobs in the W cluster and bulk load HFile to the R cluster.
>>
>> https://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/
>>
>> Many Kylin users run in this R/W separated architecture. I once tried it
>> on Azure with two clusters, it worked well. Not tested with EMR, but I
>> think they are similar.
>>
>>
>> 2018-08-06 10:55 GMT+08:00 Sonny Heer <[email protected]>:
>>
>>> Yea that would be great if Kylin can have a centralized metastore in RDS.
>>>
>>> The big problem for us now is this:
>>>
>>> 2 emr clusters each running kylin on master node.  Both share hbase s3
>>> root dir.
>>>
>>> Cluster A creates a cube and does a build.  Cluster B can see the cube
>>> as it builds in “monitor”, but once cube is finished.  Cube is “ready” only
>>> in cluster A (job launched from).
>>>
>>> We need somewhat isolated kylin nodes that can still share the same
>>> backend.  This is a big win since then each cluster can scale read/write
>>> independently in EMR - this is our goal.  Having read/write in the same
>>> cluster doesn’t work for various reasons...
>>>
>>> It seems kylin is really close since the monitoring of the cube is in
>>> sync when sharing same hbase backend.
>>>
>>> Using read replica did not work - when we try to login from the replica
>>> kylin want able to work
>>>
>>>
>>>
>>> On Sun, Aug 5, 2018 at 7:01 PM ShaoFeng Shi <[email protected]>
>>> wrote:
>>>
>>>> Hi Sonny,
>>>>
>>>> EMR HBase read replica is a great feature, but we didn't try. Are you
>>>> going to using this feature? or just want to deploy Kylin as a cluster?
>>>>
>>>> If putting Kylin metadata to RDS, can it be easier for you?
>>>>
>>>> 2018-08-04 0:05 GMT+08:00 Sonny Heer <[email protected]>:
>>>>
>>>>> we'd like to use emr hbase read replicas if possible.  We had some
>>>>> issues using this stragety since kylin requires write capability from all
>>>>> nodes (on login for example).
>>>>>
>>>>> idea is to cluster kylin using multiple EMRs on master node.  If this
>>>>> isn't possible we may go with separate instance approach, but that is 
>>>>> prone
>>>>> to errors as emr libs have to copied around..
>>>>>
>>>>> ref:
>>>>>
>>>>> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/
>>>>>
>>>>> Anyone else have experience or can share their use case on emr?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> On Thu, Aug 2, 2018 at 2:32 PM Sonny Heer <[email protected]> wrote:
>>>>>
>>>>>> Is it possible in the new version of kylin to have multiple EMR
>>>>>> clusters with Kylin installed on master node but talking to the same S3
>>>>>> location.
>>>>>>
>>>>>> e.g. one Write EMR cluster and one Read EMR cluster
>>>>>>
>>>>>> ?
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>

Re: multiple EMRs sync

Reply via email to