[
https://issues.apache.org/jira/browse/TAJO-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010845#comment-14010845
]
Jaehwa Jung commented on TAJO-611:
----------------------------------
Hi [~mafish]
First of all, I'm really appreciate for your contribution. And I left some
comments for your patch.
First, ServiceRegister need to allow composite ServiceInstance because there
can be two more than servers for one TajoWorker or one TajoMaster. If I
misunderstood the patch, I'm hoping that you might be able to add more unit
test cases.
Second, ServiceType need to be more clear. For example, I run on actual cluster
with 4 nodes as follows:
{code:xml}
- tajo1.com: TajoMaster
- tajo2.com: TajoWorker
- tajo3.com: TajoWorker
- tajo4.com: TajoWorker
{code}
If I want to apply ServiceDiscovery on my cluster, I'll run another TajoWorker
on each node with another ports. Or I'll run TajoWoker on another new node (ex:
tajo5.com).
But in this case, TajoMaster and QueryMaster can't seem to find right
TajoWorker because ServiceType is just TAJOWORKER. I think that hadoop namenode
HA configuration will be helpful to you.
Third, we need a ServiceType for TajoMaster because of TAJO-704.
Cheers
Jaehwa
> (Umbrella) Service Discovery
> -----------------------------
>
> Key: TAJO-611
> URL: https://issues.apache.org/jira/browse/TAJO-611
> Project: Tajo
> Issue Type: New Feature
> Affects Versions: 0.9.0
> Reporter: Min Zhou
> Assignee: Mafish
> Fix For: 0.9.0
>
> Attachments: TAJO-611.patch, tajo-611-servicediscovery-20140420.patch
>
>
> As we talked offline, high availability is one of our next goal. Service
> discovery can help us maintain health statuses for all daemons ( master and
> workers) . Meanwhile, those daemons can find each other easily. Furthermore,
> it's very useful for my current work TAJO-540 cuz it can randomly select
> nodes for tajo scheduler.
> One of the best candidates is Netflix Curator.
> http://curator.apache.org/curator-x-discovery/index.html
> I'd like to introduce Xuhui to help us with this issue. Xuhui was my
> colleague at Alibaba group. He was active in Hive community, one of his job
> is adding multi-distinct aggregation feature to Hive. Currently, he is a
> researcher work at Microsoft.
> [~mafish]
> Below is comment on this issue from Xuhui
> To my understanding, this feature is for high availability as well as high
> scalability. We don't need to provide all machine info for every service when
> tajo starts. Instead, we can dynamically register services to service
> discovery if necessary. Also, if some machine fails, it can be easily
> detected and replaced.
--
This message was sent by Atlassian JIRA
(v6.2#6252)