Wish list for new cluster management & job dispatcher scheme

hongbin ma Tue, 03 Nov 2015 19:43:21 -0800

Since we're working on designing new cluster management for manage LB
servers and streaming job slaves.
I think it's a good opportunity for kylin user to share their pain points
and wish list help to improve kylin use experience.


Here're mine:

1. Cluster configuration is troublesome. Currently we have to write down
the server list in kylin.properties and assign a role to each server. This
is hard to maintain. The new cluster management should automate server
discovery, leader selection and failover.

2. Log analyze is not easy if multiple servers are running at the same
time.  (https://issues.apache.org/jira/browse/KYLIN-1124 for example). For
query side, we should be able to answer questions like "I submitted a query
XXXXX at 10:00, please check why it's slow?", "what are the most time
consuming queries recently (and its related cube name)?". For streaming job
dispatcher side, we should be able to identify failed batches more
quickly(and resume it), as well as a better management of each batch's
build log (when you have tens of slaves, it's difficult to find where is a
batch's build log is). A related JIRA ticket is
https://issues.apache.org/jira/browse/KYLIN-1079

3. Streaming batch jobs should be horizontally scalable. If a batch is
found to be too big to fit into a single JVM, we should detect it and
divide the batch into smaller pieces so that we can dispatch the job to
multiple JVMs, and let subsequent auto-merge job to merge them. Related
JIRA is https://issues.apache.org/jira/browse/KYLIN-1042

4. Auto-merge job fail will lead to accumulating hundreds of segments, this
will greatly harm query performance. related JIRA:
https://issues.apache.org/jira/browse/KYLIN-1038


-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Wish list for new cluster management & job dispatcher scheme

Reply via email to