Very good inputs. On Wed, Nov 4, 2015 at 11:42 AM, hongbin ma <[email protected]> wrote:
> Since we're working on designing new cluster management for manage LB > servers and streaming job slaves. > I think it's a good opportunity for kylin user to share their pain points > and wish list help to improve kylin use experience. > > Here're mine: > > 1. Cluster configuration is troublesome. Currently we have to write down > the server list in kylin.properties and assign a role to each server. This > is hard to maintain. The new cluster management should automate server > discovery, leader selection and failover. > > 2. Log analyze is not easy if multiple servers are running at the same > time. (https://issues.apache.org/jira/browse/KYLIN-1124 for example). For > query side, we should be able to answer questions like "I submitted a query > XXXXX at 10:00, please check why it's slow?", "what are the most time > consuming queries recently (and its related cube name)?". For streaming job > dispatcher side, we should be able to identify failed batches more > quickly(and resume it), as well as a better management of each batch's > build log (when you have tens of slaves, it's difficult to find where is a > batch's build log is). A related JIRA ticket is > https://issues.apache.org/jira/browse/KYLIN-1079 > > 3. Streaming batch jobs should be horizontally scalable. If a batch is > found to be too big to fit into a single JVM, we should detect it and > divide the batch into smaller pieces so that we can dispatch the job to > multiple JVMs, and let subsequent auto-merge job to merge them. Related > JIRA is https://issues.apache.org/jira/browse/KYLIN-1042 > > 4. Auto-merge job fail will lead to accumulating hundreds of segments, this > will greatly harm query performance. related JIRA: > https://issues.apache.org/jira/browse/KYLIN-1038 > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone >
