Re: [DISCUSS]Apache Kylin 2.0 Release Features & Criteria

Adunuthula, Seshu Mon, 01 Feb 2016 08:35:57 -0800

A strong -1 on this.

- A better MR cubing algorithm, about 1.5 times faster than 1.x by
comparing hundreds of jobs.
- TopN pre-calculation (more UDFs coming)
- ODBC compatible with Tableau 9.1, MS Excel, MS PowerBI




These are incremental enhancements and does not warrant bumping up to 2.0
release. We should release them as in 1.3


- Streaming cubing experimental support, source from kafka, build cube
in-mem at minutes interval
- A plugin-able architecture, to allow alternative cube engine / storage
engine / data source.



These are POC quality features that should not be part of a release. We
have not built a single of these plugins that are production quality.

Luke/Yang I have told you multiple times not to push out a release when it
is not ready. We nearly got down the entire HBase cluster in eBay with the
bad design for the Streaming. If we scale this up to 100s of Streaming
Cubes this design will render an HBase cluster unusable.

I have spent substantial time looking into the release and it does not
meet eBay¹s standards for a quality release.

We will be doing the community a huge disservice by pushing this out by
end of February.

Regards
Seshu Adunuthula


On 1/31/16, 11:46 PM, "Li Yang" <[email protected]> wrote:

>Just  to add more colors.
>
>The 2.0 rc1 has been stabilizing in the 2.0-rc branch for a few month. The
>2.0 rc1 contains:
>
>- A plugin-able architecture, to allow alternative cube engine / storage
>engine / data source.
>- A better MR cubing algorithm, about 1.5 times faster than 1.x by
>comparing hundreds of jobs.
>- A better storage engine, makes query roughly 2 times faster (especially
>for slow queries) than 1.x by comparing tens of thousands sqls.
>- Streaming cubing experimental support, source from kafka, build cube
>in-mem at minutes interval
>- TopN pre-calculation (more UDFs coming)
>- ODBC compatible with Tableau 9.1, MS Excel, MS PowerBI
>- SAML authentication support
>
>As the release manager, I will kickoff the release process in two weeks
>(once back from vacation). ETA by end of Feb.
>
>Would love to hear more feedback from our community.  :-)
>
>
>Yang
>
>
>
>On Monday, February 1, 2016, Adunuthula, Seshu <[email protected]>
>wrote:
>
>> Hello Folks,
>>
>> We are actively working towards Apache Kylin 2.0 Release and would like
>>a
>> discussion with the community on what they would like to see in 2.0
>>release
>> of the product. We have three big rock items we are working towards in
>>2.0
>> and lot of additional minor feature enhancements.
>>
>> Streaming Data Source support.
>> This feature is semi baked in where the source of Kylin Cubes is Kafka
>> Topics. Cube Segment are built on micro batches of messages arriving on
>> Kafka topics. Currently a lot of work is going on to productize this
>> feature. Primary areas of work are Stream Processing Engines/Frameworks
>>to
>> process the micro batches and UI to support out of the box integration
>>of
>> Kafka topics with Kylin Cubes.
>>
>> Spark based Cube building Engine.
>> The initial performance numbers for a Spark based cubing engine did not
>> show substantial improvement over MR based engine, but would like this
>> feature to be baked in for the 2.0 Release. Lot of work underway to
>> stabilize this feature.
>>
>> Amazon EMR Integration
>> We had initial conversations with Amazon EMR to support Apache Kylin on
>> Amazon EMR which was received well. With Kylin 2.0 Apache Kylin will be
>> enabled feature on Amazon EMR. Limited work has gone into this area, but
>> this will be an important milestone for 2.0
>>
>> We are also working towards creating an area for community driven
>> improvements page similar to Apache Kafka¹s KIP
>> 
>>https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Propo
>>sals.
>> Stay tuned.
>>
>> Regards
>> Seshu Adunuthula
>>
>>
>>
>>
>>

Re: [DISCUSS]Apache Kylin 2.0 Release Features & Criteria

Reply via email to