Re: [DISCUSS]Apache Kylin 2.0 Release Features & Criteria

2016-02-03 Thread Jerome liu
I am looking forward the release of 2.0 , and support the streaming process
.  I hope it will be test to us .

2016-02-03 9:45 GMT+08:00 yu feng :

> We are looking forward the release of 2.0,  fast cubing algorithm, spark
> supporting and streaming cube is very useful to us.
> I have test 2.0-rc in our environment and it works fine, wish the release
> comes soon.
>
> 2016-02-02 18:02 GMT+08:00 Yerui Sun :
>
> > We’ve been looking forward the release of 2.0 for a long time.
> > We also have tested the 2.0-rc internally for a quite while, and proved
> > it’s stable.
> >
> > We’re confident the release for now.
> >
> > > 在 2016年2月2日,17:22,杨海乐  写道:
> > >
> > > hello all,
> > >   As users of kylin.We all help Kylin released version 2.0 as soon
> as
> > > possible in order to get better performance。As a member of the kylin
> > > community , I sincerely hope Kylin will be more powerful。
> > >
> > > --
> > > View this message in context:
> >
> http://apache-kylin.74782.x6.nabble.com/DISCUSS-Apache-Kylin-2-0-Release-Features-Criteria-tp3524p3555.html
> > > Sent from the Apache Kylin mailing list archive at Nabble.com.
> >
> >
>



-- 
Welcome to jerome.liuh...@gmail.com !


Re: [DISCUSS]Apache Kylin 2.0 Release Features & Criteria

2016-02-02 Thread yu feng
We are looking forward the release of 2.0,  fast cubing algorithm, spark
supporting and streaming cube is very useful to us.
I have test 2.0-rc in our environment and it works fine, wish the release
comes soon.

2016-02-02 18:02 GMT+08:00 Yerui Sun :

> We’ve been looking forward the release of 2.0 for a long time.
> We also have tested the 2.0-rc internally for a quite while, and proved
> it’s stable.
>
> We’re confident the release for now.
>
> > 在 2016年2月2日,17:22,杨海乐  写道:
> >
> > hello all,
> >   As users of kylin.We all help Kylin released version 2.0 as soon as
> > possible in order to get better performance。As a member of the kylin
> > community , I sincerely hope Kylin will be more powerful。
> >
> > --
> > View this message in context:
> http://apache-kylin.74782.x6.nabble.com/DISCUSS-Apache-Kylin-2-0-Release-Features-Criteria-tp3524p3555.html
> > Sent from the Apache Kylin mailing list archive at Nabble.com.
>
>


Re: [DISCUSS]Apache Kylin 2.0 Release Features & Criteria

2016-02-02 Thread Yerui Sun
We’ve been looking forward the release of 2.0 for a long time. 
We also have tested the 2.0-rc internally for a quite while, and proved it’s 
stable. 

We’re confident the release for now.

> 在 2016年2月2日,17:22,杨海乐  写道:
> 
> hello all,
>   As users of kylin.We all help Kylin released version 2.0 as soon as
> possible in order to get better performance。As a member of the kylin
> community , I sincerely hope Kylin will be more powerful。
> 
> --
> View this message in context: 
> http://apache-kylin.74782.x6.nabble.com/DISCUSS-Apache-Kylin-2-0-Release-Features-Criteria-tp3524p3555.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.



Re: [DISCUSS]Apache Kylin 2.0 Release Features & Criteria

2016-02-01 Thread Luke Han
Hi Seshu,
  "Done is better than Perfect" is one practice in our development:
release early, ask users
to try and test, then fix bugs, bring other features if any, and then
release a new one...
It works very well in the past and I believe it will continue benefit
further development.

  And you could see, the 2.x branch is active development code base
over several months,
as Yang mentioned, we are confident to release first version now. Also
there are already many
users in community are building package from 2.0 and reported many tickets
to help improve Kylin,
they are looking forward for the first release very much. With the Apache
 release process,
the entire community will help to test and try with each release candidate
for sure there's
no critical issues, please also help log JIRA if any.

  Back to Spark Cubing, as previous discussed with Spark community, there's
still one pending
JIRA for performance, so Spark Cubing already be excluded from the first
release. But with plug-able architecture, it could be very easy to
introduce back to coming version once the community happy for it.

And, for Amazon EMR part, it's more about how to deploy rather than one
"feature", it not  make
sense to set this as one criteria.

Thanks to bring this discussion to help community:-)

Luke


Best Regards!
-

Luke Han

On Tue, Feb 2, 2016 at 8:48 AM, Adunuthula, Seshu 
wrote:

> Yang,
>
> Implementing the old MR engine on the pluggable architecture does not
> prove that the architecture works. You need two points to draw a line. A
> single point does not prove that the architecture works.
>
> Improving the MR engine performance can be done on 1.0 code are without
> making it pluggable
>
>
> External talks and POCs are not the release criteria for a feature.
>
> Regards
> Seshu
>
> Sent from my iPhone
>
> > On Feb 1, 2016, at 6:01 PM, Li Yang  wrote:
> >
> > Seshu's understanding of the 2.0 and its plugin-able architecture is very
> > wrong. Let me correct. :-)
> >
> > The plugin-able architecture is rock solid. Its first commit went back to
> > Jul 2015. On top it, we built MR cube engine V2 and storage engine V2,
> > which give much improved build and query performance. At the same time,
> the
> > old V1 engines are still available on 2.0 branch. The plugin-able
> > architecture allows coexistence of alternative engines. And user is free
> to
> > choose any of the engines that suits the need.
> >
> > In the last few month, thorough testing has been done on the 2.0-rc
> branch.
> > Like mentioned, we have rebuild hundreds of jobs on the V2 engines and
> > compare the results by running tens of thousands of queries against both
> V1
> > and V2 cubes. The correctness is confirmed and performance improvement is
> > measured. The 2.0-rc branch is definitely the most well tested branch so
> > far. I am very confident of its quality.
> >
> > I believe Seshu also agrees with the improved performance and its
> quality,
> > as he proposed to release as v1.3. However he didn't know the improved
> > results are right on top of plugin-able architecture.
> >
> > So the saying plugin-able architecture is
> >> "POC quality features that should not be part of a release. We have not
> > built a single of these plugins that are production quality."
> > is very wrong.
> >
> > Streaming cubing is a less mature feature. It's in semi-production
> > quality.  As shared in a few public talks, eBay has a SEO dashboard case
> > that leverages the streaming cubing feature and achieves 5 minutes data
> > latency.
> >
> > And I made the point very clear -- "Streaming cubing experimental
> support,
> > ... minutes interval" -- think no one will be confused.
> >
> > If more concerns about 2.0 quality, I suggest JIRA be opened and test
> case
> > be created. So we have evidence and can collaborate to improve.
> >
> > Still many thanks to the comments. Things become clearer through healthy
> > discussions. :-)
> >
> > Cheers
> > Yang
> >
> > On Tuesday, February 2, 2016, Adunuthula, Seshu  > > wrote:
> >
> >> A strong -1 on this.
> >>
> >> - A better MR cubing algorithm, about 1.5 times faster than 1.x by
> >> comparing hundreds of jobs.
> >> - TopN pre-calculation (more UDFs coming)
> >> - ODBC compatible with Tableau 9.1, MS Excel, MS PowerBI
> >>
> >>
> >>
> >> These are incremental enhancements and does not warrant bumping up to
> 2.0
> >> release. We should release them as in 1.3
> >>
> >>
> >> - Streaming cubing experimental support, source from kafka, build cube
> >> in-mem at minutes interval
> >> - A plugin-able architecture, to allow alternative cube engine / storage
> >> engine / data source.
> >>
> >>
> >>
> >> These are POC quality features that should not be part of a release. We
> >> have not built a single of these plugins that are production quality.
> >>
> >> Luke/Yang I have told 

Re: [DISCUSS]Apache Kylin 2.0 Release Features & Criteria

2016-02-01 Thread Adunuthula, Seshu
Yes, we will be filing a whole bunch of JIRAs. This release is not Done,
so no point in arguing about whether it is perfect. Luke, I do not want
you to push this release through.

 

On 2/1/16, 7:54 PM, "Luke Han"  wrote:

>Hi Seshu,
>  "Done is better than Perfect" is one practice in our development:
>release early, ask users
>to try and test, then fix bugs, bring other features if any, and then
>release a new one...
>It works very well in the past and I believe it will continue benefit
>further development.
>
>  And you could see, the 2.x branch is active development code base
>over several months,
>as Yang mentioned, we are confident to release first version now. Also
>there are already many
>users in community are building package from 2.0 and reported many tickets
>to help improve Kylin,
>they are looking forward for the first release very much. With the Apache
> release process,
>the entire community will help to test and try with each release candidate
>for sure there's
>no critical issues, please also help log JIRA if any.
>
>  Back to Spark Cubing, as previous discussed with Spark community,
>there's
>still one pending
>JIRA for performance, so Spark Cubing already be excluded from the first
>release. But with plug-able architecture, it could be very easy to
>introduce back to coming version once the community happy for it.
>
>And, for Amazon EMR part, it's more about how to deploy rather than one
>"feature", it not  make
>sense to set this as one criteria.
>
>Thanks to bring this discussion to help community:-)
>
>Luke
>
>
>Best Regards!
>-
>
>Luke Han
>
>On Tue, Feb 2, 2016 at 8:48 AM, Adunuthula, Seshu 
>wrote:
>
>> Yang,
>>
>> Implementing the old MR engine on the pluggable architecture does not
>> prove that the architecture works. You need two points to draw a line. A
>> single point does not prove that the architecture works.
>>
>> Improving the MR engine performance can be done on 1.0 code are without
>> making it pluggable
>>
>>
>> External talks and POCs are not the release criteria for a feature.
>>
>> Regards
>> Seshu
>>
>> Sent from my iPhone
>>
>> > On Feb 1, 2016, at 6:01 PM, Li Yang  wrote:
>> >
>> > Seshu's understanding of the 2.0 and its plugin-able architecture is
>>very
>> > wrong. Let me correct. :-)
>> >
>> > The plugin-able architecture is rock solid. Its first commit went
>>back to
>> > Jul 2015. On top it, we built MR cube engine V2 and storage engine V2,
>> > which give much improved build and query performance. At the same
>>time,
>> the
>> > old V1 engines are still available on 2.0 branch. The plugin-able
>> > architecture allows coexistence of alternative engines. And user is
>>free
>> to
>> > choose any of the engines that suits the need.
>> >
>> > In the last few month, thorough testing has been done on the 2.0-rc
>> branch.
>> > Like mentioned, we have rebuild hundreds of jobs on the V2 engines and
>> > compare the results by running tens of thousands of queries against
>>both
>> V1
>> > and V2 cubes. The correctness is confirmed and performance
>>improvement is
>> > measured. The 2.0-rc branch is definitely the most well tested branch
>>so
>> > far. I am very confident of its quality.
>> >
>> > I believe Seshu also agrees with the improved performance and its
>> quality,
>> > as he proposed to release as v1.3. However he didn't know the improved
>> > results are right on top of plugin-able architecture.
>> >
>> > So the saying plugin-able architecture is
>> >> "POC quality features that should not be part of a release. We have
>>not
>> > built a single of these plugins that are production quality."
>> > is very wrong.
>> >
>> > Streaming cubing is a less mature feature. It's in semi-production
>> > quality.  As shared in a few public talks, eBay has a SEO dashboard
>>case
>> > that leverages the streaming cubing feature and achieves 5 minutes
>>data
>> > latency.
>> >
>> > And I made the point very clear -- "Streaming cubing experimental
>> support,
>> > ... minutes interval" -- think no one will be confused.
>> >
>> > If more concerns about 2.0 quality, I suggest JIRA be opened and test
>> case
>> > be created. So we have evidence and can collaborate to improve.
>> >
>> > Still many thanks to the comments. Things become clearer through
>>healthy
>> > discussions. :-)
>> >
>> > Cheers
>> > Yang
>> >
>> > On Tuesday, February 2, 2016, Adunuthula, Seshu > > > wrote:
>> >
>> >> A strong -1 on this.
>> >>
>> >> - A better MR cubing algorithm, about 1.5 times faster than 1.x by
>> >> comparing hundreds of jobs.
>> >> - TopN pre-calculation (more UDFs coming)
>> >> - ODBC compatible with Tableau 9.1, MS Excel, MS PowerBI
>> >>
>> >>
>> >>
>> >> These are incremental enhancements and does not warrant bumping up to
>> 2.0
>> >> release. We should release them as in 1.3
>> >>
>> >>
>> >> - Streaming