Re: ACID with Hive/Kylin

2023-12-11 Thread Xiaoxiang Yu
I don't know GDPR very well. Here is my understanding.

For hive and hdfs, you can consider using these techniques which support
ACID in Spark and Hive(I recommend first one):
1) Delta Lake,
https://docs.databricks.com/en/security/privacy/gdpr-delta.html
2) Hive ACID table, here is a link,
https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/migrate-hive-workloads/topics/hive-acid-migration-regulations.html

For Kylin, there are three places which may store data, index, snapshot,
dict. The refresh of the snapshot costs
less time and resources,  while refresh of index/dict much more. Snapshot
refresh will be triggered automatically
when you build an index every day.

I think you should consider centralizing user-sensitive columns(email,
phone, address) in dimension tables,
and your fact table only has the foreign key(for example, uid) which refers
to the primary key of dimension tables.
When you are modeling in Kylin, for these dim tables which contains
user-sensitive columns, try

1. set dim tables as snapshot by disable precompute join relation, so these
columns won't be built into indexes, refer
https://kylin.apache.org/5.0/docs/modeling/model_design/precompute_join_relations
2. not create a bitmap measure on these columns, so these columns won't be
built into dict


With warm regard
Xiaoxiang Yu



On Tue, Dec 12, 2023 at 12:11 PM Nam Đỗ Duy  wrote:

> Dear Xiaoxiang, Sirs/Madams
>
> I face an issue with deleting data of user according to GPDR-like policy
> which means when user send request to delete their personal data, we need
> to delete it from all system, that means to delete data:
>
> 1- from Kylin index (cube)
> 2- from Hive
> 3- from HDFS
>
> Have you had the same use-case before, do you have any suggestions to
> achieve this scenario?
>
> Thank you very much and best regards
>


Re: Pinot/Kylin/Druid quick comparision

2023-12-10 Thread Xiaoxiang Yu
1. JDBC source is a feature which in development, it will be supported
later.

2. Kylin supports kerberos now, I will write a doc as soon as possible.
(I will let you know.)

3. I think ranger and Kerberos are not doing the same things, one for
authentication, one for authorization. So they cannot replace each other.
Ranger can integrate with Kerberos, please check ranger's website for
information.


With warm regard
Xiaoxiang Yu



On Sat, Dec 9, 2023 at 8:01 AM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang for your reply
>
> -
> Do you have any suggestions/wishes for kylin 5(except real-time feature)?
> -
> Yes: please answer to help me clear this headache:
>
> 1. Can Kylin access the existing star schema in Oracle datawarehouse ? If
> not then do we have any work around?
>
> 2. My team is using kerberos for authentication, do you have any
> document/casestudy about integrating kerberos with kylin 4.x and kylin 5.x
>
> 3. Should we use apache ranger instead of kerberos for authentication and
> for security purposes?
>
> Thank you again
>
> On Thu, 7 Dec 2023 at 15:00 Xiaoxiang Yu  wrote:
>
> > I guess the release date should be 2024/01 .
> > Do you have any suggestions/wishes for kylin 5(except real-time feature)?
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Thu, Dec 7, 2023 at 3:44 PM Nam Đỗ Duy 
> wrote:
> >
> >> Thank you very much xiaoxiang, I did the presentation this morning
> already
> >> so there is no time for you to comment. Next time I will send you in
> >> advance. The meeting result was that we will implement both druid and
> >> kylin
> >> in the next couple of projects because of its realtime feature. Hope
> that
> >> kylin will have same feature soon.
> >>
> >> May I ask when will you release kylin 5.0?
> >>
> >> On Thu, Dec 7, 2023 at 9:26 AM Xiaoxiang Yu  wrote:
> >>
> >> > Since 2018 there are a lot of new features and code refactor.
> >> > If you like, you can share your ppt to me privately, maybe I can
> >> > give some comments.
> >> >
> >> > Here is the reference of advantages of Kylin since 2018:
> >> > - https://kylin.apache.org/blog/2022/01/12/The-Future-Of-Kylin/
> >> > -
> >> >
> >>
> https://kylin.apache.org/blog/2021/07/02/Apache-Kylin4-A-new-storage-and-compute-architecture/
> >> > - https://kylin.apache.org/5.0/docs/development/roadmap
> >> >
> >> > 
> >> > With warm regard
> >> > Xiaoxiang Yu
> >> >
> >> >
> >> >
> >> > On Wed, Dec 6, 2023 at 6:53 PM Nam Đỗ Duy 
> >> wrote:
> >> >
> >> >> Hi Xiaoxiang, tomorrow is the main presentation between Kylin and
> >> Druid in
> >> >> my team.
> >> >>
> >> >> I found this article and would like you to update me the advantages
> of
> >> >> Kylin since 2018 until now (especially with version 5 to be released)
> >> >>
> >> >> Apache Kylin | Why did Meituan develop Kylin On Druid (part 1 of 2)?
> >> >> <
> >> >>
> >>
> https://kylin.apache.org/blog/2018/12/12/why-did-meituan-develop-kylin-on-druid-part1-of-2/
> >> >> >
> >> >>
> >> >> On Wed, Dec 6, 2023 at 9:34 AM Nam Đỗ Duy  wrote:
> >> >>
> >> >> > Thank you very much for your prompt response, I still have several
> >> >> > questions to seek for your help later.
> >> >> >
> >> >> > Best regards and have a good day
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Wed, Dec 6, 2023 at 9:11 AM Xiaoxiang Yu 
> wrote:
> >> >> >
> >> >> >> Done. Github branch changed to kylin5.
> >> >> >>
> >> >> >> 
> >> >> >> With warm regard
> >> >> >> Xiaoxiang Yu
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Tue, Dec 5, 2023 at 11:13 AM Xiaoxiang Yu 
> >> wrote:
> >> >> >>
> >> >> >> > A JIRA ticket has been opened, waiting for INFRA :
> >> >> >> > https://issues.apache.org/jira/browse/INFRA-25238 .
> >> >> >> > 
> &

Re: Pinot/Kylin/Druid quick comparision

2023-12-07 Thread Xiaoxiang Yu
I guess the release date should be 2024/01 .
Do you have any suggestions/wishes for kylin 5(except real-time feature)?


With warm regard
Xiaoxiang Yu



On Thu, Dec 7, 2023 at 3:44 PM Nam Đỗ Duy  wrote:

> Thank you very much xiaoxiang, I did the presentation this morning already
> so there is no time for you to comment. Next time I will send you in
> advance. The meeting result was that we will implement both druid and kylin
> in the next couple of projects because of its realtime feature. Hope that
> kylin will have same feature soon.
>
> May I ask when will you release kylin 5.0?
>
> On Thu, Dec 7, 2023 at 9:26 AM Xiaoxiang Yu  wrote:
>
> > Since 2018 there are a lot of new features and code refactor.
> > If you like, you can share your ppt to me privately, maybe I can
> > give some comments.
> >
> > Here is the reference of advantages of Kylin since 2018:
> > - https://kylin.apache.org/blog/2022/01/12/The-Future-Of-Kylin/
> > -
> >
> https://kylin.apache.org/blog/2021/07/02/Apache-Kylin4-A-new-storage-and-compute-architecture/
> > - https://kylin.apache.org/5.0/docs/development/roadmap
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Wed, Dec 6, 2023 at 6:53 PM Nam Đỗ Duy 
> wrote:
> >
> >> Hi Xiaoxiang, tomorrow is the main presentation between Kylin and Druid
> in
> >> my team.
> >>
> >> I found this article and would like you to update me the advantages of
> >> Kylin since 2018 until now (especially with version 5 to be released)
> >>
> >> Apache Kylin | Why did Meituan develop Kylin On Druid (part 1 of 2)?
> >> <
> >>
> https://kylin.apache.org/blog/2018/12/12/why-did-meituan-develop-kylin-on-druid-part1-of-2/
> >> >
> >>
> >> On Wed, Dec 6, 2023 at 9:34 AM Nam Đỗ Duy  wrote:
> >>
> >> > Thank you very much for your prompt response, I still have several
> >> > questions to seek for your help later.
> >> >
> >> > Best regards and have a good day
> >> >
> >> >
> >> >
> >> > On Wed, Dec 6, 2023 at 9:11 AM Xiaoxiang Yu  wrote:
> >> >
> >> >> Done. Github branch changed to kylin5.
> >> >>
> >> >> 
> >> >> With warm regard
> >> >> Xiaoxiang Yu
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Dec 5, 2023 at 11:13 AM Xiaoxiang Yu 
> wrote:
> >> >>
> >> >> > A JIRA ticket has been opened, waiting for INFRA :
> >> >> > https://issues.apache.org/jira/browse/INFRA-25238 .
> >> >> > 
> >> >> > With warm regard
> >> >> > Xiaoxiang Yu
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Tue, Dec 5, 2023 at 10:30 AM Nam Đỗ Duy  >
> >> >> wrote:
> >> >> >
> >> >> >> Thank you Xiaoxiang, please update me when you have changed your
> >> >> default
> >> >> >> branch. In case people are impressed by the numbers then I hope to
> >> turn
> >> >> >> this situation to reverse direction.
> >> >> >>
> >> >> >> On Tue, Dec 5, 2023 at 9:02 AM Xiaoxiang Yu 
> >> wrote:
> >> >> >>
> >> >> >>> The default branch is for 4.X which is a maintained branch, the
> >> active
> >> >> >>> branch is kylin5.
> >> >> >>> I will change the default branch to kylin5 later.
> >> >> >>>
> >> >> >>> 
> >> >> >>> With warm regard
> >> >> >>> Xiaoxiang Yu
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> On Tue, Dec 5, 2023 at 9:12 AM Nam Đỗ Duy  >
> >> >> >>> wrote:
> >> >> >>>
> >> >> >>>> Hi Xiaoxiang, Sirs / Madams
> >> >> >>>>
> >> >> >>>> Can you see the atttached photo
> >> >> >>>>
> >> >> >>>> My boss asked that why druid commit code regularly but kylin had
> >> not
> >> >> >>>> been committed since July
> >> >> >>>>
> >> >&g

Re: Pinot/Kylin/Druid quick comparision

2023-12-06 Thread Xiaoxiang Yu
Since 2018 there are a lot of new features and code refactor.
If you like, you can share your ppt to me privately, maybe I can
give some comments.

Here is the reference of advantages of Kylin since 2018:
- https://kylin.apache.org/blog/2022/01/12/The-Future-Of-Kylin/
-
https://kylin.apache.org/blog/2021/07/02/Apache-Kylin4-A-new-storage-and-compute-architecture/
- https://kylin.apache.org/5.0/docs/development/roadmap


With warm regard
Xiaoxiang Yu



On Wed, Dec 6, 2023 at 6:53 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang, tomorrow is the main presentation between Kylin and Druid in
> my team.
>
> I found this article and would like you to update me the advantages of
> Kylin since 2018 until now (especially with version 5 to be released)
>
> Apache Kylin | Why did Meituan develop Kylin On Druid (part 1 of 2)?
> <
> https://kylin.apache.org/blog/2018/12/12/why-did-meituan-develop-kylin-on-druid-part1-of-2/
> >
>
> On Wed, Dec 6, 2023 at 9:34 AM Nam Đỗ Duy  wrote:
>
> > Thank you very much for your prompt response, I still have several
> > questions to seek for your help later.
> >
> > Best regards and have a good day
> >
> >
> >
> > On Wed, Dec 6, 2023 at 9:11 AM Xiaoxiang Yu  wrote:
> >
> >> Done. Github branch changed to kylin5.
> >>
> >> ----
> >> With warm regard
> >> Xiaoxiang Yu
> >>
> >>
> >>
> >> On Tue, Dec 5, 2023 at 11:13 AM Xiaoxiang Yu  wrote:
> >>
> >> > A JIRA ticket has been opened, waiting for INFRA :
> >> > https://issues.apache.org/jira/browse/INFRA-25238 .
> >> > 
> >> > With warm regard
> >> > Xiaoxiang Yu
> >> >
> >> >
> >> >
> >> > On Tue, Dec 5, 2023 at 10:30 AM Nam Đỗ Duy 
> >> wrote:
> >> >
> >> >> Thank you Xiaoxiang, please update me when you have changed your
> >> default
> >> >> branch. In case people are impressed by the numbers then I hope to
> turn
> >> >> this situation to reverse direction.
> >> >>
> >> >> On Tue, Dec 5, 2023 at 9:02 AM Xiaoxiang Yu  wrote:
> >> >>
> >> >>> The default branch is for 4.X which is a maintained branch, the
> active
> >> >>> branch is kylin5.
> >> >>> I will change the default branch to kylin5 later.
> >> >>>
> >> >>> 
> >> >>> With warm regard
> >> >>> Xiaoxiang Yu
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Tue, Dec 5, 2023 at 9:12 AM Nam Đỗ Duy 
> >> >>> wrote:
> >> >>>
> >> >>>> Hi Xiaoxiang, Sirs / Madams
> >> >>>>
> >> >>>> Can you see the atttached photo
> >> >>>>
> >> >>>> My boss asked that why druid commit code regularly but kylin had
> not
> >> >>>> been committed since July
> >> >>>>
> >> >>>>
> >> >>>> On Mon, 4 Dec 2023 at 15:33 Xiaoxiang Yu  wrote:
> >> >>>>
> >> >>>>> I think so.
> >> >>>>>
> >> >>>>> Response time is not the only factor to make a decision. Kylin
> could
> >> >>>>> be cheaper
> >> >>>>> when the query pattern is suitable for the Kylin model, and Kylin
> >> can
> >> >>>>> guarantee
> >> >>>>> reasonable query latency. Clickhouse will be quicker in an ad hoc
> >> >>>>> query scenario.
> >> >>>>>
> >> >>>>> By the way, Youzan and Kyligence combine them together to provide
> >> >>>>> unified data analytics services for their customers.
> >> >>>>>
> >> >>>>> 
> >> >>>>> With warm regard
> >> >>>>> Xiaoxiang Yu
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> On Mon, Dec 4, 2023 at 4:01 PM Nam Đỗ Duy  >
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>> Hi Xiaoxiang, thank you
> >> >>>>>>
> >> >>>>>> In case my client uses cloud computing service like gcp or aws,
&

Re: Pinot/Kylin/Druid quick comparision

2023-12-05 Thread Xiaoxiang Yu
Done. Github branch changed to kylin5.


With warm regard
Xiaoxiang Yu



On Tue, Dec 5, 2023 at 11:13 AM Xiaoxiang Yu  wrote:

> A JIRA ticket has been opened, waiting for INFRA :
> https://issues.apache.org/jira/browse/INFRA-25238 .
> 
> With warm regard
> Xiaoxiang Yu
>
>
>
> On Tue, Dec 5, 2023 at 10:30 AM Nam Đỗ Duy  wrote:
>
>> Thank you Xiaoxiang, please update me when you have changed your default
>> branch. In case people are impressed by the numbers then I hope to turn
>> this situation to reverse direction.
>>
>> On Tue, Dec 5, 2023 at 9:02 AM Xiaoxiang Yu  wrote:
>>
>>> The default branch is for 4.X which is a maintained branch, the active
>>> branch is kylin5.
>>> I will change the default branch to kylin5 later.
>>>
>>> 
>>> With warm regard
>>> Xiaoxiang Yu
>>>
>>>
>>>
>>> On Tue, Dec 5, 2023 at 9:12 AM Nam Đỗ Duy 
>>> wrote:
>>>
>>>> Hi Xiaoxiang, Sirs / Madams
>>>>
>>>> Can you see the atttached photo
>>>>
>>>> My boss asked that why druid commit code regularly but kylin had not
>>>> been committed since July
>>>>
>>>>
>>>> On Mon, 4 Dec 2023 at 15:33 Xiaoxiang Yu  wrote:
>>>>
>>>>> I think so.
>>>>>
>>>>> Response time is not the only factor to make a decision. Kylin could
>>>>> be cheaper
>>>>> when the query pattern is suitable for the Kylin model, and Kylin can
>>>>> guarantee
>>>>> reasonable query latency. Clickhouse will be quicker in an ad hoc
>>>>> query scenario.
>>>>>
>>>>> By the way, Youzan and Kyligence combine them together to provide
>>>>> unified data analytics services for their customers.
>>>>>
>>>>> 
>>>>> With warm regard
>>>>> Xiaoxiang Yu
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Dec 4, 2023 at 4:01 PM Nam Đỗ Duy 
>>>>> wrote:
>>>>>
>>>>>> Hi Xiaoxiang, thank you
>>>>>>
>>>>>> In case my client uses cloud computing service like gcp or aws, which
>>>>>> will cost more: precalculation feature of kylin or clickhouse (incase
>>>>>> of
>>>>>> kylin, I have a thought that the query execution has been done once
>>>>>> and
>>>>>> stored in cube to be used many times so kylin uses less cloud
>>>>>> computation,
>>>>>> is that true)?
>>>>>>
>>>>>> On Mon, Dec 4, 2023 at 2:46 PM Xiaoxiang Yu  wrote:
>>>>>>
>>>>>> > Following text is part of an article(
>>>>>> > https://zhuanlan.zhihu.com/p/343394287) .
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> ===
>>>>>> >
>>>>>> > Kylin is suitable for aggregation queries with fixed modes because
>>>>>> of its
>>>>>> > pre-calculated technology, for example, join, group by, and where
>>>>>> condition
>>>>>> > modes in SQL are relatively fixed, etc. The larger the data volume
>>>>>> is, the
>>>>>> > more obvious the advantages of using Kylin are; in particular,
>>>>>> Kylin is
>>>>>> > particularly advantageous in the scenarios of de-emphasis (count
>>>>>> distinct),
>>>>>> > Top N, and Percentile. In particular, Kylin's advantages in
>>>>>> de-weighting
>>>>>> > (count distinct), Top N, Percentile and other scenarios are
>>>>>> especially
>>>>>> > huge, and it is used in a large number of scenarios, such as
>>>>>> Dashboard, all
>>>>>> > kinds of reports, large-screen display, traffic statistics, and user
>>>>>> > behavior analysis. Meituan, Aurora, Shell Housing, etc. use Kylin
>>>>>> to build
>>>>>> > their data service platforms, providing millions to tens of
>>>>>> millions of
>>>>>> > queries per day, and most of the queries can be completed within 2
>

[jira] [Created] (KYLIN-5730) Query dry-run for better modeling

2023-12-04 Thread Xiaoxiang Yu (Jira)
Xiaoxiang Yu created KYLIN-5730:
---

 Summary: Query dry-run for better modeling
 Key: KYLIN-5730
 URL: https://issues.apache.org/jira/browse/KYLIN-5730
 Project: Kylin
  Issue Type: Improvement
Affects Versions: 5.0-beta
Reporter: Xiaoxiang Yu
Assignee: Xiaoxiang Yu
 Fix For: 5.0.0


When user enable this feature, the query insight page will display helpful 
message for user to understand why model is not match.

 

Following messages will display some query analytics, including at least:
 # RelNode Tree
 # OLAPContext and matched Model for each context
 # Spark Physical Plan

Configuration entry is 'kylin.query.dryrun-enabled', at project level.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Pinot/Kylin/Druid quick comparision

2023-12-04 Thread Xiaoxiang Yu
A JIRA ticket has been opened, waiting for INFRA :
https://issues.apache.org/jira/browse/INFRA-25238 .

With warm regard
Xiaoxiang Yu



On Tue, Dec 5, 2023 at 10:30 AM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang, please update me when you have changed your default
> branch. In case people are impressed by the numbers then I hope to turn
> this situation to reverse direction.
>
> On Tue, Dec 5, 2023 at 9:02 AM Xiaoxiang Yu  wrote:
>
>> The default branch is for 4.X which is a maintained branch, the active
>> branch is kylin5.
>> I will change the default branch to kylin5 later.
>>
>> ----
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Tue, Dec 5, 2023 at 9:12 AM Nam Đỗ Duy  wrote:
>>
>>> Hi Xiaoxiang, Sirs / Madams
>>>
>>> Can you see the atttached photo
>>>
>>> My boss asked that why druid commit code regularly but kylin had not
>>> been committed since July
>>>
>>>
>>> On Mon, 4 Dec 2023 at 15:33 Xiaoxiang Yu  wrote:
>>>
>>>> I think so.
>>>>
>>>> Response time is not the only factor to make a decision. Kylin could be
>>>> cheaper
>>>> when the query pattern is suitable for the Kylin model, and Kylin can
>>>> guarantee
>>>> reasonable query latency. Clickhouse will be quicker in an ad hoc query
>>>> scenario.
>>>>
>>>> By the way, Youzan and Kyligence combine them together to provide
>>>> unified data analytics services for their customers.
>>>>
>>>> 
>>>> With warm regard
>>>> Xiaoxiang Yu
>>>>
>>>>
>>>>
>>>> On Mon, Dec 4, 2023 at 4:01 PM Nam Đỗ Duy 
>>>> wrote:
>>>>
>>>>> Hi Xiaoxiang, thank you
>>>>>
>>>>> In case my client uses cloud computing service like gcp or aws, which
>>>>> will cost more: precalculation feature of kylin or clickhouse (incase
>>>>> of
>>>>> kylin, I have a thought that the query execution has been done once and
>>>>> stored in cube to be used many times so kylin uses less cloud
>>>>> computation,
>>>>> is that true)?
>>>>>
>>>>> On Mon, Dec 4, 2023 at 2:46 PM Xiaoxiang Yu  wrote:
>>>>>
>>>>> > Following text is part of an article(
>>>>> > https://zhuanlan.zhihu.com/p/343394287) .
>>>>> >
>>>>> >
>>>>> >
>>>>> ===
>>>>> >
>>>>> > Kylin is suitable for aggregation queries with fixed modes because
>>>>> of its
>>>>> > pre-calculated technology, for example, join, group by, and where
>>>>> condition
>>>>> > modes in SQL are relatively fixed, etc. The larger the data volume
>>>>> is, the
>>>>> > more obvious the advantages of using Kylin are; in particular, Kylin
>>>>> is
>>>>> > particularly advantageous in the scenarios of de-emphasis (count
>>>>> distinct),
>>>>> > Top N, and Percentile. In particular, Kylin's advantages in
>>>>> de-weighting
>>>>> > (count distinct), Top N, Percentile and other scenarios are
>>>>> especially
>>>>> > huge, and it is used in a large number of scenarios, such as
>>>>> Dashboard, all
>>>>> > kinds of reports, large-screen display, traffic statistics, and user
>>>>> > behavior analysis. Meituan, Aurora, Shell Housing, etc. use Kylin to
>>>>> build
>>>>> > their data service platforms, providing millions to tens of millions
>>>>> of
>>>>> > queries per day, and most of the queries can be completed within 2 -
>>>>> 3
>>>>> > seconds. There is no better alternative for such a high concurrency
>>>>> > scenario.
>>>>> >
>>>>> > ClickHouse, because of its MPP architecture, has high computing
>>>>> power and
>>>>> > is more suitable when the query request is more flexible, or when
>>>>> there is
>>>>> > a need for detailed queries with low concurrency. Scenarios include:
>>>>> very
>>>>> > many columns and where conditions are arbitrarily combin

Re: Pinot/Kylin/Druid quick comparision

2023-12-04 Thread Xiaoxiang Yu
The default branch is for 4.X which is a maintained branch, the active
branch is kylin5.
I will change the default branch to kylin5 later.


With warm regard
Xiaoxiang Yu



On Tue, Dec 5, 2023 at 9:12 AM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang, Sirs / Madams
>
> Can you see the atttached photo
>
> My boss asked that why druid commit code regularly but kylin had not been
> committed since July
>
>
> On Mon, 4 Dec 2023 at 15:33 Xiaoxiang Yu  wrote:
>
>> I think so.
>>
>> Response time is not the only factor to make a decision. Kylin could be
>> cheaper
>> when the query pattern is suitable for the Kylin model, and Kylin can
>> guarantee
>> reasonable query latency. Clickhouse will be quicker in an ad hoc query
>> scenario.
>>
>> By the way, Youzan and Kyligence combine them together to provide
>> unified data analytics services for their customers.
>>
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Mon, Dec 4, 2023 at 4:01 PM Nam Đỗ Duy  wrote:
>>
>>> Hi Xiaoxiang, thank you
>>>
>>> In case my client uses cloud computing service like gcp or aws, which
>>> will cost more: precalculation feature of kylin or clickhouse (incase of
>>> kylin, I have a thought that the query execution has been done once and
>>> stored in cube to be used many times so kylin uses less cloud
>>> computation,
>>> is that true)?
>>>
>>> On Mon, Dec 4, 2023 at 2:46 PM Xiaoxiang Yu  wrote:
>>>
>>> > Following text is part of an article(
>>> > https://zhuanlan.zhihu.com/p/343394287) .
>>> >
>>> >
>>> >
>>> ===
>>> >
>>> > Kylin is suitable for aggregation queries with fixed modes because of
>>> its
>>> > pre-calculated technology, for example, join, group by, and where
>>> condition
>>> > modes in SQL are relatively fixed, etc. The larger the data volume is,
>>> the
>>> > more obvious the advantages of using Kylin are; in particular, Kylin is
>>> > particularly advantageous in the scenarios of de-emphasis (count
>>> distinct),
>>> > Top N, and Percentile. In particular, Kylin's advantages in
>>> de-weighting
>>> > (count distinct), Top N, Percentile and other scenarios are especially
>>> > huge, and it is used in a large number of scenarios, such as
>>> Dashboard, all
>>> > kinds of reports, large-screen display, traffic statistics, and user
>>> > behavior analysis. Meituan, Aurora, Shell Housing, etc. use Kylin to
>>> build
>>> > their data service platforms, providing millions to tens of millions of
>>> > queries per day, and most of the queries can be completed within 2 - 3
>>> > seconds. There is no better alternative for such a high concurrency
>>> > scenario.
>>> >
>>> > ClickHouse, because of its MPP architecture, has high computing power
>>> and
>>> > is more suitable when the query request is more flexible, or when
>>> there is
>>> > a need for detailed queries with low concurrency. Scenarios include:
>>> very
>>> > many columns and where conditions are arbitrarily combined with the
>>> user
>>> > label filtering, not a large amount of concurrency of complex
>>> on-the-spot
>>> > query and so on. If the amount of data and access is large, you need to
>>> > deploy a distributed ClickHouse cluster, which is a higher challenge
>>> for
>>> > operation and maintenance.
>>> >
>>> > If some queries are very flexible but infrequent, it is more
>>> > resource-efficient to use now-computing. Since the number of queries is
>>> > small, even if each query consumes a lot of computational resources,
>>> it is
>>> > still cost-effective overall. If some queries have a fixed pattern and
>>> the
>>> > query volume is large, it is more suitable for Kylin, because the query
>>> > volume is large, and by using large computational resources to save the
>>> > results, the upfront computational cost can be amortized over each
>>> query,
>>> > so it is the most economical.
>>> >
>>> > --- Translated with DeepL.com (free version)
>>> >
>>> >
>>> > 
>>> > With warm regard
>>> > Xia

Re: Pinot/Kylin/Druid quick comparision

2023-12-04 Thread Xiaoxiang Yu
I think so.

Response time is not the only factor to make a decision. Kylin could be
cheaper
when the query pattern is suitable for the Kylin model, and Kylin can
guarantee
reasonable query latency. Clickhouse will be quicker in an ad hoc query
scenario.

By the way, Youzan and Kyligence combine them together to provide
unified data analytics services for their customers.


With warm regard
Xiaoxiang Yu



On Mon, Dec 4, 2023 at 4:01 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang, thank you
>
> In case my client uses cloud computing service like gcp or aws, which
> will cost more: precalculation feature of kylin or clickhouse (incase of
> kylin, I have a thought that the query execution has been done once and
> stored in cube to be used many times so kylin uses less cloud computation,
> is that true)?
>
> On Mon, Dec 4, 2023 at 2:46 PM Xiaoxiang Yu  wrote:
>
> > Following text is part of an article(
> > https://zhuanlan.zhihu.com/p/343394287) .
> >
> >
> >
> ===
> >
> > Kylin is suitable for aggregation queries with fixed modes because of its
> > pre-calculated technology, for example, join, group by, and where
> condition
> > modes in SQL are relatively fixed, etc. The larger the data volume is,
> the
> > more obvious the advantages of using Kylin are; in particular, Kylin is
> > particularly advantageous in the scenarios of de-emphasis (count
> distinct),
> > Top N, and Percentile. In particular, Kylin's advantages in de-weighting
> > (count distinct), Top N, Percentile and other scenarios are especially
> > huge, and it is used in a large number of scenarios, such as Dashboard,
> all
> > kinds of reports, large-screen display, traffic statistics, and user
> > behavior analysis. Meituan, Aurora, Shell Housing, etc. use Kylin to
> build
> > their data service platforms, providing millions to tens of millions of
> > queries per day, and most of the queries can be completed within 2 - 3
> > seconds. There is no better alternative for such a high concurrency
> > scenario.
> >
> > ClickHouse, because of its MPP architecture, has high computing power and
> > is more suitable when the query request is more flexible, or when there
> is
> > a need for detailed queries with low concurrency. Scenarios include: very
> > many columns and where conditions are arbitrarily combined with the user
> > label filtering, not a large amount of concurrency of complex on-the-spot
> > query and so on. If the amount of data and access is large, you need to
> > deploy a distributed ClickHouse cluster, which is a higher challenge for
> > operation and maintenance.
> >
> > If some queries are very flexible but infrequent, it is more
> > resource-efficient to use now-computing. Since the number of queries is
> > small, even if each query consumes a lot of computational resources, it
> is
> > still cost-effective overall. If some queries have a fixed pattern and
> the
> > query volume is large, it is more suitable for Kylin, because the query
> > volume is large, and by using large computational resources to save the
> > results, the upfront computational cost can be amortized over each query,
> > so it is the most economical.
> >
> > --- Translated with DeepL.com (free version)
> >
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Mon, Dec 4, 2023 at 3:16 PM Nam Đỗ Duy 
> wrote:
> >
> >> Thank you Xiaoxiang for the near real time streaming feature. That's
> >> great.
> >>
> >> This morning there has been a new challenge to my team: clickhouse
> offered
> >> us the speed of calculating 8 billion rows in millisecond which is
> faster
> >> than my demonstration (I used Kylin to do calculating 1 billion rows in
> >> 2.9
> >> seconds)
> >>
> >> Can you briefly suggest the advantages of kylin over clickhouse so that
> I
> >> can defend my demonstration.
> >>
> >> On Mon, Dec 4, 2023 at 1:55 PM Xiaoxiang Yu  wrote:
> >>
> >> > 1. "In this important scenario of realtime analytics, the reason here
> is
> >> > that
> >> > kylin has lag time due to model update of new segment build, is that
> >> > correct?"
> >> >
> >> > You are correct.
> >> >
> >> > 2. "If that is true, then can you suggest a work-around of combination
> >> of
> >> > ... "
> >> >
> >> > Kylin is 

Re: Pinot/Kylin/Druid quick comparision

2023-12-03 Thread Xiaoxiang Yu
Following text is part of an article(https://zhuanlan.zhihu.com/p/343394287)
.

===

Kylin is suitable for aggregation queries with fixed modes because of its
pre-calculated technology, for example, join, group by, and where condition
modes in SQL are relatively fixed, etc. The larger the data volume is, the
more obvious the advantages of using Kylin are; in particular, Kylin is
particularly advantageous in the scenarios of de-emphasis (count distinct),
Top N, and Percentile. In particular, Kylin's advantages in de-weighting
(count distinct), Top N, Percentile and other scenarios are especially
huge, and it is used in a large number of scenarios, such as Dashboard, all
kinds of reports, large-screen display, traffic statistics, and user
behavior analysis. Meituan, Aurora, Shell Housing, etc. use Kylin to build
their data service platforms, providing millions to tens of millions of
queries per day, and most of the queries can be completed within 2 - 3
seconds. There is no better alternative for such a high concurrency
scenario.

ClickHouse, because of its MPP architecture, has high computing power and
is more suitable when the query request is more flexible, or when there is
a need for detailed queries with low concurrency. Scenarios include: very
many columns and where conditions are arbitrarily combined with the user
label filtering, not a large amount of concurrency of complex on-the-spot
query and so on. If the amount of data and access is large, you need to
deploy a distributed ClickHouse cluster, which is a higher challenge for
operation and maintenance.

If some queries are very flexible but infrequent, it is more
resource-efficient to use now-computing. Since the number of queries is
small, even if each query consumes a lot of computational resources, it is
still cost-effective overall. If some queries have a fixed pattern and the
query volume is large, it is more suitable for Kylin, because the query
volume is large, and by using large computational resources to save the
results, the upfront computational cost can be amortized over each query,
so it is the most economical.

--- Translated with DeepL.com (free version)



With warm regard
Xiaoxiang Yu



On Mon, Dec 4, 2023 at 3:16 PM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang for the near real time streaming feature. That's great.
>
> This morning there has been a new challenge to my team: clickhouse offered
> us the speed of calculating 8 billion rows in millisecond which is faster
> than my demonstration (I used Kylin to do calculating 1 billion rows in 2.9
> seconds)
>
> Can you briefly suggest the advantages of kylin over clickhouse so that I
> can defend my demonstration.
>
> On Mon, Dec 4, 2023 at 1:55 PM Xiaoxiang Yu  wrote:
>
> > 1. "In this important scenario of realtime analytics, the reason here is
> > that
> > kylin has lag time due to model update of new segment build, is that
> > correct?"
> >
> > You are correct.
> >
> > 2. "If that is true, then can you suggest a work-around of combination of
> > ... "
> >
> > Kylin is planning to introduce NRT streaming(coding is completed but not
> > released),
> > which can make the time-lag to about 3 minutes(that is my estimation but
> I
> > am
> > quite certain about it).
> > NRT stands for 'near real-time', it will run a job and do micro-batch
> > aggregation and persistence periodically. The price is that you need to
> run
> > and monitor a long-running
> >  job. This feature is based on Spark Streaming, so you need knowledge of
> > it.
> >
> > I am curious about what is the maximum time-lag your customers
> > can tolerate?
> > Personally, I guess minute level time-lag is ok for most cases.
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Mon, Dec 4, 2023 at 12:28 PM Nam Đỗ Duy 
> wrote:
> >
> > > Druid is better in
> > > - Have a real-time datasource like Kafka etc.
> > >
> > > ==
> > >
> > > Hi Xiaoxiang, thank you for your response.
> > >
> > > In this important scenario of realtime alalytics, the reason here is
> that
> > > kylin has lag time due to model update of new segment build, is that
> > > correct?
> > >
> > > If that is true, then can you suggest a work-around of combination of :
> > >
> > > (time - lag kylin cube) + (realtime DB update) to provide
> > > realtime capability ?
> > >
> > > IMO, the point here is to find that (realtime DB update) and integrate
> it
> > > with (time - lag kylin cu

Re: Pinot/Kylin/Druid quick comparision

2023-12-03 Thread Xiaoxiang Yu
1. "In this important scenario of realtime analytics, the reason here is
that
kylin has lag time due to model update of new segment build, is that
correct?"

You are correct.

2. "If that is true, then can you suggest a work-around of combination of
... "

Kylin is planning to introduce NRT streaming(coding is completed but not
released),
which can make the time-lag to about 3 minutes(that is my estimation but I
am
quite certain about it).
NRT stands for 'near real-time', it will run a job and do micro-batch
aggregation and persistence periodically. The price is that you need to run
and monitor a long-running
 job. This feature is based on Spark Streaming, so you need knowledge of it.

I am curious about what is the maximum time-lag your customers
can tolerate?
Personally, I guess minute level time-lag is ok for most cases.

----
With warm regard
Xiaoxiang Yu



On Mon, Dec 4, 2023 at 12:28 PM Nam Đỗ Duy  wrote:

> Druid is better in
> - Have a real-time datasource like Kafka etc.
>
> ==
>
> Hi Xiaoxiang, thank you for your response.
>
> In this important scenario of realtime alalytics, the reason here is that
> kylin has lag time due to model update of new segment build, is that
> correct?
>
> If that is true, then can you suggest a work-around of combination of :
>
> (time - lag kylin cube) + (realtime DB update) to provide
> realtime capability ?
>
> IMO, the point here is to find that (realtime DB update) and integrate it
> with (time - lag kylin cube).
>
> On Fri, Dec 1, 2023 at 1:53 PM Xiaoxiang Yu  wrote:
>
> > I researched and tested Druid two years ago(I don't know too much about
> >  the change of Druid in these two years. New features that I know are :
> > new UI, fully on K8s etc).
> >
> > Here are some cases you should consider using Druid other than Kylin
> > at the moment (using Kylin 5.0-beta to compare the Druid which I used two
> > years ago):
> >
> > - Have a real-time datasource like Kafka etc.
> > - Most queries are small(Based on my test result, I think Druid had
> better
> > response time for small queries two years ago.)
> > - Don't know how to optimize Spark/Hadoop, want to use the K8S/public
> >   cloud platform as your deployment platform.
> >
> > But I do think there are many scenarios in which Kylin could be better,
> > like:
> >
> > - Better performance for complex/big queries. Kylin can have a more
> > exact-match/fine-grained
> >   Index for queries containing different `Group By dimensions`.
> > - User-friendly UI for modeling.
> > - Support 'Join' better? (Not sure at the moment)
> > - ODBC driver for different BI.(its website did not show it supports ODBC
> > well)
> > - Looks like Kylin supports ANSI SQL better than Druid.
> >
> >
> > I don't know Pinot, so I have nothing to say about it.
> > Hope to help you, or you are free to share your opinion.
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Fri, Dec 1, 2023 at 11:11 AM Nam Đỗ Duy 
> wrote:
> >
> >> Dear Xiaoxiang,
> >> Sirs/Madams,
> >>
> >> May I post my boss's question:
> >>
> >> What are the pros and cons of the OLAP platform Kylin compared to Pinot
> >> and
> >> Druid?
> >>
> >> Please kindly let me know
> >>
> >> Thank you very much and best regards
> >>
> >
>


Re: Pinot/Kylin/Druid quick comparision

2023-11-30 Thread Xiaoxiang Yu
I researched and tested Druid two years ago(I don't know too much about
 the change of Druid in these two years. New features that I know are : new
UI, fully on K8s etc).

Here are some cases you should consider using Druid other than Kylin
at the moment (using Kylin 5.0-beta to compare the Druid which I used two
years ago):

- Have a real-time datasource like Kafka etc.
- Most queries are small(Based on my test result, I think Druid had better
response time for small queries two years ago.)
- Don't know how to optimize Spark/Hadoop, want to use the K8S/public
  cloud platform as your deployment platform.

But I do think there are many scenarios in which Kylin could be better,
like:

- Better performance for complex/big queries. Kylin can have a more
exact-match/fine-grained
  Index for queries containing different `Group By dimensions`.
- User-friendly UI for modeling.
- Support 'Join' better? (Not sure at the moment)
- ODBC driver for different BI.(its website did not show it supports ODBC
well)
- Looks like Kylin supports ANSI SQL better than Druid.


I don't know Pinot, so I have nothing to say about it.
Hope to help you, or you are free to share your opinion.


With warm regard
Xiaoxiang Yu



On Fri, Dec 1, 2023 at 11:11 AM Nam Đỗ Duy  wrote:

> Dear Xiaoxiang,
> Sirs/Madams,
>
> May I post my boss's question:
>
> What are the pros and cons of the OLAP platform Kylin compared to Pinot and
> Druid?
>
> Please kindly let me know
>
> Thank you very much and best regards
>


Re: How to reflect last hour data into Hive and Kylin Insights query window

2023-11-28 Thread Xiaoxiang Yu
Sorry for my incorrect answers before. Let me make it right.

Today I tried again and reproduced the issues you reported.
The Kylin query engine may not read new files because old metadata is
cached and not be invalidated.
It is a known issues with proper solution, the solution is calling a rest
api to refresh meta cache:
https://kylin.apache.org/5.0/docs/restapi/query_api#Refresh-cached-data

Here is a sample call in my side:
curl -X PUT --user ADMIN:KYLIN -H "Content-Type:
application/json;charset=utf-8" -d '{ "tables":
["DATABASE_NAME.TABLE_NAME"]}'
http://localhost:7070/kylin/api/tables/single_catalog_cache

It is caused by a Spark's feature(introduced in 3.1.0) which tries to cache
HDFS file lists in the spark driver. (
https://spark.apache.org/docs/latest/sql-ref-syntax-aux-cache-refresh-table.html).
It's configuration entry is spark.sql.metadataCacheTTLSeconds

----
With warm regard
Xiaoxiang Yu



On Wed, Nov 22, 2023 at 6:06 PM Xiaoxiang Yu  wrote:

> It is a good question, I can share some articles with you.
>
> 1. How to build a metric repository by Kylin to share among data teams (DA,
> DS, AI), is that the usage of measure in Kylin?
>
> I think the metric repository(or metrics store) is actually which Kylin
> can help. For example,
> Beike(ke.com) did create an indicator/metrics platform whose backend is
> Kylin. They created a metrics
> store on the top of Kylin.
>
> The architecture looks like this
> https://mmbiz.qpic.cn/mmbiz_png/9xAoGyC249Kd9icMaNT1Gs7AlDAZic7PScYNCOkSQF8PqbuSLicoxhdk4w3kJtC0bms4FzW6iby08bNiaVsUzUkBPmg/640?wx_fmt=png=5_lazy=1_co=1
>
>
> Here is technical article which wrote in Chinese about it(I am sorry this
> is not translated):
>  https://mp.weixin.qq.com/s/hsGjuaYfEfParcgTimBLnw
>
>
> 2. How to use Kylin for the Customer segmentation of Marketing dept?
>
> Here are some articles : (sorry again for these are not translated)
> https://kylin.apache.org/blog/2016/11/28/intersect-count/
> https://zhuanlan.zhihu.com/p/100131550
> https://cn.kyligence.io/blog/kylin-chinagreentown-user-portrait-2/
>
> https://cn.kyligence.io/blog/apache-kylin-count-distinct-application-in-user-behavior-analysis/
> https://www.infoq.cn/article/xZYe1DUopNA9CzLwau3O
>
> You can send your presentation material to me if you are willing to share.
>
> 
> With warm regard
> Xiaoxiang Yu
>
>
>
> On Wed, Nov 22, 2023 at 5:36 PM Nam Đỗ Duy  wrote:
>
>> Thank you Xiaoxiang, tomorrow noon is my presentation to the management
>> about kylin so I am pending this issue to focus on following ones, can you
>> please advise:
>>
>> 1. How to build a metric repository by Kylin to share among data teams
>> (DA,
>> DS, AI), is that the usage of measure in Kylin?
>> 2. How to use Kylin for the Customer segmentation of Marketing dept?
>>
>>
>> On Wed, Nov 22, 2023 at 2:10 PM Xiaoxiang Yu  wrote:
>>
>> > Before you try again, you can use spark-sql/spark-shell to check if the
>> > data is loaded
>> > into your table successfully (or if your data is copied to the right
>> > place).
>> > Following is how to start a spark-sql/spark-shell in a container.
>> >
>> > export HADOOP_CONF_DIR=/opt/hadoop-3.2.1/etc/hadoop
>> >
>> > cd /home/kylin/apache-kylin-5.0.0-beta-bin/spark
>> >
>> > bin/spark-shell --executor-cores 1 --num-executors 1 --master yarn
>> >
>> >
>> > The result of spark-sql/spark-shell should be the same as your
>> > saw in Kylin insight page. If there are different results for the same
>> > query,
>> > which should not happen, please let me know.
>> >
>> > Hope you can fix your problem soon.
>> >
>> > 
>> > With warm regard
>> > Xiaoxiang Yu
>> >
>> >
>> >
>> > On Wed, Nov 22, 2023 at 11:59 AM Nam Đỗ Duy 
>> > wrote:
>> >
>> > > Thank you Xiaoxiang, I tried in my place and it worked for the ssb
>> > database
>> > > but it didn't work for my own database.
>> > >
>> > > It only works if I restart kylin so I guess there might be some
>> > > configuration miss in my end.
>> > >
>> > > Thank you very much anyway and will update next time.
>> > >
>> > > Have a good day.
>> > >
>> > > On Fri, Nov 17, 2023 at 5:34 PM Xiaoxiang Yu  wrote:
>> > >
>> > > > I did an easy test to verify if kylin has any bugs for the push down
>> > > > function. And the push
>> > > > down function w

Re: How to use measure of Kylin in query

2023-11-22 Thread Xiaoxiang Yu
Your summary
"I write query normally to query the desired column and Kylin uses the
index mechanism to accelerate query"
 is almost right. And I cannot understand exactly what your question is.




With warm regard
Xiaoxiang Yu



On Wed, Nov 22, 2023 at 5:32 PM Nam Đỗ Duy  wrote:

> Dear Sir / Madam
>
> I've searched the web but cannot find the way to use measures of Kylin, for
> example, with this  quote from the URL of document, it seems that the
> measure's magic is as follows: "I write query normally to query the desired
> colummn and Kylin uses the index mechanism to accelerate query", can you
> please advise?
>
> Count Distinct (Precise) | Welcome to Kylin 5 (apache.org)
> <
> https://kylin.apache.org/5.0/docs/modeling/model_design/measure_design/count_distinct_bitmap
> >
>
> Once the measure is added and the model is saved, you need to go to the
> Edit
> Aggregate Index page, add the corresponding dimensions and measures to the
> appropriate aggregate group according to your business scenario, and the
> new aggregate index will be generated after submission. You need to build
> index and load data to complete the precomputation of the target column.
> You can check the job of Build Index in the Job Monitor page. After the
> index is built, you can use the Count Distinct (Precise) measure to do some
> querying.
>


Re: How to reflect last hour data into Hive and Kylin Insights query window

2023-11-22 Thread Xiaoxiang Yu
It is a good question, I can share some articles with you.

1. How to build a metric repository by Kylin to share among data teams (DA,
DS, AI), is that the usage of measure in Kylin?

I think the metric repository(or metrics store) is actually which Kylin can
help. For example,
Beike(ke.com) did create an indicator/metrics platform whose backend is
Kylin. They created a metrics
store on the top of Kylin.

The architecture looks like this
https://mmbiz.qpic.cn/mmbiz_png/9xAoGyC249Kd9icMaNT1Gs7AlDAZic7PScYNCOkSQF8PqbuSLicoxhdk4w3kJtC0bms4FzW6iby08bNiaVsUzUkBPmg/640?wx_fmt=png=5_lazy=1_co=1


Here is technical article which wrote in Chinese about it(I am sorry this
is not translated):
 https://mp.weixin.qq.com/s/hsGjuaYfEfParcgTimBLnw


2. How to use Kylin for the Customer segmentation of Marketing dept?

Here are some articles : (sorry again for these are not translated)
https://kylin.apache.org/blog/2016/11/28/intersect-count/
https://zhuanlan.zhihu.com/p/100131550
https://cn.kyligence.io/blog/kylin-chinagreentown-user-portrait-2/
https://cn.kyligence.io/blog/apache-kylin-count-distinct-application-in-user-behavior-analysis/
https://www.infoq.cn/article/xZYe1DUopNA9CzLwau3O

You can send your presentation material to me if you are willing to share.


With warm regard
Xiaoxiang Yu



On Wed, Nov 22, 2023 at 5:36 PM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang, tomorrow noon is my presentation to the management
> about kylin so I am pending this issue to focus on following ones, can you
> please advise:
>
> 1. How to build a metric repository by Kylin to share among data teams (DA,
> DS, AI), is that the usage of measure in Kylin?
> 2. How to use Kylin for the Customer segmentation of Marketing dept?
>
>
> On Wed, Nov 22, 2023 at 2:10 PM Xiaoxiang Yu  wrote:
>
> > Before you try again, you can use spark-sql/spark-shell to check if the
> > data is loaded
> > into your table successfully (or if your data is copied to the right
> > place).
> > Following is how to start a spark-sql/spark-shell in a container.
> >
> > export HADOOP_CONF_DIR=/opt/hadoop-3.2.1/etc/hadoop
> >
> > cd /home/kylin/apache-kylin-5.0.0-beta-bin/spark
> >
> > bin/spark-shell --executor-cores 1 --num-executors 1 --master yarn
> >
> >
> > The result of spark-sql/spark-shell should be the same as your
> > saw in Kylin insight page. If there are different results for the same
> > query,
> > which should not happen, please let me know.
> >
> > Hope you can fix your problem soon.
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Wed, Nov 22, 2023 at 11:59 AM Nam Đỗ Duy 
> > wrote:
> >
> > > Thank you Xiaoxiang, I tried in my place and it worked for the ssb
> > database
> > > but it didn't work for my own database.
> > >
> > > It only works if I restart kylin so I guess there might be some
> > > configuration miss in my end.
> > >
> > > Thank you very much anyway and will update next time.
> > >
> > > Have a good day.
> > >
> > > On Fri, Nov 17, 2023 at 5:34 PM Xiaoxiang Yu  wrote:
> > >
> > > > I did an easy test to verify if kylin has any bugs for the push down
> > > > function. And the push
> > > > down function works as expected without any mistakes. So I'm 99%
> > certain
> > > > that
> > > > your step "I loaded the incremental data into Hive already" does not
> > > work.
> > > >
> > > > Here are my steps(you can reproduce in a fresh Kylin5 docker
> container
> > in
> > > > one minute) :
> > > >
> > > > 1. Query `select count(*) from SSB.DATES` in project ssb without
> > building
> > > > any index.
> > > > Query result(Answered By: HIVE) is :   2556
> > > >
> > > > 2. Duplicate the file of table `ssb.dates` by following command:
> > > > hadoop fs -cp /user/hive/warehouse/ssb.db/dates/SSB.DATES.csv
> > > > /user/hive/warehouse/ssb.db/dates/SSB.DATES-2.csv
> > > >
> > > > 3. Re-query `select count(*) from SSB.DATES` in project ssb
> > > > Query result(Answered By: HIVE) is :  5112
> > > >
> > > > So, it is clear that the second query incremental data can be found
> by
> > > the
> > > > Kylin query engine.
> > > >
> > > > Finally, to make good use of Kylin in real use cases, good knowledge
> of
> > > > Apache Spark
> > > > and Apache Hadoop is a must-to-have.
> > > >
> > > &

Re: How to reflect last hour data into Hive and Kylin Insights query window

2023-11-21 Thread Xiaoxiang Yu
Before you try again, you can use spark-sql/spark-shell to check if the
data is loaded
into your table successfully (or if your data is copied to the right place).
Following is how to start a spark-sql/spark-shell in a container.

export HADOOP_CONF_DIR=/opt/hadoop-3.2.1/etc/hadoop

cd /home/kylin/apache-kylin-5.0.0-beta-bin/spark

bin/spark-shell --executor-cores 1 --num-executors 1 --master yarn


The result of spark-sql/spark-shell should be the same as your
saw in Kylin insight page. If there are different results for the same
query,
which should not happen, please let me know.

Hope you can fix your problem soon.


With warm regard
Xiaoxiang Yu



On Wed, Nov 22, 2023 at 11:59 AM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang, I tried in my place and it worked for the ssb database
> but it didn't work for my own database.
>
> It only works if I restart kylin so I guess there might be some
> configuration miss in my end.
>
> Thank you very much anyway and will update next time.
>
> Have a good day.
>
> On Fri, Nov 17, 2023 at 5:34 PM Xiaoxiang Yu  wrote:
>
> > I did an easy test to verify if kylin has any bugs for the push down
> > function. And the push
> > down function works as expected without any mistakes. So I'm 99% certain
> > that
> > your step "I loaded the incremental data into Hive already" does not
> work.
> >
> > Here are my steps(you can reproduce in a fresh Kylin5 docker container in
> > one minute) :
> >
> > 1. Query `select count(*) from SSB.DATES` in project ssb without building
> > any index.
> > Query result(Answered By: HIVE) is :   2556
> >
> > 2. Duplicate the file of table `ssb.dates` by following command:
> > hadoop fs -cp /user/hive/warehouse/ssb.db/dates/SSB.DATES.csv
> > /user/hive/warehouse/ssb.db/dates/SSB.DATES-2.csv
> >
> > 3. Re-query `select count(*) from SSB.DATES` in project ssb
> > Query result(Answered By: HIVE) is :  5112
> >
> > So, it is clear that the second query incremental data can be found by
> the
> > Kylin query engine.
> >
> > Finally, to make good use of Kylin in real use cases, good knowledge of
> > Apache Spark
> > and Apache Hadoop is a must-to-have.
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Fri, Nov 17, 2023 at 5:52 PM Nam Đỗ Duy 
> wrote:
> >
> > > Have a nice weekend Xiaoxiang, and thank you for helping me to become a
> > > kylin's fan
> > >
> > > You are right I am not familiar with Kylin enough and have little
> > > background of the hadoop system so I will double check here carefully
> > > before
> > > future questions. However I did understand the following mechanism
> > > in quotes.
> > >
> > > quoted
> > >
> > > If incremental data is not loaded into Kylin, Kylin can still answer
> such
> > > queries by
> > > reading the original hive table, but the query is not accelerated.
> > >
> > > If incremental data is loaded into Kylin, Kylin can answer queries by
> > > reading the special Index/Cuboid files, and the query will be
> > accelerated.
> > >
> > > end
> > >
> > > I explain my previous question that was as follows:
> > >
> > > 1. I turned off this configuration kylin.query.cache-enabled (set =
> > false)
> > > 2. Restart Kylin
> > > 3. I loaded the incremental data into Hive already
> > > 4. Turn on Pushdown option to query Hive not model
> > > 5. In Kylin Insights window, I still cannot get the incremental data
> > (which
> > > has been in Hive already)
> > >
> > > That was the reason why I asked you: can I get the incremental result
> by
> > > above 5 steps (without model and index) or do I need to create model
> and
> > > index and segment then I can  get the incremental result by creating a
> > new
> > > segment according to incremental data?
> > >
> > > Hope you get my point or I will explain more
> > >
> > > Thank you very much again
> > >
> > >
> > > On Fri, 17 Nov 2023 at 16:00 Xiaoxiang Yu  wrote:
> > >
> > > > Unfortunately, I guess you are not asking good questions.
> > > > If the answer of a question can be searched on the Internet,
> > > > it is not recommended to ask it in the mailing list. I guess you
> > > > didn't know how Kylin works, so you need to search for documents
> &g

Re: How to reflect last hour data into Hive and Kylin Insights query window

2023-11-17 Thread Xiaoxiang Yu
I did an easy test to verify if kylin has any bugs for the push down
function. And the push
down function works as expected without any mistakes. So I'm 99% certain
that
your step "I loaded the incremental data into Hive already" does not work.

Here are my steps(you can reproduce in a fresh Kylin5 docker container in
one minute) :

1. Query `select count(*) from SSB.DATES` in project ssb without building
any index.
Query result(Answered By: HIVE) is :   2556

2. Duplicate the file of table `ssb.dates` by following command:
hadoop fs -cp /user/hive/warehouse/ssb.db/dates/SSB.DATES.csv
/user/hive/warehouse/ssb.db/dates/SSB.DATES-2.csv

3. Re-query `select count(*) from SSB.DATES` in project ssb
Query result(Answered By: HIVE) is :  5112

So, it is clear that the second query incremental data can be found by the
Kylin query engine.

Finally, to make good use of Kylin in real use cases, good knowledge of
Apache Spark
and Apache Hadoop is a must-to-have.


With warm regard
Xiaoxiang Yu



On Fri, Nov 17, 2023 at 5:52 PM Nam Đỗ Duy  wrote:

> Have a nice weekend Xiaoxiang, and thank you for helping me to become a
> kylin's fan
>
> You are right I am not familiar with Kylin enough and have little
> background of the hadoop system so I will double check here carefully
> before
> future questions. However I did understand the following mechanism
> in quotes.
>
> quoted
>
> If incremental data is not loaded into Kylin, Kylin can still answer such
> queries by
> reading the original hive table, but the query is not accelerated.
>
> If incremental data is loaded into Kylin, Kylin can answer queries by
> reading the special Index/Cuboid files, and the query will be accelerated.
>
> end
>
> I explain my previous question that was as follows:
>
> 1. I turned off this configuration kylin.query.cache-enabled (set = false)
> 2. Restart Kylin
> 3. I loaded the incremental data into Hive already
> 4. Turn on Pushdown option to query Hive not model
> 5. In Kylin Insights window, I still cannot get the incremental data (which
> has been in Hive already)
>
> That was the reason why I asked you: can I get the incremental result by
> above 5 steps (without model and index) or do I need to create model and
> index and segment then I can  get the incremental result by creating a new
> segment according to incremental data?
>
> Hope you get my point or I will explain more
>
> Thank you very much again
>
>
> On Fri, 17 Nov 2023 at 16:00 Xiaoxiang Yu  wrote:
>
> > Unfortunately, I guess you are not asking good questions.
> > If the answer of a question can be searched on the Internet,
> > it is not recommended to ask it in the mailing list. I guess you
> > didn't know how Kylin works, so you need to search for documents
> >  or some tutorials.
> >
> > What does 'get the incremental data from Hive into Kylin' means? Kylin
> > fully relies
> > on Apache Spark for execution.
> >
> > If incremental data is not loaded into Kylin, Kylin can still answer such
> > queries by
> > reading the original hive table, but the query is not accelerated.
> >
> > If incremental data is loaded into Kylin, Kylin can answer queries by
> > reading the special Index/Cuboid files, and the query will be
> accelerated.
> >
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Fri, Nov 17, 2023 at 4:36 PM Nam Đỗ Duy 
> wrote:
> >
> > > Hi Xiaoxiang,
> > >
> > > Do I really need to create a model in order to get the incremental data
> > > from Hive into Kylin?
> > >
> > > Can I query the incremental data of a pure dim/fact table without a
> > model?
> > >
> > > Thank you very much
> > >
> > > On Fri, Nov 17, 2023 at 9:05 AM Xiaoxiang Yu  wrote:
> > >
> > > > I am not really sure. But I think it is the Query cache make your
> query
> > > > result unchanged.
> > > >
> > > >
> > > > The config entry is kylin.query.cache-enabled , is turn on by
> default.
> > > > This doc links is
> > > > https://kylin.apache.org/5.0/docs/configuration/query_cache
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best wishes to you !
> > > > From :Xiaoxiang Yu
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2023-11-17 09:48:55, "Nam Đỗ Duy"  wrote:
> > > > >Hello Te

Re: How to reflect last hour data into Hive and Kylin Insights query window

2023-11-17 Thread Xiaoxiang Yu
Unfortunately, I guess you are not asking good questions.
If the answer of a question can be searched on the Internet,
it is not recommended to ask it in the mailing list. I guess you
didn't know how Kylin works, so you need to search for documents
 or some tutorials.

What does 'get the incremental data from Hive into Kylin' means? Kylin
fully relies
on Apache Spark for execution.

If incremental data is not loaded into Kylin, Kylin can still answer such
queries by
reading the original hive table, but the query is not accelerated.

If incremental data is loaded into Kylin, Kylin can answer queries by
reading the special Index/Cuboid files, and the query will be accelerated.



With warm regard
Xiaoxiang Yu



On Fri, Nov 17, 2023 at 4:36 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang,
>
> Do I really need to create a model in order to get the incremental data
> from Hive into Kylin?
>
> Can I query the incremental data of a pure dim/fact table without a model?
>
> Thank you very much
>
> On Fri, Nov 17, 2023 at 9:05 AM Xiaoxiang Yu  wrote:
>
> > I am not really sure. But I think it is the Query cache make your query
> > result unchanged.
> >
> >
> > The config entry is kylin.query.cache-enabled , is turn on by default.
> > This doc links is
> > https://kylin.apache.org/5.0/docs/configuration/query_cache
> >
> >
> >
> >
> > --
> >
> > Best wishes to you !
> > From :Xiaoxiang Yu
> >
> >
> >
> >
> >
> > At 2023-11-17 09:48:55, "Nam Đỗ Duy"  wrote:
> > >Hello Team, hello Xiaoxiang, can you please help me with this urgent
> > >issue...
> > >
> > >(this is public email group so in general I neglect your specific name
> > from
> > >greeting of first email in the threads, but in fact most of time
> Xiaoxiang
> > >actively answers my issues, thank you very much)
> > >
> > >On Thu, Nov 16, 2023 at 2:59 PM Nam Đỗ Duy  wrote:
> > >
> > >> Dear Dev Team, please kindly advise this scenario
> > >>
> > >> 1. I have a fact table and I use Kylin insights window to query it and
> > get
> > >> 5 million rows.
> > >>
> > >> 2. Then I use following command to load X rows (last hour data) from
> > >> parquet into Hive table
> > >>
> > >> LOAD DATA LOCAL INPATH
> > >> '/opt/LastHour/factUserEventDF_2023_11_16.parquet/14' INTO TABLE
> > >> factUserEvent;
> > >>
> > >> 3. Then I open Kylin insights window to query it but it still returned
> > >> previous number (5 million rows) not adding the last hour data of X
> rows
> > >> which I previously loaded from parquet into hive in step 2)
> > >>
> > >> Can you advise the way to make table refresh and updated?
> > >>
> > >> Thank you very much
> > >>
> >
>


Re: How to reflect last hour data into Hive and Kylin Insights query window

2023-11-16 Thread Xiaoxiang Yu
I am not really sure. But I think it is the Query cache make your query result 
unchanged.


The config entry is kylin.query.cache-enabled , is turn on by default. 
This doc links is https://kylin.apache.org/5.0/docs/configuration/query_cache




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-11-17 09:48:55, "Nam Đỗ Duy"  wrote:
>Hello Team, hello Xiaoxiang, can you please help me with this urgent
>issue...
>
>(this is public email group so in general I neglect your specific name from
>greeting of first email in the threads, but in fact most of time Xiaoxiang
>actively answers my issues, thank you very much)
>
>On Thu, Nov 16, 2023 at 2:59 PM Nam Đỗ Duy  wrote:
>
>> Dear Dev Team, please kindly advise this scenario
>>
>> 1. I have a fact table and I use Kylin insights window to query it and get
>> 5 million rows.
>>
>> 2. Then I use following command to load X rows (last hour data) from
>> parquet into Hive table
>>
>> LOAD DATA LOCAL INPATH
>> '/opt/LastHour/factUserEventDF_2023_11_16.parquet/14' INTO TABLE
>> factUserEvent;
>>
>> 3. Then I open Kylin insights window to query it but it still returned
>> previous number (5 million rows) not adding the last hour data of X rows
>> which I previously loaded from parquet into hive in step 2)
>>
>> Can you advise the way to make table refresh and updated?
>>
>> Thank you very much
>>


Re: Why we choose Kylin

2023-11-13 Thread Xiaoxiang Yu
I have to say currently Kylin is focused on OLAP solutions, without very
few use cases in data science or AI.
And it requires a good understanding of Hadoop/Spark if you want to
optimize query performance.

For your question, I found a video of a Kylin Meetup in Shanghai(using
English). The speaker is from eBay,
the creator of the Kylin project, I think she has a better understanding
than me.

The second section of the video summarized the use case of Kylin in eBay,
starting at 09:48 .
This video also introduces the history of the Kylin project.

Here is the link:

https://www.bilibili.com/video/BV17h41127bV/?spm_id_from=333.337.search-card.all.click_source=233a70cff82cc278ec07b1660fdbc7d2


With warm regard
Xiaoxiang Yu



On Tue, Nov 14, 2023 at 1:42 PM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang for your reply
>
> This may be final question for our Board of Management to decide to use
> Kylin so please kindly answer:
>
> During your work with those big firms (listed in Who is using Kylin), what
> do you find those big firms are utilizing kylin in their work in terms of
> (for example):
>
> - business area (like Marketing, research, sales support, risk management
> etc)
> - functional teams (data analyst, data science, AI project)
> - scope and scale of the project
> -…
>
> Thank you very much
>
> On Tue, 14 Nov 2023 at 09:58 Xiaoxiang Yu  wrote:
>
> > Yes, we have meetups in different cities, it can be searched. Besides, we
> > communicate by email or zoom.
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Tue, Nov 14, 2023 at 12:27 AM Nam Đỗ Duy 
> > wrote:
> >
> > > Hi Xiaoxiang,
> > >
> > > Please kindly let us know:
> > >
> > > My boss is asking that: how do you know the companies in “Who is using
> > > Kylin” section are actually using Kylin…do they inform you that fact or
> > do
> > > you have any agreement with them?
> > >
> > > Thank you very much
> > >
> > > On Mon, 13 Nov 2023 at 09:34 Xiaoxiang Yu  wrote:
> > >
> > > > 1. Which companies are using kylin now
> > > > You can visit the home page https://kylin.apache.org, and go to 'Who
> > is
> > > > using Kylin?' part,
> > > > you will be finding logos of these companies.
> > > >
> > > > 2. How do they use kylin’s capabilities in AI/ML projects?
> > > > Currently I am focusing on Kylin itself. I did not have enough
> > knowledge
> > > in
> > > > AI/ML.
> > > > Here is what I know.
> > > > As far as I know, Kylin used to provide a Python library, so Kylin
> can
> > be
> > > > integrated
> > > > with some Python ML tools(such as
> > > > https://docs.byzer.org/#/byzer-lang/en-us/
> > > > ),
> > > > but I don't know if it still works at the moment. I think it needs
> some
> > > > test and modification
> > > > to make these work with Kylin 5.
> > > >
> > > >
> > > >
> > > > 
> > > > With warm regard
> > > > Xiaoxiang Yu
> > > >
> > > >
> > > >
> > > > On Mon, Nov 13, 2023 at 10:00 AM Nam Đỗ Duy 
> > > > wrote:
> > > >
> > > > > Hi Xiaoxiang
> > > > >
> > > > > Regarding the reason why we should choose kylin please provide real
> > > > > use-cases to help me answer our boss’s question:
> > > > >
> > > > > 1. Which companies are using kylin now
> > > > > 2. How do they use kylin’s capabilities in AI/ML projects
> > > > >
> > > > > Thank you very much
> > > > >
> > > > > On Mon, 6 Nov 2023 at 13:42 Xiaoxiang Yu  wrote:
> > > > >
> > > > > > Here are some blogs which can help you to introduce advantages of
> > > > Kylin .
> > > > > >
> > > > > > - https://kylin.apache.org/blog/2022/01/12/The-Future-Of-Kylin/
> > > > > > -
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KYLIN/Why+did+Youzan+choose+Kylin+4
> > > > > > - https://kylin.apache.org/blog/
> > > > > >
> > > > > >
> > > > > > 
> > > > > > With warm regard
> > > > > > Xiaoxiang Yu
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Nov 3, 2023 at 4:12 PM Nam Đỗ Duy  >
> > > > > wrote:
> > > > > >
> > > > > > > Dear Sir/Madam
> > > > > > >
> > > > > > > I am persuading my company to use Kylin as OLAP platform.
> > > > > > >
> > > > > > > Could you please give some fact or some document to my
> > presentation
> > > > > about
> > > > > > > the reason why we should choose Kylin comparing with other OLAP
> > > > > platform.
> > > > > > >
> > > > > > > Thank you very much
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Default Hive user/password of docker Kylin 5.0

2023-11-13 Thread Xiaoxiang Yu
https://github.com/apache/kylin/blob/8de5c7a7121dc37729a12ee231041f8d89d1494c/dev-support/release-manager/standalone-docker/all_in_one/entrypoint.sh#L83C48-L83C48


With warm regard
Xiaoxiang Yu



On Mon, Nov 13, 2023 at 7:18 PM Nam Đỗ Duy  wrote:

> Dear Team,
>
> I am using scala to connect to Hive after install Kylin docker 5.0, please
> kindly tell me the default HIVE user/password to fill to this code:
>
> Thank you very much and best regards
>
> // JDBC URL to connect to Hive
> val jdbcURL = "jdbc:hive2://localhost:1/your_database"
>
> // Hive connection properties
> val connectionProperties = new java.util.Properties()
> connectionProperties.setProperty("user", "your_username")
> connectionProperties.setProperty("password", "your_password")
>
> // JDBC driver name and database URL
> val driverName = "org.apache.hive.jdbc.HiveDriver"
>


Re: Why we choose Kylin

2023-11-13 Thread Xiaoxiang Yu
Yes, we have meetups in different cities, it can be searched. Besides, we
communicate by email or zoom.

With warm regard
Xiaoxiang Yu



On Tue, Nov 14, 2023 at 12:27 AM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang,
>
> Please kindly let us know:
>
> My boss is asking that: how do you know the companies in “Who is using
> Kylin” section are actually using Kylin…do they inform you that fact or do
> you have any agreement with them?
>
> Thank you very much
>
> On Mon, 13 Nov 2023 at 09:34 Xiaoxiang Yu  wrote:
>
> > 1. Which companies are using kylin now
> > You can visit the home page https://kylin.apache.org, and go to 'Who is
> > using Kylin?' part,
> > you will be finding logos of these companies.
> >
> > 2. How do they use kylin’s capabilities in AI/ML projects?
> > Currently I am focusing on Kylin itself. I did not have enough knowledge
> in
> > AI/ML.
> > Here is what I know.
> > As far as I know, Kylin used to provide a Python library, so Kylin can be
> > integrated
> > with some Python ML tools(such as
> > https://docs.byzer.org/#/byzer-lang/en-us/
> > ),
> > but I don't know if it still works at the moment. I think it needs some
> > test and modification
> > to make these work with Kylin 5.
> >
> >
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Mon, Nov 13, 2023 at 10:00 AM Nam Đỗ Duy 
> > wrote:
> >
> > > Hi Xiaoxiang
> > >
> > > Regarding the reason why we should choose kylin please provide real
> > > use-cases to help me answer our boss’s question:
> > >
> > > 1. Which companies are using kylin now
> > > 2. How do they use kylin’s capabilities in AI/ML projects
> > >
> > > Thank you very much
> > >
> > > On Mon, 6 Nov 2023 at 13:42 Xiaoxiang Yu  wrote:
> > >
> > > > Here are some blogs which can help you to introduce advantages of
> > Kylin .
> > > >
> > > > - https://kylin.apache.org/blog/2022/01/12/The-Future-Of-Kylin/
> > > > -
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KYLIN/Why+did+Youzan+choose+Kylin+4
> > > > - https://kylin.apache.org/blog/
> > > >
> > > >
> > > > 
> > > > With warm regard
> > > > Xiaoxiang Yu
> > > >
> > > >
> > > >
> > > > On Fri, Nov 3, 2023 at 4:12 PM Nam Đỗ Duy 
> > > wrote:
> > > >
> > > > > Dear Sir/Madam
> > > > >
> > > > > I am persuading my company to use Kylin as OLAP platform.
> > > > >
> > > > > Could you please give some fact or some document to my presentation
> > > about
> > > > > the reason why we should choose Kylin comparing with other OLAP
> > > platform.
> > > > >
> > > > > Thank you very much
> > > > >
> > > >
> > >
> >
>


Re: MDX in kylin 5.0

2023-11-12 Thread Xiaoxiang Yu
I just want to point out that from the design(by the ASF), the dev@
mailing list is for developers
to discuss and vote on, not for asking user-level questions. Besides, I do
think the questions
you asked here are valuable to other Kylin users, maybe another user who
did subscribe to user@
but didn't subscribe to dev@ cannot find our discussion which might
be useful to him/her.

Besides, I am willing to give some help to new users of Kylin 5, and I am
interested in collecting
feedback from the community. I am hoping your company will choose to use
Kylin 5 as your OLAP platform.


With warm regard
Xiaoxiang Yu



On Mon, Nov 13, 2023 at 10:06 AM Nam Đỗ Duy  wrote:

> Sure thank you Xiaoxiang. I am learning everything about the strengths of
> kylin in order to persuade companies to use kylin so forgive me if the
> question is not clear
>
> On Mon, 13 Nov 2023 at 08:56 Xiaoxiang Yu  wrote:
>
> > As far as I know, there is no such plan in 2024.
> > I think such a question is not suitable for dev mailing list, could you
> > use mailing list instead?
> >
> > --------
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Mon, Nov 13, 2023 at 8:34 AM Nam Đỗ Duy 
> wrote:
> >
> > > Dear Sir/Madam
> > >
> > > I found documents on MDX on version 4.x and would like you to let me
> know
> > > whether MDX is still available in version 5.0?
> > >
> > > Thank you very much
> > >
> >
>


Re: Why we choose Kylin

2023-11-12 Thread Xiaoxiang Yu
1. Which companies are using kylin now
You can visit the home page https://kylin.apache.org, and go to 'Who is
using Kylin?' part,
you will be finding logos of these companies.

2. How do they use kylin’s capabilities in AI/ML projects?
Currently I am focusing on Kylin itself. I did not have enough knowledge in
AI/ML.
Here is what I know.
As far as I know, Kylin used to provide a Python library, so Kylin can be
integrated
with some Python ML tools(such as https://docs.byzer.org/#/byzer-lang/en-us/
),
but I don't know if it still works at the moment. I think it needs some
test and modification
to make these work with Kylin 5.




With warm regard
Xiaoxiang Yu



On Mon, Nov 13, 2023 at 10:00 AM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang
>
> Regarding the reason why we should choose kylin please provide real
> use-cases to help me answer our boss’s question:
>
> 1. Which companies are using kylin now
> 2. How do they use kylin’s capabilities in AI/ML projects
>
> Thank you very much
>
> On Mon, 6 Nov 2023 at 13:42 Xiaoxiang Yu  wrote:
>
> > Here are some blogs which can help you to introduce advantages of Kylin .
> >
> > - https://kylin.apache.org/blog/2022/01/12/The-Future-Of-Kylin/
> > -
> >
> >
> https://cwiki.apache.org/confluence/display/KYLIN/Why+did+Youzan+choose+Kylin+4
> > - https://kylin.apache.org/blog/
> >
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Fri, Nov 3, 2023 at 4:12 PM Nam Đỗ Duy 
> wrote:
> >
> > > Dear Sir/Madam
> > >
> > > I am persuading my company to use Kylin as OLAP platform.
> > >
> > > Could you please give some fact or some document to my presentation
> about
> > > the reason why we should choose Kylin comparing with other OLAP
> platform.
> > >
> > > Thank you very much
> > >
> >
>


Re: MDX in kylin 5.0

2023-11-12 Thread Xiaoxiang Yu
As far as I know, there is no such plan in 2024.
I think such a question is not suitable for dev mailing list, could you
use mailing list instead?


With warm regard
Xiaoxiang Yu



On Mon, Nov 13, 2023 at 8:34 AM Nam Đỗ Duy  wrote:

> Dear Sir/Madam
>
> I found documents on MDX on version 4.x and would like you to let me know
> whether MDX is still available in version 5.0?
>
> Thank you very much
>


Re: Integrate Power BI report server with kylin

2023-11-09 Thread Xiaoxiang Yu
Looks like the feature Model View (switch is
'kylin.query.auto-model-view-enabled')
can help you to query Model directly, but I didn't verify if it works as
expected.
Maybe you can try and tell us if it works for you.

See its explain from https://kylin.apache.org/5.0/docs/configuration/ .



With warm regard
Xiaoxiang Yu



On Thu, Nov 9, 2023 at 6:05 PM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang, I change the method to PowerBI connector (mez file)
> and it works with mez. I will try workarount with ODBC later.
>
> I have another question: *From power BI, can I access the cube through
> ODBC/mez or access kylin by custom query because after browsing, in power
> BI, I see only the HIVE dim/fact table only?*
>
> Thank you very much
>
> On Thu, Nov 9, 2023 at 3:03 PM Xiaoxiang Yu  wrote:
>
> > It is possible that you can ask your question on PowerBI's forum? It is
> > looks like a common issue for PowerBI.
> > Here is a link I found :
> >
> >
> https://community.fabric.microsoft.com/t5/Issues/ODBC-ERROR-Driver-could-not-be-loaded-due-to-System-Error-126/idc-p/707282/highlight/true#M44488
> >  .
> > He said 'Reinstalled the Power Bi to 32 bit and use 32 bit ODBC.  All
> > works. '
> >
> > At the same time, maybe you can retry in a Windows vm,
> > I guess you can make it correct in a new and clean env (or latest OS
> > version).
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Thu, Nov 9, 2023 at 3:30 PM Nam Đỗ Duy 
> wrote:
> >
> > > Thank you Xiaoxiang, both the link you sent me and chatgpt told me that
> > > this error comes from Visual C++ Redistributable for Visual Studio 2015
> > but
> > > I install/reinstall this component many times already still having the
> > same
> > > problem
> > >
> > > Take a rest now and before retrying with this ODBC
> > >
> > > On Wed, Nov 8, 2023 at 4:46 PM Xiaoxiang Yu  wrote:
> > >
> > > > I guess your situation maybe similar to this link:
> > > >  -
> > > >
> > >
> >
> https://dba.stackexchange.com/questions/238135/odbc-error-for-oracle-client-12-2-0-1
> > > >
> > > > Here are some librays that I installed on my PC.
> > > >
> > > >  (My PC is a single language version, its display language can not
> > change
> > > > to English.)
> > > > [image: check-visual-c++-windows.jpeg]
> > > >
> > > >
> > > > 
> > > > With warm regard
> > > > Xiaoxiang Yu
> > > >
> > > >
> > > >
> > > > On Wed, Nov 8, 2023 at 4:00 PM Nam Đỗ Duy 
> > > wrote:
> > > >
> > > >> Hi Xiaoxiang,
> > > >>
> > > >> I got this error when configuring the ODBC, it said the dll file was
> > not
> > > >> found but in fact it is there in the location. Please kindly
> advise...
> > > >>
> > > >> Thank you very much
> > > >>
> > > >> [image: image.png]
> > > >>
> > > >> On Wed, Nov 8, 2023 at 2:25 PM Xiaoxiang Yu 
> wrote:
> > > >>
> > > >>> Hi,
> > > >>> This morning, I verified the Kyligence ODBC Driver's
> connectivity
> > > >>> with Kylin 5.0.0-beta, it works well on my Windows PC, and I can
> > > preview
> > > >>> the table data in the Power BI application correctly.
> > > >>>
> > > >>> Here is my PC and software information:
> > > >>> - PC : A laptop with 4 cores CPU and 16GB RAM
> > > >>> - OS : Windows 11 Home China Version, 22H2
> > > >>> - Kylin : Kylin 5.0.0-beta in docker image(downloaded from
> > > dockerhub)
> > > >>> - Docker Desktop : latest(4.25.0)
> > > >>> - WSL : latest/ubuntu
> > > >>> - Power BI Desktop : 2.122.1066.0 64-bit(October 2023) , the
> > latest
> > > >>> version which download from Microsoft Store
> > > >>> - ODBC Driver: 3.1.12, visit
> > > >>> https://download.kyligence.io/#/download ,
> > > >>> find the link of  'Kyligence ODBC Driver for Windows 64 bit', the
> > > >>> downloaded
> > > >>> file name should similar to
> > > 'Kyligence.ODBC.3.1.12.1008.Windows.x64.exe'
> > > >>>
> > > >>>
> >

Re: Integrate Power BI report server with kylin

2023-11-09 Thread Xiaoxiang Yu
It is possible that you can ask your question on PowerBI's forum? It is
looks like a common issue for PowerBI.
Here is a link I found :
https://community.fabric.microsoft.com/t5/Issues/ODBC-ERROR-Driver-could-not-be-loaded-due-to-System-Error-126/idc-p/707282/highlight/true#M44488
 .
He said 'Reinstalled the Power Bi to 32 bit and use 32 bit ODBC.  All
works. '

At the same time, maybe you can retry in a Windows vm,
I guess you can make it correct in a new and clean env (or latest OS
version).

With warm regard
Xiaoxiang Yu



On Thu, Nov 9, 2023 at 3:30 PM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang, both the link you sent me and chatgpt told me that
> this error comes from Visual C++ Redistributable for Visual Studio 2015 but
> I install/reinstall this component many times already still having the same
> problem
>
> Take a rest now and before retrying with this ODBC
>
> On Wed, Nov 8, 2023 at 4:46 PM Xiaoxiang Yu  wrote:
>
> > I guess your situation maybe similar to this link:
> >  -
> >
> https://dba.stackexchange.com/questions/238135/odbc-error-for-oracle-client-12-2-0-1
> >
> > Here are some librays that I installed on my PC.
> >
> >  (My PC is a single language version, its display language can not change
> > to English.)
> > [image: check-visual-c++-windows.jpeg]
> >
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Wed, Nov 8, 2023 at 4:00 PM Nam Đỗ Duy 
> wrote:
> >
> >> Hi Xiaoxiang,
> >>
> >> I got this error when configuring the ODBC, it said the dll file was not
> >> found but in fact it is there in the location. Please kindly advise...
> >>
> >> Thank you very much
> >>
> >> [image: image.png]
> >>
> >> On Wed, Nov 8, 2023 at 2:25 PM Xiaoxiang Yu  wrote:
> >>
> >>> Hi,
> >>> This morning, I verified the Kyligence ODBC Driver's connectivity
> >>> with Kylin 5.0.0-beta, it works well on my Windows PC, and I can
> preview
> >>> the table data in the Power BI application correctly.
> >>>
> >>> Here is my PC and software information:
> >>> - PC : A laptop with 4 cores CPU and 16GB RAM
> >>> - OS : Windows 11 Home China Version, 22H2
> >>> - Kylin : Kylin 5.0.0-beta in docker image(downloaded from
> dockerhub)
> >>> - Docker Desktop : latest(4.25.0)
> >>> - WSL : latest/ubuntu
> >>> - Power BI Desktop : 2.122.1066.0 64-bit(October 2023) , the latest
> >>> version which download from Microsoft Store
> >>> - ODBC Driver: 3.1.12, visit
> >>> https://download.kyligence.io/#/download ,
> >>> find the link of  'Kyligence ODBC Driver for Windows 64 bit', the
> >>> downloaded
> >>> file name should similar to
> 'Kyligence.ODBC.3.1.12.1008.Windows.x64.exe'
> >>>
> >>>
> >>> Here is how to prepare a Kylin Instance in Windows:
> >>> 1. Install Docker Desktop and WSL, refer
> >>> https://learn.microsoft.com/zh-cn/windows/wsl/tutorials/wsl-containers
> .
> >>> 2. Execute command(
> >>> https://hub.docker.com/r/apachekylin/apache-kylin-standalone) in WSL
> >>> shell(default is Ubuntu)
> >>> to start Kylin Instance.
> >>> 3. Build a segment of Model 'sample_ssb' and make sure the job
> >>> succeeds.
> >>> 4. Install Power BI Desktop(it is free to use)
> >>> 5. Install and configure Kyligence ODBC Driver, and test the
> >>> connectivity
> >>> 6. 'Get Data' from Power BI Desktop, and preview table rows
> >>>
> >>>
> >>> Here is my configuration of 'ODBC Datasource':
> >>> - Host : 127.0.0.1
> >>> - Port :  7070
> >>> - Username : ADMIN
> >>> - Password : KYLIN
> >>> - Project : learn_kylin
> >>>
> >>> Here is my suggestion:
> >>> Unluckily, I didn't have the same software as yours. But I think
> you
> >>> should
> >>> check if you configured it correctly.
> >>> If you cannot make it correct, please go to 'Kyligence ODBC Driver
> >>> DSN
> >>> Setup',
> >>> click the 'Logging Options', and set proper Log Level and Log Path.
> Then
> >>> you
> >>>  should retry and check if log files show any helpful message.
> >>>
> >>> Here are some of my screenshots:
> 

Re: Integrate Power BI report server with kylin

2023-11-08 Thread Xiaoxiang Yu
I guess your situation maybe similar to this link:
 -
https://dba.stackexchange.com/questions/238135/odbc-error-for-oracle-client-12-2-0-1

Here are some librays that I installed on my PC.

 (My PC is a single language version, its display language can not change
to English.)
[image: check-visual-c++-windows.jpeg]



With warm regard
Xiaoxiang Yu



On Wed, Nov 8, 2023 at 4:00 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang,
>
> I got this error when configuring the ODBC, it said the dll file was not
> found but in fact it is there in the location. Please kindly advise...
>
> Thank you very much
>
> [image: image.png]
>
> On Wed, Nov 8, 2023 at 2:25 PM Xiaoxiang Yu  wrote:
>
>> Hi,
>> This morning, I verified the Kyligence ODBC Driver's connectivity
>> with Kylin 5.0.0-beta, it works well on my Windows PC, and I can preview
>> the table data in the Power BI application correctly.
>>
>> Here is my PC and software information:
>> - PC : A laptop with 4 cores CPU and 16GB RAM
>> - OS : Windows 11 Home China Version, 22H2
>> - Kylin : Kylin 5.0.0-beta in docker image(downloaded from dockerhub)
>> - Docker Desktop : latest(4.25.0)
>> - WSL : latest/ubuntu
>> - Power BI Desktop : 2.122.1066.0 64-bit(October 2023) , the latest
>> version which download from Microsoft Store
>> - ODBC Driver: 3.1.12, visit https://download.kyligence.io/#/download
>> ,
>> find the link of  'Kyligence ODBC Driver for Windows 64 bit', the
>> downloaded
>> file name should similar to 'Kyligence.ODBC.3.1.12.1008.Windows.x64.exe'
>>
>>
>> Here is how to prepare a Kylin Instance in Windows:
>> 1. Install Docker Desktop and WSL, refer
>> https://learn.microsoft.com/zh-cn/windows/wsl/tutorials/wsl-containers .
>> 2. Execute command(
>> https://hub.docker.com/r/apachekylin/apache-kylin-standalone) in WSL
>> shell(default is Ubuntu)
>> to start Kylin Instance.
>> 3. Build a segment of Model 'sample_ssb' and make sure the job
>> succeeds.
>> 4. Install Power BI Desktop(it is free to use)
>> 5. Install and configure Kyligence ODBC Driver, and test the
>> connectivity
>> 6. 'Get Data' from Power BI Desktop, and preview table rows
>>
>>
>> Here is my configuration of 'ODBC Datasource':
>> - Host : 127.0.0.1
>> - Port :  7070
>> - Username : ADMIN
>> - Password : KYLIN
>> - Project : learn_kylin
>>
>> Here is my suggestion:
>> Unluckily, I didn't have the same software as yours. But I think you
>> should
>> check if you configured it correctly.
>> If you cannot make it correct, please go to 'Kyligence ODBC Driver DSN
>> Setup',
>> click the 'Logging Options', and set proper Log Level and Log Path. Then
>> you
>>  should retry and check if log files show any helpful message.
>>
>> Here are some of my screenshots:
>> 1. Step 5, ODBC DSN Configuration,
>>
>> https://kylin.apache.org/images/Kylin-ODBC-DSN/odbc-kylin5-powerbi-desktop-windows-1.jpeg
>> 2. Step 6, Power BI Desktop,
>>
>> https://kylin.apache.org/images/Kylin-ODBC-DSN/odbc-kylin5-powerbi-desktop-windows-2.jpeg
>>
>> Hope this reply will help you fix your problem.
>>
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Tue, Nov 7, 2023 at 7:18 PM Nam Đỗ Duy  wrote:
>>
>> > Hi Xiaoxiang, I downloaded that file but the error is still the same,
>> > please kindly investigate and advise me.
>> >
>> > Thank you very much
>> >
>> > On Tue, Nov 7, 2023 at 5:15 PM Xiaoxiang Yu  wrote:
>> >
>> > > I need some time to try to reproduce and find the cause, but before
>> that,
>> > > could you re-download another driver from
>> > > https://download.kyligence.io/#/download have
>> > > another try ?
>> > > 
>> > > With warm regard
>> > > Xiaoxiang Yu
>> > >
>> > >
>> > >
>> > > On Tue, Nov 7, 2023 at 5:58 PM Nam Đỗ Duy 
>> > wrote:
>> > >
>> > >> Thank you Xiaoxiang, please see the attached photo and advise me the
>> > >> following:
>> > >>
>> > >> 1. I setup the file of x64, is it correct?
>> > >> 2. I run docker kylin 5.0 on Windows 10 Enterprise, mapping port of
>> > >> docker to 7070:7070 but it showed error as attached file. please
>> advise
>> > t

Re: Integrate Power BI report server with kylin

2023-11-07 Thread Xiaoxiang Yu
Hi,
This morning, I verified the Kyligence ODBC Driver's connectivity
with Kylin 5.0.0-beta, it works well on my Windows PC, and I can preview
the table data in the Power BI application correctly.

Here is my PC and software information:
- PC : A laptop with 4 cores CPU and 16GB RAM
- OS : Windows 11 Home China Version, 22H2
- Kylin : Kylin 5.0.0-beta in docker image(downloaded from dockerhub)
- Docker Desktop : latest(4.25.0)
- WSL : latest/ubuntu
- Power BI Desktop : 2.122.1066.0 64-bit(October 2023) , the latest
version which download from Microsoft Store
- ODBC Driver: 3.1.12, visit https://download.kyligence.io/#/download ,
find the link of  'Kyligence ODBC Driver for Windows 64 bit', the
downloaded
file name should similar to 'Kyligence.ODBC.3.1.12.1008.Windows.x64.exe'


Here is how to prepare a Kylin Instance in Windows:
1. Install Docker Desktop and WSL, refer
https://learn.microsoft.com/zh-cn/windows/wsl/tutorials/wsl-containers .
2. Execute command(
https://hub.docker.com/r/apachekylin/apache-kylin-standalone) in WSL
shell(default is Ubuntu)
to start Kylin Instance.
3. Build a segment of Model 'sample_ssb' and make sure the job succeeds.
4. Install Power BI Desktop(it is free to use)
5. Install and configure Kyligence ODBC Driver, and test the
connectivity
6. 'Get Data' from Power BI Desktop, and preview table rows


Here is my configuration of 'ODBC Datasource':
- Host : 127.0.0.1
- Port :  7070
- Username : ADMIN
- Password : KYLIN
- Project : learn_kylin

Here is my suggestion:
Unluckily, I didn't have the same software as yours. But I think you
should
check if you configured it correctly.
If you cannot make it correct, please go to 'Kyligence ODBC Driver DSN
Setup',
click the 'Logging Options', and set proper Log Level and Log Path. Then you
 should retry and check if log files show any helpful message.

Here are some of my screenshots:
1. Step 5, ODBC DSN Configuration,
https://kylin.apache.org/images/Kylin-ODBC-DSN/odbc-kylin5-powerbi-desktop-windows-1.jpeg
2. Step 6, Power BI Desktop,
https://kylin.apache.org/images/Kylin-ODBC-DSN/odbc-kylin5-powerbi-desktop-windows-2.jpeg

Hope this reply will help you fix your problem.


With warm regard
Xiaoxiang Yu



On Tue, Nov 7, 2023 at 7:18 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang, I downloaded that file but the error is still the same,
> please kindly investigate and advise me.
>
> Thank you very much
>
> On Tue, Nov 7, 2023 at 5:15 PM Xiaoxiang Yu  wrote:
>
> > I need some time to try to reproduce and find the cause, but before that,
> > could you re-download another driver from
> > https://download.kyligence.io/#/download have
> > another try ?
> > ----
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Tue, Nov 7, 2023 at 5:58 PM Nam Đỗ Duy 
> wrote:
> >
> >> Thank you Xiaoxiang, please see the attached photo and advise me the
> >> following:
> >>
> >> 1. I setup the file of x64, is it correct?
> >> 2. I run docker kylin 5.0 on Windows 10 Enterprise, mapping port of
> >> docker to 7070:7070 but it showed error as attached file. please advise
> the
> >> configuration.
> >>
> >> Thank you very much
> >>
> >> [image: image.png]
> >>
> >> On Tue, Nov 7, 2023 at 4:51 PM Xiaoxiang Yu  wrote:
> >>
> >>> Oh, you just ignore the 'A running Kyligence Enterprise server', since
> >>> you
> >>> have a Kylin 5 server, you can replace it with your Kylin 5 server.
> >>> That is because I know that these two products share the
> >>>  same API, metadata, and the core design .
> >>> 
> >>> With warm regard
> >>> Xiaoxiang Yu
> >>>
> >>>
> >>>
> >>> On Tue, Nov 7, 2023 at 5:40 PM Nam Đỗ Duy 
> >>> wrote:
> >>>
> >>> > Hi Xiaoxiang,
> >>> >
> >>> > In order to use Kylin ODBC, do I need to buy separate license of
> >>> Kyligence
> >>> > to use it?
> >>> >
> >>> > Prerequisites
> >>> >
> >>> >1.
> >>> >
> >>> >Microsoft Visual C++ 2015
> >>> >
> >>> >During the installation of Kyligence ODBC Driver, Microsoft VC++
> >>> will be
> >>> >    installed first and redistributable is already embedded in the
> >>> > installation
> >>> >package. If Microsoft Visual C++ 2015 is already installed on your
> >>> > machine,
> >

Re: Integrate Power BI report server with kylin

2023-11-07 Thread Xiaoxiang Yu
I need some time to try to reproduce and find the cause, but before that,
could you re-download another driver from
https://download.kyligence.io/#/download have
another try ?

With warm regard
Xiaoxiang Yu



On Tue, Nov 7, 2023 at 5:58 PM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang, please see the attached photo and advise me the
> following:
>
> 1. I setup the file of x64, is it correct?
> 2. I run docker kylin 5.0 on Windows 10 Enterprise, mapping port of docker
> to 7070:7070 but it showed error as attached file. please advise the
> configuration.
>
> Thank you very much
>
> [image: image.png]
>
> On Tue, Nov 7, 2023 at 4:51 PM Xiaoxiang Yu  wrote:
>
>> Oh, you just ignore the 'A running Kyligence Enterprise server', since you
>> have a Kylin 5 server, you can replace it with your Kylin 5 server.
>> That is because I know that these two products share the
>>  same API, metadata, and the core design .
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Tue, Nov 7, 2023 at 5:40 PM Nam Đỗ Duy  wrote:
>>
>> > Hi Xiaoxiang,
>> >
>> > In order to use Kylin ODBC, do I need to buy separate license of
>> Kyligence
>> > to use it?
>> >
>> > Prerequisites
>> >
>> >1.
>> >
>> >Microsoft Visual C++ 2015
>> >
>> >During the installation of Kyligence ODBC Driver, Microsoft VC++
>> will be
>> >installed first and redistributable is already embedded in the
>> > installation
>> >package. If Microsoft Visual C++ 2015 is already installed on your
>> > machine,
>> >this step will be skipped.
>> >2.
>> >
>> >A running Kyligence Enterprise server
>> >
>> >Kyligence ODBC Driver will connect to a Kyligence Enterprise server
>> to
>> >verify whether the connection works, so make sure the Kyligence
>> > Enterprise
>> >is running properly.
>> >
>> >
>> > Thank you and best regards
>> >
>> > On Tue, Nov 7, 2023 at 3:46 PM Xiaoxiang Yu  wrote:
>> >
>> > > Kylin Team didn't release a suitable driver at the moment, but you can
>> > try
>> > > an ODBC
>> > > driver from Kyligence(a vendor of Kylin). I did not verify but I
>> supposed
>> > > it should
>> > >  work for 5.0.0-beta. Please let us know if it can meet your needs.
>> > >
>> > > Here are the download and document links by its website:
>> > >
>> > > 1.
>> > >
>> https://kyligence.io/resources/kyligence-odbc-driver-for-apache-kylin-2/
>> > > 2.
>> > >
>> > >
>> >
>> https://docs.kyligence.io/books/v4.6/en/Analyst-and-Business-Users-Guide/integration/
>> > >
>> > >
>> > > 
>> > > With warm regard
>> > > Xiaoxiang Yu
>> > >
>> > >
>> > >
>> > > On Tue, Nov 7, 2023 at 3:14 PM Nam Đỗ Duy 
>> > wrote:
>> > >
>> > > > Dear Sir/Madam
>> > > >
>> > > > Our company is using report server on premise. Please advise me the
>> > > > integration guidelines in two scenarios:
>> > > >
>> > > > 1. POC period:
>> > > > Power BI report server: Windows 10
>> > > > Kylin 5.0 in docker file downloaded from official kylin website
>> > > >
>> > > > 2. Development period:
>> > > > Power BI report server: Windows 10
>> > > > We plan to not use docker this time: Kylin 5.0, Hive, zoo keeper,
>> > hadoop
>> > > in
>> > > > Ubuntu
>> > > >
>> > > > Thank you very much and best regards
>> > > >
>> > >
>> >
>>
>


Re: Integrate Power BI report server with kylin

2023-11-07 Thread Xiaoxiang Yu
Oh, you just ignore the 'A running Kyligence Enterprise server', since you
have a Kylin 5 server, you can replace it with your Kylin 5 server.
That is because I know that these two products share the
 same API, metadata, and the core design .

With warm regard
Xiaoxiang Yu



On Tue, Nov 7, 2023 at 5:40 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang,
>
> In order to use Kylin ODBC, do I need to buy separate license of Kyligence
> to use it?
>
> Prerequisites
>
>1.
>
>Microsoft Visual C++ 2015
>
>During the installation of Kyligence ODBC Driver, Microsoft VC++ will be
>installed first and redistributable is already embedded in the
> installation
>package. If Microsoft Visual C++ 2015 is already installed on your
> machine,
>this step will be skipped.
>2.
>
>A running Kyligence Enterprise server
>
>Kyligence ODBC Driver will connect to a Kyligence Enterprise server to
>verify whether the connection works, so make sure the Kyligence
> Enterprise
>is running properly.
>
>
> Thank you and best regards
>
> On Tue, Nov 7, 2023 at 3:46 PM Xiaoxiang Yu  wrote:
>
> > Kylin Team didn't release a suitable driver at the moment, but you can
> try
> > an ODBC
> > driver from Kyligence(a vendor of Kylin). I did not verify but I supposed
> > it should
> >  work for 5.0.0-beta. Please let us know if it can meet your needs.
> >
> > Here are the download and document links by its website:
> >
> > 1.
> > https://kyligence.io/resources/kyligence-odbc-driver-for-apache-kylin-2/
> > 2.
> >
> >
> https://docs.kyligence.io/books/v4.6/en/Analyst-and-Business-Users-Guide/integration/
> >
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Tue, Nov 7, 2023 at 3:14 PM Nam Đỗ Duy 
> wrote:
> >
> > > Dear Sir/Madam
> > >
> > > Our company is using report server on premise. Please advise me the
> > > integration guidelines in two scenarios:
> > >
> > > 1. POC period:
> > > Power BI report server: Windows 10
> > > Kylin 5.0 in docker file downloaded from official kylin website
> > >
> > > 2. Development period:
> > > Power BI report server: Windows 10
> > > We plan to not use docker this time: Kylin 5.0, Hive, zoo keeper,
> hadoop
> > in
> > > Ubuntu
> > >
> > > Thank you very much and best regards
> > >
> >
>


Re: Integrate Power BI report server with kylin

2023-11-07 Thread Xiaoxiang Yu
Kylin Team didn't release a suitable driver at the moment, but you can try
an ODBC
driver from Kyligence(a vendor of Kylin). I did not verify but I supposed
it should
 work for 5.0.0-beta. Please let us know if it can meet your needs.

Here are the download and document links by its website:

1. https://kyligence.io/resources/kyligence-odbc-driver-for-apache-kylin-2/
2.
https://docs.kyligence.io/books/v4.6/en/Analyst-and-Business-Users-Guide/integration/



With warm regard
Xiaoxiang Yu



On Tue, Nov 7, 2023 at 3:14 PM Nam Đỗ Duy  wrote:

> Dear Sir/Madam
>
> Our company is using report server on premise. Please advise me the
> integration guidelines in two scenarios:
>
> 1. POC period:
> Power BI report server: Windows 10
> Kylin 5.0 in docker file downloaded from official kylin website
>
> 2. Development period:
> Power BI report server: Windows 10
> We plan to not use docker this time: Kylin 5.0, Hive, zoo keeper, hadoop in
> Ubuntu
>
> Thank you very much and best regards
>


Re: Why we choose Kylin

2023-11-05 Thread Xiaoxiang Yu
Here are some blogs which can help you to introduce advantages of Kylin .

- https://kylin.apache.org/blog/2022/01/12/The-Future-Of-Kylin/
-
https://cwiki.apache.org/confluence/display/KYLIN/Why+did+Youzan+choose+Kylin+4
- https://kylin.apache.org/blog/



With warm regard
Xiaoxiang Yu



On Fri, Nov 3, 2023 at 4:12 PM Nam Đỗ Duy  wrote:

> Dear Sir/Madam
>
> I am persuading my company to use Kylin as OLAP platform.
>
> Could you please give some fact or some document to my presentation about
> the reason why we should choose Kylin comparing with other OLAP platform.
>
> Thank you very much
>


Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-02 Thread Xiaoxiang Yu
1. How can I automate the build index daily for newest data?
I guess your team/teammate will manage a ETL pipeline(Jenkins,
DolphinScheduler etc),
you may call Kylin by a rest api in your pipeline, here is the link:
https://kylin.apache.org/5.0/docs/restapi/segment_management_api

2. Can I apply the above automate process for near realtime or realtime
data?
There will be a latency of about 10 min to 2 hours in most cases, it
depends on how fast
the build index job is completed.



With warm regard
Xiaoxiang Yu



On Thu, Nov 2, 2023 at 2:10 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang
>
> My case is not in date range and I need to do daily.
>
> 1. How can I automate the build index daily for newest data?
>
> 2. Can I apply the above automate process for near realtime or realtime
> data (load realtime data from Hive into new index/segment)
>
> Thank you very much for your help
>
>
> On Thu, 2 Nov 2023 at 12:58 Xiaoxiang Yu  wrote:
>
> > If the new data 's date range is covered by a segment in your model, you
> > should refresh your existing segment, refer to :
> >
> >
> >
> https://kylin.apache.org/5.0/docs/modeling/load_data/segment_operation_settings/intro#segment-operation
> > .
> >
> > If not, create a new segment and build index, refer to :
> >   https://kylin.apache.org/5.0/docs/modeling/load_data/by_date
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Thu, Nov 2, 2023 at 11:57 AM Nam Đỗ Duy 
> wrote:
> >
> > > Thank you XiangXiao, I  still have 1 question as follows:
> > >
> > > When the Hive Datasource to be added with new data, how to reflect
> those
> > in
> > > Cube (index) and query result?
> > >
> > >
> > > On Thu, Nov 2, 2023 at 10:00 AM Xiaoxiang Yu  wrote:
> > >
> > > > Congratulations, hope you will make good use of the ability of Kylin
> 5
> > > for
> > > > your use cases.
> > > >
> > > >
> > > > 
> > > > With warm regard
> > > > Xiaoxiang Yu
> > > >
> > > >
> > > >
> > > > On Thu, Nov 2, 2023 at 10:50 AM Nam Đỗ Duy 
> > > wrote:
> > > >
> > > >> The query is too fast, less than a second, can you make it a little
> > bit
> > > >> slower so that I can see it clearly 
> > > >> [image: image.png]
> > > >>
> > > >> On Thu, Nov 2, 2023 at 9:32 AM Nam Đỗ Duy  wrote:
> > > >>
> > > >>> Thank you Xiaoxiang for the guideline. Will definitely read
> > > >>> it carefully. Kindly help the following questions:
> > > >>>
> > > >>> 1. Computed column
> > > >>>
> > > >>> I created a “computed column” and add it to dimensions (among other
> > > >>> dimensions)
> > > >>>
> > > >>> When I use query to select the computed column it returned error
> > > >>>
> > > >>> 2. Datatype optimization: will you think that the int be better
> than
> > > >>> string for key join columns?
> > > >>>
> > > >>> Please advise
> > > >>>
> > > >>>
> > > >>> On Wed, 1 Nov 2023 at 17:32 Xiaoxiang Yu  wrote:
> > > >>>
> > > >>>> Yes, that is almost correct.
> > > >>>>
> > > >>>> If you have a lot of complex queries, and you want to using Kylin
> 5
> > to
> > > >>>> accelerate them, the recommended steps of mine are as follows:
> > > >>>>
> > > >>>> 1. You analyse all queries and collect all join relation/pattern.
> > > >>>> 2. You create Models for each specific join relation/pattern, with
> > the
> > > >>>> join
> > > >>>> relation you find in above step.
> > > >>>> 3. You analyse and collect dimensions and measures from all
> queries,
> > > and
> > > >>>> add them to the corresponding Model.
> > > >>>> 4. You build segments of all Models with proper data range.
> > > >>>> 5. You turned off the pushdown switch, and sent all queries to
> > Kylin.
> > > If
> > > >>>> there are some queries which failed, fix them.
> > > >>>> Here are some common situations.
> 

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
If the new data 's date range is covered by a segment in your model, you
should refresh your existing segment, refer to :

https://kylin.apache.org/5.0/docs/modeling/load_data/segment_operation_settings/intro#segment-operation
.

If not, create a new segment and build index, refer to :
  https://kylin.apache.org/5.0/docs/modeling/load_data/by_date


With warm regard
Xiaoxiang Yu



On Thu, Nov 2, 2023 at 11:57 AM Nam Đỗ Duy  wrote:

> Thank you XiangXiao, I  still have 1 question as follows:
>
> When the Hive Datasource to be added with new data, how to reflect those in
> Cube (index) and query result?
>
>
> On Thu, Nov 2, 2023 at 10:00 AM Xiaoxiang Yu  wrote:
>
> > Congratulations, hope you will make good use of the ability of Kylin 5
> for
> > your use cases.
> >
> >
> > --------
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Thu, Nov 2, 2023 at 10:50 AM Nam Đỗ Duy 
> wrote:
> >
> >> The query is too fast, less than a second, can you make it a little bit
> >> slower so that I can see it clearly 
> >> [image: image.png]
> >>
> >> On Thu, Nov 2, 2023 at 9:32 AM Nam Đỗ Duy  wrote:
> >>
> >>> Thank you Xiaoxiang for the guideline. Will definitely read
> >>> it carefully. Kindly help the following questions:
> >>>
> >>> 1. Computed column
> >>>
> >>> I created a “computed column” and add it to dimensions (among other
> >>> dimensions)
> >>>
> >>> When I use query to select the computed column it returned error
> >>>
> >>> 2. Datatype optimization: will you think that the int be better than
> >>> string for key join columns?
> >>>
> >>> Please advise
> >>>
> >>>
> >>> On Wed, 1 Nov 2023 at 17:32 Xiaoxiang Yu  wrote:
> >>>
> >>>> Yes, that is almost correct.
> >>>>
> >>>> If you have a lot of complex queries, and you want to using Kylin 5 to
> >>>> accelerate them, the recommended steps of mine are as follows:
> >>>>
> >>>> 1. You analyse all queries and collect all join relation/pattern.
> >>>> 2. You create Models for each specific join relation/pattern, with the
> >>>> join
> >>>> relation you find in above step.
> >>>> 3. You analyse and collect dimensions and measures from all queries,
> and
> >>>> add them to the corresponding Model.
> >>>> 4. You build segments of all Models with proper data range.
> >>>> 5. You turned off the pushdown switch, and sent all queries to Kylin.
> If
> >>>> there are some queries which failed, fix them.
> >>>> Here are some common situations.
> >>>> 5.1 Join relation/pattern is not matched
> >>>> 5.2 If the join relation is matched, the Model might not contain
> >>>> every
> >>>> column that your query needs, please check kylin.query.log with
> keyword
> >>>> '
> >>>> unmatched'.
> >>>> 6. (Optional) If you find some of your queries do not exactly match
> with
> >>>> your Index(your query on [colA, colB], but your index contains more
> >>>> columns
> >>>> than colA and colB), you can add some aggregate groups(or smaller
> Table
> >>>> Index) to optimize the query performance.
> >>>>
> >>>>
> >>>>
> >>>> 
> >>>> With warm regard
> >>>> Xiaoxiang Yu
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Nov 1, 2023 at 5:57 PM Nam Đỗ Duy 
> >>>> wrote:
> >>>>
> >>>> > Thank you Xiaoxiang, I nearly got to the point.
> >>>> >
> >>>> > So can I interpret that: 1 model equal (~) to a set of Joins of
> >>>> (Dim/Fact)
> >>>> > table, that is to say we need to create several models according to
> >>>> > multiple kinds of joins queries?
> >>>> >
> >>>> > Best regards
> >>>> >
> >>>> > On Wed, Nov 1, 2023 at 4:50 PM Xiaoxiang Yu 
> wrote:
> >>>> >
> >>>> >> Have you ever tried to analyse the reason why your query can not
> hit
> >>>> >> Model 'sample_ssb'?
> >>>> >> It is because the join relation of your query is not 

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
Congratulations, hope you will make good use of the ability of Kylin 5 for
your use cases.



With warm regard
Xiaoxiang Yu



On Thu, Nov 2, 2023 at 10:50 AM Nam Đỗ Duy  wrote:

> The query is too fast, less than a second, can you make it a little bit
> slower so that I can see it clearly 
> [image: image.png]
>
> On Thu, Nov 2, 2023 at 9:32 AM Nam Đỗ Duy  wrote:
>
>> Thank you Xiaoxiang for the guideline. Will definitely read it carefully.
>> Kindly help the following questions:
>>
>> 1. Computed column
>>
>> I created a “computed column” and add it to dimensions (among other
>> dimensions)
>>
>> When I use query to select the computed column it returned error
>>
>> 2. Datatype optimization: will you think that the int be better than
>> string for key join columns?
>>
>> Please advise
>>
>>
>> On Wed, 1 Nov 2023 at 17:32 Xiaoxiang Yu  wrote:
>>
>>> Yes, that is almost correct.
>>>
>>> If you have a lot of complex queries, and you want to using Kylin 5 to
>>> accelerate them, the recommended steps of mine are as follows:
>>>
>>> 1. You analyse all queries and collect all join relation/pattern.
>>> 2. You create Models for each specific join relation/pattern, with the
>>> join
>>> relation you find in above step.
>>> 3. You analyse and collect dimensions and measures from all queries, and
>>> add them to the corresponding Model.
>>> 4. You build segments of all Models with proper data range.
>>> 5. You turned off the pushdown switch, and sent all queries to Kylin. If
>>> there are some queries which failed, fix them.
>>> Here are some common situations.
>>> 5.1 Join relation/pattern is not matched
>>> 5.2 If the join relation is matched, the Model might not contain
>>> every
>>> column that your query needs, please check kylin.query.log with keyword '
>>> unmatched'.
>>> 6. (Optional) If you find some of your queries do not exactly match with
>>> your Index(your query on [colA, colB], but your index contains more
>>> columns
>>> than colA and colB), you can add some aggregate groups(or smaller Table
>>> Index) to optimize the query performance.
>>>
>>>
>>>
>>> 
>>> With warm regard
>>> Xiaoxiang Yu
>>>
>>>
>>>
>>> On Wed, Nov 1, 2023 at 5:57 PM Nam Đỗ Duy 
>>> wrote:
>>>
>>> > Thank you Xiaoxiang, I nearly got to the point.
>>> >
>>> > So can I interpret that: 1 model equal (~) to a set of Joins of
>>> (Dim/Fact)
>>> > table, that is to say we need to create several models according to
>>> > multiple kinds of joins queries?
>>> >
>>> > Best regards
>>> >
>>> > On Wed, Nov 1, 2023 at 4:50 PM Xiaoxiang Yu  wrote:
>>> >
>>> >> Have you ever tried to analyse the reason why your query can not hit
>>> >> Model 'sample_ssb'?
>>> >> It is because the join relation of your query is not suitable for the
>>> >> join relation/pattern of  Model 'sample_ssb'.
>>> >>
>>> >> Your query used a join relation/pattern like: A inner join B.
>>> >> But the Model 'sample_ssb' used a join relation/pattern like : A inner
>>> >> join B inner join C.
>>> >>
>>> >> If you are familiar with the definition of Inner join, you may know
>>> that
>>> >> the
>>> >> relation/pattern 'A inner join B inner join C' will have a chance
>>> >> to lose some rows when compared to pattern 'A inner join B'.
>>> >> So the Model 'sample_ssb' will be excluded to serve your query.
>>> >>
>>> >> That is to say, you need to create a new model that is similar to
>>> Model
>>> >> 'sample_ssb',
>>> >>  but with additional tables removed.
>>> >>
>>> >>
>>> >>
>>> >> 
>>> >> With warm regard
>>> >> Xiaoxiang Yu
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Nov 1, 2023 at 5:21 PM Nam Đỗ Duy 
>>> wrote:
>>> >>
>>> >>> Hi Xiaoxiang,
>>> >>>
>>> >>> Thank you very much
>>> >>>
>>> >>> I have clearer picture of Kylin already thanks to your explanation.
>>

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
Hi, for the first question,you don't provide any detail for analysis,
please send me your query diagnostic package which includes your metadata,
query, and logs.

For the second question, I am not sure at the moment.



With warm regard
Xiaoxiang Yu



On Thu, Nov 2, 2023 at 10:33 AM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang for the guideline. Will definitely read it carefully.
> Kindly help the following questions:
>
> 1. Computed column
>
> I created a “computed column” and add it to dimensions (among other
> dimensions)
>
> When I use query to select the computed column it returned error
>
> 2. Datatype optimization: will you think that the int be better than string
> for key join columns?
>
> Please advise
>
>
> On Wed, 1 Nov 2023 at 17:32 Xiaoxiang Yu  wrote:
>
> > Yes, that is almost correct.
> >
> > If you have a lot of complex queries, and you want to using Kylin 5 to
> > accelerate them, the recommended steps of mine are as follows:
> >
> > 1. You analyse all queries and collect all join relation/pattern.
> > 2. You create Models for each specific join relation/pattern, with the
> join
> > relation you find in above step.
> > 3. You analyse and collect dimensions and measures from all queries, and
> > add them to the corresponding Model.
> > 4. You build segments of all Models with proper data range.
> > 5. You turned off the pushdown switch, and sent all queries to Kylin. If
> > there are some queries which failed, fix them.
> > Here are some common situations.
> > 5.1 Join relation/pattern is not matched
> > 5.2 If the join relation is matched, the Model might not contain
> every
> > column that your query needs, please check kylin.query.log with keyword '
> > unmatched'.
> > 6. (Optional) If you find some of your queries do not exactly match with
> > your Index(your query on [colA, colB], but your index contains more
> columns
> > than colA and colB), you can add some aggregate groups(or smaller Table
> > Index) to optimize the query performance.
> >
> >
> >
> > 
> > With warm regard
> > Xiaoxiang Yu
> >
> >
> >
> > On Wed, Nov 1, 2023 at 5:57 PM Nam Đỗ Duy 
> wrote:
> >
> > > Thank you Xiaoxiang, I nearly got to the point.
> > >
> > > So can I interpret that: 1 model equal (~) to a set of Joins of
> > (Dim/Fact)
> > > table, that is to say we need to create several models according to
> > > multiple kinds of joins queries?
> > >
> > > Best regards
> > >
> > > On Wed, Nov 1, 2023 at 4:50 PM Xiaoxiang Yu  wrote:
> > >
> > >> Have you ever tried to analyse the reason why your query can not hit
> > >> Model 'sample_ssb'?
> > >> It is because the join relation of your query is not suitable for the
> > >> join relation/pattern of  Model 'sample_ssb'.
> > >>
> > >> Your query used a join relation/pattern like: A inner join B.
> > >> But the Model 'sample_ssb' used a join relation/pattern like : A inner
> > >> join B inner join C.
> > >>
> > >> If you are familiar with the definition of Inner join, you may know
> that
> > >> the
> > >> relation/pattern 'A inner join B inner join C' will have a chance
> > >> to lose some rows when compared to pattern 'A inner join B'.
> > >> So the Model 'sample_ssb' will be excluded to serve your query.
> > >>
> > >> That is to say, you need to create a new model that is similar to
> Model
> > >> 'sample_ssb',
> > >>  but with additional tables removed.
> > >>
> > >>
> > >>
> > >> 
> > >> With warm regard
> > >> Xiaoxiang Yu
> > >>
> > >>
> > >>
> > >> On Wed, Nov 1, 2023 at 5:21 PM Nam Đỗ Duy 
> > wrote:
> > >>
> > >>> Hi Xiaoxiang,
> > >>>
> > >>> Thank you very much
> > >>>
> > >>> I have clearer picture of Kylin already thanks to your explanation.
> > >>>
> > >>> Now back to the sample project of SSB in attached photo, when I run
> > this
> > >>> query with push_down option OFF, why the OLAP error appears, and in
> > such
> > >>> case, how to create a new cube for this query?
> > >>>
> > >>> [image: image.png]
> > >>>
> > >>> On Wed, Nov 1, 2023 at 3:49 PM Xiaoxian

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
Yes, that is almost correct.

If you have a lot of complex queries, and you want to using Kylin 5 to
accelerate them, the recommended steps of mine are as follows:

1. You analyse all queries and collect all join relation/pattern.
2. You create Models for each specific join relation/pattern, with the join
relation you find in above step.
3. You analyse and collect dimensions and measures from all queries, and
add them to the corresponding Model.
4. You build segments of all Models with proper data range.
5. You turned off the pushdown switch, and sent all queries to Kylin. If
there are some queries which failed, fix them.
Here are some common situations.
5.1 Join relation/pattern is not matched
5.2 If the join relation is matched, the Model might not contain every
column that your query needs, please check kylin.query.log with keyword '
unmatched'.
6. (Optional) If you find some of your queries do not exactly match with
your Index(your query on [colA, colB], but your index contains more columns
than colA and colB), you can add some aggregate groups(or smaller Table
Index) to optimize the query performance.




With warm regard
Xiaoxiang Yu



On Wed, Nov 1, 2023 at 5:57 PM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang, I nearly got to the point.
>
> So can I interpret that: 1 model equal (~) to a set of Joins of (Dim/Fact)
> table, that is to say we need to create several models according to
> multiple kinds of joins queries?
>
> Best regards
>
> On Wed, Nov 1, 2023 at 4:50 PM Xiaoxiang Yu  wrote:
>
>> Have you ever tried to analyse the reason why your query can not hit
>> Model 'sample_ssb'?
>> It is because the join relation of your query is not suitable for the
>> join relation/pattern of  Model 'sample_ssb'.
>>
>> Your query used a join relation/pattern like: A inner join B.
>> But the Model 'sample_ssb' used a join relation/pattern like : A inner
>> join B inner join C.
>>
>> If you are familiar with the definition of Inner join, you may know that
>> the
>> relation/pattern 'A inner join B inner join C' will have a chance
>> to lose some rows when compared to pattern 'A inner join B'.
>> So the Model 'sample_ssb' will be excluded to serve your query.
>>
>> That is to say, you need to create a new model that is similar to Model
>> 'sample_ssb',
>>  but with additional tables removed.
>>
>>
>>
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Wed, Nov 1, 2023 at 5:21 PM Nam Đỗ Duy  wrote:
>>
>>> Hi Xiaoxiang,
>>>
>>> Thank you very much
>>>
>>> I have clearer picture of Kylin already thanks to your explanation.
>>>
>>> Now back to the sample project of SSB in attached photo, when I run this
>>> query with push_down option OFF, why the OLAP error appears, and in such
>>> case, how to create a new cube for this query?
>>>
>>> [image: image.png]
>>>
>>> On Wed, Nov 1, 2023 at 3:49 PM Xiaoxiang Yu  wrote:
>>>
>>>> Here is some of my explanation and it may not be perfect.
>>>> Segment in Kylin is part of model/cube pre-computed data, in most
>>>> cases, divided by date column.
>>>>
>>>> Here is some difference between Segment and Snapshot.
>>>> Segment, whose source data comes from one fact table joins some dimension
>>>> tables with 'specific date range', is 'precomputed', and will accelerate
>>>> complex query.
>>>> Snapshot, whose source data comes from one specific dimension table without
>>>> specific date range, is "not precomputed", and can join with segments
>>>> at runtime .
>>>>
>>>> - https://kylin.apache.org/5.0/docs/snapshot/snapshot_management
>>>> -
>>>> https://kylin.apache.org/5.0/docs/modeling/load_data/segment_operation_settings/intro
>>>>
>>>> 
>>>> With warm regard
>>>> Xiaoxiang Yu
>>>>
>>>>
>>>>
>>>> On Wed, Nov 1, 2023 at 3:53 PM Nam Đỗ Duy  wrote:
>>>>
>>>>> Thank you again, very smart of you to automatically select cube for a
>>>>> certain query. Sorry If I ask too much: Is the concept of Segment in Kylin
>>>>> model similar to Slice-and-Dice concept of Cube, what is the different
>>>>> between Kylin Segment and Kylin Snapshot?
>>>>>
>>>>> PS. I sent you the log files for your help in investigating why my
>>>>> cube has not been used.
>>>>>
>>>>> O

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
Have you ever tried to analyse the reason why your query can not hit Model
'sample_ssb'?
It is because the join relation of your query is not suitable for the join
relation/pattern of  Model 'sample_ssb'.

Your query used a join relation/pattern like: A inner join B.
But the Model 'sample_ssb' used a join relation/pattern like : A inner join
B inner join C.

If you are familiar with the definition of Inner join, you may know that
the
relation/pattern 'A inner join B inner join C' will have a chance
to lose some rows when compared to pattern 'A inner join B'.
So the Model 'sample_ssb' will be excluded to serve your query.

That is to say, you need to create a new model that is similar to Model
'sample_ssb',
 but with additional tables removed.




With warm regard
Xiaoxiang Yu



On Wed, Nov 1, 2023 at 5:21 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang,
>
> Thank you very much
>
> I have clearer picture of Kylin already thanks to your explanation.
>
> Now back to the sample project of SSB in attached photo, when I run this
> query with push_down option OFF, why the OLAP error appears, and in such
> case, how to create a new cube for this query?
>
> [image: image.png]
>
> On Wed, Nov 1, 2023 at 3:49 PM Xiaoxiang Yu  wrote:
>
>> Here is some of my explanation and it may not be perfect.
>> Segment in Kylin is part of model/cube pre-computed data, in most
>> cases, divided by date column.
>>
>> Here is some difference between Segment and Snapshot.
>> Segment, whose source data comes from one fact table joins some dimension
>> tables with 'specific date range', is 'precomputed', and will accelerate
>> complex query.
>> Snapshot, whose source data comes from one specific dimension table without
>> specific date range, is "not precomputed", and can join with segments at
>> runtime .
>>
>> - https://kylin.apache.org/5.0/docs/snapshot/snapshot_management
>> -
>> https://kylin.apache.org/5.0/docs/modeling/load_data/segment_operation_settings/intro
>>
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Wed, Nov 1, 2023 at 3:53 PM Nam Đỗ Duy  wrote:
>>
>>> Thank you again, very smart of you to automatically select cube for a
>>> certain query. Sorry If I ask too much: Is the concept of Segment in Kylin
>>> model similar to Slice-and-Dice concept of Cube, what is the different
>>> between Kylin Segment and Kylin Snapshot?
>>>
>>> PS. I sent you the log files for your help in investigating why my cube
>>> has not been used.
>>>
>>> On Wed, Nov 1, 2023 at 2:36 PM Xiaoxiang Yu  wrote:
>>>
>>>> I guess there is a misunderstanding from your sentences.
>>>>
>>>> -- 'I need to select Cube from a combo box below the query window'
>>>> It is not right to use 'need', that combo box is for some specific
>>>> cases(for example, Kylin did not choose a cube which is the most
>>>> efficient), not the most cases.
>>>> In most cases(both for Kylin 4 and Kylin 5), you don't need to select a
>>>> Cube in the combo box, Kylin will do the choice for you.
>>>>
>>>> 
>>>> With warm regard
>>>> Xiaoxiang Yu
>>>>
>>>>
>>>>
>>>> On Wed, Nov 1, 2023 at 3:24 PM Nam Đỗ Duy 
>>>> wrote:
>>>>
>>>>> Hi Xiaoxiang, sorry if I made you confused (Anyway, it is just a
>>>>> question of a beginner)
>>>>>
>>>>> "obviously" means "clearly"
>>>>>
>>>>> because I need to select Cube from a combo box below the query window
>>>>>
>>>>> Thank you very much
>>>>>
>>>>> On Wed, Nov 1, 2023 at 2:20 PM Xiaoxiang Yu  wrote:
>>>>>
>>>>>> From my side, I cannot understand why you say Kylin 4 is 'very
>>>>>> obviously'. Can you give an example?
>>>>>> From the source code, the basic logic of choosing the right
>>>>>> cube/model are similar.
>>>>>> 
>>>>>> With warm regard
>>>>>> Xiaoxiang Yu
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 1, 2023 at 3:01 PM Nam Đỗ Duy  wrote:
>>>>>>
>>>>>>> Thank you for your kind reply, please answer 1 more question about
>>>>>>> version 5:
>>>>>>>
>>>>>>> In version 4.x we run query 

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
Here is some of my explanation and it may not be perfect.
Segment in Kylin is part of model/cube pre-computed data, in most
cases, divided by date column.

Here is some difference between Segment and Snapshot.
Segment, whose source data comes from one fact table joins some dimension
tables with 'specific date range', is 'precomputed', and will accelerate
complex query.
Snapshot, whose source data comes from one specific dimension table without
specific date range, is "not precomputed", and can join with segments at
runtime .

- https://kylin.apache.org/5.0/docs/snapshot/snapshot_management
-
https://kylin.apache.org/5.0/docs/modeling/load_data/segment_operation_settings/intro


With warm regard
Xiaoxiang Yu



On Wed, Nov 1, 2023 at 3:53 PM Nam Đỗ Duy  wrote:

> Thank you again, very smart of you to automatically select cube for a
> certain query. Sorry If I ask too much: Is the concept of Segment in Kylin
> model similar to Slice-and-Dice concept of Cube, what is the different
> between Kylin Segment and Kylin Snapshot?
>
> PS. I sent you the log files for your help in investigating why my cube
> has not been used.
>
> On Wed, Nov 1, 2023 at 2:36 PM Xiaoxiang Yu  wrote:
>
>> I guess there is a misunderstanding from your sentences.
>>
>> -- 'I need to select Cube from a combo box below the query window'
>> It is not right to use 'need', that combo box is for some specific
>> cases(for example, Kylin did not choose a cube which is the most
>> efficient), not the most cases.
>> In most cases(both for Kylin 4 and Kylin 5), you don't need to select a
>> Cube in the combo box, Kylin will do the choice for you.
>>
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Wed, Nov 1, 2023 at 3:24 PM Nam Đỗ Duy  wrote:
>>
>>> Hi Xiaoxiang, sorry if I made you confused (Anyway, it is just a
>>> question of a beginner)
>>>
>>> "obviously" means "clearly"
>>>
>>> because I need to select Cube from a combo box below the query window
>>>
>>> Thank you very much
>>>
>>> On Wed, Nov 1, 2023 at 2:20 PM Xiaoxiang Yu  wrote:
>>>
>>>> From my side, I cannot understand why you say Kylin 4 is 'very
>>>> obviously'. Can you give an example?
>>>> From the source code, the basic logic of choosing the right cube/model
>>>> are similar.
>>>> 
>>>> With warm regard
>>>> Xiaoxiang Yu
>>>>
>>>>
>>>>
>>>> On Wed, Nov 1, 2023 at 3:01 PM Nam Đỗ Duy  wrote:
>>>>
>>>>> Thank you for your kind reply, please answer 1 more question about
>>>>> version 5:
>>>>>
>>>>> In version 4.x we run query against a Cube very obviously, but in
>>>>> version 5, the cube usage is a implication socan you advise: for a given
>>>>> query, which model will be used, which index (cube) will be used for this
>>>>> query?
>>>>>
>>>>> Thank you
>>>>>
>>>>> On Wed, Nov 1, 2023 at 1:42 PM Xiaoxiang Yu  wrote:
>>>>>
>>>>>> 1. How do I measure the size of the index (cube) in version 5?
>>>>>>You can check storage of specific Indexes from the Index page.
>>>>>>
>>>>>> https://kylin.apache.org/5.0/docs/modeling/model_design/aggregation_group#view-aggregate-index
>>>>>> or
>>>>>> https://kylin.apache.org/5.0/assets/images/index_1-6ad3f55183d4ed61962359d9408ba192.png
>>>>>>
>>>>>>
>>>>>> 2. How to create the cardinality for each column?
>>>>>>    You should check this link :
>>>>>> https://kylin.apache.org/5.0/docs/datasource/data_sampling/ .
>>>>>>
>>>>>> 3. In your default project sample named SSB project, you have only 4
>>>>>> simple aggregate group index and no table index as in attached file
>>>>>> so what is the best strategy to select index for our OLAP?
>>>>>> 1. There does exist a 'Base Table Index'  by default actually,
>>>>>> its id is 201.
>>>>>> 2. I think it is a good question and Kylin 5 lacks such a guide
>>>>>> for better modeling. You are free to ask your question to
>>>>>> mailing list and I will try to reply.
>>>>>>
>>>>>> 
>>>>>> With warm regard
>

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
I guess there is a misunderstanding from your sentences.

-- 'I need to select Cube from a combo box below the query window'
It is not right to use 'need', that combo box is for some specific
cases(for example, Kylin did not choose a cube which is the most
efficient), not the most cases.
In most cases(both for Kylin 4 and Kylin 5), you don't need to select a
Cube in the combo box, Kylin will do the choice for you.


With warm regard
Xiaoxiang Yu



On Wed, Nov 1, 2023 at 3:24 PM Nam Đỗ Duy  wrote:

> Hi Xiaoxiang, sorry if I made you confused (Anyway, it is just a question
> of a beginner)
>
> "obviously" means "clearly"
>
> because I need to select Cube from a combo box below the query window
>
> Thank you very much
>
> On Wed, Nov 1, 2023 at 2:20 PM Xiaoxiang Yu  wrote:
>
>> From my side, I cannot understand why you say Kylin 4 is 'very obviously'.
>> Can you give an example?
>> From the source code, the basic logic of choosing the right cube/model
>> are similar.
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Wed, Nov 1, 2023 at 3:01 PM Nam Đỗ Duy  wrote:
>>
>>> Thank you for your kind reply, please answer 1 more question about
>>> version 5:
>>>
>>> In version 4.x we run query against a Cube very obviously, but in
>>> version 5, the cube usage is a implication socan you advise: for a given
>>> query, which model will be used, which index (cube) will be used for this
>>> query?
>>>
>>> Thank you
>>>
>>> On Wed, Nov 1, 2023 at 1:42 PM Xiaoxiang Yu  wrote:
>>>
>>>> 1. How do I measure the size of the index (cube) in version 5?
>>>>You can check storage of specific Indexes from the Index page.
>>>>
>>>> https://kylin.apache.org/5.0/docs/modeling/model_design/aggregation_group#view-aggregate-index
>>>> or
>>>> https://kylin.apache.org/5.0/assets/images/index_1-6ad3f55183d4ed61962359d9408ba192.png
>>>>
>>>>
>>>> 2. How to create the cardinality for each column?
>>>>You should check this link :
>>>> https://kylin.apache.org/5.0/docs/datasource/data_sampling/ .
>>>>
>>>> 3. In your default project sample named SSB project, you have only 4
>>>> simple aggregate group index and no table index as in attached file
>>>> so what is the best strategy to select index for our OLAP?
>>>> 1. There does exist a 'Base Table Index'  by default actually, its
>>>> id is 201.
>>>> 2. I think it is a good question and Kylin 5 lacks such a guide for
>>>> better modeling. You are free to ask your question to
>>>> mailing list and I will try to reply.
>>>>
>>>> 
>>>> With warm regard
>>>> Xiaoxiang Yu
>>>>
>>>>
>>>>
>>>> On Wed, Nov 1, 2023 at 2:12 PM Xiaoxiang Yu  wrote:
>>>>
>>>>> OK, I didn't read all the mail history so I misunderstand the
>>>>> situation. Looks like you need to analyse
>>>>> the cause why the query didn't hit the cube correctly.
>>>>>
>>>>> Please generate query diagnosis package and send it to me privately. I
>>>>> will analyse the query log.
>>>>> You can refer to the following steps in screenshots.
>>>>>
>>>>> [image: image.png]
>>>>>
>>>>> If the screenshots are not displaying correctly, please read this
>>>>> guide :
>>>>>
>>>>> https://kylin.apache.org/5.0/docs/operations/system-operation/diagnosis/#generate-query-diagnosis-package-in-web-ui
>>>>>
>>>>> By the way, you need to analyse the cause by reading kylin.query.log,
>>>>> not the kylin.log,
>>>>> refer to https://kylin.apache.org/5.0/docs/operations/logs/system_log
>>>>>
>>>>> --------
>>>>> With warm regard
>>>>> Xiaoxiang Yu
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Nov 1, 2023 at 12:18 PM Nam Đỗ Duy  wrote:
>>>>>
>>>>>> Thank you Xiaoxiang for your advice. As my title email shown, I
>>>>>> guessed that the OLAP functionalities has not been correctly set up in my
>>>>>> computer.
>>>>>>
>>>>>> The evidence about it is that: when I disable the Pushdown option box
>>>>>

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
>From my side, I cannot understand why you say Kylin 4 is 'very obviously'.
Can you give an example?
>From the source code, the basic logic of choosing the right cube/model are
similar.

With warm regard
Xiaoxiang Yu



On Wed, Nov 1, 2023 at 3:01 PM Nam Đỗ Duy  wrote:

> Thank you for your kind reply, please answer 1 more question about version
> 5:
>
> In version 4.x we run query against a Cube very obviously, but in version
> 5, the cube usage is a implication socan you advise: for a given query,
> which model will be used, which index (cube) will be used for this query?
>
> Thank you
>
> On Wed, Nov 1, 2023 at 1:42 PM Xiaoxiang Yu  wrote:
>
>> 1. How do I measure the size of the index (cube) in version 5?
>>You can check storage of specific Indexes from the Index page.
>>
>> https://kylin.apache.org/5.0/docs/modeling/model_design/aggregation_group#view-aggregate-index
>> or
>> https://kylin.apache.org/5.0/assets/images/index_1-6ad3f55183d4ed61962359d9408ba192.png
>>
>>
>> 2. How to create the cardinality for each column?
>>You should check this link :
>> https://kylin.apache.org/5.0/docs/datasource/data_sampling/ .
>>
>> 3. In your default project sample named SSB project, you have only 4
>> simple aggregate group index and no table index as in attached file
>> so what is the best strategy to select index for our OLAP?
>> 1. There does exist a 'Base Table Index'  by default actually, its
>> id is 201.
>> 2. I think it is a good question and Kylin 5 lacks such a guide for
>> better modeling. You are free to ask your question to
>> mailing list and I will try to reply.
>>
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Wed, Nov 1, 2023 at 2:12 PM Xiaoxiang Yu  wrote:
>>
>>> OK, I didn't read all the mail history so I misunderstand the situation.
>>> Looks like you need to analyse
>>> the cause why the query didn't hit the cube correctly.
>>>
>>> Please generate query diagnosis package and send it to me privately. I
>>> will analyse the query log.
>>> You can refer to the following steps in screenshots.
>>>
>>> [image: image.png]
>>>
>>> If the screenshots are not displaying correctly, please read this guide :
>>>
>>> https://kylin.apache.org/5.0/docs/operations/system-operation/diagnosis/#generate-query-diagnosis-package-in-web-ui
>>>
>>> By the way, you need to analyse the cause by reading kylin.query.log,
>>> not the kylin.log,
>>> refer to https://kylin.apache.org/5.0/docs/operations/logs/system_log
>>>
>>> 
>>> With warm regard
>>> Xiaoxiang Yu
>>>
>>>
>>>
>>> On Wed, Nov 1, 2023 at 12:18 PM Nam Đỗ Duy  wrote:
>>>
>>>> Thank you Xiaoxiang for your advice. As my title email shown, I guessed
>>>> that the OLAP functionalities has not been correctly set up in my computer.
>>>>
>>>> The evidence about it is that: when I disable the Pushdown option box
>>>> to use solely the precomputation cube only, it showed following error:
>>>> Please kindly advise how to properly build the OLAP
>>>>
>>>> LIMIT 500": No realization found for OLAPContext, MODEL_UNMATCHED_JOIN, 
>>>> rel#2240:KapTableScan.OLAP.[](table=[VNEVENT_HIVE_DWH_400MILLION_ROWS, 
>>>> FACTUSEREVENT],ctx=0@null,fields=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
>>>> 12, 13, 14, 15, 16, 17, 18, 19, 20])
>>>>
>>>>
>>>>
>>>> On Wed, Nov 1, 2023 at 10:40 AM Xiaoxiang Yu  wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Yesterday, I tried to see if query pushdown functions work well in
>>>>> the Kylin5 docker, and all of my queries return proper responses .
>>>>> After checking your logs from Shaofeng, I found these error
>>>>> messages repeated many times:
>>>>> 1. 'java.io.IOException: All datanodes DatanodeInfoWithStorage[
>>>>> 127.0.0.1:9866,DS-5093899b-06c7-4386-95d5-6fc271d92b52,DISK] are bad.
>>>>> Aborting...'
>>>>> 2. 'curator.ConnectionState : Connection timed out for connection
>>>>> string (localhost:2181) and timeout (15000) / elapsed (41794)
>>>>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
>>>>> ConnectionLoss'
>>>>>
>>>>> I guess the root cause 

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
1. How do I measure the size of the index (cube) in version 5?
   You can check storage of specific Indexes from the Index page.
https://kylin.apache.org/5.0/docs/modeling/model_design/aggregation_group#view-aggregate-index
or
https://kylin.apache.org/5.0/assets/images/index_1-6ad3f55183d4ed61962359d9408ba192.png


2. How to create the cardinality for each column?
   You should check this link :
https://kylin.apache.org/5.0/docs/datasource/data_sampling/ .

3. In your default project sample named SSB project, you have only 4 simple
aggregate group index and no table index as in attached file
so what is the best strategy to select index for our OLAP?
1. There does exist a 'Base Table Index'  by default actually, its id
is 201.
2. I think it is a good question and Kylin 5 lacks such a guide for
better modeling. You are free to ask your question to
mailing list and I will try to reply.


With warm regard
Xiaoxiang Yu



On Wed, Nov 1, 2023 at 2:12 PM Xiaoxiang Yu  wrote:

> OK, I didn't read all the mail history so I misunderstand the situation.
> Looks like you need to analyse
> the cause why the query didn't hit the cube correctly.
>
> Please generate query diagnosis package and send it to me privately. I
> will analyse the query log.
> You can refer to the following steps in screenshots.
>
> [image: image.png]
>
> If the screenshots are not displaying correctly, please read this guide :
>
> https://kylin.apache.org/5.0/docs/operations/system-operation/diagnosis/#generate-query-diagnosis-package-in-web-ui
>
> By the way, you need to analyse the cause by reading kylin.query.log, not
> the kylin.log,
> refer to https://kylin.apache.org/5.0/docs/operations/logs/system_log
>
> ----
> With warm regard
> Xiaoxiang Yu
>
>
>
> On Wed, Nov 1, 2023 at 12:18 PM Nam Đỗ Duy  wrote:
>
>> Thank you Xiaoxiang for your advice. As my title email shown, I guessed
>> that the OLAP functionalities has not been correctly set up in my computer.
>>
>> The evidence about it is that: when I disable the Pushdown option box to
>> use solely the precomputation cube only, it showed following error: Please
>> kindly advise how to properly build the OLAP
>>
>> LIMIT 500": No realization found for OLAPContext, MODEL_UNMATCHED_JOIN, 
>> rel#2240:KapTableScan.OLAP.[](table=[VNEVENT_HIVE_DWH_400MILLION_ROWS, 
>> FACTUSEREVENT],ctx=0@null,fields=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
>> 13, 14, 15, 16, 17, 18, 19, 20])
>>
>>
>>
>> On Wed, Nov 1, 2023 at 10:40 AM Xiaoxiang Yu  wrote:
>>
>>> Hi,
>>>
>>> Yesterday, I tried to see if query pushdown functions work well in
>>> the Kylin5 docker, and all of my queries return proper responses .
>>> After checking your logs from Shaofeng, I found these error messages
>>> repeated many times:
>>> 1. 'java.io.IOException: All datanodes DatanodeInfoWithStorage[
>>> 127.0.0.1:9866,DS-5093899b-06c7-4386-95d5-6fc271d92b52,DISK] are bad.
>>> Aborting...'
>>> 2. 'curator.ConnectionState : Connection timed out for connection
>>> string (localhost:2181) and timeout (15000) / elapsed (41794)
>>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
>>> ConnectionLoss'
>>>
>>> I guess the root cause is that the container didn't not have enough
>>> resources. I found you query on a table called
>>> 'XXX_hive_dwh_400million_rows', looks like you gave a complex query on a
>>> table which contains 400 million rows?
>>>
>>> Since I am the uploader of kylin5 's docker image, I want to give
>>> some explainment. Kylin5 docker is not a place for performance benchmarks,
>>> it is only for demonstration. It is only allocated with very little
>>> resources(8G memory) if you are using the default command from docker hub
>>> page. Before I uploaded my image, I only tested my image using the ssb
>>> dataset, which the biggest table only contains about 60k rows. If you are
>>> using a larger dataset and complexer queries, you have to scale the
>>> resource properly. Try querying tables which contain not more than 100k
>>> rows by default.
>>>
>>> Here are some tips which may help you to check if the daemon service
>>> is in health status and resources(particularly disk space) is configured
>>> properly.
>>>
>>> 1. Checking HDFS 's web ui(
>>> http://localhost:9870/dfshealth.html#tab-datanode ) to confirm whether
>>> HDFS service is in 'In service' status.
>>> 2. Checking Datanode 's log in
>>> `/opt

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-11-01 Thread Xiaoxiang Yu
OK, I didn't read all the mail history so I misunderstand the situation.
Looks like you need to analyse
the cause why the query didn't hit the cube correctly.

Please generate query diagnosis package and send it to me privately. I will
analyse the query log.
You can refer to the following steps in screenshots.

[image: image.png]

If the screenshots are not displaying correctly, please read this guide :

https://kylin.apache.org/5.0/docs/operations/system-operation/diagnosis/#generate-query-diagnosis-package-in-web-ui

By the way, you need to analyse the cause by reading kylin.query.log, not
the kylin.log,
refer to https://kylin.apache.org/5.0/docs/operations/logs/system_log


With warm regard
Xiaoxiang Yu



On Wed, Nov 1, 2023 at 12:18 PM Nam Đỗ Duy  wrote:

> Thank you Xiaoxiang for your advice. As my title email shown, I guessed
> that the OLAP functionalities has not been correctly set up in my computer.
>
> The evidence about it is that: when I disable the Pushdown option box to
> use solely the precomputation cube only, it showed following error: Please
> kindly advise how to properly build the OLAP
>
> LIMIT 500": No realization found for OLAPContext, MODEL_UNMATCHED_JOIN, 
> rel#2240:KapTableScan.OLAP.[](table=[VNEVENT_HIVE_DWH_400MILLION_ROWS, 
> FACTUSEREVENT],ctx=0@null,fields=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
> 13, 14, 15, 16, 17, 18, 19, 20])
>
>
>
> On Wed, Nov 1, 2023 at 10:40 AM Xiaoxiang Yu  wrote:
>
>> Hi,
>>
>> Yesterday, I tried to see if query pushdown functions work well in
>> the Kylin5 docker, and all of my queries return proper responses .
>> After checking your logs from Shaofeng, I found these error messages
>> repeated many times:
>> 1. 'java.io.IOException: All datanodes DatanodeInfoWithStorage[
>> 127.0.0.1:9866,DS-5093899b-06c7-4386-95d5-6fc271d92b52,DISK] are bad.
>> Aborting...'
>> 2. 'curator.ConnectionState : Connection timed out for connection
>> string (localhost:2181) and timeout (15000) / elapsed (41794)
>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
>> ConnectionLoss'
>>
>> I guess the root cause is that the container didn't not have enough
>> resources. I found you query on a table called
>> 'XXX_hive_dwh_400million_rows', looks like you gave a complex query on a
>> table which contains 400 million rows?
>>
>> Since I am the uploader of kylin5 's docker image, I want to give
>> some explainment. Kylin5 docker is not a place for performance benchmarks,
>> it is only for demonstration. It is only allocated with very little
>> resources(8G memory) if you are using the default command from docker hub
>> page. Before I uploaded my image, I only tested my image using the ssb
>> dataset, which the biggest table only contains about 60k rows. If you are
>> using a larger dataset and complexer queries, you have to scale the
>> resource properly. Try querying tables which contain not more than 100k
>> rows by default.
>>
>> Here are some tips which may help you to check if the daemon service
>> is in health status and resources(particularly disk space) is configured
>> properly.
>>
>> 1. Checking HDFS 's web ui(
>> http://localhost:9870/dfshealth.html#tab-datanode ) to confirm whether
>> HDFS service is in 'In service' status.
>> 2. Checking Datanode 's log in
>> `/opt/hadoop-3.2.1/logs/hadoop-root-datanode-Kylin5-Machine.log`, check if
>> there is any error message. Like: cat
>> /opt/hadoop-3.2.1/logs/hadoop-root-datanode-Kylin5-Machine.log | grep ERROR
>> | wc -l
>> 3. Checking if your docker engine is configured with enough disk
>> space, if you are using Docker Desktop like me,please go to "Settings" -
>> "Resources" - "Advanced", make sure you have allocated 40GB+ disk space to
>> the docker container.
>> 4. Checking the available disk space of your container by `df -h`,
>> make sure the 'Use%' of 'overlay' is less than 60% .
>> 5. Checking the load average/ cpu usage/ jvm gc. Make sure these
>> metrics are not really high when you send a query.
>> 
>> With warm regard
>> Xiaoxiang Yu
>>
>>
>>
>> On Tue, Oct 31, 2023 at 5:13 PM Nam Đỗ Duy 
>> wrote:
>>
>>> Hi ShaoFeng
>>>
>>> Thank you very much for your valuable feedback
>>>
>>> I saw the application to be there (if I see it right) as in the
>>> attachment photo. Kindly advise so that I can run this query on OLAP.
>>>
>>> PS. I sent you the log file in private.
>>>
>>> [image: i

Re: OLAP functionalities in Kylin 5.0 seems not yet working for me

2023-10-31 Thread Xiaoxiang Yu
Hi,

Yesterday, I tried to see if query pushdown functions work well in the
Kylin5 docker, and all of my queries return proper responses .
After checking your logs from Shaofeng, I found these error messages
repeated many times:
1. 'java.io.IOException: All datanodes DatanodeInfoWithStorage[
127.0.0.1:9866,DS-5093899b-06c7-4386-95d5-6fc271d92b52,DISK] are bad.
Aborting...'
2. 'curator.ConnectionState : Connection timed out for connection
string (localhost:2181) and timeout (15000) / elapsed (41794)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss'

I guess the root cause is that the container didn't not have enough
resources. I found you query on a table called
'XXX_hive_dwh_400million_rows', looks like you gave a complex query on a
table which contains 400 million rows?

Since I am the uploader of kylin5 's docker image, I want to give some
explainment. Kylin5 docker is not a place for performance benchmarks, it is
only for demonstration. It is only allocated with very little resources(8G
memory) if you are using the default command from docker hub page. Before I
uploaded my image, I only tested my image using the ssb dataset, which the
biggest table only contains about 60k rows. If you are using a larger
dataset and complexer queries, you have to scale the resource properly. Try
querying tables which contain not more than 100k rows by default.

Here are some tips which may help you to check if the daemon service is
in health status and resources(particularly disk space) is configured
properly.

1. Checking HDFS 's web ui(
http://localhost:9870/dfshealth.html#tab-datanode ) to confirm whether
HDFS service is in 'In service' status.
2. Checking Datanode 's log in
`/opt/hadoop-3.2.1/logs/hadoop-root-datanode-Kylin5-Machine.log`, check if
there is any error message. Like: cat
/opt/hadoop-3.2.1/logs/hadoop-root-datanode-Kylin5-Machine.log | grep ERROR
| wc -l
3. Checking if your docker engine is configured with enough disk space,
if you are using Docker Desktop like me,please go to "Settings" -
"Resources" - "Advanced", make sure you have allocated 40GB+ disk space to
the docker container.
4. Checking the available disk space of your container by `df -h`, make
sure the 'Use%' of 'overlay' is less than 60% .
5. Checking the load average/ cpu usage/ jvm gc. Make sure these
metrics are not really high when you send a query.
----
With warm regard
Xiaoxiang Yu



On Tue, Oct 31, 2023 at 5:13 PM Nam Đỗ Duy  wrote:

> Hi ShaoFeng
>
> Thank you very much for your valuable feedback
>
> I saw the application to be there (if I see it right) as in the attachment
> photo. Kindly advise so that I can run this query on OLAP.
>
> PS. I sent you the log file in private.
>
> [image: image.png]
>
> On Tue, Oct 31, 2023 at 3:11 PM ShaoFeng Shi 
> wrote:
>
>> Can you provide the messages in logs/kylin.log when executing the SQL?
>> and you can also check the Spark UI from yarn resource manager (there
>> should be one running application called Spardar, which is Kylin's backend
>> spark application). If the application is not there, it may indicates the
>> yarn doesn't have resource to startup it.
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC,
>> Apache Incubator PMC,
>> Email: shaofeng...@apache.org
>>
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>
>>
>>
>>
>> Nam Đỗ Duy  于2023年10月31日周二 10:35写道:
>>
>>> Dear Sir/Madam,
>>>
>>> I have a fact with 500million rows then I build model, index according
>>> to the website help.
>>>
>>> I chose full incremental because this is the first times I load data
>>>
>>> I create both index types Aggregate group index, table index as photo
>>> attached.
>>>
>>> But the query always failed after timeout of 300 seconds (I run in
>>> docker), I dont want to increase the value of 300 seconds because I wish
>>> the OLAP can run within 1 minutes (is that possible?)
>>>
>>> It seems that the OLAP function in indexing not working to speedup the
>>> query by precomputed cube.
>>>
>>> Can you advise to check whether the index did really work?
>>>
>>> It is quite urgent task for me so prompt response is highly appreciated.
>>>
>>> Thank you very much
>>>
>>


Re: What is the default account name/pass for mysql in kylin docker

2023-10-30 Thread Xiaoxiang Yu
I cannot see the image correctly. I don't know if your index is ready for query.
Have you checked the manual modeling doc: 
https://kylin.apache.org/5.0/docs/modeling/manual_modeling ?







--

Best wishes to you ! 
From :Xiaoxiang Yu




At 2023-10-28 20:11:19, "Nam Đỗ Duy"  wrote:

Hello Xiaoxiang


I have created plenty of index but why it is very slow query and it still 
showed "No model index built": 






On Thu, Oct 19, 2023 at 5:10 PM Nam Đỗ Duy  wrote:

Thank you for sharing.


I have one very important question: Is the concept of Segment or Index now in 
version 5.x similar to columnar OLAP database like clickhouse or druid?


Best regards


On Thu, Oct 19, 2023 at 3:49 PM Xiaoxiang Yu  wrote:


In Kylin 5, the functions of cube are merged into DataModel, so power of 
multidimensional cube is kept in Model of Kylin 5.




Here is demo for creating and query model in Kylin 5: 
https://people.apache.org/~xxyu/resources/How_to_create_and_query_Model_in_Kylin_5.pptx
 .




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-10-18 22:38:51, "Nam Đỗ Duy"  wrote:
>Hello ShaoFeng
>
>Thank you for your email.
>
>I am a bit confused when the cube is no more.
>
>Then how can I utilize the power of multidimensional cube when querying
>data?
>
>On Tue, Oct 17, 2023 at 3:08 PM ShaoFeng Shi  wrote:
>
>> Copy to dev@kylin which is the right address:
>>
>> For question 1, yes Kylin 5 embedded the cube concept into the "data
>> model".  In the previous versions, a cube contains a fixed number of
>> cuboids. In Kylin 5, a "data model" can have a flexibile number of "index".
>> The concept of "index" is equal "cuboid".
>>
>> For question 2, due to license issue, there is no open source ODBC driver
>> for Apache Kylin any more. But you can seek third-party, for example:
>> https://kyligence.io/resources/kyligence-odbc-driver-for-apache-kylin-2/
>>
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC,
>> Apache Incubator PMC,
>> Email: shaofeng...@apache.org
>>
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>
>>
>>
>>
>> Nam Đỗ Duy  于2023年10月11日周三 19:38写道:
>>
>>> Hello ShaoFeng
>>>
>>> I download apache 5.0 docker and build the model already,
>>>
>>> 1. Is that true that you embed the Cube concept into model in version 5.0?
>>> 2. How can I find ODBC driver to connect PoweBI with kylin model version
>>> 5.0?
>>>
>>> Thank you very much
>>>
>>> On Wed, Sep 20, 2023 at 10:20 PM ShaoFeng Shi 
>>> wrote:
>>>
>>>> Hi Nam, you can check it here:
>>>>
>>>>
>>>> https://github.com/apache/kylin/blob/99574f90e735a7679d8ceed576cccfd8d750beee/build/release/all-in-one-docker/all_in_one/entrypoint.sh#L83C48-L83C50
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>> Apache Kylin PMC,
>>>> Apache Incubator PMC,
>>>> Email: shaofeng...@apache.org
>>>>
>>>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>>>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>>>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>>>
>>>>
>>>>
>>>>
>>>> Nam Đỗ Duy  于2023年9月19日周二 17:21写道:
>>>>
>>>>> Dear sir/Madam
>>>>>
>>>>> I run docker pull and docker run from the following page and then how
>>>>> can I login to mysql ? please instruct me the mysql credentials
>>>>>
>>>>> apachekylin/apache-kylin-standalone - Docker Image | Docker Hub
>>>>> <https://hub.docker.com/r/apachekylin/apache-kylin-standalone>
>>>>>
>>>>> Thank you very much
>>>>>
>>>>


Re:Query timeout and pushdown

2023-10-30 Thread Xiaoxiang Yu
Did this error happened when you using Kylin in docker container? 







--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-10-28 14:28:21, "Nam Đỗ Duy"  wrote:
>Dear Sir/Madam
>
>Can you please advise to pass this error which happened many times with me:
>
>The query exceeds the set time limit of 300s. Current step: Collecting
>dataset of push-down.


Re: Issue with Hive Table Synchronization in Apache Kylin

2023-10-27 Thread Xiaoxiang Yu
Could you create a ticket on Kylin's JIRA and paste your logs and screenshots 
there? Your accout creation request looks like been approved today.







--

Best wishes to you ! 
From :Xiaoxiang Yu




At 2023-10-26 14:42:36, "Quốc Nguyễn Đình"  wrote:

The problem still remains the same after I add the hive-exec-2.3.7-core.jar 
into Kylin's classpath:


Here is the log file:



Vào Th 5, 26 thg 10, 2023 vào lúc 09:42 ShaoFeng Shi  
đã viết:

Seems the hive-exec-2.3.7-core.jar is not on Kylin's classpath. You can
search it on the local disk, and then copy it to Kylin's lib folder as a
quick workaround.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC,
Apache Incubator PMC,
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Quốc Nguyễn Đình  于2023年10月26日周四 01:11写道:

> Dear Apache Kylin Team,
>
> I hope this email finds you well. I am writing to report an issue I
> encountered while trying to sync Hive tables from Apache Kylin in my Hadoop
> ecosystem. Here are the details of my setup:
>
> Hadoop Version: 2.9.2
> Apache Hive Version: 2.3.7
> Apache Kylin Version: 4.0.3
> Zookeeper: 3.8.3
>
> After successfully setting up my environment and running Kylin, I
> encountered a problem when attempting to synchronize Hive tables. The error
> message in the log is as follows:
> [image: image.png]
> This is the content of kylin.properties
> [image: image.png]
> This is what the Apache Kylin Web UI shows:
> [image: image.png]
>
> I would greatly appreciate your assistance in resolving this problem.
> Could you please provide guidance on how to overcome this issue and
> successfully synchronize Hive tables with Apache Kylin?
>
> Thank you in advance for your support. If you require any additional
> information or log files to diagnose the problem, please let me know, and I
> will provide them promptly.
>
> Looking forward to your guidance and expertise in resolving this matter.
>
> Best regards,
> Nguyen Dinh Quoc
>


Re: What is the default account name/pass for mysql in kylin docker

2023-10-19 Thread Xiaoxiang Yu
In Kylin 5, the functions of cube are merged into DataModel, so power of 
multidimensional cube is kept in Model of Kylin 5.




Here is demo for creating and query model in Kylin 5: 
https://people.apache.org/~xxyu/resources/How_to_create_and_query_Model_in_Kylin_5.pptx
 .




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-10-18 22:38:51, "Nam Đỗ Duy"  wrote:
>Hello ShaoFeng
>
>Thank you for your email.
>
>I am a bit confused when the cube is no more.
>
>Then how can I utilize the power of multidimensional cube when querying
>data?
>
>On Tue, Oct 17, 2023 at 3:08 PM ShaoFeng Shi  wrote:
>
>> Copy to dev@kylin which is the right address:
>>
>> For question 1, yes Kylin 5 embedded the cube concept into the "data
>> model".  In the previous versions, a cube contains a fixed number of
>> cuboids. In Kylin 5, a "data model" can have a flexibile number of "index".
>> The concept of "index" is equal "cuboid".
>>
>> For question 2, due to license issue, there is no open source ODBC driver
>> for Apache Kylin any more. But you can seek third-party, for example:
>> https://kyligence.io/resources/kyligence-odbc-driver-for-apache-kylin-2/
>>
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC,
>> Apache Incubator PMC,
>> Email: shaofeng...@apache.org
>>
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>
>>
>>
>>
>> Nam Đỗ Duy  于2023年10月11日周三 19:38写道:
>>
>>> Hello ShaoFeng
>>>
>>> I download apache 5.0 docker and build the model already,
>>>
>>> 1. Is that true that you embed the Cube concept into model in version 5.0?
>>> 2. How can I find ODBC driver to connect PoweBI with kylin model version
>>> 5.0?
>>>
>>> Thank you very much
>>>
>>> On Wed, Sep 20, 2023 at 10:20 PM ShaoFeng Shi 
>>> wrote:
>>>
>>>> Hi Nam, you can check it here:
>>>>
>>>>
>>>> https://github.com/apache/kylin/blob/99574f90e735a7679d8ceed576cccfd8d750beee/build/release/all-in-one-docker/all_in_one/entrypoint.sh#L83C48-L83C50
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>> Apache Kylin PMC,
>>>> Apache Incubator PMC,
>>>> Email: shaofeng...@apache.org
>>>>
>>>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>>>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>>>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>>>
>>>>
>>>>
>>>>
>>>> Nam Đỗ Duy  于2023年9月19日周二 17:21写道:
>>>>
>>>>> Dear sir/Madam
>>>>>
>>>>> I run docker pull and docker run from the following page and then how
>>>>> can I login to mysql ? please instruct me the mysql credentials
>>>>>
>>>>> apachekylin/apache-kylin-standalone - Docker Image | Docker Hub
>>>>> <https://hub.docker.com/r/apachekylin/apache-kylin-standalone>
>>>>>
>>>>> Thank you very much
>>>>>
>>>>


Re:Security Vulnerability - Action Required: XXE vulnerability in org.apache.kylin:kylin-job before version 0.7.2-incubating-job

2023-09-25 Thread Xiaoxiang Yu
Hi,
The 0.7.2-incubating is released 8 years ago, the current maintained 
version are Kylin 3.0+, 
and C3P0ConfigXmlUtils is not a maintained version. So I think it affected 
nobody,




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-09-21 16:54:17, "James Watt"  wrote:
>Hi there,
>I think the method
>com.mchange.v2.c3p0.cfg.C3P0ConfigXmlUtils.extractXmlConfigFromInputStream(InputStream
>is) may have an XXE vulnerability which is vulnerable in the
>org.apache.kylin:kylin-job before version 0.7.2-incubating-job. It shares
>similarities to a recent CVE disclosure CVE-2018-20433 in the
>"swaldman/c3p0" project.
> The source vulnerability information is as follows:
>
>> Vulnerability Detail:
>> CVE Identifier: CVE-2018-20433
>> c3p0 0.9.5.2 allows XXE in extractXmlConfigFromInputStream in
>> com/mchange/v2/c3p0/cfg/C3P0ConfigXmlUtils.java during initialization.
>> Reference:https://nvd.nist.gov/vuln/detail/CVE-2018-20433
>> Patch: zhutougg/c3p0@2eb0ea9
>> <https://github.com/zhutougg/c3p0/commit/2eb0ea97f745740b18dd45e4a909112d4685f87b>
>
>
>This may be caused by the fact that the version of c3p0, the component you
>rely on, has not been updated. Maybe I can submit a PR to help you update
>the version? Looking forward to your reply.
>
>Best regards,
>Yiheng Cao


Re:Is cwiki not managed for Kylin 5? Is there a material for detail design and architecture of Kylin 5?

2023-09-12 Thread Xiaoxiang Yu
Hi,

I am glad that you are interested in Kylin 5, and I am sorry for that the 
Kylin team
didn't provided enough material in developer perspective.


Q1: Is there a material to understand detail of Kylin5 ?
For developer's perspective, there are three Chinese articles in 
here(https://kylin.apache.org/5.0/blog), 
but Kylin team didn't translated it to English. I think I will do the 
translatation after the next release of Kylin 5 and I
am currently doing some preparation(refactor modules, refine release pipeline, 
metadata upgrade tools, benchmark
etc) for Kylin 5.0.0.
For user's perspective, I guess the current 
document(https://kylin.apache.org/5.0/docs/quickstart/overview/) 
is enough for you to understand the difference of Kylin 5 with Kylin 3, 
right?(If it is not right, please let me know.)


Q2: Is the contribution of Kylin 5 opensource closed to others that are not in 
Kyligence team ?
That is not true, Kylin is still a Apache Project. And Kylin 5 will welcome and 
merge PRs from any team 
if it is qualified.


Q3: Is cwiki not managed for Kylin 5?
Unfortunately, Kylin team didn't have enough maintainer/authors for cwiki, 
soI guess Kylin team will not 
updated cwiki in the future. Techical datails will be updated in 
https://kylin.apache.org/5.0/blog. 


Finally, if you are interested techincal details, is it possible that you 
can use Google Translation or ChatGPT 
to translate and read these articles? I am willing to hear your feedback, and 
if you didn't find what you interested,
please let me know.




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-09-12 11:25:03, "이윤성"  wrote:
>Hi. I'm Yoonsung Lee in LINE corp.
>I have been using Kylin3 for last 3 years for OLAP query engine over XX PB 
>datasource.
>
>I consider to use and develop Kylin 5 for upcoming future.
>Until kylin 4, I understand core design and detail performance detail in 
>https://cwiki.apache.org/confluence/display/KYLIN
>But I cannot find any document for Kylin 5 in here.
>
>I checked these pages - 
>https://kylin.apache.org/5.0/docs/development/roadmap,https://kylin.apache.org/5.0/docs/development/how_to_understand_kylin_design
>But that pages are not enough to understand the design and architecture of 
>Kylin 5.
>I want to know the core design and detail architecture of Kylin 5 beyond 
>Roadmap for planning when and how I can adopt Kylin 5 and participate in OSS 
>development.
>
>Q1: is there a material to understand detail of Kylin5 ?
>Q2: Is the contribution of Kylin 5 opensource closed to others that are not in 
>Kyligence team ?
>
>Best regards. Thanks


Re:Kylin 5.0.0 Beta Version Issue with Hadoop Conf

2023-09-08 Thread Xiaoxiang Yu
Sorry for late reply, 


If you can execuet "ls -al /opt/apache-kylin-5.0.0-beta-bin/spark/jars/ | wc 
-l", is the output equal to 287 ?


And could you please send your log files, including:
- /opt/apache-kylin-5.0.0-beta-bin/logs/stderr
- /opt/apache-kylin-5.0.0-beta-bin/logs/stdout



--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-09-01 23:40:24, "Singh Sonu"  wrote:
> Hi Experts,
>
>I was trying to setup the new Kylin 5.0.0 Beta version and got stuck with
>the below error:
>Kylin 5.0.0 Alpha is working completely fine in my environment.
>
>Looking for your immediate response.
>Thanks in advance.
>
>Error -
>
>Checking check-1200-hadoop-conf.sh
>
>Checking hadoop conf dir...
>WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
>Turn on verbose mode.
>Turn on verbose mode.
>KYLIN_HOME is:/opt/apache-kylin-5.0.0-beta-bin
>KYLIN_CONFIG_FILE is:/opt/apache-kylin-5.0.0-beta-bin/conf/kylin.properties
>SPARK_HOME is:/opt/apache-kylin-5.0.0-beta-bin/spark
>KYLIN_JVM_SETTINGS is -server -Xms1g -Xmx8g -XX:+UseG1GC
>-XX:MaxGCPauseMillis=200 -XX:G1HeapRegionSize=16m -XX:+PrintFlagsFinal
>-XX:+PrintReferenceGC -verbose:gc -XX:+PrintGCDetails
>-XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintAdaptiveSizePolicy
>-XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark
> -Xloggc:/opt/apache-kylin-5.0.0-beta-bin/logs/kylin.gc.%p
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M
>-XX:-OmitStackTraceInFastThrow
>-Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector
>-DAsyncLogger.RingBufferSize=8192
>KYLIN_DEBUG_SETTINGS is not set, will not enable remote debuging
>KYLIN_LD_LIBRARY_SETTINGS is not set, it is okay unless you want to specify
>your own native path
>WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
>SPARK_HDP_VERSION is set to 'hadoop'
>/opt/apache-kylin-5.0.0-beta-bin/sbin/do-check-and-prepare-spark.sh: line
>103: /opt/apache-kylin-5.0.0-beta-bin/spark/spark_hdp_version: No such file
>or directory
>cat: /opt/apache-kylin-5.0.0-beta-bin/spark/spark_hdp_version: No such file
>or directory
>cp: cannot stat '/opt/apache-kylin-5.0.0-beta-bin/spark': No such file or
>directory
>/opt/apache-kylin-5.0.0-beta-bin/sbin/do-check-and-prepare-spark.sh: line
>172: /opt/apache-kylin-5.0.0-beta-bin/spark/spark_hdp_version: No such file
>or directory
>hadoop Spark jars exchange SUCCESS
>SPARK_HDP_VERSION save in PATH
>/opt/apache-kylin-5.0.0-beta-bin/spark/spark_hdp_version
>Export SPARK_HOME to /opt/apache-kylin-5.0.0-beta-bin/spark
>Checking hadoop config dir /opt/apache-kylin-5.0.0-beta-bin/hadoop_conf
>Exception in thread "main" java.lang.NoClassDefFoundError:
>org/apache/hadoop/conf/Configuration
>at
>org.apache.kylin.tool.hadoop.CheckHadoopConfDir.getLocalFSAndHitUGIForTheFirstTime(CheckHadoopConfDir.java:84)
>at
>org.apache.kylin.tool.hadoop.CheckHadoopConfDir.main(CheckHadoopConfDir.java:54)
>Caused by: java.lang.ClassNotFoundException:
>org.apache.hadoop.conf.Configuration
>at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>... 2 more
>ERROR: Check HADOOP_CONF_DIR failed. Please correct hadoop configurations.
>ESC[31mERROR: Check HADOOP_CONF_DIR failed. Please correct hadoop
>configurations.ESC[0m
>
>
>Regards,
>Sonu


Docker Image for Kylin 5.0-beta is available now

2023-09-08 Thread Xiaoxiang Yu
Following is the command to preview Kylin 5 really quickly.:


   docker run -d \
  --name Kylin5-Machine \
  --hostname Kylin5-Machine \
  -m 8G \
  -p 7070:7070 \
  -p 8088:8088 \
  -p 9870:9870 \
  -p 8032:8032 \
  -p 8042:8042 \
  -p 2181:2181 \
  apachekylin/apache-kylin-standalone:5.0-beta


docker logs --follow Kylin5-Machine




Please visit https://hub.docker.com/r/apachekylin/apache-kylin-standalone for 
more information. 



--

Best wishes to you ! 
From :Xiaoxiang Yu

[Announce] Apache Kylin 5.0.0-beta released

2023-08-30 Thread Xiaoxiang Yu
The Apache Kylin team is pleased to announce the immediate availability of
the 5.0.0-beta release.

This is the second release for Kylin 5, with 62 new features/improvements
and 124 bug fixes.

You can download the source release and binary packages from Apache Kylin's
download page: https://kylin.apache.org/5.0/docs/download

Apache Kylin is an open-source Distributed Analytical Data Warehouse for
Big Data; it was designed to provide OLAP (Online Analytical Processing)
capability in the big data era. By renovating the multi-dimensional cube
and precalculation technology on Hadoop and Spark, Kylin is able to achieve
near-constant query speed regardless of the ever-growing data volume.
Reducing query latency from minutes to sub-second, Kylin brings online
analytics back to big data.

Apache Kylin lets you query billions of rows at sub-second latency in 3
steps:
1. Identify a Star/Snowflake Schema on Hadoop.
2. Build Model from the identified tables.
3. Query using ANSI-SQL and get results in sub-second, via ODBC, JDBC or
RESTful API.

Thanks to everyone who has contributed to this release.

We welcome your help and feedback. For more information on how to report
problems, and to get involved, visit the project website at
https://kylin.apache.org/


With warm regard
Xiaoxiang Yu


[RESULT][VOTE] Release apache-kylin-5.0.0-beta (RC1)

2023-08-27 Thread Xiaoxiang Yu
Thanks to everyone who has tested the release candidate and given their
comments and votes.

The tally is as follows.

3 binding +1s:
Xiaoxiang Yu
Shaofeng Shi
Chunen Ni

No 0s or -1s.

Therefore I am delighted to announce that the proposal to release
Apache-Kylin-5.0.0-beta has passed.

With warm regard
Xiaoxiang Yu


Re: [VOTE] Release apache-kylin-5.0.0-beta (RC1)

2023-08-27 Thread Xiaoxiang Yu
Thanks everyone for voting, let me announce the result.

With warm regard
Xiaoxiang Yu



On Thu, Aug 24, 2023 at 11:18 PM George Ni  wrote:

> +1 (binding)
>
> ShaoFeng Shi  于2023年8月24日周四 22:08写道:
>
> > +1 (binding)
> >
> > I checked the source package:
> > - the sha256 hash is correct;
> > - it was signed by PMC xiaoxiang.yu;
> > - it includes the LICENSE and NOTICE file;
> > - compiles success with maven 3.6.3 and java 1.8.0_161 on macOS 11.4;
> > - test success with the command "bash dev-support/unit_testing.sh"
> >
> >
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> > Apache Kylin PMC,
> > Apache Incubator PMC,
> > Email: shaofeng...@apache.org
> >
> > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> > Join Kylin user mail group: user-subscr...@kylin.apache.org
> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> >
> >
> >
> >
> > Xiaoxiang Yu  于2023年8月23日周三 18:23写道:
> >
> > > - Basic function included data load and query tested on my Hadoop
> > > cluster(3.2.1) and succeeded.
> > > - UT passed on my macbook using "bash dev-support/unit_testing.sh".
> > (build
> > > time costs 20 minutes and test time costs 1 hour)
> > > - sha256 and gpg check succeed.
> > >
> > >
> > > 
> > > With warm regard
> > > Xiaoxiang Yu
> > >
> > >
> > >
> > > On Wed, Aug 23, 2023 at 6:06 PM Xiaoxiang Yu  wrote:
> > >
> > > > Hi all,
> > > >
> > > >
> > > > I have created a build for Apache Kylin 5.0.0-beta, release candidate
> > 1.
> > > >
> > > >
> > > > Changes highlights:
> > > >
> > > >
> > > > [KYLIN-5465] - kylin5 Embedded Dashboard for Query and Job
> > > > [KYLIN-5521] - JoinsGraph optimization: Query SQL association order
> > > change
> > > > causes the model to fail to hit
> > > > [KYLIN-5491] - Partial Log Governance
> > > > [KYLIN-5523] - Kylin5 supports using computable columns as Join Key
> and
> > > > partition columns
> > > > [KYLIN-5544] - Provides Spark DDL & DML execution capability in GUI
> > > > [KYLIN-5564] - Introduce Bloom Filter to optimize data scanning based
> > on
> > > > Spark
> > > > [KYLIN-5651] - supports obtaining table comment from Hive
> > > > [KYLIN-5634] - Support query executor expansion and contraction
> > > > [KYLIN-5550] - Auto load table when importing models
> > > > [KYLIN-5681] - SCD2 can not work
> > > > [KYLIN-5567] - Make ke-external module obey the open source specs
> > > >
> > > >
> > > >
> > > >
> > > > Thanks to everyone who has contributed to this release.
> > > > Here’s release notes:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352635
> > > >
> > > >
> > > > The commit to be voted upon:
> > > >
> > > >
> > >
> >
> https://github.com/apache/kylin/commit/8c281e70708a1d89f35dfe4a29dba584d986add3
> > > >
> > > >
> > > > Its hash is 8c281e70708a1d89f35dfe4a29dba584d986add3.
> > > >
> > > >
> > > > The artifacts to be voted on are located here:
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-beta-rc1/
> > > >
> > > >
> > > > The hash of the artifact is as follows:
> > > > apache-kylin-5.0.0-beta-source-release.zip.sha256
> > > > 2c81edd8269773baed4ee3243be6464bedecd33174a4dbd2169b8fbc1a31f053
> > > >
> > > >
> > > > apache-kylin-5.0.0-beta-bin.tar.gz.sha256
> > > > 106a43580a70699d41a1769efe3ce5e4e1c93e5b2aa13421ee4cafb30e9f5880
> > > >
> > > >
> > > > A staged Maven repository is available for review at:
> > > >
> > https://repository.apache.org/content/repositories/orgapachekylin-/
> > > >
> > > >
> > > > Release artifacts are signed with the following key:
> > > > https://people.apache.org/keys/committer/xxyu.asc
> > > >
> > > >
> > > > Please vote on releasing this package as Apache Kylin 5.0.0-beta.
> > > >
> > > >
> > > > The vote is open for the next 72 hours and passes if a majority of
> > > > at least three +1 PMC votes are cast.
> > > >
> > > >
> > > > [ ] +1 Release this package as Apache Kylin 5.0.0-beta
> > > > [ ]  0 I don't feel strongly about it, but I'm okay with the release
> > > > [ ] -1 Do not release this package because…
> > > >
> > > >
> > > > Here is my vote:
> > > >
> > > >
> > > > +1 (binding)
> > > >
> > > > --
> > > >
> > > > Best wishes to you !
> > > > From :Xiaoxiang Yu
> > >
> >
>
>
> --
>
> -
>
> Best regards,
>
>
>
> Ni Chunen / George
>


Re: [VOTE] Release apache-kylin-5.0.0-beta (RC1)

2023-08-23 Thread Xiaoxiang Yu
- Basic function included data load and query tested on my Hadoop
cluster(3.2.1) and succeeded.
- UT passed on my macbook using "bash dev-support/unit_testing.sh". (build
time costs 20 minutes and test time costs 1 hour)
- sha256 and gpg check succeed.



With warm regard
Xiaoxiang Yu



On Wed, Aug 23, 2023 at 6:06 PM Xiaoxiang Yu  wrote:

> Hi all,
>
>
> I have created a build for Apache Kylin 5.0.0-beta, release candidate 1.
>
>
> Changes highlights:
>
>
> [KYLIN-5465] - kylin5 Embedded Dashboard for Query and Job
> [KYLIN-5521] - JoinsGraph optimization: Query SQL association order change
> causes the model to fail to hit
> [KYLIN-5491] - Partial Log Governance
> [KYLIN-5523] - Kylin5 supports using computable columns as Join Key and
> partition columns
> [KYLIN-5544] - Provides Spark DDL & DML execution capability in GUI
> [KYLIN-5564] - Introduce Bloom Filter to optimize data scanning based on
> Spark
> [KYLIN-5651] - supports obtaining table comment from Hive
> [KYLIN-5634] - Support query executor expansion and contraction
> [KYLIN-5550] - Auto load table when importing models
> [KYLIN-5681] - SCD2 can not work
> [KYLIN-5567] - Make ke-external module obey the open source specs
>
>
>
>
> Thanks to everyone who has contributed to this release.
> Here’s release notes:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352635
>
>
> The commit to be voted upon:
>
> https://github.com/apache/kylin/commit/8c281e70708a1d89f35dfe4a29dba584d986add3
>
>
> Its hash is 8c281e70708a1d89f35dfe4a29dba584d986add3.
>
>
> The artifacts to be voted on are located here:
> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-beta-rc1/
>
>
> The hash of the artifact is as follows:
> apache-kylin-5.0.0-beta-source-release.zip.sha256
> 2c81edd8269773baed4ee3243be6464bedecd33174a4dbd2169b8fbc1a31f053
>
>
> apache-kylin-5.0.0-beta-bin.tar.gz.sha256
> 106a43580a70699d41a1769efe3ce5e4e1c93e5b2aa13421ee4cafb30e9f5880
>
>
> A staged Maven repository is available for review at:
> https://repository.apache.org/content/repositories/orgapachekylin-/
>
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/xxyu.asc
>
>
> Please vote on releasing this package as Apache Kylin 5.0.0-beta.
>
>
> The vote is open for the next 72 hours and passes if a majority of
> at least three +1 PMC votes are cast.
>
>
> [ ] +1 Release this package as Apache Kylin 5.0.0-beta
> [ ]  0 I don't feel strongly about it, but I'm okay with the release
> [ ] -1 Do not release this package because…
>
>
> Here is my vote:
>
>
> +1 (binding)
>
> --
>
> Best wishes to you !
> From :Xiaoxiang Yu


[VOTE] Release apache-kylin-5.0.0-beta (RC1)

2023-08-23 Thread Xiaoxiang Yu
Hi all,


I have created a build for Apache Kylin 5.0.0-beta, release candidate 1.


Changes highlights:


[KYLIN-5465] - kylin5 Embedded Dashboard for Query and Job
[KYLIN-5521] - JoinsGraph optimization: Query SQL association order change 
causes the model to fail to hit
[KYLIN-5491] - Partial Log Governance
[KYLIN-5523] - Kylin5 supports using computable columns as Join Key and 
partition columns
[KYLIN-5544] - Provides Spark DDL & DML execution capability in GUI
[KYLIN-5564] - Introduce Bloom Filter to optimize data scanning based on Spark
[KYLIN-5651] - supports obtaining table comment from Hive
[KYLIN-5634] - Support query executor expansion and contraction
[KYLIN-5550] - Auto load table when importing models
[KYLIN-5681] - SCD2 can not work
[KYLIN-5567] - Make ke-external module obey the open source specs




Thanks to everyone who has contributed to this release.
Here’s release notes:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352635
 


The commit to be voted upon:
https://github.com/apache/kylin/commit/8c281e70708a1d89f35dfe4a29dba584d986add3


Its hash is 8c281e70708a1d89f35dfe4a29dba584d986add3.


The artifacts to be voted on are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-beta-rc1/


The hash of the artifact is as follows:
apache-kylin-5.0.0-beta-source-release.zip.sha256 
2c81edd8269773baed4ee3243be6464bedecd33174a4dbd2169b8fbc1a31f053


apache-kylin-5.0.0-beta-bin.tar.gz.sha256
106a43580a70699d41a1769efe3ce5e4e1c93e5b2aa13421ee4cafb30e9f5880


A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-/


Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/xxyu.asc


Please vote on releasing this package as Apache Kylin 5.0.0-beta.


The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.


[ ] +1 Release this package as Apache Kylin 5.0.0-beta
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because…


Here is my vote:


+1 (binding)

--

Best wishes to you ! 
From :Xiaoxiang Yu

[Announce] New committer: Qian Xia(lauraxia)

2023-08-15 Thread Xiaoxiang Yu
The Project Management Committee (PMC) for Apache Kylin
has invited Qian Xia(lauraxia) to become a committer and we are pleased to 
announce that she have accepted.

Being a committer enables easier contribution to the
project since there is no need to go via the patch
submission process. This should enable better productivity.
--
Best wishes to you ! 
From :Xiaoxiang Yu

Re:Get access denied when download dependencies

2023-08-08 Thread Xiaoxiang Yu
Hi,
Sorry, I use 'kylin-5-beta-release' to prepare a release candidate for 
5.0.0-beta, the branch is not ready to public review currently. 
I will notify in dev mailing list when release candidate for 5.0.0-beta is 
ready.




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-08-07 10:18:34, "Nguyên Nguyễn Phúc"  
wrote:
>I'm trying to build a kylin 5 binary from the source branch
>kylin-5-beta-release. I get this access denied when downloading from the
>url
>https://s3.cn-north-1.amazonaws.com.cn/download-resource/kyspark/spark-newten-3.2.0-4.6.9.0.tgz.
>Are there any quick fixes on this issue?
>
>Thanks,
>Nguyen


[Announce] Apache Kylin 5.0.0-alpha released

2023-04-26 Thread Xiaoxiang Yu
The Apache Kylin team is pleased to announce the immediate availability of
the 5.0.0-alpha release.

This is the first release for Kylin 5, with 89 new features/improvements
and 112 bug fixes.

You can download the source release and binary packages from Apache Kylin's
download page: https://kylin.apache.org/5.0/docs/download

Apache Kylin is an open-source Distributed Analytical Data Warehouse for
Big Data; it was designed to provide OLAP (Online Analytical Processing)
capability in the big data era. By renovating the multi-dimensional cube
and precalculation technology on Hadoop and Spark, Kylin is able to achieve
near-constant query speed regardless of the ever-growing data volume.
Reducing query latency from minutes to sub-second, Kylin brings online
analytics back to big data.

Apache Kylin lets you query billions of rows at sub-second latency in 3
steps:
1. Identify a Star/Snowflake Schema on Hadoop.
2. Build Model from the identified tables.
3. Query using ANSI-SQL and get results in sub-second, via ODBC, JDBC or
RESTful API.

Thanks to everyone who has contributed to this release.

We welcome your help and feedback. For more information on how to report
problems, and to get involved, visit the project website at
https://kylin.apache.org/



With warm regard
Xiaoxiang Yu


[RESULT][VOTE] Release apache-kylin-5.0.0-alpha (RC3)

2023-04-25 Thread Xiaoxiang Yu
Thanks to everyone who has tested the release candidate and given
their comments and votes.

The tally is as follows.

4 binding +1s:
Xiaoxiang Yu, Shaofengshi, Li Yang, George Ni

0 non-binding +1s:

No 0s or -1s.

Therefore I am delighted to announce that the proposal to release
Apache-Kylin-5.0.0-alpha has passed.

With warm regard
Xiaoxiang Yu (http://people.apache.org/~xxyu/)


Re: [VOTE] Release apache-kylin-5.0.0-alpha (RC3)

2023-04-25 Thread Xiaoxiang Yu
Thanks everyone for your suggestions and vote, let me summarize and send
the result.

With warm regard
Xiaoxiang Yu (http://people.apache.org/~xxyu/)



On Mon, Apr 24, 2023 at 1:58 PM Li Yang  wrote:

> +1 (binding)
>
> Long anticipated. 
>
> On Sun, Apr 23, 2023 at 11:53 PM George Ni  wrote:
>
> > +1 (binding)
> >
> > ShaoFeng Shi  于2023年4月23日周日 22:12写道:
> >
> > > +1 (binding)
> > >
> > > I checked:
> > > - LICENSE, NOTICE, README.md are included;
> > > - All source file has the right header; No binary file included;
> > > - "mvn clean package -DskipTests" success on Mac with JDK 1.8;
> > > - Run "dev-support/unit_testing.sh" successfully in docker
> > > "apachekylin/release-machine";
> > > - Signature is by Xiaoxiang Yu;
> > > - SHA256 is correct;
> > >
> > > Best regards,
> > >
> > > Shaofeng Shi 史少锋
> > > Apache Kylin PMC,
> > > Apache Incubator PMC,
> > > Email: shaofeng...@apache.org
> > >
> > > Apache Kylin FAQ:
> https://kylin.apache.org/docs/gettingstarted/faq.html
> > > Join Kylin user mail group: user-subscr...@kylin.apache.org
> > > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> > >
> > >
> > >
> > >
> > > Xiaoxiang Yu  于2023年4月23日周日 10:49写道:
> > >
> > > > Hi all,
> > > >
> > > > I have created a build for Apache Kylin 5.0.0-alpha, release
> candidate
> > 3.
> > > >
> > > > Changes highlights:
> > > >
> > > > [KYLIN-5216] - Upgrade metadata and engine : Kylin 5.0
> > > > [KYLIN-5397] - Support sum_lc function
> > > > [KYLIN-5387] - Index planner phase 1 for kylin 5
> > > > [KYLIN-5443] - Optimize the ER diagram of the model list page
> > > > [KYLIN-5459] - Partial Log Governance
> > > > [KYLIN-5374] - Provides Spark DDL & DML execution capability in GUI
> > > > [KYLIN-5309] - Propose more flexible runtime join scenarios for Kylin
> > > > [KYLIN-5297] - Add Kylin5 Jdbc module
> > > > [KYLIN-5390] - Build tasks support segment coverage
> > > > [KYLIN-5417] - streaming custom data parser
> > > > [KYLIN-5235] - Add doc for How to write document
> > > > [KYLIN-5236] - Add doc for how to contribute
> > > > [KYLIN-5249] - Add node status notification on web ui
> > > > [KYLIN-5256] - Add a cache for the system property get by the
> optional
> > > > config in KylinConfigBase
> > > > [KYLIN-5274] - Improve performance of getSubstitutor
> > > > [KYLIN-5361] - suggest set email content from hard code to
> configurable
> > > > files
> > > >
> > > >
> > > > Thanks to everyone who has contributed to this release.
> > > >
> > > > Here are the release notes:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352155
> > > >
> > > > The commit to being voted upon:
> > > >
> > > >
> > >
> >
> https://github.com/apache/kylin/commit/4aef5cbce3313d40851470a42bff3b5a6827f609
> > > > Its hash is 4aef5cbce3313d40851470a42bff3b5a6827f609.
> > > >
> > > > The artifacts to be voted on, including the source package and one
> > > > pre-compiled binary package are located here:
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-alpha-rc3/
> > > >
> > > > The hash of the artifacts are as follows:
> > > > apache-kylin-5.0.0-alpha-source-release.zip.sha256
> > > > 0bafc66e9e9facdb10bdb0433cc47d69e3434b0801dc8b47b28dcbece65092d9
> > > >
> > > > apache-kylin-5.0.0-alpha-bin.tar.gz.sha256
> > > > 6b778ac63c6470fd02779aca27280e27aa29a4a5c4db62593f684de913a59d41
> > > >
> > > > A staged Maven repository is available for review at:
> > > >
> > https://repository.apache.org/content/repositories/orgapachekylin-1109/
> > > >
> > > > Release artifacts are signed with the following key:
> > > > https://people.apache.org/keys/committer/xxyu.asc
> > > >
> > > >
> > > >
> > > >
> > > > Please vote on releasing this package as Apache Kylin 5.0.0-alpha.
> > > >
> > > > The vote is open for the next 72 hours and passes if a majority of
> > > > at least three +1 PMC votes are cast.
> > > >
> > > > [ ] +1 Release this package as Apache Kylin 5.0.0-alpha
> > > >
> > > > [ ] 0 I don't feel strongly about it, but I'm okay with the release
> > > >
> > > > [ ] -1 Do not release this package because...
> > > >
> > > >
> > > > Here is my vote:
> > > >
> > > > +1 (binding)
> > > >
> > > >
> > > > 
> > > > With warm regard
> > > > Xiaoxiang Yu (http://people.apache.org/~xxyu/)
> > > >
> > >
> >
> >
> > --
> >
> > -
> >
> > Best regards,
> >
> >
> >
> > Ni Chunen / George
> >
>


[VOTE] Release apache-kylin-5.0.0-alpha (RC3)

2023-04-22 Thread Xiaoxiang Yu
Hi all,

I have created a build for Apache Kylin 5.0.0-alpha, release candidate 3.

Changes highlights:

[KYLIN-5216] - Upgrade metadata and engine : Kylin 5.0
[KYLIN-5397] - Support sum_lc function
[KYLIN-5387] - Index planner phase 1 for kylin 5
[KYLIN-5443] - Optimize the ER diagram of the model list page
[KYLIN-5459] - Partial Log Governance
[KYLIN-5374] - Provides Spark DDL & DML execution capability in GUI
[KYLIN-5309] - Propose more flexible runtime join scenarios for Kylin
[KYLIN-5297] - Add Kylin5 Jdbc module
[KYLIN-5390] - Build tasks support segment coverage
[KYLIN-5417] - streaming custom data parser
[KYLIN-5235] - Add doc for How to write document
[KYLIN-5236] - Add doc for how to contribute
[KYLIN-5249] - Add node status notification on web ui
[KYLIN-5256] - Add a cache for the system property get by the optional
config in KylinConfigBase
[KYLIN-5274] - Improve performance of getSubstitutor
[KYLIN-5361] - suggest set email content from hard code to configurable
files


Thanks to everyone who has contributed to this release.

Here are the release notes:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352155

The commit to being voted upon:
https://github.com/apache/kylin/commit/4aef5cbce3313d40851470a42bff3b5a6827f609
Its hash is 4aef5cbce3313d40851470a42bff3b5a6827f609.

The artifacts to be voted on, including the source package and one
pre-compiled binary package are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-alpha-rc3/

The hash of the artifacts are as follows:
apache-kylin-5.0.0-alpha-source-release.zip.sha256
0bafc66e9e9facdb10bdb0433cc47d69e3434b0801dc8b47b28dcbece65092d9

apache-kylin-5.0.0-alpha-bin.tar.gz.sha256
6b778ac63c6470fd02779aca27280e27aa29a4a5c4db62593f684de913a59d41

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1109/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/xxyu.asc




Please vote on releasing this package as Apache Kylin 5.0.0-alpha.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 5.0.0-alpha

[ ] 0 I don't feel strongly about it, but I'm okay with the release

[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)



With warm regard
Xiaoxiang Yu (http://people.apache.org/~xxyu/)


[CANCELLED][VOTE] Release apache-kylin-5.0.0-alpha (RC2)

2023-04-22 Thread Xiaoxiang Yu
This vote is cancelled due to not enough votes.

With warm regard
Xiaoxiang Yu (http://people.apache.org/~xxyu/)


[VOTE] Release apache-kylin-5.0.0-alpha (RC2)

2023-04-12 Thread Xiaoxiang Yu
Hi all,

I have created a build for Apache Kylin 5.0.0-alpha, release candidate 2.

Changes highlights:

[KYLIN-5216] - Upgrade metadata and engine : Kylin 5.0
[KYLIN-5397] - Support sum_lc function
[KYLIN-5387] - Index planner phase 1 for kylin 5
[KYLIN-5443] - Optimize the ER diagram of the model list page
[KYLIN-5459] - Partial Log Governance
[KYLIN-5374] - Provides Spark DDL & DML execution capability in GUI
[KYLIN-5309] - Propose more flexible runtime join scenarios for Kylin
[KYLIN-5297] - Add Kylin5 Jdbc module
[KYLIN-5390] - Build tasks support segment coverage
[KYLIN-5417] - streaming custom data parser
[KYLIN-5235] - Add doc for How to write document
[KYLIN-5236] - Add doc for how to contribute
[KYLIN-5249] - Add node status notification on web ui
[KYLIN-5256] - Add a cache for the system property get by the optional
config in KylinConfigBase
[KYLIN-5274] - Improve performance of getSubstitutor
[KYLIN-5361] - suggest set email content from hard code to configurable
files


Thanks to everyone who has contributed to this release.

Here are the release notes:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352155

The commit to being voted upon:
https://github.com/apache/kylin/commit/32dfef3fd6e0efc283b89ff6d5750feafc0deef4
Its hash is 32dfef3fd6e0efc283b89ff6d5750feafc0deef4.

The artifacts to be voted on, including the source package and one
pre-compiled binary package are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-alpha-rc2/

The hash of the artifacts are as follows:
apache-kylin-5.0.0-alpha-source-release.zip.sha256
1a24d864b8b637d73a9befdaf7987ff80de47b10764c8c2457eec6ee7575721c

apache-kylin-5.0.0-alpha-bin.tar.gz.sha256
6b778ac63c6470fd02779aca27280e27aa29a4a5c4db62593f684de913a59d41

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1107/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/xxyu.asc




Please vote on releasing this package as Apache Kylin 5.0.0-alpha.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 5.0.0-alpha

[ ] 0 I don't feel strongly about it, but I'm okay with the release

[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)


With warm regard
Xiaoxiang Yu


Re: Create EMR Kylin cluster using metadata and cubes stored in s3

2023-04-10 Thread Xiaoxiang Yu
Hi,

First, Kylin 4 no longer depends on HBase, so you do not need HBase to be
included in EMR.
I suggest you use s3 as cube storage and AWS RDS as metadata storage.
(Actually, Kylin4 only supports RDBMS as metadata storage.)

Here is a step by step guide which shows how to install Kylin 4 on EMR 5.33
,
but it is written in Chinese, I guess you may use deepl to translate it.
https://blog.csdn.net/mukvintt/article/details/120152854

Besides that, Kylin 5 is also on the way to release and it has more
advantages than Kylin 4.
Maybe you can have a try.
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-alpha-rc1/



Xiaoxiang Yu, Apache Kylin PMC
http://people.apache.org/~xxyu/



On Tue, Apr 11, 2023 at 8:24 AM Rodriguez, Gabriela <
gabriela.rodrig...@dowjones.com> wrote:

> Good afternoon,Currently, our team is evaluating the usage of Apache Kylin,
> we are facing some issues and we would like to get some guidance.We have
> installed Kylin v4.0.3 in an EMR cluster (v 6.5.0) and we are storing the
> metadata and kylin cubes inside s3 buckets, these are the properties set in
> conf/kylin.properties to store the metadata in our bucket:
> kylin.env.hdfs-working-dir=s3://BUCKET/kylin
> kylin.storage.hbase.cluster-fs=s3://BUCKET/storageAlso we have added this
> property when creating the cluster:
>
> hbase.rootdir": "s3://BUCKET/hbase/data",
>
> Whenever we want to create a new EMR kylin cluster, how can we create it
> using the metadata and cubes stored in s3?These are the versions used:
>
> hadoop 3.2.1
> hive 3.1.2
> spark 3.1.2,
> hbase 2.4.4
> zookeeper 3.5.7
>
> Regards,
> Gabriela Rodriguez
>


[CANCELLED][VOTE] Release apache-kylin-5.0.0-alpha (RC1)

2023-04-07 Thread Xiaoxiang Yu
This vote is cancelled due to apache rules.

Xiaoxiang Yu, Apache Kylin PMC
http://people.apache.org/~xxyu/


Re: [VOTE] Release apache-kylin-5.0.0-alpha (RC1)

2023-04-07 Thread Xiaoxiang Yu
Thanks, I will cancel this vote .

Xiaoxiang Yu, Apache Kylin PMC
http://people.apache.org/~xxyu/



On Fri, Apr 7, 2023 at 2:56 PM ShaoFeng Shi  wrote:

> Thanks Xiaoxiang for preparing this.
>
> While I have to give a -1, as I find several binary files included in the
> source package:
>
> - build/async-profiler-lib has several .so binary files;
> - jdbc/libs has several log4j jar files;
> - spark-project/examples has a zip file;
> - dev-support/local/images has several PNG files;
>
> Besides, the "kylin-it" and "examples" folders are beyond 100MB, which
> makes the source release package is > 100MB, which is not reasonable; Seems
> they include many sample data; If they are only for demo or test case
> purpose, try to exclude them from the source release.
>
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC,
> Apache Incubator PMC,
> Email: shaofeng...@apache.org
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> Xiaoxiang Yu  于2023年4月7日周五 14:25写道:
>
> > 1. UT has passed in my test env.
> > For some reasons, runing UT required some paramters, so I used scripts
> > dev-support/unit_testing.sh
> > (
> >
> https://github.com/apache/kylin/blob/kylin-5.0.0-alpha/dev-support/unit_testing.sh
> )
> > on
> > docker container(https://hub.docker.com/r/apachekylin/release-machine ).
> >
> >
> > 2. Happy path(including modeling and query etc) passed in my Hadoop
> > env(Hadoop 3.2.1, Hive 3.1.2).
> > Please note, before start Kylin by executing 'bin/kylin.sh start', please
> > execute 'sbin/download-spark-user.sh' first .
> > Current release candidate binary requires a specific Spark which
> > downloaded by sbin/download-spark-user.sh.
> >
> >
> >
> >
> > --
> >
> > Best wishes to you !
> > From :Xiaoxiang Yu
> >
> >
> >
> >
> >
> > At 2023-04-07 12:10:00, "Xiaoxiang Yu"  wrote:
> > >Hi all,
> > >
> > >I have created a build for Apache Kylin 5.0.0-alpha, release candidate
> 1.
> > >Changes highlights:
> > >
> > >[KYLIN-5216] - Upgrade metadata and engine : Kylin 5.0
> > >[KYLIN-5397] - Support sum_lc function
> > >[KYLIN-5387] - Index planner phase 1 for kylin 5
> > >[KYLIN-5443] - Optimize the ER diagram of the model list page
> > >[KYLIN-5459] - Partial Log Governance
> > >[KYLIN-5374] - Provides Spark DDL & DML execution capability in GUI
> > >[KYLIN-5309] - Propose more flexible runtime join scenarios for Kylin
> > >[KYLIN-5297] - Add Kylin5 Jdbc module
> > >[KYLIN-5390] - Build tasks support segment coverage
> > >[KYLIN-5417] - streaming custom data parser
> > >[KYLIN-5235] - Add doc for How to write document
> > >[KYLIN-5236] - Add doc for how to contribute
> > >[KYLIN-5249] - Add node status notification on web ui
> > >[KYLIN-5256] - Add a cache for the system property get by the optional
> > >config in KylinConfigBase
> > >[KYLIN-5274] - Improve performance of getSubstitutor
> > >[KYLIN-5361] - suggest set email content from hard code to configurable
> > >files
> > >
> > >
> > >
> > >Thanks to everyone who has contributed to this release.
> > >
> > >Here are the release notes:
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352155
> > >
> > >The commit to being voted upon:
> > >
> >
> https://github.com/apache/kylin/commit/efc9170e2f068dfe925007b11310efc32461b68e
> > >Its hash is efc9170e2f068dfe925007b11310efc32461b68e.
> > >
> > >The artifacts to be voted on, including the source package and one
> > >pre-compiled binary package are located here:
> > >
> >
> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-alpha-rc1/
> > >
> > >The hash of the artifacts are as follows:
> > >apache-kylin-5.0.0-alpha-source-release.zip.sha256
> > >1f5742861b487229355343815a0373da87e05f7203de1f729a50a0858f0c8204
> > >
> > >apache-kylin-5.0.0-alpha-bin.tar.gz.sha256
> > >5af01b182396533e7bccebb1b1aef7135228f2f64b2ba2017447e7335b5a40b4
> > >
> > >A staged Maven repository is available for review at:
> > >https://repository.apache.org/content/repositories/orgapachekylin-1106/
> > >
> > >Release artifacts are signed with the following key:
> > >https://people.apache.org/keys/committer/xxyu.asc
> > >
> > >
> > >
> > >Please vote on releasing this package as Apache Kylin 5.0.0-alpha.
> > >
> > >The vote is open for the next 72 hours and passes if a majority of
> > >at least three +1 PMC votes are cast.
> > >
> > >[ ] +1 Release this package as Apache Kylin 5.0.0-alpha
> > >[ ] 0 I don't feel strongly about it, but I'm okay with the release
> > >[ ] -1 Do not release this package because...
> > >
> > >Here is my vote:
> > >+1 (binding)
> > >
> > >Xiaoxiang Yu, Apache Kylin PMC
> > >http://people.apache.org/~xxyu/
> >
>


Re:[VOTE] Release apache-kylin-5.0.0-alpha (RC1)

2023-04-07 Thread Xiaoxiang Yu
1. UT has passed in my test env. 
For some reasons, runing UT required some paramters, so I used scripts 
dev-support/unit_testing.sh
(https://github.com/apache/kylin/blob/kylin-5.0.0-alpha/dev-support/unit_testing.sh)
 on 
docker container(https://hub.docker.com/r/apachekylin/release-machine ).


2. Happy path(including modeling and query etc) passed in my Hadoop env(Hadoop 
3.2.1, Hive 3.1.2). 
Please note, before start Kylin by executing 'bin/kylin.sh start', please 
execute 'sbin/download-spark-user.sh' first . 
Current release candidate binary requires a specific Spark which downloaded by 
sbin/download-spark-user.sh.




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-04-07 12:10:00, "Xiaoxiang Yu"  wrote:
>Hi all,
>
>I have created a build for Apache Kylin 5.0.0-alpha, release candidate 1.
>Changes highlights:
>
>[KYLIN-5216] - Upgrade metadata and engine : Kylin 5.0
>[KYLIN-5397] - Support sum_lc function
>[KYLIN-5387] - Index planner phase 1 for kylin 5
>[KYLIN-5443] - Optimize the ER diagram of the model list page
>[KYLIN-5459] - Partial Log Governance
>[KYLIN-5374] - Provides Spark DDL & DML execution capability in GUI
>[KYLIN-5309] - Propose more flexible runtime join scenarios for Kylin
>[KYLIN-5297] - Add Kylin5 Jdbc module
>[KYLIN-5390] - Build tasks support segment coverage
>[KYLIN-5417] - streaming custom data parser
>[KYLIN-5235] - Add doc for How to write document
>[KYLIN-5236] - Add doc for how to contribute
>[KYLIN-5249] - Add node status notification on web ui
>[KYLIN-5256] - Add a cache for the system property get by the optional
>config in KylinConfigBase
>[KYLIN-5274] - Improve performance of getSubstitutor
>[KYLIN-5361] - suggest set email content from hard code to configurable
>files
>
>
>
>Thanks to everyone who has contributed to this release.
>
>Here are the release notes:
>https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352155
>
>The commit to being voted upon:
>https://github.com/apache/kylin/commit/efc9170e2f068dfe925007b11310efc32461b68e
>Its hash is efc9170e2f068dfe925007b11310efc32461b68e.
>
>The artifacts to be voted on, including the source package and one
>pre-compiled binary package are located here:
>https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-alpha-rc1/
>
>The hash of the artifacts are as follows:
>apache-kylin-5.0.0-alpha-source-release.zip.sha256
>1f5742861b487229355343815a0373da87e05f7203de1f729a50a0858f0c8204
>
>apache-kylin-5.0.0-alpha-bin.tar.gz.sha256
>5af01b182396533e7bccebb1b1aef7135228f2f64b2ba2017447e7335b5a40b4
>
>A staged Maven repository is available for review at:
>https://repository.apache.org/content/repositories/orgapachekylin-1106/
>
>Release artifacts are signed with the following key:
>https://people.apache.org/keys/committer/xxyu.asc
>
>
>
>Please vote on releasing this package as Apache Kylin 5.0.0-alpha.
>
>The vote is open for the next 72 hours and passes if a majority of
>at least three +1 PMC votes are cast.
>
>[ ] +1 Release this package as Apache Kylin 5.0.0-alpha
>[ ] 0 I don't feel strongly about it, but I'm okay with the release
>[ ] -1 Do not release this package because...
>
>Here is my vote:
>+1 (binding)
>
>Xiaoxiang Yu, Apache Kylin PMC
>http://people.apache.org/~xxyu/


[VOTE] Release apache-kylin-5.0.0-alpha (RC1)

2023-04-06 Thread Xiaoxiang Yu
Hi all,

I have created a build for Apache Kylin 5.0.0-alpha, release candidate 1.
Changes highlights:

[KYLIN-5216] - Upgrade metadata and engine : Kylin 5.0
[KYLIN-5397] - Support sum_lc function
[KYLIN-5387] - Index planner phase 1 for kylin 5
[KYLIN-5443] - Optimize the ER diagram of the model list page
[KYLIN-5459] - Partial Log Governance
[KYLIN-5374] - Provides Spark DDL & DML execution capability in GUI
[KYLIN-5309] - Propose more flexible runtime join scenarios for Kylin
[KYLIN-5297] - Add Kylin5 Jdbc module
[KYLIN-5390] - Build tasks support segment coverage
[KYLIN-5417] - streaming custom data parser
[KYLIN-5235] - Add doc for How to write document
[KYLIN-5236] - Add doc for how to contribute
[KYLIN-5249] - Add node status notification on web ui
[KYLIN-5256] - Add a cache for the system property get by the optional
config in KylinConfigBase
[KYLIN-5274] - Improve performance of getSubstitutor
[KYLIN-5361] - suggest set email content from hard code to configurable
files



Thanks to everyone who has contributed to this release.

Here are the release notes:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352155

The commit to being voted upon:
https://github.com/apache/kylin/commit/efc9170e2f068dfe925007b11310efc32461b68e
Its hash is efc9170e2f068dfe925007b11310efc32461b68e.

The artifacts to be voted on, including the source package and one
pre-compiled binary package are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-5.0.0-alpha-rc1/

The hash of the artifacts are as follows:
apache-kylin-5.0.0-alpha-source-release.zip.sha256
1f5742861b487229355343815a0373da87e05f7203de1f729a50a0858f0c8204

apache-kylin-5.0.0-alpha-bin.tar.gz.sha256
5af01b182396533e7bccebb1b1aef7135228f2f64b2ba2017447e7335b5a40b4

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1106/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/xxyu.asc



Please vote on releasing this package as Apache Kylin 5.0.0-alpha.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 5.0.0-alpha
[ ] 0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...

Here is my vote:
+1 (binding)
----
Xiaoxiang Yu, Apache Kylin PMC
http://people.apache.org/~xxyu/


Re:Question about install Kylin in CDH7.1.7

2023-03-22 Thread Xiaoxiang Yu
Could you please paste text of error message to email?




--

Best wishes to you ! 
From :Xiaoxiang Yu




At 2023-03-20 17:46:03, "Chu, Lea"  wrote:

Hi Developers from Kylin,

 

Greeting from Lea, data engineer in Garmin Taiwan.

I saw that the newest CDH version passed installation tests is CDH 6.3.2 in 
Kylin official website. But our Hadoop cluster installs in Cloudera private 
cloud version 7.1.7. I still try to install Kylin 4.0.0 binary package on the 
master node in CDH7.1.7 environment. Unfortunately, the error messages occurred 
when I tried to build a cube, which show below. It seems that the jar 
“hadoop-common” only supports up to 3.0.0, rather than 3.1.1.

Did any developers face this situation before or can give me some advice about 
this?

I’m looking forward to your replies. Thank you.

 

 

Regards,

Lea Chu

Mail: lea@garmin.com

 

 

Re: [VOTE] [Kylin] Accept donation of Kylin New Modeling System

2023-03-02 Thread Xiaoxiang Yu
Thanks for notification.


The file has been moved to 
https://github.com/apache/kylin/blob/doc5.0/website/blog/2022-12-18-Introduction_of_Metadata/protocol-buffer/metadata.proto
 .


And there is another technical article(Chinese ver) wrote by Pengfei, 
https://kylin.apache.org/5.0/blog/introduction_of_metadata_cn, talks the 
details of "new metadata design" .




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-03-03 10:17:08, "Cheng Pan"  wrote:
> +1 (non-binding), glad to see the Apache Kylin community becomes active
>again.
>
>BTW, the link of "New metadata design"[1] in README seems broken?
>
>[1]
>https://github.com/apache/kylin/blob/kylin5/document/protocol-buffer/metadata.proto
>
>Thanks,
>Cheng Pan
>
>
>On Mar 2, 2023 at 21:54:18, ShaoFeng Shi  wrote:
>
>> The Apache Kylin PMC has voted [1] to accept the donation of the
>> new modeling system.
>>
>> The donation initially was a commit [2]  on a separate branch; under the
>> review
>> and collaboration with the community in the past several months, the
>> lastest
>> version is at branch "kylin5" [3].
>>
>> The code was developed by the Kyligence, which has signed the CCLA to the
>> secretary, and all contributors for the codebase have ICLA.
>>
>> Here is the IP clearance form:
>>
>> https://incubator.apache.org/ip-clearance/kylin-new-modeling-system.html
>>
>>
>> This lazy consensus vote will be open for at least 72 hours.
>>
>>
>> [1] https://lists.apache.org/thread/2kzld4glj50ojshhpx3m63zsvxm8ps5l
>> [2]
>>
>> https://github.com/apache/kylin/commit/edab8698b6a9770ddc4cd00d9788d718d032b5e8
>> [3] https://github.com/apache/kylin/commits/kylin5
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC,
>> Apache Incubator PMC,
>> Email: shaofeng...@apache.org
>>
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>


[RESULT][VOTE] Accept donation of new codebase as Apache Kylin 5.0

2023-03-02 Thread Xiaoxiang Yu
Thanks to everyone who has tested the release candidate and given their 

comments and votes.






The tally is as follows.




4 binding +1s:




Xiaoxiang Yu

Yanghong Zhong

Chunen Ni

Shaofeng Shi




No 0s or -1s.







Therefore I am delighted to announce that the proposal has passed.




- https://lists.apache.org/thread/2kzld4glj50ojshhpx3m63zsvxm8ps5l






--

Best wishes to you ! 
From :Xiaoxiang Yu

Re: Support DDL and DML for kylin5

2023-03-02 Thread Xiaoxiang Yu
Thanks for reply. My workmates and I think it is a good feature. And I have 
another two questions.


1. I think it(KYLIN-5466) is not a small task, will you divide it into some 
small tasks(or PRs) in your ticket? Will you set ETA for this feature ?
2. Have you decided how to impelment your new feature? Will you provided some 
technical detail in your google doc? 


Thx.

--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-03-02 12:59:28, "Yang Jiang"  wrote:
>Hi Xiaoxiang:
>
>  Thanks for the comment. 
>  > 1. Will this DDL provide way to add/edit/delete RulebaseIndex(aggregate 
> group) for Model?   
>  For now, not supported, But it doesn't affect in future development.
>  > 2. Will this DDL provide way to add model level config (Hive provide such 
> grammar : TBLPROPERTIES (property_name=property_value, ...) ) 
> Will support in  `ALTER `  
>
>In our environment ,The most common users case is:
>*show job status
>SHOW JOB job_id;
>*trigger build 
>  INSERT PARTITION project_name.model_test (2023-01-01, 2023-02-10);
>*create model
>
>*delete model
>  DELETE MODEL project_name.model_name;
>*cancel job
>  CANCEL JOB job_id;
> 
>
>On 2023/03/02 02:16:15 Xiaoxiang Yu wrote:
>> Hi Yang Jiang,
>> 
>> 
>> I have check this google 
>> doc(https://docs.google.com/document/d/1Wa5Ih-rKi2Uqqg8cLfUqwuZBEfEryMwXcD55ezER_Fg
>>  ) .I think it is a good feature to me. Here is my questions:
>> 1. Will this DDL provide way to add/edit/delete RulebaseIndex(aggregate 
>> group) for Model?
>> 2. Will this DDL provide way to add model level config (Hive provide such 
>> grammar : TBLPROPERTIES (property_name=property_value, ...) ) 
>> 
>> 
>> 
>> --
>> 
>> Best wishes to you ! 
>> From :Xiaoxiang Yu
>> 
>> 
>> 
>> 
>> 
>> At 2023-03-02 09:25:30, "Yang Jiang"  wrote:
>> >Hi Shaofeng:
>> >   Currently not, kylin will only read meta data form hive. Only create 
>> > meta data in kylin system like `model`, `job` ...
>> > 
>> >On 2023/03/01 11:57:32 ShaoFeng Shi wrote:
>> >> Hi Yang,
>> >> 
>> >> This is cool. Currently Kylin only sync the table metadata from Hive
>> >> metastore, which increases the complexity for the user. This proposal will
>> >> make it easier I think.
>> >> 
>> >> A quick question: if support this feature, will Kylin write back the 
>> >> schema
>> >> info to Hive? Thank you!
>> >> 
>> >> Best regards,
>> >> 
>> >> Shaofeng Shi 史少锋
>> >> Apache Kylin PMC,
>> >> Apache Incubator PMC,
>> >> Email: shaofeng...@apache.org
>> >> 
>> >> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> >> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> >> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>> >> 
>> >> 
>> >> 
>> >> 
>> >> Jiang, Yang  于2023年2月28日周二 14:06写道:
>> >> 
>> >> > Hi
>> >> >I submit a new feature about introduce DML and DDL in kylin5 in
>> >> > https://issues.apache.org/jira/browse/KYLIN-5466.
>> >> > Any comments are welcome!
>> >> >
>> >> > From Ted-Jiang
>> >> >
>> >> >
>> >> 
>> 


Re: Support DDL and DML for kylin5

2023-03-01 Thread Xiaoxiang Yu
Hi Yang Jiang,


I have check this google 
doc(https://docs.google.com/document/d/1Wa5Ih-rKi2Uqqg8cLfUqwuZBEfEryMwXcD55ezER_Fg
 ) .I think it is a good feature to me. Here is my questions:
1. Will this DDL provide way to add/edit/delete RulebaseIndex(aggregate group) 
for Model?
2. Will this DDL provide way to add model level config (Hive provide such 
grammar : TBLPROPERTIES (property_name=property_value, ...) ) 



--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-03-02 09:25:30, "Yang Jiang"  wrote:
>Hi Shaofeng:
>   Currently not, kylin will only read meta data form hive. Only create meta 
> data in kylin system like `model`, `job` ...
> 
>On 2023/03/01 11:57:32 ShaoFeng Shi wrote:
>> Hi Yang,
>> 
>> This is cool. Currently Kylin only sync the table metadata from Hive
>> metastore, which increases the complexity for the user. This proposal will
>> make it easier I think.
>> 
>> A quick question: if support this feature, will Kylin write back the schema
>> info to Hive? Thank you!
>> 
>> Best regards,
>> 
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC,
>> Apache Incubator PMC,
>> Email: shaofeng...@apache.org
>> 
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>> 
>> 
>> 
>> 
>> Jiang, Yang  于2023年2月28日周二 14:06写道:
>> 
>> > Hi
>> >I submit a new feature about introduce DML and DDL in kylin5 in
>> > https://issues.apache.org/jira/browse/KYLIN-5466.
>> > Any comments are welcome!
>> >
>> > From Ted-Jiang
>> >
>> >
>> 


[VOTE] Accept donation of new codebase as Apache Kylin 5.0

2023-02-23 Thread Xiaoxiang Yu
Hi,


The Kylin community plans to design and implement new modeling system.


The new codebase[2] is now receiving regular contributions from Kyligence, 
and now eBay have contributed some features as well .


As we discussed[1] before, I propose that we should now accept this new codebase
 to Apache Kylin and continue development as part of the Kylin.
All contributors have signed ICLAs and there is a branch with the donation[2].


This vote is to determine if the Kylin PMC is in favor of accepting this
donation. If the vote passes, the PMC and the authors of the code will work
together to complete the ASF IP Clearance process (
http://incubator.apache.org/ip-clearance/) .


I have filled out the first draft of the IP clearance form and submitted it
to the svn repo [3].




[ ] +1 : Accept donation of new codebase as Apache Kylin 5.0
[ ] 0  : No opinion 
[ ] -1 : Reject contribution because...


Here is my vote: +1


The vote will be open for at least 72 hours.




[1] https://lists.apache.org/thread/trvv0g2mqcq2bv6x2zpdvjl8hhzcccqr   
[2] https://github.com/apache/kylin/tree/kylin5 
[3] 
https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/kylin-new-modeling-system.xml



--

Best wishes to you ! 
From :Xiaoxiang Yu

[jira] [Created] (KYLIN-5464) Index Planner for Kylin 5.0

2023-02-23 Thread Xiaoxiang Yu (Jira)
Xiaoxiang Yu created KYLIN-5464:
---

 Summary: Index Planner for Kylin 5.0
 Key: KYLIN-5464
 URL: https://issues.apache.org/jira/browse/KYLIN-5464
 Project: Kylin
  Issue Type: New Feature
  Components: Job Engine
Affects Versions: 5.0-alpha
Reporter: Xiaoxiang Yu
Assignee: Kun Liu
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[CANCELL][VOTE] Accept donation of new codebase as Apache Kylin 5.0

2023-02-20 Thread Xiaoxiang Yu
Due to a typo error inside the voting email, I will cancel this vote.




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-02-20 18:09:15, "Xiaoxiang Yu"  wrote:
>Hi,
>
>
>
>
>The Kylin community plans to design and implement new modeling system.
>
>
>
>
>The new codebase[2] is now receiving regular contributions from Kyligence, 
>
>and now eBay have contributed some features as well .
>
>
>
>
>As we discussed[1] before, I propose that we should now accept this new 
>codebase
>
> to Apache Kylin and continue development as part of the Kylin.
>
>
>
>
>All contributors have signed ICLAs and there is a branch with the donation[2].
>
>
>
>
>This vote is to determine if the Kylin PMC is in favor of accepting this
>
>donation. If the vote passes, the PMC and the authors of the code will work
>
>together to complete the ASF IP Clearance process (
>
>http://incubator.apache.org/ip-clearance/) .
>
>
>
>
>I have filled out the first draft of the IP clearance form and submitted it
>
>to the svn repo [3].
>
>
>
>
>[ ] +1 : Accept contribution of datafusion-substrait crate 
>
>[ ] 0 : No opinion 
>
>[ ] -1 : Reject contribution because...
>
>
>
>
>Here is my vote: +1
>
>
>
>
>The vote will be open for at least 72 hours.
>
>
>
>
>
>
>
>
>
>[1] https://lists.apache.org/thread/trvv0g2mqcq2bv6x2zpdvjl8hhzcccqr 
>
>[2] https://github.com/apache/kylin/tree/kylin5
>
>[3] 
>https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/kylin-new-modeling-system.xml
>
>
>
>
>--
>
>Best wishes to you ! 
>From :Xiaoxiang Yu


[VOTE] Accept donation of new codebase as Apache Kylin 5.0

2023-02-20 Thread Xiaoxiang Yu
Hi,




The Kylin community plans to design and implement new modeling system.




The new codebase[2] is now receiving regular contributions from Kyligence, 

and now eBay have contributed some features as well .




As we discussed[1] before, I propose that we should now accept this new codebase

 to Apache Kylin and continue development as part of the Kylin.




All contributors have signed ICLAs and there is a branch with the donation[2].




This vote is to determine if the Kylin PMC is in favor of accepting this

donation. If the vote passes, the PMC and the authors of the code will work

together to complete the ASF IP Clearance process (

http://incubator.apache.org/ip-clearance/) .




I have filled out the first draft of the IP clearance form and submitted it

to the svn repo [3].




[ ] +1 : Accept contribution of datafusion-substrait crate 

[ ] 0 : No opinion 

[ ] -1 : Reject contribution because...




Here is my vote: +1




The vote will be open for at least 72 hours.









[1] https://lists.apache.org/thread/trvv0g2mqcq2bv6x2zpdvjl8hhzcccqr 

[2] https://github.com/apache/kylin/tree/kylin5

[3] 
https://svn.apache.org/repos/asf/incubator/public/trunk/content/ip-clearance/kylin-new-modeling-system.xml




--

Best wishes to you ! 
From :Xiaoxiang Yu

Re: [Discuss] Adopt the new codebase as Apache Kylin 5.0

2023-02-16 Thread Xiaoxiang Yu
Thanks Yanghong's reply. It looks like native vectorized execution engine which 
developed by eBay has greater perform than before.
We hope new codebase will help to make kylin better and better.




--

Best wishes to you ! 
From :Xiaoxiang Yu




At 2023-02-17 00:43:10, "Zhong, Yanghong"  wrote:

Thanks Xiaoxiang for raising this discussion.

 

From my point of view, it’s time for our Kylin community to take a big step to 
make Kylin great again. I totally agree with Xiaoxiang’s proposal of adopting 
the new codebase of the branch https://github.com/apache/kylin/tree/kylin5, 
which is mainly contributed by the Kyligence.

 

The Kylin team inside eBay also has been working on that branch together with 
the contributors from Kyligence for nearly half a year. And we just made an 
internal alpha release based on that branch and get lots of positive feedbacks 
from our trial users. The main breakthrough of Kylin5 is the change from the 
cube-based to the index-based. Therefore, now we can make our focus on how to 
make Kylin as an excellent index management system. Compared to the Kylin4, the 
current Kylin5 has already owned the following advantages:

Low barrier of creating a new index model. Indexes are optional and can be 
adjusted iteratively.
Easy and transparent index model upgrade. Dimension and measure change on 
existing index model can be with no downtime.
Raw data query can be supported by a new kind of index, projection index.
No limitation of the dimension number for the aggregation index (previous 
cuboid).

 

At the meanwhile, the code of Kylin5 also has been refactored so that it makes 
it’s much easier to introduce different execution engines for querying on Kylin 
indexes. The Kylin team inside eBay has been working on a native vectorized 
execution engine (DataFusion+Ballista) with the Apache Arrow community for more 
than one year. And it’s used in our internal alpha release. We get an 
incredibly excellent result. Benchmarking on the SSB testing data set(1TB, 
6Billion rows), our version can have average 20 times query performance gain 
compared to the Kylin3.

 

 

Also thanks George for raising your concerns. Actually, the eBay Kylin also has 
the similar issues as we are stilling using Kylin3. We are mainly facing 3 
kinds of challenges:

How to deal with incompatibility of interfaces between different Kylin versions?
How to make the iteration of the index model update more intelligently and 
transparently for users?
How to migrate Kylin3’s metadata and data to Kylin5?

 

For the first challenge, the eBay Kylin team has been working on designing and 
implementing Kylin’s own SQL grammar for all of the DDL, DML, DQL and other 
commands. The SQL grammar should be standard, easy understanding. All of Kylin 
future versions should follow that SQL grammar so that users will be able to 
use the same SQLs for different versions of Kylin and don’t need to worry about 
the incompatibility issue. We have finished an initial version and now it’s 
under testing and verification. And we will raise issues and PRs to the 
community soon.

 

For the second challenge, the eBay Kylin team has been working on 
reimplementation a new two-phase index planner for Kylin5 based on the Kylin3’s 
two-phase cube planner. The basic idea is as follows:

Do index recommendation based on data, like the row count of aggregation index
Do index recommendation based on user behavior, like query statistics

The whole iteration process of index model will be user transparent. Currently, 
we have proposed PR for the phase one and it’s under review.

 

For the third challenge, the eBay Kylin team has created tools for metadata 
migration from Kylin3 to Kylin5. And we will raise PR to the community soon.

 

Overall, I will give +1 for this proposal of making Kylin5 as the codebase for 
Apache Kylin.

 

Best regards,

Yanghong Zhong钟阳红
Apache Kylin Committer, PMC,
Apache ArrowCommitter,
Email: nju_y...@apache.org

 

From: hit_la...@126.com  on behalf of Xiaoxiang Yu 

Date: Thursday, February 16, 2023 at 14:29
To: dev@kylin.apache.org 
Subject: Re: [Discuss] Adopt the new codebase as Apache Kylin 5.0

External Email

1. Would it possible for users of Kylin 2-4 to upgrade their metadata to Kylin5 
easily?


I was talking with some early users of new codebase, they told me that they 
have a plan to
upgrade to new codebase(kylin 5) from kylin 3, and they plan to developed and 
contributed
 the metadata upgradation tools. So I think this issue will be solved soon.




2. Would the structures(URL, request and response) of the Restiful Apis in 
Kylin 2-4 be kept?


It is a good question, but I have to say most REST APIs have been rewritten so 
they
are call in new way.  I think new REST doc will help to solve this partially.




3. Any benchmark test has been done?


I think I will do a benchmark in next month.


4. Are features like Realtime Cubing, Cube Planner in Kylin3 ar

Re: [Discuss] Adopt the new codebase as Apache Kylin 5.0

2023-02-15 Thread Xiaoxiang Yu
1. Would it possible for users of Kylin 2-4 to upgrade their metadata to Kylin5 
easily?


I was talking with some early users of new codebase, they told me that they 
have a plan to 
upgrade to new codebase(kylin 5) from kylin 3, and they plan to developed and 
contributed
 the metadata upgradation tools. So I think this issue will be solved soon.




2. Would the structures(URL, request and response) of the Restiful Apis in 
Kylin 2-4 be kept?


It is a good question, but I have to say most REST APIs have been rewritten so 
they 
are call in new way.  I think new REST doc will help to solve this partially.




3. Any benchmark test has been done?


I think I will do a benchmark in next month.


4. Are features like Realtime Cubing, Cube Planner in Kylin3 are included in 
Kylin5?


Kafka streaming/Realtime cubing/JDBC source are implemented in new way so 
previous 
code are not exists.
For cube planner, Liu Kun are trying hard to implement it in new codebase
(see https://github.com/apache/kylin/pull/2089 ).





--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-02-15 21:03:40, "George Ni"  wrote:
>Hi,
>
>Overrall, I'd like to give +1 to this proposal, for Kylin5 has implemented
>such a lot significant breakthroughs. Below are some of my questions:
>
>1. Would it possible for users of Kylin 2-4 to upgrade their metadata to
>Kylin5 easily?
>2. Would the structures(URL, request and response) of the Restiful Apis in
>Kylin 2-4 be kept?
>3. Any benchmark test has been done?
>4. Are features like Realtime Cubing, Cube Planner in Kylin3 are included
>in Kylin5?
>
>Li Yang  于2023年2月15日周三 17:23写道:
>
>> As Xiaoxiang mentioned, the code donation has a lot of improvements
>> compared to current Kylin 4. Many are long wanted, like
>>
>>- The flexible model can greatly improve the smoothness of adding new
>>dimensions in a production environment.
>>- The computed column can mind the gap of last-mile data transformation.
>>- The new model metadata design that is more friendly to dynamic
>>indexing.
>>- Support of 63+ dimensions.
>>
>> Accepting this code base a good thing for the whole Kylin community.
>>
>> Cheers
>> Yang
>>
>>
>> On Tue, Feb 14, 2023 at 10:46 PM ShaoFeng Shi 
>> wrote:
>>
>> > The current limitations are very difficult to solve in normal ways. For
>> > example, the Cuboid ID is represented by a Long number, which is 64 bit,
>> > and the sequence of each dimension is fixed. The Cuboid ID appears in
>> every
>> > part of Kylin's source code. This design couldn't be refactored easily.
>> So
>> > I agree that a whole new design is necessary, in long term it can help a
>> > lot.
>> >
>> > Best regards,
>> >
>> > Shaofeng Shi 史少锋
>> > Apache Kylin PMC,
>> > Apache Incubator PMC,
>> > Email: shaofeng...@apache.org
>> >
>> > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> > Join Kylin user mail group: user-subscr...@kylin.apache.org
>> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>> >
>> >
>> >
>> >
>> > Xiaoxiang Yu  于2023年2月14日周二 14:22写道:
>> >
>> > > A formatted version of the discussion with the same content:
>> > >
>> > > ## Background ##
>> > >
>> > > As we discussed in the mailing list[2] last year, Kylin 4.0 has
>> achieved
>> > > its goal in new storage (columnar file) and new query engine (Spark
>> > based),
>> > > and gained some adoptions from the community. But due to the old design
>> > > from the early versions, Kylin 4.0 still keep some limitations from
>> > > previous versions, such as max. 63 dimension cap, cube structure
>> couldn't
>> > > be modified once built, etc. We think the only way to solve those
>> > > limitations is to do a whole redesign, especially in the metadata.
>> > >
>> > > The good news is, Kyligence has started to do that from years ago, and
>> > its
>> > > comercial version has been verified by many customers in terms of its
>> > > functionality, performance and stability. Last year, Kyligence open
>> > sourced
>> > > its core under Apache License v2.0, and signed CCLA to Apache Software
>> > > Foundataion. We staged it in a separate branch of the github repository
>> > for
>> > > review[1]. Engineers from other teams such as eBay also reviewed the
>> > > codebase, and put forward many new ideas. We think based on the
>> codebase,
>>

Re:[Discuss] Adopt the new codebase as Apache Kylin 5.0

2023-02-13 Thread Xiaoxiang Yu
A formatted version of the discussion with the same content:

## Background ##

As we discussed in the mailing list[2] last year, Kylin 4.0 has achieved its 
goal in new storage (columnar file) and new query engine (Spark based), and 
gained some adoptions from the community. But due to the old design from the 
early versions, Kylin 4.0 still keep some limitations from previous versions, 
such as max. 63 dimension cap, cube structure couldn't be modified once built, 
etc. We think the only way to solve those limitations is to do a whole 
redesign, especially in the metadata.

The good news is, Kyligence has started to do that from years ago, and its 
comercial version has been verified by many customers in terms of its 
functionality, performance and stability. Last year, Kyligence open sourced its 
core under Apache License v2.0, and signed CCLA to Apache Software Foundataion. 
We staged it in a separate branch of the github repository for review[1]. 
Engineers from other teams such as eBay also reviewed the codebase, and put 
forward many new ideas. We think based on the codebase, Kylin will not only 
gain a flexible metadata design, a faster computing engine, but also will gain 
richer user scenarios.

The new codebase has the following features compared with the latest release 
(Kylin 4.0.3):

- More flexible and enhanced data model
* Allow adding new dimensions and measures to the existing data model
* The model adapts to table schema changes while retaining the existing 
index at the best effort
* Support last-mile data transformation using Computed Column
* Support raw query (non-aggregation query) using Table Index
* Support changing dimension table (SCD2)
- Simplified metadata design
* Merge DataModel and CubeDesc into new DataModel
* Add DataFlow for more generic data sequence, e.g. streaming like data flow
* New metadata AuditLog for better cache synchronization
- More flexible index management
* Add IndexPlan to support flexible index management
* Add IndexEntity to support different index type
* Add LayoutEntity to support different storage layouts of the same Index
- Toward a native and vectorized query engine
* Experiment: Integrate with a native execution engine, leveraging Gluten
* Support async query
* Enhance cost-based index optimizer
- More
* Build engine refactoring and performance optimization
* New WEB UI based on Vue.js, a brand new front-end framework, to replace 
AngularJS
* Smooth modeling process on one canvas




## Proposal ##
So, I'd like to propose adopting the new codebase from Kyligence as Kylin 's 
future code base, e.g, Kylin 5. If accepted, we will request an IP clearance in 
Apache Incubator for it as the next step.





## Reference ##
https://github.com/apache/kylin/tree/kylin5
https://lists.apache.org/thread/4fkhyw1fyf0jg5cb18v7vxyqbn6vm3zv


--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-02-14 14:09:31, "Xiaoxiang Yu"  wrote:
>Background
>
>
>As we discussed in the mailing list[2] last year, Kylin 4.0 has achieved its 
>goal in new storage (columnar file) and new query engine (Spark based), and 
>gained some adoptions from the community. But due to the old design from the 
>early versions, Kylin 4.0 still keep some limitations from previous versions, 
>such as max. 63 dimension cap, cube structure couldn't be modified once built, 
>etc. We think the only way to solve those limitations is to do a whole 
>redesign, especially in the metadata.
>
>
>The good news is, Kyligence has started to do that from years ago, and its 
>comercial version has been verified by many customers in terms of its 
>functionality, performance and stability. Last year, Kyligence open sourced 
>its core under Apache License v2.0, and signed CCLA to Apache Software 
>Foundataion. We staged it in a separate branch of the github repository for 
>review[1]. Engineers from other teams such as eBay also reviewed the codebase, 
>and put forward many new ideas. We think based on the codebase, Kylin will not 
>only gain a flexible metadata design, a faster computing engine, but also will 
>gain richer user scenarios.
>
>
>The new codebase has the following features compared with the latest release 
>(Kylin 4.0.3):
>More flexible and enhanced data model
>Allow adding new dimensions and measures to the existing data model
>The model adapts to table schema changes while retaining the existing index at 
>the best effort
>Support last-mile data transformation using Computed Column
>Support raw query (non-aggregation query) using Table Index
>Support changing dimension table (SCD2)
>Simplified metadata design
>Merge DataModel and CubeDesc into new DataModel
>Add DataFlow for more generic data sequence, e.g. streaming like data flow
>New metadata AuditLog for better cache synchronization
>More flexible i

[Discuss] Adopt the new codebase as Apache Kylin 5.0

2023-02-13 Thread Xiaoxiang Yu
Background


As we discussed in the mailing list[2] last year, Kylin 4.0 has achieved its 
goal in new storage (columnar file) and new query engine (Spark based), and 
gained some adoptions from the community. But due to the old design from the 
early versions, Kylin 4.0 still keep some limitations from previous versions, 
such as max. 63 dimension cap, cube structure couldn't be modified once built, 
etc. We think the only way to solve those limitations is to do a whole 
redesign, especially in the metadata.


The good news is, Kyligence has started to do that from years ago, and its 
comercial version has been verified by many customers in terms of its 
functionality, performance and stability. Last year, Kyligence open sourced its 
core under Apache License v2.0, and signed CCLA to Apache Software Foundataion. 
We staged it in a separate branch of the github repository for review[1]. 
Engineers from other teams such as eBay also reviewed the codebase, and put 
forward many new ideas. We think based on the codebase, Kylin will not only 
gain a flexible metadata design, a faster computing engine, but also will gain 
richer user scenarios.


The new codebase has the following features compared with the latest release 
(Kylin 4.0.3):
More flexible and enhanced data model
Allow adding new dimensions and measures to the existing data model
The model adapts to table schema changes while retaining the existing index at 
the best effort
Support last-mile data transformation using Computed Column
Support raw query (non-aggregation query) using Table Index
Support changing dimension table (SCD2)
Simplified metadata design
Merge DataModel and CubeDesc into new DataModel
Add DataFlow for more generic data sequence, e.g. streaming like data flow
New metadata AuditLog for better cache synchronization
More flexible index management
Add IndexPlan to support flexible index management
Add IndexEntity to support different index type
Add LayoutEntity to support different storage layouts of the same Index
Toward a native and vectorized query engine
Experiment: Integrate with a native execution engine, leveraging Gluten
Support async query
Enhance cost-based index optimizer
More
Build engine refactoring and performance optimization
New WEB UI based on Vue.js, a brand new front-end framework, to replace 
AngularJS
Smooth modeling process on one canvas
Proposal
So, I'd like to propose adopting the new codebase from Kyligence as Kylin 's 
future code base, e.g, Kylin 5. If accepted, we will request an IP clearance in 
Apache Incubator for it as the next step.
Reference
https://github.com/apache/kylin/tree/kylin5
https://lists.apache.org/thread/4fkhyw1fyf0jg5cb18v7vxyqbn6vm3zv
https://kylin.apache.org/5.0/blog/introduction_of_metastore_cn

--

Best wishes to you ! 
From :Xiaoxiang Yu

Re:Kylin 4.x Data source

2023-01-05 Thread Xiaoxiang Yu
I am afarid that Kylin 4 don't has a plan to support JDBC source( But Kylin 5 
has plan to support it). You may consider downgrade to 3.X.




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2023-01-06 13:38:26, "Garg, Himani"  wrote:
>Hi Kylin Family,
>
>
>I was able to deploy Kylin 4.0.3 on EMR 5.31. However, I wanted to check if 
>Kylin 4.x can source data from Redshift ? if yes, can you please some leads. 
>If no, is there any workaround to source Redshift data or I shall downgrade it 
>to Kylin3.x?
>
>
>Response will be really appreciated.
>
>Thank you
>Himani Garg
>
>


CVE-2022-43396: Apache Kylin: Command injection by Useless configuration

2022-12-29 Thread Xiaoxiang Yu
Severity: important

Description:

In the fix for CVE-2022-24697, a blacklist is used to filter user input 
commands. But there is a risk of being bypassed. The user can control the 
command by controlling the kylin.engine.spark-cmd parameter of conf.

Work Arounds:

Users of Kylin 2.x & Kylin 3.x & 4.x should upgrade to 4.0.3 or apply patch  
https://github.com/apache/kylin/pull/2011 
https://github.com/apache/kylin/pull/2011

Credit:

Yasax1 Li  (finder)


References:

https://lists.apache.org/thread/o53vqxjdd9q731bwqpgcqyzx9r716qwx
https://kylin.apache.org/
https://www.cve.org/CVERecord?id=CVE-2022-43396
















--

Best wishes to you ! 
From :Xiaoxiang Yu

CVE-2022-44621: Apache Kylin: Command injection by Diagnosis Controller

2022-12-29 Thread Xiaoxiang Yu
Severity: important

Description:

Diagnosis Controller miss parameter validation, so user may attacked by command 
injection via HTTP Request.

Work Arounds:

Users of Kylin 2.x & Kylin 3.x & 4.x should upgrade to 4.0.3 or apply patch  
https://github.com/apache/kylin/pull/2011 
https://github.com/apache/kylin/pull/2011

Credit:

Messy God  (finder)

References:

https://kylin.apache.org/
https://www.cve.org/CVERecord?id=CVE-2022-44621



Re: [VOTE] Release apache-kylin-4.0.3 (RC1)

2022-12-18 Thread Xiaoxiang Yu
+1


mvn test passed




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2022-12-17 14:25:29, "Li Yang"  wrote:
>+1
>
>`mvn clean package` passed on commit
>322ab6e5ee9738c5a07165af398c1faeeeacb079
>
>OpenJDK Runtime Environment (build 1.8.0_342-8u342-b07-0ubuntu1~20.04-b07)
>OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)
>
>On Fri, Dec 16, 2022 at 7:33 PM ShaoFeng Shi  wrote:
>
>> Hi all,
>>
>> I have created a build for Apache Kylin 4.0.3, release candidate 1.
>>
>> Changes highlights:
>> - [KYLIN-5181] - Support postgresql to store kylin metadata
>> - [KYLIN-5246] - Long running job's log staying in mem, may cause job
>> server oom
>> - [KYLIN-5250] - Add a switch for no hack aggregation group
>> - [KYLIN-5251] - On hadoop 3 platform and start with class not found:
>> org/apache/commons/Configuration
>> - [KYLIN-5008] - When backend spark was failed, but corresponding job
>> status is shown as finished in WebUI
>> - [KYLIN-5245] - When a job is submitted with deployMode=cluster and the
>> application driver is abnormal, Kylin displays the job status as success
>> - [KYLIN-5271] - Query memory leaks
>> - [KYLIN-5285] - Performance optimization for sharby column
>>
>>
>> Thanks to everyone who has contributed to this release.
>> Here are the release notes:
>>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12352685
>>
>> The commit to being voted upon:
>>
>> https://github.com/apache/kylin/commit/322ab6e5ee9738c5a07165af398c1faeeeacb079
>> Its hash is 322ab6e5ee9738c5a07165af398c1faeeeacb079.
>>
>> The artifacts to be voted on, including the source package and one
>> pre-compiled binary package are located here:
>> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-4.0.3-rc1/
>>
>> The hash of the artifacts are as follows:
>>
>> apache-kylin-4.0.3-source-release.zip.sha256
>> 725d2e7d96eae04013ce91687bf9e19df3041410476849c640f0cef68e25262b
>>
>> apache-kylin-4.0.3-bin-spark3.tar.gz
>> b44d08b51f7b542ff8a93acaa345ffe26cf4b182c011555435078fae6215356a
>>
>> A staged Maven repository is available for review at:
>> https://repository.apache.org/content/repositories/orgapachekylin-1101/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/shaofengshi.asc
>>
>> Please vote on releasing this package as Apache Kylin 4.0.3.
>>
>> The vote is open for the next 72 hours and passes if a majority of
>> at least three +1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Kylin 4.0.3
>> [ ] 0 I don't feel strongly about it, but I'm okay with the release
>> [ ] -1 Do not release this package because...
>>
>> Here is my vote:
>> +1 (binding)
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC,
>> Apache Incubator PMC,
>> Email: shaofeng...@apache.org
>>
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>


Re: building a cube for demographic data queries

2022-10-12 Thread Xiaoxiang Yu
Hi, Will
  Glad to see you have complete the 'basic path' of kylin4_on_cloud, which 
provided
some tools which make deployment of Kylin much easier than before. But I think 
to
make Kylin provide satisfying performance(response time, concurrency), user 
must 
have enough knowledge of Apache Spark and Apache Kylin. I think this article 
maybe
helpful: https://kylin.apache.org/blog/2021/06/17/Why-did-Youzan-choose-Kylin4 .




--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2022-10-12 02:21:13, "Will Glass-Husain"  wrote:
>Thank you -- very helpful.
>
>Regarding limits on the number of dimensions.What are the
>compute/storage constraints on this?  For a given query:
>* Where is the data stored
>* Which nodes is the computation occurring on?
>
>I am trying to figure out -- if we have a large number of dimensions, what
>part of the cloud based kylin  needs to be increased (I'm doing the setup
>from the kylin4_on_cloud branch)
>
>Thanks, WILL
>
>On Tue, Oct 11, 2022 at 1:20 AM Xiaoxiang Yu  wrote:
>
>> 1) The criteria for filtering (e.g. selecting sex='male') and grouping (e.g.
>> group by state) should be dimensions - is this correct?
>> Yes, besides Kylin has limit of 63 dimensions at maximum.  But you should
>> be aware of 'The Curse of Dimensionality'.
>>
>> 2.1) Items that I would like to sum should be measures, is that right?
>> Yes.
>>
>> 2.2) Is there a limit to the number of measures?
>> No, there isn't such limit.
>>
>> 3) Did Kylin support sum(expression)?
>> From mysql doc
>> https://dev.mysql.com/doc/refman/5.7/en/aggregate-functions.html#function_sum
>>  ,
>> we know MySQL supports it.
>> For Kylin, Kylin should support it for Kylin 3.X and the future version
>> 5.x. But unluckily, Kylin 4.x didn't support sum exprssion, and Kylin 4.x
>> is the version you are using.
>>
>> 4) Does Kylin support MEDIAN?
>>
>> Yes, Kylin should support but I didn't test it. In fact, Kylin has a
>> measure PERCENTILE, and I think 50th percentile is equal to MEDIAN, am I
>> right?
>>
>> --
>> *Best wishes to you ! *
>> *From :**Xiaoxiang Yu*
>>
>>
>>
>> At 2022-10-11 14:03:14, "Will Glass-Husain"  wrote:
>> >Hi,
>> >
>> >Thanks for the recent help as I set up my first Kylin system.   I have a
>> >question regarding proper design of a cube to run some
>> >demographic queries.   I want to make this accessible in a webapp, with
>> >reasonable response time.
>> >
>> >I have a CSV file with about 80 columns on sex, race, state, age, internet
>> >access, job, etc.
>> >
>> >Can you advise regarding proper cube design?
>> >
>> >1) The criteria for filtering (e.g. selecting sex='male') and grouping
>> >(e.g. group by state) should be dimensions - is this correct?
>> >
>> >2) Items that I would like to sum should be measures, is that right?   Is
>> >there a limit to the number of measures?  I want to report out up to 300
>> >different measures aggregated by the dimensions.
>> >
>> >3)
>> >In MySQL, I am querying for different values like this
>> >
>> >select SUM((married=1) * weight) as MARRIED_1, SUM((married=2) * weight) as
>> >MARRIED_2 from data group by state;
>> >
>> >This returns the total number of weighted records for records where married
>> >is 1 and where married is 2.
>> >
>> >Question - is there a way to do this in the Kylin query?Or do I need to
>> >pre-compute my weights and create columns MARRIED_1 and MARRIED_2 in the
>> >source data, then sum it in Kylin.
>> >
>> >4) This is a tricky one.  Does Kylin support MEDIAN?   In MySQL, there's no
>> >MEDIAN function but we can calculate it by counting all the records, then
>> >selecting the record at an offset of half the records.   I want to
>> >calculate "median" (not mean) for age and some other variables.
>> >
>> >Thanks for any tips.
>> >
>> >Best, WILL
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >--
>> >William Glass-Husain   /forio  |  +1 (415) 440 7500 x802  |  forio.com
>> ><http://www.forio.com/>
>>
>>
>
>-- 
>William Glass-Husain   /forio  |  +1 (415) 440 7500 x802  |  forio.com
><http://www.forio.com/>


CVE-2022-24697: Apache Kylin: Command injection exists when the configuration overwrites function overwrites system parameters

2022-10-11 Thread Xiaoxiang Yu
Severity: important

Description:

Kylin's cube designer function has a command injection vulnerability when 
overwriting system parameters in the configuration overwrites menu. RCE can be 
implemented by closing the single quotation marks around the parameter value of 
“-- conf=” to inject any operating system command into the command line 
parameters. This vulnerability affects the kylin which version is 4.0.1 and 
above.

Mitigation:

Users of Kylin 2.x & Kylin 3.x & 4.x should upgrade to 4.0.2 or apply patch 
https://github.com/apache/kylin/pull/1811 .

Credit:

Kylin Team would like to thanks Kai Zhao of ToTU Secruity Team.



[Announce] Apache Kylin 4.0.2 released

2022-10-11 Thread Xiaoxiang Yu
The Apache Kylin team is pleased to announce the immediate availability of

the 4.0.2 release.




This is a bugfix release after 4.0.1, with 21 bug fixes and enhancements.

All of the changes in this release can be found in:

https://kylin.apache.org/docs/release_notes.html




You can download the source release and binary packages from Apache Kylin's

download page: https://kylin.apache.org/download/




Apache Kylin is an open-source Distributed Analytical Data Warehouse for

Big Data; it was designed to provide OLAP (Online Analytical Processing)

capability in the big data era. By renovating the multi-dimensional cube

and precalculation technology on Hadoop and Spark, Kylin is able to achieve

near-constant query speed regardless of the ever-growing data volume.

Reducing query latency from minutes to sub-second, Kylin brings online

analytics back to big data.




Apache Kylin lets you query billions of rows at sub-second latency in 3

steps:

1. Identify a Star/Snowflake Schema on Hadoop.

2. Build Cube from the identified tables.

3. Query using ANSI-SQL and get results in sub-second, via ODBC, JDBC or

RESTful API.




Thanks to everyone who has contributed to this release.




We welcome your help and feedback. For more information on how to report

problems, and to get involved, visit the project website at

https://kylin.apache.org/

--

Best wishes to you ! 
From :Xiaoxiang Yu

Re:building a cube for demographic data queries

2022-10-11 Thread Xiaoxiang Yu
1) The criteria for filtering (e.g. selecting sex='male') and grouping (e.g. 
group by state) should be dimensions - is this correct?
Yes, besides Kylin has limit of 63 dimensions at maximum.  But you should be 
aware of 'The Curse of Dimensionality'. 


2.1) Items that I would like to sum should be measures, is that right?
Yes.


2.2) Is there a limit to the number of measures?
No, there isn't such limit.


3) Did Kylin support sum(expression)?
From mysql doc 
https://dev.mysql.com/doc/refman/5.7/en/aggregate-functions.html#function_sum , 
we know MySQL supports it.
For Kylin, Kylin should support it for Kylin 3.X and the future version 5.x. 
But unluckily, Kylin 4.x didn't support sum exprssion, and Kylin 4.x is the 
version you are using.


4) Does Kylin support MEDIAN?

Yes, Kylin should support but I didn't test it. In fact, Kylin has a measure 
PERCENTILE, and I think 50th percentile is equal to MEDIAN, am I right?

--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2022-10-11 14:03:14, "Will Glass-Husain"  wrote:
>Hi,
>
>Thanks for the recent help as I set up my first Kylin system.   I have a
>question regarding proper design of a cube to run some
>demographic queries.   I want to make this accessible in a webapp, with
>reasonable response time.
>
>I have a CSV file with about 80 columns on sex, race, state, age, internet
>access, job, etc.
>
>Can you advise regarding proper cube design?
>
>1) The criteria for filtering (e.g. selecting sex='male') and grouping
>(e.g. group by state) should be dimensions - is this correct?
>
>2) Items that I would like to sum should be measures, is that right?   Is
>there a limit to the number of measures?  I want to report out up to 300
>different measures aggregated by the dimensions.
>
>3)
>In MySQL, I am querying for different values like this
>
>select SUM((married=1) * weight) as MARRIED_1, SUM((married=2) * weight) as
>MARRIED_2 from data group by state;
>
>This returns the total number of weighted records for records where married
>is 1 and where married is 2.
>
>Question - is there a way to do this in the Kylin query?Or do I need to
>pre-compute my weights and create columns MARRIED_1 and MARRIED_2 in the
>source data, then sum it in Kylin.
>
>4) This is a tricky one.  Does Kylin support MEDIAN?   In MySQL, there's no
>MEDIAN function but we can calculate it by counting all the records, then
>selecting the record at an offset of half the records.   I want to
>calculate "median" (not mean) for age and some other variables.
>
>Thanks for any tips.
>
>Best, WILL
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>-- 
>William Glass-Husain   /forio  |  +1 (415) 440 7500 x802  |  forio.com
><http://www.forio.com/>


[jira] [Created] (KYLIN-5274) Improve performance of getSubstitutor

2022-10-08 Thread Xiaoxiang Yu (Jira)
Xiaoxiang Yu created KYLIN-5274:
---

 Summary: Improve performance of getSubstitutor
 Key: KYLIN-5274
 URL: https://issues.apache.org/jira/browse/KYLIN-5274
 Project: Kylin
  Issue Type: Improvement
Reporter: Xiaoxiang Yu
Assignee: Zhong Yanghong
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT][VOTE] Release apache-kylin-4.0.2 (RC1)

2022-09-20 Thread Xiaoxiang Yu
Thanks to everyone who has tested the release candidate and given their

comments and votes.




The tally is as follows.




4 binding +1s:

Xiaoxiang Yu

Chunen Ni

Shaofeng Shi

Yang Li




2 non-binding +1s:

Tengting Xu

Yaqian Zhang




No 0s or -1s.




Therefore I am delighted to announce that the proposal to release

Apache-Kylin-4.0.2 has passed.

--

Best wishes to you ! 
From :Xiaoxiang Yu

Re:[VOTE] Release apache-kylin-4.0.2 (RC1)

2022-09-14 Thread Xiaoxiang Yu
Happy path passed on Hadoop 3.0.0 + Spark 3.1.1
UT passed on centos via "mvn clean install -DskipTests ; mvn test"

checksum verified







--

Best wishes to you ! 
From :Xiaoxiang Yu





At 2022-09-14 23:14:25, "Xiaoxiang Yu"  wrote:
>Hi all,
>
>
>
>
>I have created a build for Apache Kylin 4.0.2, release candidate 1.
>
>
>
>
>Changes highlights:
>
>[KYLIN-5187] - Support Alluxio Local Cache + Soft Affinity to speed up the 
>query performance on the cloud
>
>[KYLIN-4954] - Cardinality statistics for Kylin 4
>
>[KYLIN-5166] - Use SQL hint to specify the cube that the user wants to query
>
>[KYLIN-5195] - update spark from 3.1.1 to 3.1.3 in kylin4
>
>[KYLIN-5181] - support postgresql to store kylin metadata
>
>
>
>
>
>
>
>Thanks to everyone who has contributed to this release.
>
>Here are the release notes:
>
>https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12350849
>
>
>
>
>The commit to being voted upon:
>
>https://github.com/apache/kylin/commit/f88d14600ffb1a24d9eb33b29c6114c635600111
>
>Its hash is f88d14600ffb1a24d9eb33b29c6114c635600111.
>
>
>
>
>The artifacts to be voted on, including the source package and one 
>pre-compiled binary package are located here:
>
>https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-4.0.2-rc1/
>
>
>
>
>The hash of the artifacts are as follows:
>
>
>
>
>apache-kylin-4.0.2-source-release.zip.sha256
>
>34920bbe4e3ec118a099fd7a5375515d5852aac45ed5e6c937cfff92f22869be
>
>
>
>
>apache-kylin-4.0.2-bin.tar.gz.sha256
>98309df502aff1154c80d3608b18b13ca305fb9926f7dfb8a15e1107f0a32c67
>
>
>
>
>A staged Maven repository is available for review at:
>
>https://repository.apache.org/content/repositories/orgapachekylin-1100/
>
>
>
>
>Release artifacts are signed with the following key:
>
>https://people.apache.org/keys/committer/xxyu.asc
>
>
>
>
>Please vote on releasing this package as Apache Kylin 4.0.2.
>
>
>
>
>The vote is open for the next 72 hours and passes if a majority of
>
>at least three +1 PMC votes are cast.
>
>
>
>
>[ ] +1 Release this package as Apache Kylin 4.0.2
>
>[ ] 0 I don't feel strongly about it, but I'm okay with the release
>
>[ ] -1 Do not release this package because...
>
>
>
>
>Here is my vote:
>
>+1 (binding)
>
>
>
>
>
>--
>
>Best wishes to you ! 
>From :Xiaoxiang Yu


[VOTE] Release apache-kylin-4.0.2 (RC1)

2022-09-14 Thread Xiaoxiang Yu
Hi all,




I have created a build for Apache Kylin 4.0.2, release candidate 1.




Changes highlights:

[KYLIN-5187] - Support Alluxio Local Cache + Soft Affinity to speed up the 
query performance on the cloud

[KYLIN-4954] - Cardinality statistics for Kylin 4

[KYLIN-5166] - Use SQL hint to specify the cube that the user wants to query

[KYLIN-5195] - update spark from 3.1.1 to 3.1.3 in kylin4

[KYLIN-5181] - support postgresql to store kylin metadata







Thanks to everyone who has contributed to this release.

Here are the release notes:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12350849




The commit to being voted upon:

https://github.com/apache/kylin/commit/f88d14600ffb1a24d9eb33b29c6114c635600111

Its hash is f88d14600ffb1a24d9eb33b29c6114c635600111.




The artifacts to be voted on, including the source package and one pre-compiled 
binary package are located here:

https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-4.0.2-rc1/




The hash of the artifacts are as follows:




apache-kylin-4.0.2-source-release.zip.sha256

34920bbe4e3ec118a099fd7a5375515d5852aac45ed5e6c937cfff92f22869be




apache-kylin-4.0.2-bin.tar.gz.sha256
98309df502aff1154c80d3608b18b13ca305fb9926f7dfb8a15e1107f0a32c67




A staged Maven repository is available for review at:

https://repository.apache.org/content/repositories/orgapachekylin-1100/




Release artifacts are signed with the following key:

https://people.apache.org/keys/committer/xxyu.asc




Please vote on releasing this package as Apache Kylin 4.0.2.




The vote is open for the next 72 hours and passes if a majority of

at least three +1 PMC votes are cast.




[ ] +1 Release this package as Apache Kylin 4.0.2

[ ] 0 I don't feel strongly about it, but I'm okay with the release

[ ] -1 Do not release this package because...




Here is my vote:

+1 (binding)





--

Best wishes to you ! 
From :Xiaoxiang Yu

[jira] [Created] (KYLIN-5247) Add doc for system operation

2022-08-31 Thread Xiaoxiang Yu (Jira)
Xiaoxiang Yu created KYLIN-5247:
---

 Summary: Add doc for system operation
 Key: KYLIN-5247
 URL: https://issues.apache.org/jira/browse/KYLIN-5247
 Project: Kylin
  Issue Type: Sub-task
Reporter: Xiaoxiang Yu
Assignee: mukvin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5244) Provided related scripts for grafana and influxdb

2022-08-29 Thread Xiaoxiang Yu (Jira)
Xiaoxiang Yu created KYLIN-5244:
---

 Summary: Provided related scripts for grafana and influxdb
 Key: KYLIN-5244
 URL: https://issues.apache.org/jira/browse/KYLIN-5244
 Project: Kylin
  Issue Type: Sub-task
Reporter: Xiaoxiang Yu
Assignee: mukvin
 Fix For: 5.0-alpha


 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5243) Add doc for operation tools

2022-08-26 Thread Xiaoxiang Yu (Jira)
Xiaoxiang Yu created KYLIN-5243:
---

 Summary: Add doc for operation tools
 Key: KYLIN-5243
 URL: https://issues.apache.org/jira/browse/KYLIN-5243
 Project: Kylin
  Issue Type: Bug
Reporter: Xiaoxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5242) Add doc for New Modeling

2022-08-26 Thread Xiaoxiang Yu (Jira)
Xiaoxiang Yu created KYLIN-5242:
---

 Summary: Add doc for New Modeling
 Key: KYLIN-5242
 URL: https://issues.apache.org/jira/browse/KYLIN-5242
 Project: Kylin
  Issue Type: Sub-task
Reporter: Xiaoxiang Yu
Assignee: mukvin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   >