[jira] [Created] (KYLIN-2492) kylin fact table is not getting incremental from hive using curl command to build?

2017-03-08 Thread prasannaP (JIRA)
prasannaP created KYLIN-2492:


 Summary: kylin fact table is not getting incremental from hive 
using curl command to build?
 Key: KYLIN-2492
 URL: https://issues.apache.org/jira/browse/KYLIN-2492
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: v1.6.0
Reporter: prasannaP
Assignee: Zhong,Jason


I am working on apache kylin.But my fact table is not getting incremental data 
from hive, when i am using kylin Restful api's curl command.If I am doing 
manual build in kylin GUI then I am getting incremental data into fact table.I 
am using curl commands as,

/usr/bin/curl -c /home/hdfs/.mozilla/firefox/a7ec5aak.default/cookies.sqlite -X 
POST -H "Authorization: Basic QURNSU46S1lMSU4 =" -H 'Content-Type: 
application/json' http://192.168.1.135:7070/kylin/api/user/authentication

/usr/bin/curl -b /home/hdfs/.mozilla/firefox/a7ec5aak.default/cookies.sqlite -X 
PUT -H 'Content-Type: application/json' -d '{"startTime":'142538400', 
"endTime": '148890720', "buildType":"BUILD"}' 
http://192.168.1.135:7070/kylin/api/cubes/incident_analytics_cube/rebuild

what i have to do for getting fact table incremental data also into kylin using 
curl command. Please suggest me.

In kylin am i able to use join query statement without using fact table?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-183) Execute and Display SQL Explain in query page

2017-03-08 Thread Roger Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Shi updated KYLIN-183:

Labels: github-import web-enhance  (was: github-import)

> Execute and Display SQL Explain in query page
> -
>
> Key: KYLIN-183
> URL: https://issues.apache.org/jira/browse/KYLIN-183
> Project: Kylin
>  Issue Type: Wish
>  Components: Web 
>Reporter: Luke Han
>Assignee: Zhong,Jason
>  Labels: github-import, web-enhance
> Fix For: Backlog
>
>
> As a user, I would like to:
> 1. Execute SQL Explain Plan before execute real SQL 
> 2. Display SQL Explain Plan with user information, such as how many rows will 
> be scan, how many rows will be returned, and the each step's cost (if have)
>  Imported from GitHub 
> Url: https://github.com/KylinOLAP/Kylin/issues/322
> Created by: [lukehan|https://github.com/lukehan]
> Labels: newfeature, 
> Milestone: v2.0 Release
> Created at: Fri Dec 26 15:24:49 CST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-189) ODBC prepared statement

2017-03-08 Thread Roger Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Shi updated KYLIN-189:

Labels: github-import odbc  (was: github-import)

> ODBC prepared statement
> ---
>
> Key: KYLIN-189
> URL: https://issues.apache.org/jira/browse/KYLIN-189
> Project: Kylin
>  Issue Type: Improvement
>  Components: Driver - ODBC
>Reporter: Luke Han
>Assignee: hongbin ma
>  Labels: github-import, odbc
> Fix For: Backlog
>
>
> Support ODBC prepared statement
>  Imported from GitHub 
> Url: https://github.com/KylinOLAP/Kylin/issues/316
> Created by: [lukehan|https://github.com/lukehan]
> Labels: newfeature, 
> Milestone: Backlog
> Created at: Fri Dec 26 15:15:36 CST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-181) Enhance filter on high cardinality in Tableau

2017-03-08 Thread Roger Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Shi updated KYLIN-181:

Labels: bi github-import  (was: github-import)

> Enhance filter on high cardinality in Tableau
> -
>
> Key: KYLIN-181
> URL: https://issues.apache.org/jira/browse/KYLIN-181
> Project: Kylin
>  Issue Type: Wish
>Reporter: Luke Han
>  Labels: bi, github-import
> Fix For: Future
>
>
> When user use seller_id as filter, the current ODBC will only show the first 
> 100K scan result. And the query will not re-run when given specific value.
> There's enhancement should be offered from Tableau side to aware such high 
> cardinality column and perform different behavior to avoid such issue.
>  Imported from GitHub 
> Url: https://github.com/KylinOLAP/Kylin/issues/325
> Created by: [lukehan|https://github.com/lukehan]
> Labels: enhancement, 
> Milestone: Backlog
> Created at: Fri Dec 26 15:32:09 CST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-187) Data Statistics Analyzer

2017-03-08 Thread Roger Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Shi updated KYLIN-187:

Description: 
1 Overview 
We need the statistics data for the following domains:
* Design cube metadata based on query log
* Design HBase row-key based on data distribution (e.g. histogram and 
cardinality)
* Choose execution plan based on cuboid data

2 Data Analyzer 
We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
analyze the hive to guide the 1st round design of row key. Then we will analyze 
the cube data to refine the design of row key and to estimate the cost of query.

2.1 Analyze Hive Data 
We need to analyze the following statistics data on hive table:
* Cardinality of each dimension
* Cardinality of dimension combination (optional)
* Value distribution of each dimension (optional)
Based on the statistics of hive data, we can design row key group from high 
cardinality dimension to low cardinality dimension. BTW, we should evenly split 
dimension into the row key group that will reduce the number of cuboid.

2.2 Analyze Cube Data 
We need to analyze the following statistics on data cube:
* Count of each cuboid
* Group ratio of each cuboid = current cuboid count / lower group base cuboid 
count 

3 Query Analyzer 
TBD

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/318
Created by: [lukehan|https://github.com/lukehan]
Labels: newfeature, 
Milestone: Backlog
Created at: Fri Dec 26 15:21:24 CST 2014
State: open


  was:
# Overview 
We need the statistics data for the following domains:
* Design cube metadata based on query log
* Design HBase row-key based on data distribution (e.g. histogram and 
cardinality)
* Choose execution plan based on cuboid data

# Data Analyzer 
We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
analyze the hive to guide the 1st round design of row key. Then we will analyze 
the cube data to refine the design of row key and to estimate the cost of query.

## Analyze Hive Data 
We need to analyze the following statistics data on hive table:
* Cardinality of each dimension
* Cardinality of dimension combination (optional)
* Value distribution of each dimension (optional)
Based on the statistics of hive data, we can design row key group from high 
cardinality dimension to low cardinality dimension. BTW, we should evenly split 
dimension into the row key group that will reduce the number of cuboid.

## Analyze Cube Data 
We need to analyze the following statistics on data cube:
* Count of each cuboid
* Group ratio of each cuboid = current cuboid count / lower group base cuboid 
count 

# Query Analyzer 
TBD

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/318
Created by: [lukehan|https://github.com/lukehan]
Labels: newfeature, 
Milestone: Backlog
Created at: Fri Dec 26 15:21:24 CST 2014
State: open



> Data Statistics Analyzer 
> -
>
> Key: KYLIN-187
> URL: https://issues.apache.org/jira/browse/KYLIN-187
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Reporter: Luke Han
>  Labels: github-import
> Fix For: Backlog
>
>
> 1 Overview 
> We need the statistics data for the following domains:
> * Design cube metadata based on query log
> * Design HBase row-key based on data distribution (e.g. histogram and 
> cardinality)
> * Choose execution plan based on cuboid data
> 2 Data Analyzer 
> We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
> analyze the hive to guide the 1st round design of row key. Then we will 
> analyze the cube data to refine the design of row key and to estimate the 
> cost of query.
> 2.1 Analyze Hive Data 
> We need to analyze the following statistics data on hive table:
> * Cardinality of each dimension
> * Cardinality of dimension combination (optional)
> * Value distribution of each dimension (optional)
> Based on the statistics of hive data, we can design row key group from high 
> cardinality dimension to low cardinality dimension. BTW, we should evenly 
> split dimension into the row key group that will reduce the number of cuboid.
> 2.2 Analyze Cube Data 
> We need to analyze the following statistics on data cube:
> * Count of each cuboid
> * Group ratio of each cuboid = current cuboid count / lower group base cuboid 
> count 
> 3 Query Analyzer 
> TBD
>  Imported from GitHub 
> Url: https://github.com/KylinOLAP/Kylin/issues/318
> Created by: [lukehan|https://github.com/lukehan]
> Labels: newfeature, 
> Milestone: Backlog
> Created at: Fri Dec 26 15:21:24 CST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-187) Data Statistics Analyzer

2017-03-08 Thread Roger Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Shi updated KYLIN-187:

Description: 
# Overview 
We need the statistics data for the following domains:
* Design cube metadata based on query log
* Design HBase row-key based on data distribution (e.g. histogram and 
cardinality)
* Choose execution plan based on cuboid data

# Data Analyzer 
We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
analyze the hive to guide the 1st round design of row key. Then we will analyze 
the cube data to refine the design of row key and to estimate the cost of query.

## Analyze Hive Data 
We need to analyze the following statistics data on hive table:
* Cardinality of each dimension
* Cardinality of dimension combination (optional)
* Value distribution of each dimension (optional)
Based on the statistics of hive data, we can design row key group from high 
cardinality dimension to low cardinality dimension. BTW, we should evenly split 
dimension into the row key group that will reduce the number of cuboid.

## Analyze Cube Data 
We need to analyze the following statistics on data cube:
* Count of each cuboid
* Group ratio of each cuboid = current cuboid count / lower group base cuboid 
count 

# Query Analyzer 
TBD

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/318
Created by: [lukehan|https://github.com/lukehan]
Labels: newfeature, 
Milestone: Backlog
Created at: Fri Dec 26 15:21:24 CST 2014
State: open


  was:
## 1. Overview 
We need the statistics data for the following domains:
* Design cube metadata based on query log
* Design HBase row-key based on data distribution (e.g. histogram and 
cardinality)
* Choose execution plan based on cuboid data

## 2. Data Analyzer 
We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
analyze the hive to guide the 1st round design of row key. Then we will analyze 
the cube data to refine the design of row key and to estimate the cost of query.

 2.1. Analyze Hive Data 
We need to analyze the following statistics data on hive table:
* Cardinality of each dimension
* Cardinality of dimension combination (optional)
* Value distribution of each dimension (optional)
Based on the statistics of hive data, we can design row key group from high 
cardinality dimension to low cardinality dimension. BTW, we should evenly split 
dimension into the row key group that will reduce the number of cuboid.

 2.2. Analyze Cube Data 
We need to analyze the following statistics on data cube:
* Count of each cuboid
* Group ratio of each cuboid = current cuboid count / lower group base cuboid 
count 

 3. Query Analyzer 
TBD

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/318
Created by: [lukehan|https://github.com/lukehan]
Labels: newfeature, 
Milestone: Backlog
Created at: Fri Dec 26 15:21:24 CST 2014
State: open



> Data Statistics Analyzer 
> -
>
> Key: KYLIN-187
> URL: https://issues.apache.org/jira/browse/KYLIN-187
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Reporter: Luke Han
>  Labels: github-import
> Fix For: Backlog
>
>
> # Overview 
> We need the statistics data for the following domains:
> * Design cube metadata based on query log
> * Design HBase row-key based on data distribution (e.g. histogram and 
> cardinality)
> * Choose execution plan based on cuboid data
> # Data Analyzer 
> We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
> analyze the hive to guide the 1st round design of row key. Then we will 
> analyze the cube data to refine the design of row key and to estimate the 
> cost of query.
> ## Analyze Hive Data 
> We need to analyze the following statistics data on hive table:
> * Cardinality of each dimension
> * Cardinality of dimension combination (optional)
> * Value distribution of each dimension (optional)
> Based on the statistics of hive data, we can design row key group from high 
> cardinality dimension to low cardinality dimension. BTW, we should evenly 
> split dimension into the row key group that will reduce the number of cuboid.
> ## Analyze Cube Data 
> We need to analyze the following statistics on data cube:
> * Count of each cuboid
> * Group ratio of each cuboid = current cuboid count / lower group base cuboid 
> count 
> # Query Analyzer 
> TBD
>  Imported from GitHub 
> Url: https://github.com/KylinOLAP/Kylin/issues/318
> Created by: [lukehan|https://github.com/lukehan]
> Labels: newfeature, 
> Milestone: Backlog
> Created at: Fri Dec 26 15:21:24 CST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-187) Data Statistics Analyzer

2017-03-08 Thread Roger Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Shi updated KYLIN-187:

Request participants:   (was: )
 Description: 
#Overview 
We need the statistics data for the following domains:
* Design cube metadata based on query log
* Design HBase row-key based on data distribution (e.g. histogram and 
cardinality)
* Choose execution plan based on cuboid data

#Data Analyzer 
We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
analyze the hive to guide the 1st round design of row key. Then we will analyze 
the cube data to refine the design of row key and to estimate the cost of query.

##Analyze Hive Data 
We need to analyze the following statistics data on hive table:
* Cardinality of each dimension
* Cardinality of dimension combination (optional)
* Value distribution of each dimension (optional)
Based on the statistics of hive data, we can design row key group from high 
cardinality dimension to low cardinality dimension. BTW, we should evenly split 
dimension into the row key group that will reduce the number of cuboid.

##Analyze Cube Data 
We need to analyze the following statistics on data cube:
* Count of each cuboid
* Group ratio of each cuboid = current cuboid count / lower group base cuboid 
count 

# Query Analyzer 
TBD

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/318
Created by: [lukehan|https://github.com/lukehan]
Labels: newfeature, 
Milestone: Backlog
Created at: Fri Dec 26 15:21:24 CST 2014
State: open


  was:
## 1. Overview 
We need the statistics data for the following domains:
* Design cube metadata based on query log
* Design HBase row-key based on data distribution (e.g. histogram and 
cardinality)
* Choose execution plan based on cuboid data

## 2. Data Analyzer 
We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
analyze the hive to guide the 1st round design of row key. Then we will analyze 
the cube data to refine the design of row key and to estimate the cost of query.

 2.1. Analyze Hive Data 
We need to analyze the following statistics data on hive table:
* Cardinality of each dimension
* Cardinality of dimension combination (optional)
* Value distribution of each dimension (optional)
Based on the statistics of hive data, we can design row key group from high 
cardinality dimension to low cardinality dimension. BTW, we should evenly split 
dimension into the row key group that will reduce the number of cuboid.

 2.2. Analyze Cube Data 
We need to analyze the following statistics on data cube:
* Count of each cuboid
* Group ratio of each cuboid = current cuboid count / lower group base cuboid 
count 

 3. Query Analyzer 
TBD

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/318
Created by: [lukehan|https://github.com/lukehan]
Labels: newfeature, 
Milestone: Backlog
Created at: Fri Dec 26 15:21:24 CST 2014
State: open



> Data Statistics Analyzer 
> -
>
> Key: KYLIN-187
> URL: https://issues.apache.org/jira/browse/KYLIN-187
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Reporter: Luke Han
>  Labels: github-import
> Fix For: Backlog
>
>
> #Overview 
> We need the statistics data for the following domains:
> * Design cube metadata based on query log
> * Design HBase row-key based on data distribution (e.g. histogram and 
> cardinality)
> * Choose execution plan based on cuboid data
> #Data Analyzer 
> We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
> analyze the hive to guide the 1st round design of row key. Then we will 
> analyze the cube data to refine the design of row key and to estimate the 
> cost of query.
> ##Analyze Hive Data 
> We need to analyze the following statistics data on hive table:
> * Cardinality of each dimension
> * Cardinality of dimension combination (optional)
> * Value distribution of each dimension (optional)
> Based on the statistics of hive data, we can design row key group from high 
> cardinality dimension to low cardinality dimension. BTW, we should evenly 
> split dimension into the row key group that will reduce the number of cuboid.
> ##Analyze Cube Data 
> We need to analyze the following statistics on data cube:
> * Count of each cuboid
> * Group ratio of each cuboid = current cuboid count / lower group base cuboid 
> count 
> # Query Analyzer 
> TBD
>  Imported from GitHub 
> Url: https://github.com/KylinOLAP/Kylin/issues/318
> Created by: [lukehan|https://github.com/lukehan]
> Labels: newfeature, 
> Milestone: Backlog
> Created at: Fri Dec 26 15:21:24 CST 2014
> State: open



--
This message was sent by Atlassian JIRA

[jira] [Updated] (KYLIN-187) Data Statistics Analyzer

2017-03-08 Thread Roger Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Shi updated KYLIN-187:

Description: 
## 1. Overview 
We need the statistics data for the following domains:
* Design cube metadata based on query log
* Design HBase row-key based on data distribution (e.g. histogram and 
cardinality)
* Choose execution plan based on cuboid data

## 2. Data Analyzer 
We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
analyze the hive to guide the 1st round design of row key. Then we will analyze 
the cube data to refine the design of row key and to estimate the cost of query.

 2.1. Analyze Hive Data 
We need to analyze the following statistics data on hive table:
* Cardinality of each dimension
* Cardinality of dimension combination (optional)
* Value distribution of each dimension (optional)
Based on the statistics of hive data, we can design row key group from high 
cardinality dimension to low cardinality dimension. BTW, we should evenly split 
dimension into the row key group that will reduce the number of cuboid.

 2.2. Analyze Cube Data 
We need to analyze the following statistics on data cube:
* Count of each cuboid
* Group ratio of each cuboid = current cuboid count / lower group base cuboid 
count 

 3. Query Analyzer 
TBD

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/318
Created by: [lukehan|https://github.com/lukehan]
Labels: newfeature, 
Milestone: Backlog
Created at: Fri Dec 26 15:21:24 CST 2014
State: open


  was:
#Overview 
We need the statistics data for the following domains:
* Design cube metadata based on query log
* Design HBase row-key based on data distribution (e.g. histogram and 
cardinality)
* Choose execution plan based on cuboid data

#Data Analyzer 
We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
analyze the hive to guide the 1st round design of row key. Then we will analyze 
the cube data to refine the design of row key and to estimate the cost of query.

##Analyze Hive Data 
We need to analyze the following statistics data on hive table:
* Cardinality of each dimension
* Cardinality of dimension combination (optional)
* Value distribution of each dimension (optional)
Based on the statistics of hive data, we can design row key group from high 
cardinality dimension to low cardinality dimension. BTW, we should evenly split 
dimension into the row key group that will reduce the number of cuboid.

##Analyze Cube Data 
We need to analyze the following statistics on data cube:
* Count of each cuboid
* Group ratio of each cuboid = current cuboid count / lower group base cuboid 
count 

# Query Analyzer 
TBD

 Imported from GitHub 
Url: https://github.com/KylinOLAP/Kylin/issues/318
Created by: [lukehan|https://github.com/lukehan]
Labels: newfeature, 
Milestone: Backlog
Created at: Fri Dec 26 15:21:24 CST 2014
State: open



> Data Statistics Analyzer 
> -
>
> Key: KYLIN-187
> URL: https://issues.apache.org/jira/browse/KYLIN-187
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Reporter: Luke Han
>  Labels: github-import
> Fix For: Backlog
>
>
> ## 1. Overview 
> We need the statistics data for the following domains:
> * Design cube metadata based on query log
> * Design HBase row-key based on data distribution (e.g. histogram and 
> cardinality)
> * Choose execution plan based on cuboid data
> ## 2. Data Analyzer 
> We need to analyzer the hive data and cube data in 2 phases. Firstly, we will 
> analyze the hive to guide the 1st round design of row key. Then we will 
> analyze the cube data to refine the design of row key and to estimate the 
> cost of query.
>  2.1. Analyze Hive Data 
> We need to analyze the following statistics data on hive table:
> * Cardinality of each dimension
> * Cardinality of dimension combination (optional)
> * Value distribution of each dimension (optional)
> Based on the statistics of hive data, we can design row key group from high 
> cardinality dimension to low cardinality dimension. BTW, we should evenly 
> split dimension into the row key group that will reduce the number of cuboid.
>  2.2. Analyze Cube Data 
> We need to analyze the following statistics on data cube:
> * Count of each cuboid
> * Group ratio of each cuboid = current cuboid count / lower group base cuboid 
> count 
>  3. Query Analyzer 
> TBD
>  Imported from GitHub 
> Url: https://github.com/KylinOLAP/Kylin/issues/318
> Created by: [lukehan|https://github.com/lukehan]
> Labels: newfeature, 
> Milestone: Backlog
> Created at: Fri Dec 26 15:21:24 CST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)