[jira] [Created] (KYLIN-2492) kylin fact table is not getting incremental from hive using curl command to build?
prasannaP created KYLIN-2492: Summary: kylin fact table is not getting incremental from hive using curl command to build? Key: KYLIN-2492 URL: https://issues.apache.org/jira/browse/KYLIN-2492 Project: Kylin Issue Type: Bug Components: REST Service Affects Versions: v1.6.0 Reporter: prasannaP Assignee: Zhong,Jason I am working on apache kylin.But my fact table is not getting incremental data from hive, when i am using kylin Restful api's curl command.If I am doing manual build in kylin GUI then I am getting incremental data into fact table.I am using curl commands as, /usr/bin/curl -c /home/hdfs/.mozilla/firefox/a7ec5aak.default/cookies.sqlite -X POST -H "Authorization: Basic QURNSU46S1lMSU4 =" -H 'Content-Type: application/json' http://192.168.1.135:7070/kylin/api/user/authentication /usr/bin/curl -b /home/hdfs/.mozilla/firefox/a7ec5aak.default/cookies.sqlite -X PUT -H 'Content-Type: application/json' -d '{"startTime":'142538400', "endTime": '148890720', "buildType":"BUILD"}' http://192.168.1.135:7070/kylin/api/cubes/incident_analytics_cube/rebuild what i have to do for getting fact table incremental data also into kylin using curl command. Please suggest me. In kylin am i able to use join query statement without using fact table? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-183) Execute and Display SQL Explain in query page
[ https://issues.apache.org/jira/browse/KYLIN-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roger Shi updated KYLIN-183: Labels: github-import web-enhance (was: github-import) > Execute and Display SQL Explain in query page > - > > Key: KYLIN-183 > URL: https://issues.apache.org/jira/browse/KYLIN-183 > Project: Kylin > Issue Type: Wish > Components: Web >Reporter: Luke Han >Assignee: Zhong,Jason > Labels: github-import, web-enhance > Fix For: Backlog > > > As a user, I would like to: > 1. Execute SQL Explain Plan before execute real SQL > 2. Display SQL Explain Plan with user information, such as how many rows will > be scan, how many rows will be returned, and the each step's cost (if have) > Imported from GitHub > Url: https://github.com/KylinOLAP/Kylin/issues/322 > Created by: [lukehan|https://github.com/lukehan] > Labels: newfeature, > Milestone: v2.0 Release > Created at: Fri Dec 26 15:24:49 CST 2014 > State: open -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-189) ODBC prepared statement
[ https://issues.apache.org/jira/browse/KYLIN-189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roger Shi updated KYLIN-189: Labels: github-import odbc (was: github-import) > ODBC prepared statement > --- > > Key: KYLIN-189 > URL: https://issues.apache.org/jira/browse/KYLIN-189 > Project: Kylin > Issue Type: Improvement > Components: Driver - ODBC >Reporter: Luke Han >Assignee: hongbin ma > Labels: github-import, odbc > Fix For: Backlog > > > Support ODBC prepared statement > Imported from GitHub > Url: https://github.com/KylinOLAP/Kylin/issues/316 > Created by: [lukehan|https://github.com/lukehan] > Labels: newfeature, > Milestone: Backlog > Created at: Fri Dec 26 15:15:36 CST 2014 > State: open -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-181) Enhance filter on high cardinality in Tableau
[ https://issues.apache.org/jira/browse/KYLIN-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roger Shi updated KYLIN-181: Labels: bi github-import (was: github-import) > Enhance filter on high cardinality in Tableau > - > > Key: KYLIN-181 > URL: https://issues.apache.org/jira/browse/KYLIN-181 > Project: Kylin > Issue Type: Wish >Reporter: Luke Han > Labels: bi, github-import > Fix For: Future > > > When user use seller_id as filter, the current ODBC will only show the first > 100K scan result. And the query will not re-run when given specific value. > There's enhancement should be offered from Tableau side to aware such high > cardinality column and perform different behavior to avoid such issue. > Imported from GitHub > Url: https://github.com/KylinOLAP/Kylin/issues/325 > Created by: [lukehan|https://github.com/lukehan] > Labels: enhancement, > Milestone: Backlog > Created at: Fri Dec 26 15:32:09 CST 2014 > State: open -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-187) Data Statistics Analyzer
[ https://issues.apache.org/jira/browse/KYLIN-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roger Shi updated KYLIN-187: Description: 1 Overview We need the statistics data for the following domains: * Design cube metadata based on query log * Design HBase row-key based on data distribution (e.g. histogram and cardinality) * Choose execution plan based on cuboid data 2 Data Analyzer We need to analyzer the hive data and cube data in 2 phases. Firstly, we will analyze the hive to guide the 1st round design of row key. Then we will analyze the cube data to refine the design of row key and to estimate the cost of query. 2.1 Analyze Hive Data We need to analyze the following statistics data on hive table: * Cardinality of each dimension * Cardinality of dimension combination (optional) * Value distribution of each dimension (optional) Based on the statistics of hive data, we can design row key group from high cardinality dimension to low cardinality dimension. BTW, we should evenly split dimension into the row key group that will reduce the number of cuboid. 2.2 Analyze Cube Data We need to analyze the following statistics on data cube: * Count of each cuboid * Group ratio of each cuboid = current cuboid count / lower group base cuboid count 3 Query Analyzer TBD Imported from GitHub Url: https://github.com/KylinOLAP/Kylin/issues/318 Created by: [lukehan|https://github.com/lukehan] Labels: newfeature, Milestone: Backlog Created at: Fri Dec 26 15:21:24 CST 2014 State: open was: # Overview We need the statistics data for the following domains: * Design cube metadata based on query log * Design HBase row-key based on data distribution (e.g. histogram and cardinality) * Choose execution plan based on cuboid data # Data Analyzer We need to analyzer the hive data and cube data in 2 phases. Firstly, we will analyze the hive to guide the 1st round design of row key. Then we will analyze the cube data to refine the design of row key and to estimate the cost of query. ## Analyze Hive Data We need to analyze the following statistics data on hive table: * Cardinality of each dimension * Cardinality of dimension combination (optional) * Value distribution of each dimension (optional) Based on the statistics of hive data, we can design row key group from high cardinality dimension to low cardinality dimension. BTW, we should evenly split dimension into the row key group that will reduce the number of cuboid. ## Analyze Cube Data We need to analyze the following statistics on data cube: * Count of each cuboid * Group ratio of each cuboid = current cuboid count / lower group base cuboid count # Query Analyzer TBD Imported from GitHub Url: https://github.com/KylinOLAP/Kylin/issues/318 Created by: [lukehan|https://github.com/lukehan] Labels: newfeature, Milestone: Backlog Created at: Fri Dec 26 15:21:24 CST 2014 State: open > Data Statistics Analyzer > - > > Key: KYLIN-187 > URL: https://issues.apache.org/jira/browse/KYLIN-187 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test >Reporter: Luke Han > Labels: github-import > Fix For: Backlog > > > 1 Overview > We need the statistics data for the following domains: > * Design cube metadata based on query log > * Design HBase row-key based on data distribution (e.g. histogram and > cardinality) > * Choose execution plan based on cuboid data > 2 Data Analyzer > We need to analyzer the hive data and cube data in 2 phases. Firstly, we will > analyze the hive to guide the 1st round design of row key. Then we will > analyze the cube data to refine the design of row key and to estimate the > cost of query. > 2.1 Analyze Hive Data > We need to analyze the following statistics data on hive table: > * Cardinality of each dimension > * Cardinality of dimension combination (optional) > * Value distribution of each dimension (optional) > Based on the statistics of hive data, we can design row key group from high > cardinality dimension to low cardinality dimension. BTW, we should evenly > split dimension into the row key group that will reduce the number of cuboid. > 2.2 Analyze Cube Data > We need to analyze the following statistics on data cube: > * Count of each cuboid > * Group ratio of each cuboid = current cuboid count / lower group base cuboid > count > 3 Query Analyzer > TBD > Imported from GitHub > Url: https://github.com/KylinOLAP/Kylin/issues/318 > Created by: [lukehan|https://github.com/lukehan] > Labels: newfeature, > Milestone: Backlog > Created at: Fri Dec 26 15:21:24 CST 2014 > State: open -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-187) Data Statistics Analyzer
[ https://issues.apache.org/jira/browse/KYLIN-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roger Shi updated KYLIN-187: Description: # Overview We need the statistics data for the following domains: * Design cube metadata based on query log * Design HBase row-key based on data distribution (e.g. histogram and cardinality) * Choose execution plan based on cuboid data # Data Analyzer We need to analyzer the hive data and cube data in 2 phases. Firstly, we will analyze the hive to guide the 1st round design of row key. Then we will analyze the cube data to refine the design of row key and to estimate the cost of query. ## Analyze Hive Data We need to analyze the following statistics data on hive table: * Cardinality of each dimension * Cardinality of dimension combination (optional) * Value distribution of each dimension (optional) Based on the statistics of hive data, we can design row key group from high cardinality dimension to low cardinality dimension. BTW, we should evenly split dimension into the row key group that will reduce the number of cuboid. ## Analyze Cube Data We need to analyze the following statistics on data cube: * Count of each cuboid * Group ratio of each cuboid = current cuboid count / lower group base cuboid count # Query Analyzer TBD Imported from GitHub Url: https://github.com/KylinOLAP/Kylin/issues/318 Created by: [lukehan|https://github.com/lukehan] Labels: newfeature, Milestone: Backlog Created at: Fri Dec 26 15:21:24 CST 2014 State: open was: ## 1. Overview We need the statistics data for the following domains: * Design cube metadata based on query log * Design HBase row-key based on data distribution (e.g. histogram and cardinality) * Choose execution plan based on cuboid data ## 2. Data Analyzer We need to analyzer the hive data and cube data in 2 phases. Firstly, we will analyze the hive to guide the 1st round design of row key. Then we will analyze the cube data to refine the design of row key and to estimate the cost of query. 2.1. Analyze Hive Data We need to analyze the following statistics data on hive table: * Cardinality of each dimension * Cardinality of dimension combination (optional) * Value distribution of each dimension (optional) Based on the statistics of hive data, we can design row key group from high cardinality dimension to low cardinality dimension. BTW, we should evenly split dimension into the row key group that will reduce the number of cuboid. 2.2. Analyze Cube Data We need to analyze the following statistics on data cube: * Count of each cuboid * Group ratio of each cuboid = current cuboid count / lower group base cuboid count 3. Query Analyzer TBD Imported from GitHub Url: https://github.com/KylinOLAP/Kylin/issues/318 Created by: [lukehan|https://github.com/lukehan] Labels: newfeature, Milestone: Backlog Created at: Fri Dec 26 15:21:24 CST 2014 State: open > Data Statistics Analyzer > - > > Key: KYLIN-187 > URL: https://issues.apache.org/jira/browse/KYLIN-187 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test >Reporter: Luke Han > Labels: github-import > Fix For: Backlog > > > # Overview > We need the statistics data for the following domains: > * Design cube metadata based on query log > * Design HBase row-key based on data distribution (e.g. histogram and > cardinality) > * Choose execution plan based on cuboid data > # Data Analyzer > We need to analyzer the hive data and cube data in 2 phases. Firstly, we will > analyze the hive to guide the 1st round design of row key. Then we will > analyze the cube data to refine the design of row key and to estimate the > cost of query. > ## Analyze Hive Data > We need to analyze the following statistics data on hive table: > * Cardinality of each dimension > * Cardinality of dimension combination (optional) > * Value distribution of each dimension (optional) > Based on the statistics of hive data, we can design row key group from high > cardinality dimension to low cardinality dimension. BTW, we should evenly > split dimension into the row key group that will reduce the number of cuboid. > ## Analyze Cube Data > We need to analyze the following statistics on data cube: > * Count of each cuboid > * Group ratio of each cuboid = current cuboid count / lower group base cuboid > count > # Query Analyzer > TBD > Imported from GitHub > Url: https://github.com/KylinOLAP/Kylin/issues/318 > Created by: [lukehan|https://github.com/lukehan] > Labels: newfeature, > Milestone: Backlog > Created at: Fri Dec 26 15:21:24 CST 2014 > State: open -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-187) Data Statistics Analyzer
[ https://issues.apache.org/jira/browse/KYLIN-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roger Shi updated KYLIN-187: Request participants: (was: ) Description: #Overview We need the statistics data for the following domains: * Design cube metadata based on query log * Design HBase row-key based on data distribution (e.g. histogram and cardinality) * Choose execution plan based on cuboid data #Data Analyzer We need to analyzer the hive data and cube data in 2 phases. Firstly, we will analyze the hive to guide the 1st round design of row key. Then we will analyze the cube data to refine the design of row key and to estimate the cost of query. ##Analyze Hive Data We need to analyze the following statistics data on hive table: * Cardinality of each dimension * Cardinality of dimension combination (optional) * Value distribution of each dimension (optional) Based on the statistics of hive data, we can design row key group from high cardinality dimension to low cardinality dimension. BTW, we should evenly split dimension into the row key group that will reduce the number of cuboid. ##Analyze Cube Data We need to analyze the following statistics on data cube: * Count of each cuboid * Group ratio of each cuboid = current cuboid count / lower group base cuboid count # Query Analyzer TBD Imported from GitHub Url: https://github.com/KylinOLAP/Kylin/issues/318 Created by: [lukehan|https://github.com/lukehan] Labels: newfeature, Milestone: Backlog Created at: Fri Dec 26 15:21:24 CST 2014 State: open was: ## 1. Overview We need the statistics data for the following domains: * Design cube metadata based on query log * Design HBase row-key based on data distribution (e.g. histogram and cardinality) * Choose execution plan based on cuboid data ## 2. Data Analyzer We need to analyzer the hive data and cube data in 2 phases. Firstly, we will analyze the hive to guide the 1st round design of row key. Then we will analyze the cube data to refine the design of row key and to estimate the cost of query. 2.1. Analyze Hive Data We need to analyze the following statistics data on hive table: * Cardinality of each dimension * Cardinality of dimension combination (optional) * Value distribution of each dimension (optional) Based on the statistics of hive data, we can design row key group from high cardinality dimension to low cardinality dimension. BTW, we should evenly split dimension into the row key group that will reduce the number of cuboid. 2.2. Analyze Cube Data We need to analyze the following statistics on data cube: * Count of each cuboid * Group ratio of each cuboid = current cuboid count / lower group base cuboid count 3. Query Analyzer TBD Imported from GitHub Url: https://github.com/KylinOLAP/Kylin/issues/318 Created by: [lukehan|https://github.com/lukehan] Labels: newfeature, Milestone: Backlog Created at: Fri Dec 26 15:21:24 CST 2014 State: open > Data Statistics Analyzer > - > > Key: KYLIN-187 > URL: https://issues.apache.org/jira/browse/KYLIN-187 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test >Reporter: Luke Han > Labels: github-import > Fix For: Backlog > > > #Overview > We need the statistics data for the following domains: > * Design cube metadata based on query log > * Design HBase row-key based on data distribution (e.g. histogram and > cardinality) > * Choose execution plan based on cuboid data > #Data Analyzer > We need to analyzer the hive data and cube data in 2 phases. Firstly, we will > analyze the hive to guide the 1st round design of row key. Then we will > analyze the cube data to refine the design of row key and to estimate the > cost of query. > ##Analyze Hive Data > We need to analyze the following statistics data on hive table: > * Cardinality of each dimension > * Cardinality of dimension combination (optional) > * Value distribution of each dimension (optional) > Based on the statistics of hive data, we can design row key group from high > cardinality dimension to low cardinality dimension. BTW, we should evenly > split dimension into the row key group that will reduce the number of cuboid. > ##Analyze Cube Data > We need to analyze the following statistics on data cube: > * Count of each cuboid > * Group ratio of each cuboid = current cuboid count / lower group base cuboid > count > # Query Analyzer > TBD > Imported from GitHub > Url: https://github.com/KylinOLAP/Kylin/issues/318 > Created by: [lukehan|https://github.com/lukehan] > Labels: newfeature, > Milestone: Backlog > Created at: Fri Dec 26 15:21:24 CST 2014 > State: open -- This message was sent by Atlassian JIRA
[jira] [Updated] (KYLIN-187) Data Statistics Analyzer
[ https://issues.apache.org/jira/browse/KYLIN-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roger Shi updated KYLIN-187: Description: ## 1. Overview We need the statistics data for the following domains: * Design cube metadata based on query log * Design HBase row-key based on data distribution (e.g. histogram and cardinality) * Choose execution plan based on cuboid data ## 2. Data Analyzer We need to analyzer the hive data and cube data in 2 phases. Firstly, we will analyze the hive to guide the 1st round design of row key. Then we will analyze the cube data to refine the design of row key and to estimate the cost of query. 2.1. Analyze Hive Data We need to analyze the following statistics data on hive table: * Cardinality of each dimension * Cardinality of dimension combination (optional) * Value distribution of each dimension (optional) Based on the statistics of hive data, we can design row key group from high cardinality dimension to low cardinality dimension. BTW, we should evenly split dimension into the row key group that will reduce the number of cuboid. 2.2. Analyze Cube Data We need to analyze the following statistics on data cube: * Count of each cuboid * Group ratio of each cuboid = current cuboid count / lower group base cuboid count 3. Query Analyzer TBD Imported from GitHub Url: https://github.com/KylinOLAP/Kylin/issues/318 Created by: [lukehan|https://github.com/lukehan] Labels: newfeature, Milestone: Backlog Created at: Fri Dec 26 15:21:24 CST 2014 State: open was: #Overview We need the statistics data for the following domains: * Design cube metadata based on query log * Design HBase row-key based on data distribution (e.g. histogram and cardinality) * Choose execution plan based on cuboid data #Data Analyzer We need to analyzer the hive data and cube data in 2 phases. Firstly, we will analyze the hive to guide the 1st round design of row key. Then we will analyze the cube data to refine the design of row key and to estimate the cost of query. ##Analyze Hive Data We need to analyze the following statistics data on hive table: * Cardinality of each dimension * Cardinality of dimension combination (optional) * Value distribution of each dimension (optional) Based on the statistics of hive data, we can design row key group from high cardinality dimension to low cardinality dimension. BTW, we should evenly split dimension into the row key group that will reduce the number of cuboid. ##Analyze Cube Data We need to analyze the following statistics on data cube: * Count of each cuboid * Group ratio of each cuboid = current cuboid count / lower group base cuboid count # Query Analyzer TBD Imported from GitHub Url: https://github.com/KylinOLAP/Kylin/issues/318 Created by: [lukehan|https://github.com/lukehan] Labels: newfeature, Milestone: Backlog Created at: Fri Dec 26 15:21:24 CST 2014 State: open > Data Statistics Analyzer > - > > Key: KYLIN-187 > URL: https://issues.apache.org/jira/browse/KYLIN-187 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test >Reporter: Luke Han > Labels: github-import > Fix For: Backlog > > > ## 1. Overview > We need the statistics data for the following domains: > * Design cube metadata based on query log > * Design HBase row-key based on data distribution (e.g. histogram and > cardinality) > * Choose execution plan based on cuboid data > ## 2. Data Analyzer > We need to analyzer the hive data and cube data in 2 phases. Firstly, we will > analyze the hive to guide the 1st round design of row key. Then we will > analyze the cube data to refine the design of row key and to estimate the > cost of query. > 2.1. Analyze Hive Data > We need to analyze the following statistics data on hive table: > * Cardinality of each dimension > * Cardinality of dimension combination (optional) > * Value distribution of each dimension (optional) > Based on the statistics of hive data, we can design row key group from high > cardinality dimension to low cardinality dimension. BTW, we should evenly > split dimension into the row key group that will reduce the number of cuboid. > 2.2. Analyze Cube Data > We need to analyze the following statistics on data cube: > * Count of each cuboid > * Group ratio of each cuboid = current cuboid count / lower group base cuboid > count > 3. Query Analyzer > TBD > Imported from GitHub > Url: https://github.com/KylinOLAP/Kylin/issues/318 > Created by: [lukehan|https://github.com/lukehan] > Labels: newfeature, > Milestone: Backlog > Created at: Fri Dec 26 15:21:24 CST 2014 > State: open -- This message was sent by Atlassian JIRA (v6.3.15#6346)