[4/4] kylin git commit: add 1.6 document

shaofengshi Tue, 25 Oct 2016 20:59:02 -0700

add 1.6 document


Project: http://git-wip-us.apache.org/repos/asf/kylin/repo
Commit: http://git-wip-us.apache.org/repos/asf/kylin/commit/59913167
Tree: http://git-wip-us.apache.org/repos/asf/kylin/tree/59913167
Diff: http://git-wip-us.apache.org/repos/asf/kylin/diff/59913167

Branch: refs/heads/document
Commit: 59913167af89d72533d94b02aafeee3a94de7944
Parents: 6616266
Author: shaofengshi <[email protected]>
Authored: Wed Oct 26 11:57:38 2016 +0800
Committer: shaofengshi <[email protected]>
Committed: Wed Oct 26 11:58:17 2016 +0800

----------------------------------------------------------------------
 website/_config.yml                             |    2 +-
 .../_docs16/gettingstarted/best_practices.md    |   27 +
 website/_docs16/gettingstarted/concepts.md      |   64 +
 website/_docs16/gettingstarted/events.md        |   24 +
 website/_docs16/gettingstarted/faq.md           |  119 ++
 website/_docs16/gettingstarted/terminology.md   |   25 +
 website/_docs16/howto/howto_backup_metadata.md  |   60 +
 .../howto/howto_build_cube_with_restapi.md      |   53 +
 website/_docs16/howto/howto_cleanup_storage.md  |   22 +
 website/_docs16/howto/howto_jdbc.md             |   92 ++
 website/_docs16/howto/howto_ldap_and_sso.md     |  121 ++
 website/_docs16/howto/howto_optimize_cubes.md   |  212 +++
 .../_docs16/howto/howto_update_coprocessor.md   |   14 +
 website/_docs16/howto/howto_upgrade.md          |  157 +++
 website/_docs16/howto/howto_use_beeline.md      |   14 +
 website/_docs16/howto/howto_use_restapi.md      | 1066 +++++++++++++++
 .../_docs16/howto/howto_use_restapi_in_js.md    |   46 +
 website/_docs16/index.cn.md                     |   24 +
 website/_docs16/index.md                        |   52 +
 website/_docs16/install/advance_settings.md     |   92 ++
 website/_docs16/install/hadoop_evn.md           |   40 +
 website/_docs16/install/index.cn.md             |   46 +
 website/_docs16/install/index.md                |   35 +
 website/_docs16/install/kylin_cluster.md        |   32 +
 website/_docs16/install/kylin_docker.md         |   10 +
 .../_docs16/install/manual_install_guide.cn.md  |   48 +
 website/_docs16/release_notes.md                | 1214 ++++++++++++++++++
 website/_docs16/tutorial/acl.cn.md              |   35 +
 website/_docs16/tutorial/acl.md                 |   32 +
 website/_docs16/tutorial/create_cube.cn.md      |  129 ++
 website/_docs16/tutorial/create_cube.md         |  198 +++
 website/_docs16/tutorial/cube_build_job.cn.md   |   66 +
 website/_docs16/tutorial/cube_build_job.md      |   67 +
 website/_docs16/tutorial/cube_streaming.md      |  225 ++++
 .../_docs16/tutorial/kylin_client_tool.cn.md    |   97 ++
 website/_docs16/tutorial/kylin_sample.md        |   21 +
 website/_docs16/tutorial/odbc.cn.md             |   34 +
 website/_docs16/tutorial/odbc.md                |   49 +
 website/_docs16/tutorial/powerbi.cn.md          |   56 +
 website/_docs16/tutorial/powerbi.md             |   54 +
 website/_docs16/tutorial/tableau.cn.md          |  116 ++
 website/_docs16/tutorial/tableau.md             |  113 ++
 website/_docs16/tutorial/tableau_91.cn.md       |   51 +
 website/_docs16/tutorial/tableau_91.md          |   50 +
 website/_docs16/tutorial/web.cn.md              |  134 ++
 website/_docs16/tutorial/web.md                 |  123 ++
 website/_layouts/docs16-cn.html                 |   46 +
 website/_layouts/docs16.html                    |   50 +
 .../10_agg_group.png                            |  Bin 0 -> 134624 bytes
 .../Kylin-Cube-Streaming-Tutorial/11_Rowkey.png |  Bin 0 -> 186974 bytes
 .../12_overwrite.png                            |  Bin 0 -> 43343 bytes
 .../13_Query_result.png                         |  Bin 0 -> 83561 bytes
 .../1_Add_streaming_table.png                   |  Bin 0 -> 24514 bytes
 .../2_Define_streaming_table.png                |  Bin 0 -> 326958 bytes
 .../3_Kafka_setting.png                         |  Bin 0 -> 45511 bytes
 .../3_Paser_setting.png                         |  Bin 0 -> 64675 bytes
 .../3_Paser_time.png                            |  Bin 0 -> 55982 bytes
 .../4_Streaming_table.png                       |  Bin 0 -> 86278 bytes
 .../5_Data_model_dimension.png                  |  Bin 0 -> 81626 bytes
 .../6_Data_model_measure.png                    |  Bin 0 -> 33630 bytes
 .../7_Data_model_partition.png                  |  Bin 0 -> 53840 bytes
 .../8_Cube_dimension.png                        |  Bin 0 -> 197656 bytes
 .../9_Cube_measure.png                          |  Bin 0 -> 75748 bytes
 63 files changed, 5456 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_config.yml
----------------------------------------------------------------------
diff --git a/website/_config.yml b/website/_config.yml
index 40b1dbf..dc84757 100644
--- a/website/_config.yml
+++ b/website/_config.yml
@@ -27,7 +27,7 @@ encoding: UTF-8
 timezone: America/Dawson 
 
 exclude: ["README.md", "Rakefile", "*.scss", "*.haml", "*.sh"]
-include: [_docs,_docs15,_dev]
+include: [_docs,_docs15,_docs16,_dev]
 
 # Build settings
 markdown: kramdown

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/gettingstarted/best_practices.md
----------------------------------------------------------------------
diff --git a/website/_docs16/gettingstarted/best_practices.md 
b/website/_docs16/gettingstarted/best_practices.md
new file mode 100644
index 0000000..5c3a12d
--- /dev/null
+++ b/website/_docs16/gettingstarted/best_practices.md
@@ -0,0 +1,27 @@
+---
+layout: docs16
+title:  "Community Best Practices"
+categories: gettingstarted
+permalink: /docs16/gettingstarted/best_practices.html
+since: v1.3.x
+---
+
+List of articles about Kylin best practices contributed by community. Some of 
them are from Chinese community. Many thanks!
+
+* [Apache 
Kylinå¨ç¾åº¦å°å¾çå®è·µ](http://www.infoq.com/cn/articles/practis-of-apache-kylin-in-baidu-map)
+
+* [Apache Kylin 
å¤§æ°æ®æ¶ä»£çOLAPå©å¨](http://www.bitstech.net/2016/01/04/kylin-olap/)(ç½ææ¡ä¾)
+
+* [Apache 
Kylinå¨äºæµ·çå®è·µ](http://www.csdn.net/article/2015-11-27/2826343)(äº¬ä¸æ¡ä¾)
+
+* [Kylin, Mondrian, 
Saikuç³»ç»çæ´å](http://tech.youzan.com/kylin-mondrian-saiku/)(æèµæ¡ä¾)
+
+* [Big Data MDX with Mondrian and Apache 
Kylin](https://www.inovex.de/fileadmin/files/Vortraege/2015/big-data-mdx-with-mondrian-and-apache-kylin-sebastien-jelsch-pcm-11-2015.pdf)
+
+* [Kylin and Mondrain 
Interaction](https://github.com/mustangore/kylin-mondrian-interaction) (Thanks 
to [mustangore](https://github.com/mustangore))
+
+* [Kylin And Tableau 
Tutorial](https://github.com/albertoRamon/Kylin/tree/master/KylinWithTableau) 
(Thanks to [RamÃ³n PortolÃ©s, 
Alberto](https://www.linkedin.com/in/alberto-ramon-portoles-a02b523b))
+
+* [Kylin and Qlik 
Integration](https://github.com/albertoRamon/Kylin/tree/master/KylinWithQlik) 
(Thanks to [RamÃ³n PortolÃ©s, 
Alberto](https://www.linkedin.com/in/alberto-ramon-portoles-a02b523b))
+
+* [How to use Hue with 
Kylin](https://github.com/albertoRamon/Kylin/tree/master/KylinWithHue) (Thanks 
to [RamÃ³n PortolÃ©s, 
Alberto](https://www.linkedin.com/in/alberto-ramon-portoles-a02b523b))
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/gettingstarted/concepts.md
----------------------------------------------------------------------
diff --git a/website/_docs16/gettingstarted/concepts.md 
b/website/_docs16/gettingstarted/concepts.md
new file mode 100644
index 0000000..cf5ce07
--- /dev/null
+++ b/website/_docs16/gettingstarted/concepts.md
@@ -0,0 +1,64 @@
+---
+layout: docs16
+title:  "Technical Concepts"
+categories: gettingstarted
+permalink: /docs16/gettingstarted/concepts.html
+since: v1.2
+---
+ 
+Here are some basic technical concepts used in Apache Kylin, please check them 
for your reference.
+For terminology in domain, please refer to: [Terminology](terminology.html)
+
+## CUBE
+* __Table__ - This is definition of hive tables as source of cubes, which must 
be synced before building cubes.
+![](/images/docs/concepts/DataSource.png)
+
+* __Data Model__ - This describes a [STAR 
SCHEMA](https://en.wikipedia.org/wiki/Star_schema) data model, which defines 
fact/lookup tables and filter condition.
+![](/images/docs/concepts/DataModel.png)
+
+* __Cube Descriptor__ - This describes definition and settings for a cube 
instance, defining which data model to use, what dimensions and measures to 
have, how to partition to segments and how to handle auto-merge etc.
+![](/images/docs/concepts/CubeDesc.png)
+
+* __Cube Instance__ - This is instance of cube, built from one cube 
descriptor, and consist of one or more cube segments according partition 
settings.
+![](/images/docs/concepts/CubeInstance.png)
+
+* __Partition__ - User can define a DATE/STRING column as partition column on 
cube descriptor, to separate one cube into several segments with different date 
periods.
+![](/images/docs/concepts/Partition.png)
+
+* __Cube Segment__ - This is actual carrier of cube data, and maps to a HTable 
in HBase. One building job creates one new segment for the cube instance. Once 
data change on specified data period, we can refresh related segments to avoid 
rebuilding whole cube.
+![](/images/docs/concepts/CubeSegment.png)
+
+* __Aggregation Group__ - Each aggregation group is subset of dimensions, and 
build cuboid with combinations inside. It aims at pruning for optimization.
+![](/images/docs/concepts/AggregationGroup.png)
+
+## DIMENSION & MEASURE
+* __Mandotary__ - This dimension type is used for cuboid pruning, if a 
dimension is specified as âmandatoryâ, then those combinations without such 
dimension are pruned.
+* __Hierarchy__ - This dimension type is used for cuboid pruning, if dimension 
A,B,C forms a âhierarchyâ relation, then only combinations with A, AB or 
ABC shall be remained. 
+* __Derived__ - On lookup tables, some dimensions could be generated from its 
PK, so there's specific mapping between them and FK from fact table. So those 
dimensions are DERIVED and don't participate in cuboid generation.
+![](/images/docs/concepts/Dimension.png)
+
+* __Count Distinct(HyperLogLog)__ - Immediate COUNT DISTINCT is hard to 
calculate, a approximate algorithm - 
[HyperLogLog](https://en.wikipedia.org/wiki/HyperLogLog) is introduced, and 
keep error rate in a lower level. 
+* __Count Distinct(Precise)__ - Precise COUNT DISTINCT will be pre-calculated 
basing on RoaringBitmap, currently only int or bigint are supported.
+* __Top N__ - For example, with this measure type, user can easily get 
specified numbers of top sellers/buyers etc. 
+![](/images/docs/concepts/Measure.png)
+
+## CUBE ACTIONS
+* __BUILD__ - Given an interval of partition column, this action is to build a 
new cube segment.
+* __REFRESH__ - This action will rebuilt cube segment in some partition 
period, which is used in case of source table increasing.
+* __MERGE__ - This action will merge multiple continuous cube segments into 
single one. This can be automated with auto-merge settings in cube descriptor.
+* __PURGE__ - Clear segments under a cube instance. This will only update 
metadata, and won't delete cube data from HBase.
+![](/images/docs/concepts/CubeAction.png)
+
+## JOB STATUS
+* __NEW__ - This denotes one job has been just created.
+* __PENDING__ - This denotes one job is paused by job scheduler and waiting 
for resources.
+* __RUNNING__ - This denotes one job is running in progress.
+* __FINISHED__ - This denotes one job is successfully finished.
+* __ERROR__ - This denotes one job is aborted with errors.
+* __DISCARDED__ - This denotes one job is cancelled by end users.
+![](/images/docs/concepts/Job.png)
+
+## JOB ACTION
+* __RESUME__ - Once a job in ERROR status, this action will try to restore it 
from latest successful point.
+* __DISCARD__ - No matter status of a job is, user can end it and release 
resources with DISCARD action.
+![](/images/docs/concepts/JobAction.png)

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/gettingstarted/events.md
----------------------------------------------------------------------
diff --git a/website/_docs16/gettingstarted/events.md 
b/website/_docs16/gettingstarted/events.md
new file mode 100644
index 0000000..277d580
--- /dev/null
+++ b/website/_docs16/gettingstarted/events.md
@@ -0,0 +1,24 @@
+---
+layout: docs16
+title:  "Events and Conferences"
+categories: gettingstarted
+permalink: /docs16/gettingstarted/events.html
+---
+
+__Conferences__
+
+* [The Evolution of Apache Kylin: Realtime and Plugin Architecture in 
Kylin](https://www.youtube.com/watch?v=n74zvLmIgF0)([slides](http://www.slideshare.net/YangLi43/apache-kylin-15-updates))
 by [Li Yang](https://github.com/liyang-gmt8), at [Hadoop Summit 2016 
Dublin](http://hadoopsummit.org/dublin/agenda/), Ireland, 2016-04-14
+* [Apache Kylin - Balance Between Space and 
Time](http://www.chinahadoop.com/2015/July/Shanghai/agenda.php) 
([slides](http://www.slideshare.net/qhzhou/apache-kylin-china-hadoop-summit-2015-shanghai))
 by [Qianhao Zhou](https://github.com/qhzhou), at Hadoop Summit 2015 in 
Shanghai, China, 2015-07-24
+* [Apache Kylin - Balance Between Space and 
Time](https://www.youtube.com/watch?v=jgvZSFaXPgI), 
[slides](http://www.slideshare.net/DebashisSaha/apache-kylin-balance-between-space-and-time-hadop-summit-2015)
 ([video](https://www.youtube.com/watch?v=jgvZSFaXPgI), 
[slides](http://www.slideshare.net/DebashisSaha/apache-kylin-balance-between-space-and-time-hadop-summit-2015))
 by [Debashis Saha](https://twitter.com/debashis_saha) & [Luke 
Han](https://twitter.com/lukehq), at Hadoop Summit 2015 in San Jose, US, 
2015-06-09
+* [HBaseCon 2015: Apache Kylin; Extreme OLAP Engine for 
Hadoop](https://vimeo.com/128152444) ([video](https://vimeo.com/128152444), 
[slides](http://www.slideshare.net/HBaseCon/ecosystem-session-3b)) by [Seshu 
Adunuthula](https://twitter.com/SeshuAd) at HBaseCon 2015 in San Francisco, US, 
2015-05-07
+* [Apache Kylin - Extreme OLAP Engine for 
Hadoop](http://strataconf.com/big-data-conference-uk-2015/public/schedule/detail/40029)
 
([slides](http://www.slideshare.net/lukehan/apache-kylin-extreme-olap-engine-for-big-data))
 by [Luke Han](https://twitter.com/lukehq) & [Yang 
Li](https://github.com/liyang-gmt8), at Strata+Hadoop World in London, UK, 
2015-05-06
+* [Apache Kylin Open Source 
Journey](http://www.infoq.com/cn/presentations/open-source-journey-of-apache-kylin)
 
([slides](http://www.slideshare.net/lukehan/apache-kylin-open-source-journey-for-qcon2015-beijing))
 by [Luke Han](https://twitter.com/lukehq), at QCon Beijing in Beijing, China, 
2015-04-23
+* [Apache Kylin - OLAP on 
Hadoop](http://cio.it168.com/a2015/0418/1721/000001721404.shtml) by [Yang 
Li](https://github.com/liyang-gmt8), at Database Technology Conference China 
2015 in Beijing, China, 2015-04-18
+* [Apache Kylin â Cubes on 
Hadoop](https://www.youtube.com/watch?v=U0SbrVzuOe4) 
([video](https://www.youtube.com/watch?v=U0SbrVzuOe4), 
[slides](http://www.slideshare.net/Hadoop_Summit/apache-kylin-cubes-on-hadoop)) 
by [Ted Dunning](https://twitter.com/ted_dunning), at Hadoop Summit 2015 Europe 
in Brussels, Belgium, 2015-04-16
+* [Apache Kylin ï¼ Hadoop 
ä¸çå¤§è§æ¨¡èæºåæå¹³å°](http://bdtc2014.hadooper.cn/m/zone/bdtc_2014/schedule3)
 
([slides](http://www.slideshare.net/lukehan/apache-kylin-big-data-technology-conference-2014-beijing-v2))
 by [Luke Han](https://twitter.com/lukehq), at Big Data Technology Conference 
China in Beijing, China, 2014-12-14
+* [Apache Kylin: OLAP Engine on Hadoop - Tech Deep 
Dive](http://v.csdn.hudong.com/s/article.html?arcid=15820707) 
([video](http://v.csdn.hudong.com/s/article.html?arcid=15820707), 
[slides](http://www.slideshare.net/XuJiang2/kylin-hadoop-olap-engine)) by 
[Jiang Xu](https://www.linkedin.com/pub/xu-jiang/4/5a8/230), at Shanghai Big 
Data Summit 2014 in Shanghai, China , 2014-10-25
+
+__Meetup__
+
+* [Apache Kylin Meetup @Bay 
Area](http://www.meetup.com/Cloud-at-ebayinc/events/218914395/), in San Jose, 
US, 6:00PM - 7:30PM, Thursday, 2014-12-04
+

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/gettingstarted/faq.md
----------------------------------------------------------------------
diff --git a/website/_docs16/gettingstarted/faq.md 
b/website/_docs16/gettingstarted/faq.md
new file mode 100644
index 0000000..0ecb44e
--- /dev/null
+++ b/website/_docs16/gettingstarted/faq.md
@@ -0,0 +1,119 @@
+---
+layout: docs16
+title:  "FAQ"
+categories: gettingstarted
+permalink: /docs16/gettingstarted/faq.html
+since: v0.6.x
+---
+
+#### 1. "bin/find-hive-dependency.sh" can locate hive/hcat jars in local, but 
Kylin reports error like "java.lang.NoClassDefFoundError: 
org/apache/hive/hcatalog/mapreduce/HCatInputFormat"
+
+  * Kylin need many dependent jars (hadoop/hive/hcat/hbase/kafka) on classpath 
to work, but Kylin doesn't ship them. It will seek these jars from your local 
machine by running commands like `hbase classpath`, `hive -e set` etc. The 
founded jars' path will be appended to the environment variable 
*HBASE_CLASSPATH* (Kylin uses `hbase` shell command to start up, which will 
read this). But in some Hadoop distribution (like EMR 5.0), the `hbase` shell 
doesn't keep the origin `HBASE_CLASSPATH` value, that causes the 
"NoClassDefFoundError".
+
+  * To fix this, find the hbase shell script (in hbase/bin folder), and search 
*HBASE_CLASSPATH*, check whether it overwrite the value like :
+
+  {% highlight Groff markup %}
+  export 
HBASE_CLASSPATH=$HADOOP_CONF:$HADOOP_HOME/*:$HADOOP_HOME/lib/*:$ZOOKEEPER_HOME/*:$ZOOKEEPER_HOME/lib/*
+  {% endhighlight %}
+
+  * If true, change it to keep the origin value like:
+
+   {% highlight Groff markup %}
+  export 
HBASE_CLASSPATH=$HADOOP_CONF:$HADOOP_HOME/*:$HADOOP_HOME/lib/*:$ZOOKEEPER_HOME/*:$ZOOKEEPER_HOME/lib/*:$HBASE_CLASSPATH
+  {% endhighlight %}
+
+#### 2. Get "java.lang.IllegalArgumentException: Too high cardinality is not 
suitable for dictionary -- cardinality: 5220674" in "Build Dimension 
Dictionary" step
+
+  * Kylin uses "Dictionary" encoding to encode/decode the dimension values 
(check [this blog](/blog/2015/08/13/kylin-dictionary/)); Usually a dimension's 
cardinality is less than millions, so the "Dict" encoding is good to use. As 
dictionary need be persisted and loaded into memory, if a dimension's 
cardinality is very high, the memory footprint will be tremendous, so Kylin add 
a check on this. If you see this error, suggest to identify the UHC dimension 
first and then re-evaluate the design (whether need to make that as 
dimension?). If must keep it, you can by-pass this error with couple ways: 1) 
change to use other encoding (like `fixed_length`, `integer`) 2) or set a 
bigger value for `kylin.dictionary.max.cardinality` in `conf/kylin.properties`.
+
+#### 3. Build cube failed due to "error check status"
+
+  * Check if `kylin.log` contains 
*yarn.resourcemanager.webapp.address:http://0.0.0.0:8088* and 
*java.net.ConnectException: Connection refused*
+  * If yes, then the problem is the address of resource manager was not 
available in yarn-site.xml
+  * A workaround is update `kylin.properties`, set 
`kylin.job.yarn.app.rest.check.status.url=http://YOUR_RM_NODE:8088/ws/v1/cluster/apps/${job_id}?anonymous=true`
+
+#### 4. HBase cannot get master address from ZooKeeper on Hortonworks Sandbox
+   
+  * By default hortonworks disables hbase, you'll have to start hbase in 
ambari homepage first.
+
+#### 5. Map Reduce Job information cannot display on Hortonworks Sandbox
+   
+  * Check out 
[https://github.com/KylinOLAP/Kylin/issues/40](https://github.com/KylinOLAP/Kylin/issues/40)
+
+#### 6. How to Install Kylin on CDH 5.2 or Hadoop 2.5.x
+
+  * Check out discussion: 
[https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kylin-olap/X0GZfsX1jLc/nzs6xAhNpLkJ](https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kylin-olap/X0GZfsX1jLc/nzs6xAhNpLkJ)
+
+  {% highlight Groff markup %}
+  I was able to deploy Kylin with following option in POM.
+  <hadoop2.version>2.5.0</hadoop2.version>
+  <yarn.version>2.5.0</yarn.version>
+  <hbase-hadoop2.version>0.98.6-hadoop2</hbase-hadoop2.version>
+  <zookeeper.version>3.4.5</zookeeper.version>
+  <hive.version>0.13.1</hive.version>
+  My Cluster is running on Cloudera Distribution CDH 5.2.0.
+  {% endhighlight %}
+
+
+#### 7. SUM(field) returns a negtive result while all the numbers in this 
field are > 0
+  * If a column is declared as integer in Hive, the SQL engine (calcite) will 
use column's type (integer) as the data type for "SUM(field)", while the 
aggregated value on this field may exceed the scope of integer; in that case 
the cast will cause a negtive value be returned; The workround is, alter that 
column's type to BIGINT in hive, and then sync the table schema to Kylin (the 
cube doesn't need rebuild); Keep in mind that, always declare as BIGINT in hive 
for an integer column which would be used as a measure in Kylin; See hive 
number types: 
[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes)
+
+#### 8. Why Kylin need extract the distinct columns from Fact Table before 
building cube?
+  * Kylin uses dictionary to encode the values in each column, this greatly 
reduce the cube's storage size. To build the dictionary, Kylin need fetch the 
distinct values for each column.
+
+#### 9. Why Kylin calculate the HIVE table cardinality?
+  * The cardinality of dimensions is an important measure of cube complexity. 
The higher the cardinality, the bigger the cube, and thus the longer to build 
and the slower to query. Cardinality > 1,000 is worth attention and > 1,000,000 
should be avoided at best effort. For optimal cube performance, try reduce high 
cardinality by categorize values or derive features.
+
+#### 10. How to add new user or change the default password?
+  * Kylin web's security is implemented with Spring security framework, where 
the kylinSecurity.xml is the main configuration file:
+
+   {% highlight Groff markup %}
+   ${KYLIN_HOME}/tomcat/webapps/kylin/WEB-INF/classes/kylinSecurity.xml
+   {% endhighlight %}
+
+  * The password hash for pre-defined test users can be found in the profile 
"sandbox,testing" part; To change the default password, you need generate a new 
hash and then update it here, please refer to the code snippet in: 
[https://stackoverflow.com/questions/25844419/spring-bcryptpasswordencoder-generate-different-password-for-same-input](https://stackoverflow.com/questions/25844419/spring-bcryptpasswordencoder-generate-different-password-for-same-input)
+  * When you deploy Kylin for more users, switch to LDAP authentication is 
recommended.
+
+#### 11. Using sub-query for un-supported SQL
+
+{% highlight Groff markup %}
+Original SQL:
+select fact.slr_sgmt,
+sum(case when cal.RTL_WEEK_BEG_DT = '2015-09-06' then gmv else 0 end) as W36,
+sum(case when cal.RTL_WEEK_BEG_DT = '2015-08-30' then gmv else 0 end) as W35
+from ih_daily_fact fact
+inner join dw_cal_dt cal on fact.cal_dt = cal.cal_dt
+group by fact.slr_sgmt
+{% endhighlight %}
+
+{% highlight Groff markup %}
+Using sub-query
+select a.slr_sgmt,
+sum(case when a.RTL_WEEK_BEG_DT = '2015-09-06' then gmv else 0 end) as W36,
+sum(case when a.RTL_WEEK_BEG_DT = '2015-08-30' then gmv else 0 end) as W35
+from (
+    select fact.slr_sgmt as slr_sgmt,
+    cal.RTL_WEEK_BEG_DT as RTL_WEEK_BEG_DT,
+    sum(gmv) as gmv36,
+    sum(gmv) as gmv35
+    from ih_daily_fact fact
+    inner join dw_cal_dt cal on fact.cal_dt = cal.cal_dt
+    group by fact.slr_sgmt, cal.RTL_WEEK_BEG_DT
+) a
+group by a.slr_sgmt
+{% endhighlight %}
+
+#### 12. Build kylin meet NPM errors 
(ä¸å½å¤§éå°åºç¨æ·è¯·ç¹å«æ³¨ææ¤é®é¢)
+
+  * Please add proxy for your NPM:  
+  `npm config set proxy http://YOUR_PROXY_IP`
+
+  * Please update your local NPM repository to using any mirror of npmjs.org, 
like Taobao NPM (è¯·æ´æ°æ¨æ¬å°çNPMä»åºä»¥ä½¿ç¨å½å
çNPMéåï¼ä¾å¦æ·å®NPMéå) :  
+  [http://npm.taobao.org](http://npm.taobao.org)
+
+#### 13. Failed to run BuildCubeWithEngineTest, saying failed to connect to 
hbase while hbase is active
+  * User may get this error when first time run hbase client, please check the 
error trace to see whether there is an error saying couldn't access a folder 
like "/hadoop/hbase/local/jars"; If that folder doesn't exist, create it.
+
+
+
+

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/gettingstarted/terminology.md
----------------------------------------------------------------------
diff --git a/website/_docs16/gettingstarted/terminology.md 
b/website/_docs16/gettingstarted/terminology.md
new file mode 100644
index 0000000..1fad135
--- /dev/null
+++ b/website/_docs16/gettingstarted/terminology.md
@@ -0,0 +1,25 @@
+---
+layout: docs16
+title:  "Terminology"
+categories: gettingstarted
+permalink: /docs16/gettingstarted/terminology.html
+since: v0.5.x
+---
+ 
+
+Here are some domain terms we are using in Apache Kylin, please check them for 
your reference.   
+They are basic knowledge of Apache Kylin which also will help to well 
understand such concerpt, term, knowledge, theory and others about Data 
Warehouse, Business Intelligence for analycits. 
+
+* __Data Warehouse__: a data warehouse (DW or DWH), also known as an 
enterprise data warehouse (EDW), is a system used for reporting and data 
analysis, [wikipedia](https://en.wikipedia.org/wiki/Data_warehouse)
+* __Business Intelligence__: Business intelligence (BI) is the set of 
techniques and tools for the transformation of raw data into meaningful and 
useful information for business analysis purposes, 
[wikipedia](https://en.wikipedia.org/wiki/Business_intelligence)
+* __OLAP__: OLAP is an acronym for [online analytical 
processing](https://en.wikipedia.org/wiki/Online_analytical_processing)
+* __OLAP Cube__: an OLAP cube is an array of data understood in terms of its 0 
or more dimensions, [wikipedia](http://en.wikipedia.org/wiki/OLAP_cube)
+* __Star Schema__: the star schema consists of one or more fact tables 
referencing any number of dimension tables, 
[wikipedia](https://en.wikipedia.org/wiki/Star_schema)
+* __Fact Table__: a Fact table consists of the measurements, metrics or facts 
of a business process, [wikipedia](https://en.wikipedia.org/wiki/Fact_table)
+* __Lookup Table__: a lookup table is an array that replaces runtime 
computation with a simpler array indexing operation, 
[wikipedia](https://en.wikipedia.org/wiki/Lookup_table)
+* __Dimension__: A dimension is a structure that categorizes facts and 
measures in order to enable users to answer business questions. Commonly used 
dimensions are people, products, place and time, 
[wikipedia](https://en.wikipedia.org/wiki/Dimension_(data_warehouse))
+* __Measure__: a measure is a property on which calculations (e.g., sum, 
count, average, minimum, maximum) can be made, 
[wikipedia](https://en.wikipedia.org/wiki/Measure_(data_warehouse))
+* __Join__: a SQL join clause combines records from two or more tables in a 
relational database, [wikipedia](https://en.wikipedia.org/wiki/Join_(SQL))
+
+
+

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_backup_metadata.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_backup_metadata.md 
b/website/_docs16/howto/howto_backup_metadata.md
new file mode 100644
index 0000000..0d295aa
--- /dev/null
+++ b/website/_docs16/howto/howto_backup_metadata.md
@@ -0,0 +1,60 @@
+---
+layout: docs16
+title:  Backup Metadata
+categories: howto
+permalink: /docs16/howto/howto_backup_metadata.html
+---
+
+Kylin organizes all of its metadata (including cube descriptions and 
instances, projects, inverted index description and instances, jobs, tables and 
dictionaries) as a hierarchy file system. However, Kylin uses hbase to store 
it, rather than normal file system. If you check your kylin configuration 
file(kylin.properties) you will find such a line:
+
+{% highlight Groff markup %}
+## The metadata store in hbase
+kylin.metadata.url=kylin_metadata@hbase
+{% endhighlight %}
+
+This indicates that the metadata will be saved as a htable called 
`kylin_metadata`. You can scan the htable in hbase shell to check it out.
+
+## Backup Metadata Store with binary package
+
+Sometimes you need to backup the Kylin's Metadata Store from hbase to your 
disk file system.
+In such cases, assuming you're on the hadoop CLI(or sandbox) where you 
deployed Kylin, you can go to KYLIN_HOME and run :
+
+{% highlight Groff markup %}
+./bin/metastore.sh backup
+{% endhighlight %}
+
+to dump your metadata to your local folder a folder under 
KYLIN_HOME/metadata_backps, the folder is named after current time with the 
syntax: KYLIN_HOME/meta_backups/meta_year_month_day_hour_minute_second
+
+## Restore Metadata Store with binary package
+
+In case you find your metadata store messed up, and you want to restore to a 
previous backup:
+
+Firstly, reset the metadata store (this will clean everything of the Kylin 
metadata store in hbase, make sure to backup):
+
+{% highlight Groff markup %}
+./bin/metastore.sh reset
+{% endhighlight %}
+
+Then upload the backup metadata to Kylin's metadata store:
+{% highlight Groff markup %}
+./bin/metastore.sh restore $KYLIN_HOME/meta_backups/meta_xxxx_xx_xx_xx_xx_xx
+{% endhighlight %}
+
+## Backup/restore metadata in development env (available since 0.7.3)
+
+When developing/debugging Kylin, typically you have a dev machine with an IDE, 
and a backend sandbox. Usually you'll write code and run test cases at dev 
machine. It would be troublesome if you always have to put a binary package in 
the sandbox to check the metadata. There is a helper class called 
SandboxMetastoreCLI to help you download/upload metadata locally at your dev 
machine. Follow the Usage information and run it in your IDE.
+
+## Cleanup unused resources from Metadata Store (available since 0.7.3)
+As time goes on, some resources like dictionary, table snapshots became 
useless (as the cube segment be dropped or merged), but they still take space 
there; You can run command to find and cleanup them from metadata store:
+
+Firstly, run a check, this is safe as it will not change anything:
+{% highlight Groff markup %}
+./bin/metastore.sh clean
+{% endhighlight %}
+
+The resources that will be dropped will be listed;
+
+Next, add the "--delete true" parameter to cleanup those resources; before 
this, make sure you have made a backup of the metadata store;
+{% highlight Groff markup %}
+./bin/metastore.sh clean --delete true
+{% endhighlight %}

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_build_cube_with_restapi.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_build_cube_with_restapi.md 
b/website/_docs16/howto/howto_build_cube_with_restapi.md
new file mode 100644
index 0000000..0ccd486
--- /dev/null
+++ b/website/_docs16/howto/howto_build_cube_with_restapi.md
@@ -0,0 +1,53 @@
+---
+layout: docs16
+title:  Build Cube with RESTful API
+categories: howto
+permalink: /docs16/howto/howto_build_cube_with_restapi.html
+---
+
+### 1. Authentication
+*   Currently, Kylin uses [basic 
authentication](http://en.wikipedia.org/wiki/Basic_access_authentication).
+*   Add `Authorization` header to first request for authentication
+*   Or you can do a specific request by `POST 
http://localhost:7070/kylin/api/user/authentication`
+*   Once authenticated, client can go subsequent requests with cookies.
+{% highlight Groff markup %}
+POST http://localhost:7070/kylin/api/user/authentication
+    
+Authorization:Basic xxxxJD124xxxGFxxxSDF
+Content-Type: application/json;charset=UTF-8
+{% endhighlight %}
+
+### 2. Get details of cube. 
+*   `GET 
http://localhost:7070/kylin/api/cubes?cubeName={cube_name}&limit=15&offset=0`
+*   Client can find cube segment date ranges in returned cube detail.
+{% highlight Groff markup %}
+GET 
http://localhost:7070/kylin/api/cubes?cubeName=test_kylin_cube_with_slr&limit=15&offset=0
+
+Authorization:Basic xxxxJD124xxxGFxxxSDF
+Content-Type: application/json;charset=UTF-8
+{% endhighlight %}
+### 3. Then submit a build job of the cube. 
+*   `PUT http://localhost:7070/kylin/api/cubes/{cube_name}/rebuild`
+*   For put request body detail please refer to [Build Cube 
API](howto_use_restapi.html#build-cube). 
+    *   `startTime` and `endTime` should be utc timestamp.
+    *   `buildType` can be `BUILD` ,`MERGE` or `REFRESH`. `BUILD` is for 
building a new segment, `REFRESH` for refreshing an existing segment. `MERGE` 
is for merging multiple existing segments into one bigger segment.
+*   This method will return a new created job instance,  whose uuid is the 
unique id of job to track job status.
+{% highlight Groff markup %}
+PUT http://localhost:7070/kylin/api/cubes/test_kylin_cube_with_slr/rebuild
+
+Authorization:Basic xxxxJD124xxxGFxxxSDF
+Content-Type: application/json;charset=UTF-8
+    
+{
+    "startTime": 0,
+    "endTime": 1388563200000,
+    "buildType": "BUILD"
+}
+{% endhighlight %}
+
+### 4. Track job status. 
+*   `GET http://localhost:7070/kylin/api/jobs/{job_uuid}`
+*   Returned `job_status` represents current status of job.
+
+### 5. If the job got errors, you can resume it. 
+*   `PUT http://localhost:7070/kylin/api/jobs/{job_uuid}/resume`

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_cleanup_storage.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_cleanup_storage.md 
b/website/_docs16/howto/howto_cleanup_storage.md
new file mode 100644
index 0000000..233d32d
--- /dev/null
+++ b/website/_docs16/howto/howto_cleanup_storage.md
@@ -0,0 +1,22 @@
+---
+layout: docs16
+title:  Cleanup Storage (HDFS & HBase)
+categories: howto
+permalink: /docs16/howto/howto_cleanup_storage.html
+---
+
+Kylin will generate intermediate files in HDFS during the cube building; 
Besides, when purge/drop/merge cubes, some HBase tables may be left in HBase 
and will no longer be queried; Although Kylin has started to do some 
+automated garbage collection, it might not cover all cases; You can do an 
offline storage cleanup periodically:
+
+Steps:
+1. Check which resources can be cleanup, this will not remove anything:
+{% highlight Groff markup %}
+export KYLIN_HOME=/path/to/kylin_home
+${KYLIN_HOME}/bin/kylin.sh 
org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete false
+{% endhighlight %}
+Here please replace (version) with the specific Kylin jar version in your 
installation;
+2. You can pickup 1 or 2 resources to check whether they're no longer be 
referred; Then add the "--delete true" option to start the cleanup:
+{% highlight Groff markup %}
+${KYLIN_HOME}/bin/kylin.sh 
org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete true
+{% endhighlight %}
+On finish, the intermediate HDFS location and HTables should be dropped;

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_jdbc.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_jdbc.md 
b/website/_docs16/howto/howto_jdbc.md
new file mode 100644
index 0000000..9990df6
--- /dev/null
+++ b/website/_docs16/howto/howto_jdbc.md
@@ -0,0 +1,92 @@
+---
+layout: docs16
+title:  Use JDBC Driver
+categories: howto
+permalink: /docs16/howto/howto_jdbc.html
+---
+
+### Authentication
+
+###### Build on Apache Kylin authentication restful service. Supported 
parameters:
+* user : username 
+* password : password
+* ssl: true/false. Default be false; If true, all the services call will use 
https.
+
+### Connection URL format:
+{% highlight Groff markup %}
+jdbc:kylin://<hostname>:<port>/<kylin_project_name>
+{% endhighlight %}
+* If "ssl" = true, the "port" should be Kylin server's HTTPS port; 
+* If "port" is not specified, the driver will use default port: HTTP 80, HTTPS 
443;
+* The "kylin_project_name" must be specified and user need ensure it exists in 
Kylin server;
+
+### 1. Query with Statement
+{% highlight Groff markup %}
+Driver driver = (Driver) 
Class.forName("org.apache.kylin.jdbc.Driver").newInstance();
+
+Properties info = new Properties();
+info.put("user", "ADMIN");
+info.put("password", "KYLIN");
+Connection conn = 
driver.connect("jdbc:kylin://localhost:7070/kylin_project_name", info);
+Statement state = conn.createStatement();
+ResultSet resultSet = state.executeQuery("select * from test_table");
+
+while (resultSet.next()) {
+    assertEquals("foo", resultSet.getString(1));
+    assertEquals("bar", resultSet.getString(2));
+    assertEquals("tool", resultSet.getString(3));
+}
+{% endhighlight %}
+
+### 2. Query with PreparedStatement
+
+###### Supported prepared statement parameters:
+* setString
+* setInt
+* setShort
+* setLong
+* setFloat
+* setDouble
+* setBoolean
+* setByte
+* setDate
+* setTime
+* setTimestamp
+
+{% highlight Groff markup %}
+Driver driver = (Driver) 
Class.forName("org.apache.kylin.jdbc.Driver").newInstance();
+Properties info = new Properties();
+info.put("user", "ADMIN");
+info.put("password", "KYLIN");
+Connection conn = 
driver.connect("jdbc:kylin://localhost:7070/kylin_project_name", info);
+PreparedStatement state = conn.prepareStatement("select * from test_table 
where id=?");
+state.setInt(1, 10);
+ResultSet resultSet = state.executeQuery();
+
+while (resultSet.next()) {
+    assertEquals("foo", resultSet.getString(1));
+    assertEquals("bar", resultSet.getString(2));
+    assertEquals("tool", resultSet.getString(3));
+}
+{% endhighlight %}
+
+### 3. Get query result set metadata
+Kylin jdbc driver supports metadata list methods:
+List catalog, schema, table and column with sql pattern filters(such as %).
+
+{% highlight Groff markup %}
+Driver driver = (Driver) 
Class.forName("org.apache.kylin.jdbc.Driver").newInstance();
+Properties info = new Properties();
+info.put("user", "ADMIN");
+info.put("password", "KYLIN");
+Connection conn = 
driver.connect("jdbc:kylin://localhost:7070/kylin_project_name", info);
+Statement state = conn.createStatement();
+ResultSet resultSet = state.executeQuery("select * from test_table");
+
+ResultSet tables = conn.getMetaData().getTables(null, null, "dummy", null);
+while (tables.next()) {
+    for (int i = 0; i < 10; i++) {
+        assertEquals("dummy", tables.getString(i + 1));
+    }
+}
+{% endhighlight %}

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_ldap_and_sso.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_ldap_and_sso.md 
b/website/_docs16/howto/howto_ldap_and_sso.md
new file mode 100644
index 0000000..d8988dc
--- /dev/null
+++ b/website/_docs16/howto/howto_ldap_and_sso.md
@@ -0,0 +1,121 @@
+---
+layout: docs16
+title: Enable Security with LDAP and SSO
+categories: howto
+permalink: /docs16/howto/howto_ldap_and_sso.html
+---
+
+## Enable LDAP authentication
+
+Kylin supports LDAP authentication for enterprise or production deployment; 
This is implemented with Spring Security framework; Before enable LDAP, please 
contact your LDAP administrator to get necessary information, like LDAP server 
URL, username/password, search patterns;
+
+#### Configure LDAP server info
+
+Firstly, provide LDAP URL, and username/password if the LDAP server is 
secured; The password in kylin.properties need be salted; You can run 
"org.apache.kylin.rest.security.PasswordPlaceholderConfigurer AES 
your_password" to get a hash.
+
+```
+ldap.server=ldap://<your_ldap_host>:<port>
+ldap.username=<your_user_name>
+ldap.password=<your_password_hash>
+```
+
+Secondly, provide the user search patterns, this is by LDAP design, here is 
just a sample:
+
+```
+ldap.user.searchBase=OU=UserAccounts,DC=mycompany,DC=com
+ldap.user.searchPattern=(&(AccountName={0})(memberOf=CN=MYCOMPANY-USERS,DC=mycompany,DC=com))
+ldap.user.groupSearchBase=OU=Group,DC=mycompany,DC=com
+```
+
+If you have service accounts (e.g, for system integration) which also need be 
authenticated, configure them in ldap.service.*; Otherwise, leave them be empty;
+
+### Configure the administrator group and default role
+
+To map an LDAP group to the admin group in Kylin, need set the "acl.adminRole" 
to "ROLE_" + GROUP_NAME. For example, in LDAP the group "KYLIN-ADMIN-GROUP" is 
the list of administrators, here need set it as:
+
+```
+acl.adminRole=ROLE_KYLIN-ADMIN-GROUP
+acl.defaultRole=ROLE_ANALYST,ROLE_MODELER
+```
+
+The "acl.defaultRole" is a list of the default roles that grant to everyone, 
keep it as-is.
+
+#### Enable LDAP
+
+Set "kylin.security.profile=ldap" in conf/kylin.properties, then restart Kylin 
server.
+
+## Enable SSO authentication
+
+From v1.5, Kylin provides SSO with SAML. The implementation is based on Spring 
Security SAML Extension. You can read [this 
reference](http://docs.spring.io/autorepo/docs/spring-security-saml/1.0.x-SNAPSHOT/reference/htmlsingle/)
 to get an overall understand.
+
+Before trying this, you should have successfully enabled LDAP and managed 
users with it, as SSO server may only do authentication, Kylin need search LDAP 
to get the user's detail information.
+
+### Generate IDP metadata xml
+Contact your IDP (ID provider), asking to generate the SSO metadata file; 
Usually you need provide three piece of info:
+
+  1. Partner entity ID, which is an unique ID of your app, e.g,: 
https://host-name/kylin/saml/metadata 
+  2. App callback endpoint, to which the SAML assertion be posted, it need be: 
https://host-name/kylin/saml/SSO
+  3. Public certificate of Kylin server, the SSO server will encrypt the 
message with it.
+
+### Generate JKS keystore for Kylin
+As Kylin need send encrypted message (signed with Kylin's private key) to SSO 
server, a keystore (JKS) need be provided. There are a couple ways to generate 
the keystore, below is a sample.
+
+Assume kylin.crt is the public certificate file, kylin.key is the private 
certificate file; firstly create a PKCS#12 file with openssl, then convert it 
to JKS with keytool: 
+
+```
+$ openssl pkcs12 -export -in kylin.crt -inkey kylin.key -out kylin.p12
+Enter Export Password: <export_pwd>
+Verifying - Enter Export Password: <export_pwd>
+
+
+$ keytool -importkeystore -srckeystore kylin.p12 -srcstoretype PKCS12 
-srcstorepass <export_pwd> -alias 1 -destkeystore samlKeystore.jks -destalias 
kylin -destkeypass changeit
+
+Enter destination keystore password:  changeit
+Re-enter new password: changeit
+```
+
+It will put the keys to "samlKeystore.jks" with alias "kylin";
+
+### Enable Higher Ciphers
+
+Make sure your environment is ready to handle higher level crypto keys, you 
may need to download Java Cryptography Extension (JCE) Unlimited Strength 
Jurisdiction Policy Files, copy local_policy.jar and US_export_policy.jar to 
$JAVA_HOME/jre/lib/security .
+
+### Deploy IDP xml file and keystore to Kylin
+
+The IDP metadata and keystore file need be deployed in Kylin web app's 
classpath in $KYLIN_HOME/tomcat/webapps/kylin/WEB-INF/classes 
+       
+  1. Name the IDP file to sso_metadata.xml and then copy to Kylin's classpath;
+  2. Name the keystore as "samlKeystore.jks" and then copy to Kylin's 
classpath;
+  3. If you use another alias or password, remember to update that 
kylinSecurity.xml accordingly:
+
+```
+<!-- Central storage of cryptographic keys -->
+<bean id="keyManager" 
class="org.springframework.security.saml.key.JKSKeyManager">
+       <constructor-arg value="classpath:samlKeystore.jks"/>
+       <constructor-arg type="java.lang.String" value="changeit"/>
+       <constructor-arg>
+               <map>
+                       <entry key="kylin" value="changeit"/>
+               </map>
+       </constructor-arg>
+       <constructor-arg type="java.lang.String" value="kylin"/>
+</bean>
+
+```
+
+### Other configurations
+In conf/kylin.properties, add the following properties with your server 
information:
+
+```
+saml.metadata.entityBaseURL=https://host-name/kylin
+saml.context.scheme=https
+saml.context.serverName=host-name
+saml.context.serverPort=443
+saml.context.contextPath=/kylin
+```
+
+Please note, Kylin assume in the SAML message there is a "email" attribute 
representing the login user, and the name before @ will be used to search LDAP. 
+
+### Enable SSO
+Set "kylin.security.profile=saml" in conf/kylin.properties, then restart Kylin 
server; After that, type a URL like "/kylin" or "/kylin/cubes" will redirect to 
SSO for login, and jump back after be authorized. While login with LDAP is 
still available, you can type "/kylin/login" to use original way. The Rest API 
(/kylin/api/*) still use LDAP + basic authentication, no impact.
+

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_optimize_cubes.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_optimize_cubes.md 
b/website/_docs16/howto/howto_optimize_cubes.md
new file mode 100644
index 0000000..fbc3586
--- /dev/null
+++ b/website/_docs16/howto/howto_optimize_cubes.md
@@ -0,0 +1,212 @@
+---
+layout: docs16
+title:  Optimize Cube
+categories: howto
+permalink: /docs16/howto/howto_optimize_cubes.html
+---
+
+## Hierarchies:
+
+Theoretically for N dimensions you'll end up with 2^N dimension combinations. 
However for some group of dimensions there are no need to create so many 
combinations. For example, if you have three dimensions: continent, country, 
city (In hierarchies, the "bigger" dimension comes first). You will only need 
the following three combinations of group by when you do drill down analysis:
+
+group by continent
+group by continent, country
+group by continent, country, city
+
+In such cases the combination count is reduced from 2^3=8 to 3, which is a 
great optimization. The same goes for the YEAR,QUATER,MONTH,DATE case.
+
+If we Donate the hierarchy dimension as H1,H2,H3, typical scenarios would be:
+
+
+A. Hierarchies on lookup table
+
+
+<table>
+  <tr>
+    <td align="center">Fact table</td>
+    <td align="center">(joins)</td>
+    <td align="center">Lookup Table</td>
+  </tr>
+  <tr>
+    <td>column1,column2,,,,,, FK</td>
+    <td></td>
+    <td>PK,,H1,H2,H3,,,,</td>
+  </tr>
+</table>
+
+---
+
+B. Hierarchies on fact table
+
+
+<table>
+  <tr>
+    <td align="center">Fact table</td>
+  </tr>
+  <tr>
+    <td>column1,column2,,,H1,H2,H3,,,,,,, </td>
+  </tr>
+</table>
+
+---
+
+
+There is a special case for scenario A, where PK on the lookup table is 
accidentally being part of the hierarchies. For example we have a calendar 
lookup table where cal_dt is the primary key:
+
+A*. Hierarchies on lookup table over its primary key
+
+
+<table>
+  <tr>
+    <td align="center">Lookup Table(Calendar)</td>
+  </tr>
+  <tr>
+    <td>cal_dt(PK), week_beg_dt, month_beg_dt, quarter_beg_dt,,,</td>
+  </tr>
+</table>
+
+---
+
+
+For cases like A* what you need is another optimization called "Derived 
Columns"
+
+## Derived Columns:
+
+Derived column is used when one or more dimensions (They must be dimension on 
lookup table, these columns are called "Derived") can be deduced from 
another(Usually it is the corresponding FK, this is called the "host column")
+
+For example, suppose we have a lookup table where we join fact table and it 
with "where DimA = DimX". Notice in Kylin, if you choose FK into a dimension, 
the corresponding PK will be automatically querable, without any extra cost. 
The secret is that since FK and PK are always identical, Kylin can apply 
filters/groupby on the FK first, and transparently replace them to PK.  This 
indicates that if we want the DimA(FK), DimX(PK), DimB, DimC in our cube, we 
can safely choose DimA,DimB,DimC only.
+
+<table>
+  <tr>
+    <td align="center">Fact table</td>
+    <td align="center">(joins)</td>
+    <td align="center">Lookup Table</td>
+  </tr>
+  <tr>
+    <td>column1,column2,,,,,, DimA(FK) </td>
+    <td></td>
+    <td>DimX(PK),,DimB, DimC</td>
+  </tr>
+</table>
+
+---
+
+
+Let's say that DimA(the dimension representing FK/PK) has a special mapping to 
DimB:
+
+
+<table>
+  <tr>
+    <th>dimA</th>
+    <th>dimB</th>
+    <th>dimC</th>
+  </tr>
+  <tr>
+    <td>1</td>
+    <td>a</td>
+    <td>?</td>
+  </tr>
+  <tr>
+    <td>2</td>
+    <td>b</td>
+    <td>?</td>
+  </tr>
+  <tr>
+    <td>3</td>
+    <td>c</td>
+    <td>?</td>
+  </tr>
+  <tr>
+    <td>4</td>
+    <td>a</td>
+    <td>?</td>
+  </tr>
+</table>
+
+
+in this case, given a value in DimA, the value of DimB is determined, so we 
say dimB can be derived from DimA. When we build a cube that contains both DimA 
and DimB, we simple include DimA, and marking DimB as derived. Derived 
column(DimB) does not participant in cuboids generation:
+
+original combinations:
+ABC,AB,AC,BC,A,B,C
+
+combinations when driving B from A:
+AC,A,C
+
+at Runtime, in case queries like "select count(*) from fact_table inner join 
looup1 group by looup1 .dimB", it is expecting cuboid containing DimB to answer 
the query. However, DimB will appear in NONE of the cuboids due to derived 
optimization. In this case, we modify the execution plan to make it group by  
DimA(its host column) first, we'll get intermediate answer like:
+
+
+<table>
+  <tr>
+    <th>DimA</th>
+    <th>count(*)</th>
+  </tr>
+  <tr>
+    <td>1</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td>2</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td>3</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td>4</td>
+    <td>1</td>
+  </tr>
+</table>
+
+
+Afterwards, Kylin will replace DimA values with DimB values(since both of 
their values are in lookup table, Kylin can load the whole lookup table into 
memory and build a mapping for them), and the intermediate result becomes:
+
+
+<table>
+  <tr>
+    <th>DimB</th>
+    <th>count(*)</th>
+  </tr>
+  <tr>
+    <td>a</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td>b</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td>c</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td>a</td>
+    <td>1</td>
+  </tr>
+</table>
+
+
+After this, the runtime SQL engine(calcite) will further aggregate the 
intermediate result to:
+
+
+<table>
+  <tr>
+    <th>DimB</th>
+    <th>count(*)</th>
+  </tr>
+  <tr>
+    <td>a</td>
+    <td>2</td>
+  </tr>
+  <tr>
+    <td>b</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td>c</td>
+    <td>1</td>
+  </tr>
+</table>
+
+
+this step happens at query runtime, this is what it means "at the cost of 
extra runtime aggregation"

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_update_coprocessor.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_update_coprocessor.md 
b/website/_docs16/howto/howto_update_coprocessor.md
new file mode 100644
index 0000000..1aa8b0e
--- /dev/null
+++ b/website/_docs16/howto/howto_update_coprocessor.md
@@ -0,0 +1,14 @@
+---
+layout: docs16
+title:  How to Update HBase Coprocessor
+categories: howto
+permalink: /docs16/howto/howto_update_coprocessor.html
+---
+
+Kylin leverages HBase coprocessor to optimize query performance. After new 
versions released, the RPC protocol may get changed, so user need to redeploy 
coprocessor to HTable.
+
+There's a CLI tool to update HBase Coprocessor:
+
+{% highlight Groff markup %}
+$KYLIN_HOME/bin/kylin.sh 
org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI 
$KYLIN_HOME/lib/kylin-coprocessor-*.jar all
+{% endhighlight %}

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_upgrade.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_upgrade.md 
b/website/_docs16/howto/howto_upgrade.md
new file mode 100644
index 0000000..7294bb7
--- /dev/null
+++ b/website/_docs16/howto/howto_upgrade.md
@@ -0,0 +1,157 @@
+---
+layout: docs16
+title:  Upgrade From Old Versions
+categories: howto
+permalink: /docs16/howto/howto_upgrade.html
+since: v1.5.1
+---
+
+
+## Upgrade from 1.5.2 to v1.5.3
+Kylin v1.5.3 metadata is compitible with v1.5.2, your cubes don't need 
rebuilt, as usual, some actions need to be performed:
+
+#### 1. Update HBase coprocessor
+The HBase tables for existing cubes need be updated to the latest coprocessor; 
Follow [this guide](howto_update_coprocessor.html) to update;
+
+#### 2. Update conf/kylin_hive_conf.xml
+From 1.5.3, Kylin doesn't need Hive to merge small files anymore; For users 
who copy the conf/ from previous version, please remove the "merge" related 
properties in kylin_hive_conf.xml, including "hive.merge.mapfiles", 
"hive.merge.mapredfiles", and "hive.merge.size.per.task"; this will save the 
time on extracting data from Hive.
+
+
+## Upgrade from 1.5.1 to v1.5.2
+Kylin v1.5.2 metadata is compitible with v1.5.1, your cubes don't need 
upgrade, while some actions need to be performed:
+
+#### 1. Update HBase coprocessor
+The HBase tables for existing cubes need be updated to the latest coprocessor; 
Follow [this guide](howto_update_coprocessor.html) to update;
+
+#### 2. Update conf/kylin.properties
+In v1.5.2 several properties are deprecated, and several new one are added:
+
+Deprecated:
+
+* kylin.hbase.region.cut.small=5
+* kylin.hbase.region.cut.medium=10
+* kylin.hbase.region.cut.large=50
+
+New:
+
+* kylin.hbase.region.cut=5
+* kylin.hbase.hfile.size.gb=2
+
+These new parameters determines how to split HBase region; To use different 
size you can overwite these params in Cube level. 
+
+When copy from old kylin.properties file, suggest to remove the deprecated 
ones and add the new ones.
+
+#### 3. Add conf/kylin\_job\_conf\_inmem.xml
+A new job conf file named "kylin\_job\_conf\_inmem.xml" is added in "conf" 
folder; As Kylin 1.5 introduced the "fast cubing" algorithm, which aims to 
leverage more memory to do the in-mem aggregation; Kylin will use this new conf 
file for submitting the in-mem cube build job, which requesting different 
memory with a normal job; Please update it properly according to your cluster 
capacity.
+
+Besides, if you have used separate config files for different capacity cubes, 
for example "kylin\_job\_conf\_small.xml", "kylin\_job\_conf\_medium.xml" and 
"kylin\_job\_conf\_large.xml", please note that they are deprecated now; Only 
"kylin\_job\_conf.xml" and "kylin\_job\_conf\_inmem.xml" will be used for 
submitting cube job; If you have cube level job configurations (like using 
different Yarn job queue), you can customize at cube level, check 
[KYLIN-1706](https://issues.apache.org/jira/browse/KYLIN-1706)
+
+## Upgrade from prior 1.5 to v1.5.1
+
+Kylin 1.5.1 is not backward compatible in terms of metadata. (The built cubes 
are still functional after metadata upgrade) So if you want to deploy v1.5.x 
code on your prior 1.5 metadata store (in the following text we'll use v1.3.0 
as example), you need to upgrade the metadata as following steps:
+
+#### 1. Backup metadata on v1.3.0
+
+To avoid data loss during the upgrade, a backup at the very beginning is 
always suggested. In case of upgrade failure, you can roll back to original 
state with the backup.
+
+```
+export KYLIN_HOME="<path_of_1_3_0_installation>" 
+$KYLIN_HOME/bin/metastore.sh backup
+``` 
+
+It will print the backup folder, write it down and make sure it will not be 
deleted before the upgrade finished. We'll later reference this folder as 
BACKUP_FOLDER.
+
+#### 2. Stop Kylin v1.3.0 instance
+
+Before deploying Kylin v1.5.1 instance, you need to stop the old instance. 
Note that end users cannot access kylin service from this point.
+
+```
+$KYLIN_HOME/bin/kylin.sh stop
+```
+
+#### 3. Install Kylin v1.5.1 and copy back "conf"
+
+Download the new Kylin v1.5.1 binary package from Kylin's download page; 
Extract it to a different folder other than current KYLIN_HOME; Before copy 
back the "conf" folder, do a compare and merge between the old and new 
kylin.properties to ensure newly introduced property will be kept.
+
+#### 4. Automaticly upgrade metadata
+
+Kylin v1.5.1 package provides a tool for metadata automaticly upgrade. In this 
upgrade, all cubes' metadata will be updated to v1.5.1 compatible format. The 
treatment of empty cubes and non-empty cubes are different though. For empty 
cubes, we'll upgrade the cube's storage engine and cubing engine to the latest, 
so that new features will be enabled for the new coming cubing jobs. But those 
non-empty cubes carries legacy cube segments, so we'll remain its old storage 
engine and cubing engine. In other word, the non-empty cubes will not enjoy the 
performance and storage wise gains released in 1.5.x versions. Check the last 
section to see how to deal with non-empty cubes.
+To avoid corrupting the metadata store, metadata upgrade is performed against 
a copy of the local metadata backup, i.e. a copy of BACKUP_FOLDER.
+
+```
+export KYLIN_HOME="<path_of_1_5_0_installation>" 
+$KYLIN_HOME/bin/kylin.sh  
org.apache.kylin.cube.upgrade.entry.CubeMetadataUpgradeEntry_v_1_5_1 
<path_of_BACKUP_FOLDER>
+```
+
+The above commands will first copy the BACKUP_FOLDER to 
${BACKUP_FOLDER}_workspace, and perform the upgrade against the workspace 
folder at local disk. Check the output, if no error happened, then you have a 
1.5.1 compatible metadata saved in the workspace folder now. Otherwise the 
upgrade process is not successful, please don't take further actions. 
+The next thing to do is to override the metatdata store with the new metadata 
in workspace:
+
+```
+$KYLIN_HOME/bin/metastore.sh reset
+$KYLIN_HOME/bin/metastore.sh restore <path_of_workspace>
+```
+
+The last thing to do is to upgrade all cubes' coprocessor:
+
+```
+$KYLIN_HOME/bin/kylin.sh 
org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI 
$KYLIN_HOME/lib/kylin-coprocessor*.jar all
+```
+
+#### 6. Start Kylin v1.5.1 instance
+
+```
+$KYLIN_HOME/bin/kylin.sh start
+```
+
+Check the log and open web UI to see if the upgrade succeeded.
+
+## Rollback if the upgrade is failed
+
+If the new version couldn't startup normally, you need to roll back to orignal 
v1.3.0 version. The steps are as followed:
+
+#### 1. Stop Kylin v1.5.1 instance
+
+```
+$KYLIN_HOME/bin/kylin.sh stop
+```
+
+#### 2. Restore 1.3.0 metadata from backup folder
+
+```
+export KYLIN_HOME="<path_of_1_3_0_installation>"
+$KYLIN_HOME/bin/metastore.sh reset
+$KYLIN_HOME/bin/metastore.sh restore <path_of_BACKUP_FOLDER>
+``` 
+
+#### 3. Deploy coprocessor of v1.3.0
+
+Since coprocessor of used HTable are upgraded as v1.5.1, you need to manually 
downgrade them with this command.
+
+```
+$KYLIN_HOME/bin/kylin.sh org.apache.kylin.job.tools.DeployCoprocessorCLI 
$KYLIN_HOME/lib/kylin-coprocessor*.jar all
+```
+
+#### 4. Start Kylin v1.3.0 instance
+
+```
+$KYLIN_HOME/bin/kylin.sh start
+```
+
+## For non-empty cubes
+
+Since old cubes built with v1.3.0 cannot leverage new features of v1.5.1. But 
if you must have them on your cubes, you can choose one of these solutions:
+
+#### 1. Rebuild cubes
+
+This is the simplest way: If the cost of rebuilding is acceptable, if you 
purge the cube before Metadata Upgrade. After upgrade done, you need to 
manually rebuild those segments by yourself.
+
+#### 2. Use hybrid model
+
+If you can't rebuild any segments, but want to leverage new features for new 
segments. You can use hybrid model, which contains not only your old segments, 
but also an new empty cube which has same model with the old one. For the empty 
cube, you can do incremental building with new v1.5.x features. For the old 
cube, you can refresh existing segments only.
+
+Here is the command to create hybrid model:
+
+```
+export KYLIN_HOME="<path_of_1_5_0_installation>"
+$KYLIN_HOME/bin/kylin.sh 
org.apache.kylin.storage.hbase.util.ExtendCubeToHybridCLI <project_name> 
<cube_name>
+```

http://git-wip-us.apache.org/repos/asf/kylin/blob/59913167/website/_docs16/howto/howto_use_beeline.md
----------------------------------------------------------------------
diff --git a/website/_docs16/howto/howto_use_beeline.md 
b/website/_docs16/howto/howto_use_beeline.md
new file mode 100644
index 0000000..7c3148a
--- /dev/null
+++ b/website/_docs16/howto/howto_use_beeline.md
@@ -0,0 +1,14 @@
+---
+layout: docs16
+title:  Use Beeline for Hive Commands
+categories: howto
+permalink: /docs16/howto/howto_use_beeline.html
+---
+
+Beeline(https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients) 
is recommended by many venders to replace Hive CLI. By default Kylin uses Hive 
CLI to synchronize Hive tables, create flatten intermediate tables, etc. By 
simple configuration changes you can set Kylin to use Beeline instead.
+
+Edit $KYLIN_HOME/conf/kylin.properties by:
+
+  1. change kylin.hive.client=cli to kylin.hive.client=beeline
+  2. add "kylin.hive.beeline.params", this is where you can specifiy beeline 
commmand parameters. Like username(-n), JDBC URL(-u),etc. There's a sample 
kylin.hive.beeline.params included in default kylin.properties, however it's 
commented. You can modify the sample based on your real environment.
+

[4/4] kylin git commit: add 1.6 document

Reply via email to