Author: lidong Date: Thu Jan 10 14:09:26 2019 New Revision: 1850940 URL: http://svn.apache.org/viewvc?rev=1850940&view=rev Log: Update cube_spark document with KYLIN-3607
Modified: kylin/site/blog/index.html kylin/site/cn/docs/tutorial/cube_spark.html kylin/site/docs/tutorial/cube_spark.html kylin/site/feed.xml Modified: kylin/site/blog/index.html URL: http://svn.apache.org/viewvc/kylin/site/blog/index.html?rev=1850940&r1=1850939&r2=1850940&view=diff ============================================================================== --- kylin/site/blog/index.html (original) +++ kylin/site/blog/index.html Thu Jan 10 14:09:26 2019 @@ -6047,21 +6047,21 @@ var _hmt = _hmt || []; </div> <div class="col-md-6 col-lg-6 col-xs-12"> - <a class="blog-card" href="/cn/blog/2018/09/20/release-v2.5.0/"> + <a class="blog-card" href="/blog/2018/09/20/release-v2.5.0/"> <div class="blog-pic"> <img width="20" src="../assets/images/icon_blog_w.png" /> </div> - <p class="blog-title">Apache Kylin v2.5.0 æ£å¼åå¸</p> + <p class="blog-title">Apache Kylin v2.5.0 Release Announcement</p> <p align="left" class="post-meta">posted: Sep 20, 2018</p> </a> </div> <div class="col-md-6 col-lg-6 col-xs-12"> - <a class="blog-card" href="/blog/2018/09/20/release-v2.5.0/"> + <a class="blog-card" href="/cn/blog/2018/09/20/release-v2.5.0/"> <div class="blog-pic"> <img width="20" src="../assets/images/icon_blog_w.png" /> </div> - <p class="blog-title">Apache Kylin v2.5.0 Release Announcement</p> + <p class="blog-title">Apache Kylin v2.5.0 æ£å¼åå¸</p> <p align="left" class="post-meta">posted: Sep 20, 2018</p> </a> </div> @@ -6347,21 +6347,21 @@ var _hmt = _hmt || []; </div> <div class="col-md-6 col-lg-6 col-xs-12"> - <a class="blog-card" href="/cn/blog/2016/03/16/release-v1.3.0/"> + <a class="blog-card" href="/blog/2016/03/16/release-v1.3.0/"> <div class="blog-pic"> <img width="20" src="../assets/images/icon_blog_w.png" /> </div> - <p class="blog-title">Apache Kylin v1.3.0 æ£å¼åå¸</p> + <p class="blog-title">Apache Kylin v1.3.0 Release Announcement</p> <p align="left" class="post-meta">posted: Mar 16, 2016</p> </a> </div> <div class="col-md-6 col-lg-6 col-xs-12"> - <a class="blog-card" href="/blog/2016/03/16/release-v1.3.0/"> + <a class="blog-card" href="/cn/blog/2016/03/16/release-v1.3.0/"> <div class="blog-pic"> <img width="20" src="../assets/images/icon_blog_w.png" /> </div> - <p class="blog-title">Apache Kylin v1.3.0 Release Announcement</p> + <p class="blog-title">Apache Kylin v1.3.0 æ£å¼åå¸</p> <p align="left" class="post-meta">posted: Mar 16, 2016</p> </a> </div> @@ -6387,41 +6387,41 @@ var _hmt = _hmt || []; </div> <div class="col-md-6 col-lg-6 col-xs-12"> - <a class="blog-card" href="/cn/blog/2015/12/25/support-powerbi-tableau9/"> + <a class="blog-card" href="/blog/2015/12/25/support-powerbi-tableau9/"> <div class="blog-pic"> <img width="20" src="../assets/images/icon_blog_w.png" /> </div> - <p class="blog-title">Apache Kylinå¢å 对Tableau 9å微软Excel, Power BIçæ¯æ</p> + <p class="blog-title">Apache Kylin supports Tableau 9 and MS Excel, Power BI now</p> <p align="left" class="post-meta">posted: Dec 25, 2015</p> </a> </div> <div class="col-md-6 col-lg-6 col-xs-12"> - <a class="blog-card" href="/blog/2015/12/25/support-powerbi-tableau9/"> + <a class="blog-card" href="/cn/blog/2015/12/25/support-powerbi-tableau9/"> <div class="blog-pic"> <img width="20" src="../assets/images/icon_blog_w.png" /> </div> - <p class="blog-title">Apache Kylin supports Tableau 9 and MS Excel, Power BI now</p> + <p class="blog-title">Apache Kylinå¢å 对Tableau 9å微软Excel, Power BIçæ¯æ</p> <p align="left" class="post-meta">posted: Dec 25, 2015</p> </a> </div> <div class="col-md-6 col-lg-6 col-xs-12"> - <a class="blog-card" href="/cn/blog/2015/12/23/release-v1.2/"> + <a class="blog-card" href="/blog/2015/12/23/release-v1.2/"> <div class="blog-pic"> <img width="20" src="../assets/images/icon_blog_w.png" /> </div> - <p class="blog-title">Apache Kylin v1.2 æ£å¼åå¸</p> + <p class="blog-title">Apache Kylin v1.2 Release Announcement</p> <p align="left" class="post-meta">posted: Dec 23, 2015</p> </a> </div> <div class="col-md-6 col-lg-6 col-xs-12"> - <a class="blog-card" href="/blog/2015/12/23/release-v1.2/"> + <a class="blog-card" href="/cn/blog/2015/12/23/release-v1.2/"> <div class="blog-pic"> <img width="20" src="../assets/images/icon_blog_w.png" /> </div> - <p class="blog-title">Apache Kylin v1.2 Release Announcement</p> + <p class="blog-title">Apache Kylin v1.2 æ£å¼åå¸</p> <p align="left" class="post-meta">posted: Dec 23, 2015</p> </a> </div> Modified: kylin/site/cn/docs/tutorial/cube_spark.html URL: http://svn.apache.org/viewvc/kylin/site/cn/docs/tutorial/cube_spark.html?rev=1850940&r1=1850939&r2=1850940&view=diff ============================================================================== --- kylin/site/cn/docs/tutorial/cube_spark.html (original) +++ kylin/site/cn/docs/tutorial/cube_spark.html Thu Jan 10 14:09:26 2019 @@ -292,6 +292,22 @@ $KYLIN_HOME/bin/kylin.sh start</code></p <p>ç¹å»ä¸ä¸ªå ·ä½ç jobï¼è¿è¡æ¶çå ·ä½ä¿¡æ¯å°ä¼å±ç¤ºï¼è¯¥ä¿¡æ¯å¯¹çé¾è§£çåæ§è½è°æ´ææ大ç帮å©ã</p> +<p>å¨æäº Hadoop çæ¬ä¸, å¨ âConvert Cuboid Data to HFileâ è¿ä¸æ¥å¯è½ä¼éå°ä¸é¢è¿ä¸ªé误:</p> + +<div class="highlight"><pre><code class="language-groff" data-lang="groff">Caused by: java.lang.RuntimeException: Could not create interface org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceFactory Is the hadoop compatibility jar on the classpath? + at org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:73) + at org.apache.hadoop.hbase.io.MetricsIO.<init>(MetricsIO.java:31) + at org.apache.hadoop.hbase.io.hfile.HFile.<clinit>(HFile.java:192) + ... 15 more +Caused by: java.util.NoSuchElementException + at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:365) + at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) + at java.util.ServiceLoader$1.next(ServiceLoader.java:480) + at org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:59) + ... 17 more</code></pre></div> + +<p>解å³åæ³æ¯: å° <code class="highlighter-rouge">hbase-hadoop2-compat-*.jar</code> å <code class="highlighter-rouge">hbase-hadoop-compat-*.jar</code> æ·è´å° <code class="highlighter-rouge">$KYLIN_HOME/spark/jars</code> ç®å½ä¸ (è¿ä¸¤ä¸ª jar æ件å¯ä»¥ä» HBase ç lib ç®å½æ¾å°); å¦æä½ å·²ç»çæäº Spark assembly jar 并ä¸ä¼ å°äº HDFS, é£ä¹ä½ éè¦éæ°æå ä¸ä¼ ãå¨è¿ä¹åï¼éè¯å¤±è´¥ç cube ä»»å¡ï¼åºè¯¥å°±å¯ä»¥æåäºãç¸å ³ç JIRA issue æ¯ KYLIN-3607ï¼ä¼å¨æªæ¥çæ¬ä¿®å¤.</p> + <h2 id="section-2">è¿ä¸æ¥</h2> <p>å¦ææ¨æ¯ Kylin ç管çåä½æ¯å¯¹äº Spark æ¯æ°æï¼å»ºè®®æ¨æµè§ <a href="https://spark.apache.org/docs/2.1.2/">Spark ææ¡£</a>ï¼å«å¿è®°ç¸åºå°å»æ´æ°é ç½®ãæ¨å¯ä»¥å¼å¯ Spark ç <a href="https://spark.apache.org/docs/2.1.2/job-scheduling.html#dynamic-resource-allocation">Dynamic Resource Allocation</a> ï¼ä»¥ä¾¿å ¶å¯¹äºä¸åçå·¥ä½è´è½½è½èªå¨ä¼¸ç¼©ãSpark æ§è½ä¾èµäºé群çå åå CPU èµæºï¼å½æå¤ææ°æ®æ¨¡åå巨大çæ°æ®éä¸æ¬¡æå»ºæ¶ Kylin ç Cube æ建å°ä¼æ¯ä¸é¡¹ç¹éçä»»å¡ãå¦ææ¨çé群èµæºä¸è½å¤æ§è¡ï¼Spark executors å°±ä¼ æåºå¦ âOutOfMemorryâ è¿æ ·çé误ï¼å æ¤è¯·åçç使ç¨ã对äºæ UHC dimensionï¼è¿å¤ç»å (ä¾å¦ï¼ä¸ä¸ª cube è¶ è¿ 12 dimensions)ï¼æèå°½å åç度é (Count Distinctï¼Top-N) ç Cubeï¼å»ºè®®æ¨ä½¿ç¨ MapReduce engineãå¦ææ¨ç Cube 模åè¾ä¸ºç®åï¼ææ度éé½æ¯ SUM/MIN/MAX/COUNTï¼æºæ°æ®è§æ¨¡å°è³ä¸çï¼Spark engine å°ä¼æ¯ä¸ªå¥½çéæ©ã</p> Modified: kylin/site/docs/tutorial/cube_spark.html URL: http://svn.apache.org/viewvc/kylin/site/docs/tutorial/cube_spark.html?rev=1850940&r1=1850939&r2=1850940&view=diff ============================================================================== --- kylin/site/docs/tutorial/cube_spark.html (original) +++ kylin/site/docs/tutorial/cube_spark.html Thu Jan 10 14:09:26 2019 @@ -6048,7 +6048,7 @@ export KYLIN_HOME=/usr/local/apache-kyli <h2 id="check-spark-configuration">Check Spark configuration</h2> -<p>Kylin embeds a Spark binary (v2.1.0) in $KYLIN_HOME/spark, all the Spark configurations can be managed in $KYLIN_HOME/conf/kylin.properties with prefix <em>âkylin.engine.spark-conf.â</em>. These properties will be extracted and applied when runs submit Spark job; E.g, if you configure âkylin.engine.spark-conf.spark.executor.memory=4Gâ, Kylin will use ââconf spark.executor.memory=4Gâ as parameter when execute âspark-submitâ.</p> +<p>Kylin embeds a Spark binary (Spark v2.1 for Kylin 2.4 and 2.5) in $KYLIN_HOME/spark, all the Spark configurations can be managed in $KYLIN_HOME/conf/kylin.properties with prefix <em>âkylin.engine.spark-conf.â</em>. These properties will be extracted and applied when runs submit Spark job; E.g, if you configure âkylin.engine.spark-conf.spark.executor.memory=4Gâ, Kylin will use ââconf spark.executor.memory=4Gâ as parameter when execute âspark-submitâ.</p> <p>Before you run Spark cubing, suggest take a look on these configurations and do customization according to your cluster. Below is the recommended configurations:</p> @@ -6149,6 +6149,22 @@ $KYLIN_HOME/bin/kylin.sh start</code></p <p>Click a specific job, there you will see the detail runtime information, that is very helpful for trouble shooting and performance tuning.</p> +<p>On some Hadoop release, you may encounter the following error in the âConvert Cuboid Data to HFileâ step:</p> + +<div class="highlight"><pre><code class="language-groff" data-lang="groff">Caused by: java.lang.RuntimeException: Could not create interface org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceFactory Is the hadoop compatibility jar on the classpath? + at org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:73) + at org.apache.hadoop.hbase.io.MetricsIO.<init>(MetricsIO.java:31) + at org.apache.hadoop.hbase.io.hfile.HFile.<clinit>(HFile.java:192) + ... 15 more +Caused by: java.util.NoSuchElementException + at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:365) + at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) + at java.util.ServiceLoader$1.next(ServiceLoader.java:480) + at org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:59) + ... 17 more</code></pre></div> + +<p>The workaround is: add <code class="highlighter-rouge">hbase-hadoop2-compat-*.jar</code> and <code class="highlighter-rouge">hbase-hadoop-compat-*.jar</code> into <code class="highlighter-rouge">$KYLIN_HOME/spark/jars</code> (the two jar files can be found in HBaseâs lib folder); If you already make the Spark assembly jar and uploaded to HDFS, you may need to re-package that and re-upload to HDFS. After that, resume the failed job, the job should be succesful. The related issue is KYLIN-3607 which will be fixed in later version.</p> + <h2 id="go-further">Go further</h2> <p>If youâre a Kylin administrator but new to Spark, suggest you go through <a href="https://spark.apache.org/docs/2.1.0/">Spark documents</a>, and donât forget to update the configurations accordingly. You can enable Spark <a href="https://spark.apache.org/docs/2.1.0/job-scheduling.html#dynamic-resource-allocation">Dynamic Resource Allocation</a> so that it can auto scale/shrink for different work load. Sparkâs performance relies on Clusterâs memory and CPU resource, while Kylinâs Cube build is a heavy task when having a complex data model and a huge dataset to build at one time. If your cluster resource couldnât fulfill, errors like âOutOfMemorryâ will be thrown in Spark executors, so please use it properly. For Cube which has UHC dimension, many combinations (e.g, a full cube with more than 12 dimensions), or memory hungry measures (Count Distinct, Top-N), suggest to use the MapReduce engine. If your Cube model is simple, all measures are S UM/MIN/MAX/COUNT, source data is small to medium scale, Spark engine would be a good choice. Besides, Streaming build isnât supported in this engine so far (KYLIN-2484).</p> Modified: kylin/site/feed.xml URL: http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1850940&r1=1850939&r2=1850940&view=diff ============================================================================== --- kylin/site/feed.xml (original) +++ kylin/site/feed.xml Thu Jan 10 14:09:26 2019 @@ -19,8 +19,8 @@ <description>Apache Kylin Home</description> <link>http://kylin.apache.org/</link> <atom:link href="http://kylin.apache.org/feed.xml" rel="self" type="application/rss+xml"/> - <pubDate>Wed, 09 Jan 2019 05:59:25 -0800</pubDate> - <lastBuildDate>Wed, 09 Jan 2019 05:59:25 -0800</lastBuildDate> + <pubDate>Thu, 10 Jan 2019 05:59:22 -0800</pubDate> + <lastBuildDate>Thu, 10 Jan 2019 05:59:22 -0800</lastBuildDate> <generator>Jekyll v2.5.3</generator> <item> @@ -235,6 +235,70 @@ Graphic 10 Process of Querying Cube</ </item> <item> + <title>Apache Kylin v2.5.0 æ£å¼åå¸</title> + <description><p>è¿æ¥Apache Kylin 社åºå¾é«å ´å°å®£å¸ï¼Apache Kylin 2.5.0 æ£å¼åå¸ã</p> + +<p>Apache Kylin æ¯ä¸ä¸ªå¼æºçåå¸å¼åæå¼æï¼æ¨å¨ä¸ºæ大æ°æ®éæä¾ SQL æ¥å£åå¤ç»´åæï¼OLAPï¼çè½åã</p> + +<p>è¿æ¯ç»§2.4.0 åçä¸ä¸ªæ°åè½çæ¬ã该çæ¬å¼å ¥äºå¾å¤æä»·å¼çæ¹è¿ï¼å®æ´çæ¹å¨å表请åè§<a href="https://kylin.apache.org/docs/release_notes.html">release notes</a>ï¼è¿éæä¸äºä¸»è¦æ¹è¿å说æï¼</p> + +<h3 id="all-in-spark--cubing-">All-in-Spark ç Cubing å¼æ</h3> +<p>Kylin ç Spark å¼æå°ä½¿ç¨ Spark è¿è¡ cube 计ç®ä¸çææåå¸å¼ä½ä¸ï¼å æ¬è·åå个维度çä¸åå¼ï¼å° cuboid æ件转æ¢ä¸º HBase HFileï¼å并 segmentï¼å并è¯å ¸çãé»è®¤ç Spark é ç½®ä¹ç»è¿ä¼åï¼ä½¿å¾ç¨æ·å¯ä»¥è·å¾å¼ç®±å³ç¨çä½éªãç¸å ³å¼åä»»å¡æ¯ KYLIN-3427, KYLIN-3441, KYLIN-3442.</p> + +<p>Spark ä»»å¡ç®¡çä¹æææ¹è¿ï¼ä¸æ¦ Spark ä»»å¡å¼å§è¿è¡ï¼æ¨å°±å¯ä»¥å¨Webæ§å¶å°ä¸è·å¾ä½ä¸é¾æ¥ï¼å¦ææ¨ä¸¢å¼è¯¥ä½ä¸ï¼Kylin å°ç«å»ç»æ¢ Spark ä½ä¸ä»¥åæ¶éæ¾èµæºï¼å¦æéæ°å¯å¨ Kylinï¼å®å¯ä»¥ä»ä¸ä¸ä¸ªä½ä¸æ¢å¤ï¼èä¸æ¯éæ°æ交æ°ä½ä¸.</p> + +<h3 id="mysql--kylin-">MySQL å Kylin å æ°æ®çåå¨</h3> +<p>å¨è¿å»ï¼HBase æ¯ Kylin å æ°æ®åå¨çå¯ä¸éæ©ã å¨æäºæ åµä¸ HBaseä¸éç¨ï¼ä¾å¦ä½¿ç¨å¤ä¸ª HBase é群æ¥ä¸º Kylin æä¾è·¨åºåçé«å¯ç¨ï¼è¿éå¤å¶ç HBase é群æ¯åªè¯»çï¼æ以ä¸è½åå æ°æ®åå¨ãç°å¨æ们å¼å ¥äº MySQL Metastore 以满足è¿ç§éæ±ãæ¤åè½ç°å¨å¤äºæµè¯é¶æ®µãæ´å¤å 容åè§ KYLIN-3488ã</p> + +<h3 id="hybrid-model-">Hybrid model å¾å½¢çé¢</h3> +<p>Hybrid æ¯ä¸ç§ç¨äºç»è£ å¤ä¸ª cube çé«çº§æ¨¡åã å®å¯ç¨äºæ»¡è¶³ cube ç schema è¦åçæ¹åçæ åµãè¿ä¸ªåè½è¿å»æ²¡æå¾å½¢çé¢ï¼å æ¤åªæä¸å°é¨åç¨æ·ç¥éå®ãç°å¨æä»¬å¨ Web çé¢ä¸å¼å¯äºå®ï¼ä»¥ä¾¿æ´å¤ç¨æ·å¯ä»¥å°è¯ã</p> + +<h3 id="cube-planner">é»è®¤å¼å¯ Cube planner</h3> +<p>Cube planner å¯ä»¥æ大å°ä¼å cube ç»æï¼åå°æ建ç cuboid æ°éï¼ä»èèç计ç®/åå¨èµæºå¹¶æé«æ¥è¯¢æ§è½ãå®æ¯å¨v2.3ä¸å¼å ¥çï¼ä½é»è®¤æ åµä¸æ²¡æå¼å¯ã为äºè®©æ´å¤ç¨æ·çå°å¹¶å°è¯å®ï¼æ们é»è®¤å¨v2.5ä¸å¯ç¨å®ã ç®æ³å°å¨ç¬¬ä¸æ¬¡æ建 segment çæ¶åï¼æ ¹æ®æ°æ®ç»è®¡èªå¨ä¼å cuboid éå.</p> + +<h3 id="segment-">æ¹è¿ç Segment åªæ</h3> +<p>Segmentï¼ååºï¼ä¿®åªå¯ä»¥ææå°åå°ç£çåç½ç»I / Oï¼å æ¤å¤§å¤§æé«äºæ¥è¯¢æ§è½ã è¿å»ï¼Kylin åªæååºå (partition date column) çå¼è¿è¡ segment çä¿®åªã å¦ææ¥è¯¢ä¸æ²¡æå°ååºåä½ä¸ºè¿æ»¤æ¡ä»¶ï¼é£ä¹ä¿®åªå°ä¸èµ·ä½ç¨ï¼ä¼æ«æææsegmentã.<br /> +ç°å¨ä»v2.5å¼å§ï¼Kylin å°å¨ segment 级å«è®°å½æ¯ä¸ªç»´åº¦çæå°/æ大å¼ã å¨æ«æ segment ä¹åï¼ä¼å°æ¥è¯¢çæ¡ä»¶ä¸æå°/æ大索å¼è¿è¡æ¯è¾ã å¦æä¸å¹é ï¼å°è·³è¿è¯¥ segmentã æ£æ¥KYLIN-3370äºè§£æ´å¤ä¿¡æ¯ã</p> + +<h3 id="yarn-">å¨ YARN ä¸å并åå ¸</h3> +<p>å½ segment å并æ¶ï¼å®ä»¬çè¯å ¸ä¹éè¦å并ãå¨è¿å»ï¼åå ¸å并åçå¨ Kylin ç JVM ä¸ï¼è¿éè¦ä½¿ç¨å¤§éçæ¬å°å åå CPU èµæºã å¨æ端æ åµä¸ï¼å¦ææå 个并åä½ä¸ï¼ï¼å¯è½ä¼å¯¼è´ Kylin è¿ç¨å´©æºã å æ¤ï¼ä¸äºç¨æ·ä¸å¾ä¸ä¸º Kylin ä»»å¡èç¹åé æ´å¤å åï¼æè¿è¡å¤ä¸ªä»»å¡èç¹ä»¥å¹³è¡¡å·¥ä½è´è½½ã<br /> +ç°å¨ä»v2.5å¼å§ï¼Kylin å°æè¿é¡¹ä»»å¡æäº¤ç» Hadoop MapReduce å Sparkï¼è¿æ ·å°±å¯ä»¥è§£å³è¿ä¸ªç¶é¢é®é¢ã æ¥çKYLIN-3471äºè§£æ´å¤ä¿¡æ¯.</p> + +<h3 id="cube-">æ¹è¿ä½¿ç¨å ¨å±åå ¸ç cube æ建æ§è½</h3> +<p>å ¨å±åå ¸ (Global Dictionary) æ¯ bitmap 精确å»é计æ°çå¿ è¦æ¡ä»¶ãå¦æå»éåå ·æé常é«çåºæ°ï¼å GD å¯è½é常大ãå¨ cube æ建é¶æ®µï¼Kylin éè¦éè¿ GD å°éæ´æ°å¼è½¬æ¢ä¸ºæ´æ°ã尽管 GD 已被åæå¤ä¸ªåçï¼å¯ä»¥åå¼å è½½å°å åï¼ä½æ¯ç±äºå»éåçå¼æ¯ä¹±åºçãKylin éè¦åå¤è½½å ¥åè½½åº(swap in/out)åçï¼è¿ä¼å¯¼è´æ建任å¡é常ç¼æ ¢ã<br /> +该å¢å¼ºåè½å¼å ¥äºä¸ä¸ªæ°æ¥éª¤ï¼ä¸ºæ¯ä¸ªæ°æ®åä»å ¨å±åå ¸ä¸æ建ä¸ä¸ªç¼©å°çåå ¸ã éåæ¯ä¸ªä»»å¡åªéè¦å 载缩å°çåå ¸ï¼ä»èé¿å é¢ç¹çè½½å ¥åè½½åºãæ§è½å¯ä»¥æ¯ä»¥åå¿«3åãæ¥ç KYLIN-3491 äºè§£æ´å¤ä¿¡æ¯.</p> + +<h3 id="topn-count-distinct--cube-">æ¹è¿å« TOPN, COUNT DISTINCT ç cube 大å°ç估计</h3> +<p>Cube ç大å°å¨æ建æ¶æ¯é¢å 估计çï¼å¹¶è¢«åç»å 个æ¥éª¤ä½¿ç¨ï¼ä¾å¦å³å® MR / Spark ä½ä¸çååºæ°ï¼è®¡ç® HBase region åå²çãå®çåç¡®ä¸å¦ä¼å¯¹æ建æ§è½äº§çå¾å¤§å½±åã å½åå¨ COUNT DISTINCTï¼TOPN ç度éæ¶åï¼å 为å®ä»¬ç大å°æ¯çµæ´»çï¼å æ¤ä¼°è®¡å¼å¯è½è·çå®å¼æå¾å¤§åå·®ã å¨è¿å»ï¼ç¨æ·éè¦è°æ´è¥å¹²ä¸ªåæ°ä»¥ä½¿å°ºå¯¸ä¼°è®¡æ´æ¥è¿å®é 尺寸ï¼è¿å¯¹æ®éç¨æ·æç¹å°é¾ã<br /> +ç°å¨ï¼Kylin å°æ ¹æ®æ¶éçç»è®¡ä¿¡æ¯èªå¨è°æ´å¤§å°ä¼°è®¡ãè¿å¯ä»¥ä½¿ä¼°è®¡å¼ä¸å®é 大å°æ´æ¥è¿ãæ¥ç KYLIN-3453 äºè§£æ´å¤ä¿¡æ¯ã</p> + +<h3 id="hadoop-30hbase-20">æ¯æHadoop 3.0/HBase 2.0</h3> +<p>Hadoop 3å HBase 2å¼å§è¢«è®¸å¤ç¨æ·éç¨ãç°å¨ Kylin æä¾ä½¿ç¨æ°ç Hadoop å HBase API ç¼è¯çæ°äºè¿å¶å ãæ们已ç»å¨ Hortonworks HDP 3.0 å Cloudera CDH 6.0 ä¸è¿è¡äºæµè¯</p> + +<p><strong>ä¸è½½</strong></p> + +<p>è¦ä¸è½½Apache Kylin v2.5.0æºä»£ç æäºè¿å¶å ï¼è¯·è®¿é®<a href="http://kylin.apache.org/download">ä¸è½½é¡µé¢</a> .</p> + +<p><strong>å级</strong></p> + +<p>åè<a href="/docs/howto/howto_upgrade.html">å级æå</a>.</p> + +<p><strong>åé¦</strong></p> + +<p>å¦ææ¨éå°é®é¢æçé®ï¼è¯·åéé®ä»¶è³ Apache Kylin dev æ user é®ä»¶å表ï¼d...@kylin.apache.orgï¼u...@kylin.apache.org; å¨åéä¹åï¼è¯·ç¡®ä¿æ¨å·²éè¿åéçµåé®ä»¶è³ dev-subscr...@kylin.apache.org æ user-subscr...@kylin.apache.org订é äºé®ä»¶å表ã</p> + +<p><em>é常æè°¢ææè´¡ç®Apache Kylinçæå!</em></p> +</description> + <pubDate>Thu, 20 Sep 2018 13:00:00 -0700</pubDate> + <link>http://kylin.apache.org/cn/blog/2018/09/20/release-v2.5.0/</link> + <guid isPermaLink="true">http://kylin.apache.org/cn/blog/2018/09/20/release-v2.5.0/</guid> + + + <category>blog</category> + + </item> + + <item> <title>Apache Kylin v2.5.0 Release Announcement</title> <description><p>The Apache Kylin community is pleased to announce the release of Apache Kylin v2.5.0.</p> @@ -303,70 +367,6 @@ Graphic 10 Process of Querying Cube</ <category>blog</category> - - </item> - - <item> - <title>Apache Kylin v2.5.0 æ£å¼åå¸</title> - <description><p>è¿æ¥Apache Kylin 社åºå¾é«å ´å°å®£å¸ï¼Apache Kylin 2.5.0 æ£å¼åå¸ã</p> - -<p>Apache Kylin æ¯ä¸ä¸ªå¼æºçåå¸å¼åæå¼æï¼æ¨å¨ä¸ºæ大æ°æ®éæä¾ SQL æ¥å£åå¤ç»´åæï¼OLAPï¼çè½åã</p> - -<p>è¿æ¯ç»§2.4.0 åçä¸ä¸ªæ°åè½çæ¬ã该çæ¬å¼å ¥äºå¾å¤æä»·å¼çæ¹è¿ï¼å®æ´çæ¹å¨å表请åè§<a href="https://kylin.apache.org/docs/release_notes.html">release notes</a>ï¼è¿éæä¸äºä¸»è¦æ¹è¿å说æï¼</p> - -<h3 id="all-in-spark--cubing-">All-in-Spark ç Cubing å¼æ</h3> -<p>Kylin ç Spark å¼æå°ä½¿ç¨ Spark è¿è¡ cube 计ç®ä¸çææåå¸å¼ä½ä¸ï¼å æ¬è·åå个维度çä¸åå¼ï¼å° cuboid æ件转æ¢ä¸º HBase HFileï¼å并 segmentï¼å并è¯å ¸çãé»è®¤ç Spark é ç½®ä¹ç»è¿ä¼åï¼ä½¿å¾ç¨æ·å¯ä»¥è·å¾å¼ç®±å³ç¨çä½éªãç¸å ³å¼åä»»å¡æ¯ KYLIN-3427, KYLIN-3441, KYLIN-3442.</p> - -<p>Spark ä»»å¡ç®¡çä¹æææ¹è¿ï¼ä¸æ¦ Spark ä»»å¡å¼å§è¿è¡ï¼æ¨å°±å¯ä»¥å¨Webæ§å¶å°ä¸è·å¾ä½ä¸é¾æ¥ï¼å¦ææ¨ä¸¢å¼è¯¥ä½ä¸ï¼Kylin å°ç«å»ç»æ¢ Spark ä½ä¸ä»¥åæ¶éæ¾èµæºï¼å¦æéæ°å¯å¨ Kylinï¼å®å¯ä»¥ä»ä¸ä¸ä¸ªä½ä¸æ¢å¤ï¼èä¸æ¯éæ°æ交æ°ä½ä¸.</p> - -<h3 id="mysql--kylin-">MySQL å Kylin å æ°æ®çåå¨</h3> -<p>å¨è¿å»ï¼HBase æ¯ Kylin å æ°æ®åå¨çå¯ä¸éæ©ã å¨æäºæ åµä¸ HBaseä¸éç¨ï¼ä¾å¦ä½¿ç¨å¤ä¸ª HBase é群æ¥ä¸º Kylin æä¾è·¨åºåçé«å¯ç¨ï¼è¿éå¤å¶ç HBase é群æ¯åªè¯»çï¼æ以ä¸è½åå æ°æ®åå¨ãç°å¨æ们å¼å ¥äº MySQL Metastore 以满足è¿ç§éæ±ãæ¤åè½ç°å¨å¤äºæµè¯é¶æ®µãæ´å¤å 容åè§ KYLIN-3488ã</p> - -<h3 id="hybrid-model-">Hybrid model å¾å½¢çé¢</h3> -<p>Hybrid æ¯ä¸ç§ç¨äºç»è£ å¤ä¸ª cube çé«çº§æ¨¡åã å®å¯ç¨äºæ»¡è¶³ cube ç schema è¦åçæ¹åçæ åµãè¿ä¸ªåè½è¿å»æ²¡æå¾å½¢çé¢ï¼å æ¤åªæä¸å°é¨åç¨æ·ç¥éå®ãç°å¨æä»¬å¨ Web çé¢ä¸å¼å¯äºå®ï¼ä»¥ä¾¿æ´å¤ç¨æ·å¯ä»¥å°è¯ã</p> - -<h3 id="cube-planner">é»è®¤å¼å¯ Cube planner</h3> -<p>Cube planner å¯ä»¥æ大å°ä¼å cube ç»æï¼åå°æ建ç cuboid æ°éï¼ä»èèç计ç®/åå¨èµæºå¹¶æé«æ¥è¯¢æ§è½ãå®æ¯å¨v2.3ä¸å¼å ¥çï¼ä½é»è®¤æ åµä¸æ²¡æå¼å¯ã为äºè®©æ´å¤ç¨æ·çå°å¹¶å°è¯å®ï¼æ们é»è®¤å¨v2.5ä¸å¯ç¨å®ã ç®æ³å°å¨ç¬¬ä¸æ¬¡æ建 segment çæ¶åï¼æ ¹æ®æ°æ®ç»è®¡èªå¨ä¼å cuboid éå.</p> - -<h3 id="segment-">æ¹è¿ç Segment åªæ</h3> -<p>Segmentï¼ååºï¼ä¿®åªå¯ä»¥ææå°åå°ç£çåç½ç»I / Oï¼å æ¤å¤§å¤§æé«äºæ¥è¯¢æ§è½ã è¿å»ï¼Kylin åªæååºå (partition date column) çå¼è¿è¡ segment çä¿®åªã å¦ææ¥è¯¢ä¸æ²¡æå°ååºåä½ä¸ºè¿æ»¤æ¡ä»¶ï¼é£ä¹ä¿®åªå°ä¸èµ·ä½ç¨ï¼ä¼æ«æææsegmentã.<br /> -ç°å¨ä»v2.5å¼å§ï¼Kylin å°å¨ segment 级å«è®°å½æ¯ä¸ªç»´åº¦çæå°/æ大å¼ã å¨æ«æ segment ä¹åï¼ä¼å°æ¥è¯¢çæ¡ä»¶ä¸æå°/æ大索å¼è¿è¡æ¯è¾ã å¦æä¸å¹é ï¼å°è·³è¿è¯¥ segmentã æ£æ¥KYLIN-3370äºè§£æ´å¤ä¿¡æ¯ã</p> - -<h3 id="yarn-">å¨ YARN ä¸å并åå ¸</h3> -<p>å½ segment å并æ¶ï¼å®ä»¬çè¯å ¸ä¹éè¦å并ãå¨è¿å»ï¼åå ¸å并åçå¨ Kylin ç JVM ä¸ï¼è¿éè¦ä½¿ç¨å¤§éçæ¬å°å åå CPU èµæºã å¨æ端æ åµä¸ï¼å¦ææå 个并åä½ä¸ï¼ï¼å¯è½ä¼å¯¼è´ Kylin è¿ç¨å´©æºã å æ¤ï¼ä¸äºç¨æ·ä¸å¾ä¸ä¸º Kylin ä»»å¡èç¹åé æ´å¤å åï¼æè¿è¡å¤ä¸ªä»»å¡èç¹ä»¥å¹³è¡¡å·¥ä½è´è½½ã<br /> -ç°å¨ä»v2.5å¼å§ï¼Kylin å°æè¿é¡¹ä»»å¡æäº¤ç» Hadoop MapReduce å Sparkï¼è¿æ ·å°±å¯ä»¥è§£å³è¿ä¸ªç¶é¢é®é¢ã æ¥çKYLIN-3471äºè§£æ´å¤ä¿¡æ¯.</p> - -<h3 id="cube-">æ¹è¿ä½¿ç¨å ¨å±åå ¸ç cube æ建æ§è½</h3> -<p>å ¨å±åå ¸ (Global Dictionary) æ¯ bitmap 精确å»é计æ°çå¿ è¦æ¡ä»¶ãå¦æå»éåå ·æé常é«çåºæ°ï¼å GD å¯è½é常大ãå¨ cube æ建é¶æ®µï¼Kylin éè¦éè¿ GD å°éæ´æ°å¼è½¬æ¢ä¸ºæ´æ°ã尽管 GD 已被åæå¤ä¸ªåçï¼å¯ä»¥åå¼å è½½å°å åï¼ä½æ¯ç±äºå»éåçå¼æ¯ä¹±åºçãKylin éè¦åå¤è½½å ¥åè½½åº(swap in/out)åçï¼è¿ä¼å¯¼è´æ建任å¡é常ç¼æ ¢ã<br /> -该å¢å¼ºåè½å¼å ¥äºä¸ä¸ªæ°æ¥éª¤ï¼ä¸ºæ¯ä¸ªæ°æ®åä»å ¨å±åå ¸ä¸æ建ä¸ä¸ªç¼©å°çåå ¸ã éåæ¯ä¸ªä»»å¡åªéè¦å 载缩å°çåå ¸ï¼ä»èé¿å é¢ç¹çè½½å ¥åè½½åºãæ§è½å¯ä»¥æ¯ä»¥åå¿«3åãæ¥ç KYLIN-3491 äºè§£æ´å¤ä¿¡æ¯.</p> - -<h3 id="topn-count-distinct--cube-">æ¹è¿å« TOPN, COUNT DISTINCT ç cube 大å°ç估计</h3> -<p>Cube ç大å°å¨æ建æ¶æ¯é¢å 估计çï¼å¹¶è¢«åç»å 个æ¥éª¤ä½¿ç¨ï¼ä¾å¦å³å® MR / Spark ä½ä¸çååºæ°ï¼è®¡ç® HBase region åå²çãå®çåç¡®ä¸å¦ä¼å¯¹æ建æ§è½äº§çå¾å¤§å½±åã å½åå¨ COUNT DISTINCTï¼TOPN ç度éæ¶åï¼å 为å®ä»¬ç大å°æ¯çµæ´»çï¼å æ¤ä¼°è®¡å¼å¯è½è·çå®å¼æå¾å¤§åå·®ã å¨è¿å»ï¼ç¨æ·éè¦è°æ´è¥å¹²ä¸ªåæ°ä»¥ä½¿å°ºå¯¸ä¼°è®¡æ´æ¥è¿å®é 尺寸ï¼è¿å¯¹æ®éç¨æ·æç¹å°é¾ã<br /> -ç°å¨ï¼Kylin å°æ ¹æ®æ¶éçç»è®¡ä¿¡æ¯èªå¨è°æ´å¤§å°ä¼°è®¡ãè¿å¯ä»¥ä½¿ä¼°è®¡å¼ä¸å®é 大å°æ´æ¥è¿ãæ¥ç KYLIN-3453 äºè§£æ´å¤ä¿¡æ¯ã</p> - -<h3 id="hadoop-30hbase-20">æ¯æHadoop 3.0/HBase 2.0</h3> -<p>Hadoop 3å HBase 2å¼å§è¢«è®¸å¤ç¨æ·éç¨ãç°å¨ Kylin æä¾ä½¿ç¨æ°ç Hadoop å HBase API ç¼è¯çæ°äºè¿å¶å ãæ们已ç»å¨ Hortonworks HDP 3.0 å Cloudera CDH 6.0 ä¸è¿è¡äºæµè¯</p> - -<p><strong>ä¸è½½</strong></p> - -<p>è¦ä¸è½½Apache Kylin v2.5.0æºä»£ç æäºè¿å¶å ï¼è¯·è®¿é®<a href="http://kylin.apache.org/download">ä¸è½½é¡µé¢</a> .</p> - -<p><strong>å级</strong></p> - -<p>åè<a href="/docs/howto/howto_upgrade.html">å级æå</a>.</p> - -<p><strong>åé¦</strong></p> - -<p>å¦ææ¨éå°é®é¢æçé®ï¼è¯·åéé®ä»¶è³ Apache Kylin dev æ user é®ä»¶å表ï¼d...@kylin.apache.orgï¼u...@kylin.apache.org; å¨åéä¹åï¼è¯·ç¡®ä¿æ¨å·²éè¿åéçµåé®ä»¶è³ dev-subscr...@kylin.apache.org æ user-subscr...@kylin.apache.org订é äºé®ä»¶å表ã</p> - -<p><em>é常æè°¢ææè´¡ç®Apache Kylinçæå!</em></p> -</description> - <pubDate>Thu, 20 Sep 2018 13:00:00 -0700</pubDate> - <link>http://kylin.apache.org/cn/blog/2018/09/20/release-v2.5.0/</link> - <guid isPermaLink="true">http://kylin.apache.org/cn/blog/2018/09/20/release-v2.5.0/</guid> - - - <category>blog</category> </item>