update faq and advanced_settings


class="post-content" > 
-                                                       <h4 
id="build-cube-failed-due-to-error-check-status">Build cube failed due to 
“error check status”</h4>
+                                                       <h4 
 “bin/” can locate hive/hcat jars in local, but 
Kylin reports error like “java.lang.NoClassDefFoundError: 
+  <li>
+    <p>Kylin need many dependent jars (hadoop/hive/hcat/hbase/kafka) on 
classpath to work, but Kylin doesn’t ship them. It will seek these jars from 
your local machine by running commands like <code 
class="highlighter-rouge">hbase classpath</code>, <code 
class="highlighter-rouge">hive -e set</code> etc. The founded jars’ path will 
be appended to the environment variable <em>HBASE_CLASSPATH</em> (Kylin uses 
<code class="highlighter-rouge">hbase</code> shell command to start up, which 
will read this). But in some Hadoop distribution (like EMR 5.0), the <code 
class="highlighter-rouge">hbase</code> shell doesn’t keep the origin <code 
class="highlighter-rouge">HBASE_CLASSPATH</code> value, that causes the 
+  </li>
+  <li>
+    <p>To fix this, find the hbase shell script (in hbase/bin folder), and 
search <em>HBASE_CLASSPATH</em>, check whether it overwrite the value like :</p>
+  </li>
+<div class="highlight"><pre><code class="language-groff" 
+  <li>If true, change it to keep the origin value like:</li>
+<div class="highlight"><pre><code class="language-groff" 
 Get “java.lang.IllegalArgumentException: Too high cardinality is not 
suitable for dictionary – cardinality: 5220674” in “Build Dimension 
Dictionary” step</h4>
+  <li>Kylin uses “Dictionary” encoding to encode/decode the dimension 
values (check <a href="/blog/2015/08/13/kylin-dictionary/">this blog</a>); 
Usually a dimension’s cardinality is less than millions, so the “Dict” 
encoding is good to use. As dictionary need be persisted and loaded into 
memory, if a dimension’s cardinality is very high, the memory footprint will 
be tremendous, so Kylin add a check on this. If you see this error, suggest to 
identify the UHC dimension first and then re-evaluate the design (whether need 
to make that as dimension?). If must keep it, you can by-pass this error with 
couple ways: 1) change to use other encoding (like <code 
class="highlighter-rouge">fixed_length</code>, <code 
class="highlighter-rouge">integer</code>) 2) or set a bigger value for <code 
class="highlighter-rouge">kylin.dictionary.max.cardinality</code> in <code 
+<h4 id="build-cube-failed-due-to-error-check-status">3. Build cube failed due 
to “error check status”</h4>
   <li>Check if <code class="highlighter-rouge">kylin.log</code> contains 
<em>yarn.resourcemanager.webapp.address:</em> and 
<em> Connection refused</em></li>
   <li>A workaround is update <code 
class="highlighter-rouge"></code>, set <code 
 cannot get master address from ZooKeeper on Hortonworks Sandbox</h4>
HBase cannot get master address from ZooKeeper on Hortonworks Sandbox</h4>
   <li>By default hortonworks disables hbase, you’ll have to start hbase in 
ambari homepage first.</li>
-<h4 id="map-reduce-job-information-cannot-display-on-hortonworks-sandbox">Map 
Reduce Job information cannot display on Hortonworks Sandbox</h4>
+<h4 id="map-reduce-job-information-cannot-display-on-hortonworks-sandbox">5. 
Map Reduce Job information cannot display on Hortonworks Sandbox</h4>
   <li>Check out <a 
-<h4 id="install-kylin-on-cdh-52-or-hadoop-25x">Install Kylin on CDH 5.2 or 
Hadoop 2.5.x</h4>
+<h4 id="how-to-install-kylin-on-cdh-52-or-hadoop-25x">6. How to Install Kylin 
on CDH 5.2 or Hadoop 2.5.x</h4>
   <li>Check out discussion: <a 
   My Cluster is running on Cloudera Distribution CDH 5.2.0.</code></pre></div>
 to load a big cube as HTable, with java.lang.OutOfMemoryError: unable to 
create new native thread</h4>
-  <li>HBase (as of writing) allocates one thread per region when bulk loading 
a HTable. Try reduce the number of regions of your cube by setting its 
“capacity” to “MEDIUM” or “LARGE”. Also tweaks OS &amp; JVM can 
allow more threads, for example see <a 
href="";>this article</a>.</li>
 returns a negtive result while all the numbers in this field are &gt; 0</h4>
 SUM(field) returns a negtive result while all the numbers in this field are 
&gt; 0</h4>
   <li>If a column is declared as integer in Hive, the SQL engine (calcite) 
will use column’s type (integer) as the data type for “SUM(field)”, while 
the aggregated value on this field may exceed the scope of integer; in that 
case the cast will cause a negtive value be returned; The workround is, alter 
that column’s type to BIGINT in hive, and then sync the table schema to Kylin 
(the cube doesn’t need rebuild); Keep in mind that, always declare as BIGINT 
in hive for an integer column which would be used as a measure in Kylin; See 
hive number types: <a 
 Kylin need extract the distinct columns from Fact Table before building 
 Why Kylin need extract the distinct columns from Fact Table before building 
   <li>Kylin uses dictionary to encode the values in each column, this greatly 
reduce the cube’s storage size. To build the dictionary, Kylin need fetch the 
distinct values for each column.</li>
-<h4 id="why-kylin-calculate-the-hive-table-cardinality">Why Kylin calculate 
the HIVE table cardinality?</h4>
+<h4 id="why-kylin-calculate-the-hive-table-cardinality">9. Why Kylin calculate 
the HIVE table cardinality?</h4>
   <li>The cardinality of dimensions is an important measure of cube 
complexity. The higher the cardinality, the bigger the cube, and thus the 
longer to build and the slower to query. Cardinality &gt; 1,000 is worth 
attention and &gt; 1,000,000 should be avoided at best effort. For optimal cube 
performance, try reduce high cardinality by categorize values or derive 
-<h4 id="how-to-add-new-user-or-change-the-default-password">How to add new 
user or change the default password?</h4>
+<h4 id="how-to-add-new-user-or-change-the-default-password">10. How to add new 
user or change the default password?</h4>
   <li>Kylin web’s security is implemented with Spring security framework, 
where the kylinSecurity.xml is the main configuration file:</li>
   <li>The password hash for pre-defined test users can be found in the profile 
“sandbox,testing” part; To change the default password, you need generate a 
new hash and then update it here, please refer to the code snippet in: <a 
-  <li>When you deploy Kylin for more users, switch to LDAP authentication is 
recommended; To enable LDAP authentication, update “kylin.sandbox” in 
conf/ to false, and also configure the ldap.* properties in 
+  <li>When you deploy Kylin for more users, switch to LDAP authentication is 
-<h4 id="using-sub-query-for-un-supported-sql">Using sub-query for un-supported 
+<h4 id="using-sub-query-for-un-supported-sql">11. Using sub-query for 
un-supported SQL</h4>
 <div class="highlight"><pre><code class="language-groff" 
data-lang="groff">Original SQL:
 select fact.slr_sgmt,
@@ -2192,11 +2212,11 @@ from (
 ) a
 group by a.slr_sgmt</code></pre></div>
-<h4 id="build-kylin-meet-npm-errors-">Build kylin meet NPM errors 
+<h4 id="build-kylin-meet-npm-errors-">12. Build kylin meet NPM errors 
-    <p>Please add proxy for your NPM (请为NPM设置代理):  <br />
+    <p>Please add proxy for your NPM:  <br />
   <code class="highlighter-rouge">npm config set proxy 
@@ -2205,7 +2225,7 @@ group by a.slr_sgmt</code></pre></div>
 to run BuildCubeWithEngineTest, saying failed to connect to hbase while hbase 
is active</h4>
 Failed to run BuildCubeWithEngineTest, saying failed to connect to hbase while 
hbase is active</h4>
   <li>User may get this error when first time run hbase client, please check 
the error trace to see whether there is an error saying couldn’t access a 
folder like “/hadoop/hbase/local/jars”; If that folder doesn’t exist, 
create it.</li>

 <p>Compression settings only take effect after restarting Kylin server 
+<h2 id="allocate-more-memory-to-kylin-instance">Allocate more memory to Kylin 
+<p>Open <code class="highlighter-rouge">bin/</code>, which has two 
sample settings for <code class="highlighter-rouge">KYLIN_JVM_SETTINGS</code> 
environment variable; The default setting is small (4GB at max.), you can 
comment it and then un-comment the next line to allocate 16GB:</p>
+<div class="highlight"><pre><code class="language-groff" 
data-lang="groff">export KYLIN_JVM_SETTINGS="-Xms1024M -Xmx4096M -Xss1024K 
-XX:MaxPermSize=128M -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Xloggc:$KYLIN_HOME/logs/kylin.gc.$$ -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M"
+# export KYLIN_JVM_SETTINGS="-Xms16g -Xmx16g -XX:MaxPermSize=512m 
-XX:NewSize=3g -XX:MaxNewSize=3g -XX:SurvivorRatio=4 
-XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled 
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
-XX:CMSInitiatingOccupancyFraction=70 -XX:+DisableExplicitGC 
 <h2 id="enable-ldap-or-sso-authentication">Enable LDAP or SSO 
 <p>Check <a href="../howto/howto_ldap_and_sso.html">How to Enable Security 
with LDAP and SSO</a></p>
 <p>Restart Kylin server to take effective. To disable, set <code 
class="highlighter-rouge">mail.enabled</code> back to <code 
+<p>Administrator will get notifications for all jobs. Modeler and Analyst need 
enter email address into the “Notification List” at the first page of cube 
wizard, and then will get notified for that cube.</p>

     <description>Apache Kylin Home</description>
     <atom:link href=""; rel="self" 
-    <pubDate>Sun, 16 Oct 2016 07:26:42 -0700</pubDate>
-    <lastBuildDate>Sun, 16 Oct 2016 07:26:42 -0700</lastBuildDate>
+    <pubDate>Tue, 18 Oct 2016 06:59:12 -0700</pubDate>
+    <lastBuildDate>Tue, 18 Oct 2016 06:59:12 -0700</lastBuildDate>
     <generator>Jekyll v2.5.3</generator>

