Re: can I read or write "Extract Fact Table Distinct Columns" result to somewhere

Xiaoxiang Yu Mon, 21 Oct 2019 00:02:11 -0700

Hi,
This is my suggestion, you may check if it satisfy you request.

First, check the dictionary  for what you want and get it’s path in HDFS.
Second, fetch them to local disk.
Third, use DumpDictionaryCLI to dump dict’s content.



Following is my output:

hadoop fs -get 
/kylin/kylin_4117/resources/dict/LACUS.USERACTIONLOG/CITY/139315b2-44ba-b5ff-dea5-431c308cd399.dict
sh bin/kylin.sh org.apache.kylin.cube.cli.DumpDictionaryCLI 
139315b2-44ba-b5ff-dea5-431c308cd399.dict


[root@cdh-client apache-kylin-3.0.0-SNAPSHOT-bin]# sh bin/kylin.sh 
org.apache.kylin.cube.cli.DumpDictionaryCLI 
139315b2-44ba-b5ff-dea5-431c308cd399.dict
Using cached dependency...
KYLIN_JVM_SETTINGS is -Xms1024M -Xmx4096M -Dcalcite.debug -Xss1024K 
-XX:MaxPermSize=512M -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Xloggc:/root/xiaoxiang.yu/apache-kylin-3.0.0-SNAPSHOT-bin/logs/kylin.gc.23118 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; 
support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated 
and will likely be removed in a future release
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/root/xiaoxiang.yu/apache-kylin-3.0.0-SNAPSHOT-bin/tool/kylin-tool-3.0.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
============================================================================
File: 
/root/xiaoxiang.yu/apache-kylin-3.0.0-SNAPSHOT-bin/139315b2-44ba-b5ff-dea5-431c308cd399.dict
Thu Aug 08 20:47:26 CST 2019
{
  "uuid" : "139315b2-44ba-b5ff-dea5-431c308cd399",
  "last_modified" : 1565268446094,
  "version" : "2.6.0.20500",
  "source_table" : "LACUS.USERACTIONLOG",
  "source_column" : "CITY",
  "source_column_index" : 10,
  "data_type" : "varchar(30)",
  "input" : {
    "path" : 
"hdfs://cdh-master:8020/kylin/kylin_4117/kylin-86514b4e-ae55-ca6f-935a-b38bf55cf190/IntersectCountCube/fact_distinct_columns/USERACTIONLOG.CITY",
    "size" : 439,
    "last_modified_time" : 1565268427282
  },
  "dictionary_class" : "org.apache.kylin.dict.TrieDictionaryForest",
  "cardinality" : 9
}
TrieDictionaryForest
baseId:0
value divide:beijing
offset divide:0
----tree 0--------
Total 9 values
0 (0): beijing
1 (1): chongqin
2 (2): guangzhou
3 (3): hangzhou
4 (4): nanjing
5 (5): shanghai
6 (6): shenzhen
7 (7): taibei
8 (8): xianggang



----------------
Best wishes,
Xiaoxiang Yu


发件人: lk_hadoop <[email protected]>
答复: "[email protected]" <[email protected]>
日期: 2019年10月21日 星期一 13:32
收件人: user <[email protected]>, dev <[email protected]>
主题: can I read or write "Extract Fact Table Distinct Columns" result to 
somewhere

hi,all
    Some dimension like product name may hive many different values ，I need to 
list all values to users to select what the exactly value they want . because 
of the step 3 "Extract Fact Table Distinct Columns" have already calculated 
each dimension's distinct values , can I  directly read it or write it to 
somewhere like elasticsearch. Is there any way to do this easily.

2019-10-21
________________________________
lk_hadoop

Re: can I read or write "Extract Fact Table Distinct Columns" result to somewhere

Reply via email to