Hi,
??????
hadoop3.1.2
kylin:2.6.1

????????????#3 step ?? Extract Fact Table Distinct Columns)







kylin.log ??????????????~??
??????????????:
2019-03-08 11:15:08,660 INFO  [Scheduler 190936888 Job 
edba142b-b9ec-2379-89f6-2a34511d0425-177] spark.SparkExecutable:262 : cmd: 
export HADOOP_CONF_DIR=/home/apache/hbase/conf && 
/home/apache/kylin/spark/bin/spark-submit --class 
org.apache.kylin.common.util.SparkEntry  --conf spark.executor.instances=40  
--conf spark.yarn.archive=hdfs://master:9820/kylin/spark/spark-libs.jar  --conf 
spark.network.timeout=600  --conf spark.yarn.queue=default  --conf 
spark.history.fs.logDirectory=hdfs:///kylin/spark-history  --conf 
spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec  --conf 
spark.dynamicAllocation.enabled=true  --conf spark.master=yarn  --conf 
spark.storage.memoryFraction=0.5  --conf 
spark.dynamicAllocation.executorIdleTimeout=300  --conf 
spark.hadoop.yarn.timeline-service.enabled=false  --conf 
spark.executor.memory=4G  --conf spark.eventLog.enabled=true  --conf 
spark.eventLog.dir=hdfs:///kylin/spark-history  --conf 
spark.dynamicAllocation.minExecutors=1  --conf spark.executor.cores=2  --conf 
spark.yarn.executor.memoryOverhead=1024  --conf spark.hadoop.dfs.replication=2  
--conf spark.dynamicAllocation.maxExecutors=1000  --conf 
spark.executor.memoryOverhead=1024  --conf spark.driver.memory=2G  --conf 
spark.driver.memoryOverhead=256  --conf spark.submit.deployMode=cluster  --conf 
spark.shuffle.service.enabled=true --jars 
/home/apache/kylin/lib/kylin-job-2.6.1.jar 
/home/apache/kylin/lib/kylin-job-2.6.1.jar -className 
org.apache.kylin.engine.spark.SparkFactDistinct -counterOutput 
hdfs://kylincluster/kylin/kylin_metadata/kylin-edba142b-b9ec-2379-89f6-2a34511d0425/spark_cube_low/counter
 -statisticssamplingpercent 100 -cubename spark_cube_low -hiveTable 
default.kylin_intermediate_spark_cube_low_63876e8b_97c8_074d_0356_3f779a6ca1b6 
-output 
hdfs://kylincluster/kylin/kylin_metadata/kylin-edba142b-b9ec-2379-89f6-2a34511d0425/spark_cube_low/fact_distinct_columns
 -input 
hdfs://kylincluster/kylin/kylin_metadata/kylin-edba142b-b9ec-2379-89f6-2a34511d0425/kylin_intermediate_spark_cube_low_63876e8b_97c8_074d_0356_3f779a6ca1b6
 -segmentId 63876e8b-97c8-074d-0356-3f779a6ca1b6 -metaUrl 
kylin_metadata@hdfs,path=hdfs://kylincluster/kylin/kylin_metadata/kylin-edba142b-b9ec-2379-89f6-2a34511d0425/spark_cube_low/metadata
2019-03-08 11:15:09,423 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:09 WARN  SparkConf:66 - The configuration key 
'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and 
may be removed in the future. Please use the new key 
'spark.executor.memoryOverhead' instead.
2019-03-08 11:15:10,478 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:10 INFO  Client:54 - Requesting a new application from cluster 
with 2 NodeManagers
2019-03-08 11:15:10,541 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:10 INFO  Client:54 - Verifying our application has not 
requested more than the maximum memory capability of the cluster (8000 MB per 
container)
2019-03-08 11:15:10,542 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:10 INFO  Client:54 - Will allocate AM container, with 2304 MB 
memory including 256 MB overhead
2019-03-08 11:15:10,542 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:10 INFO  Client:54 - Setting up container launch context for 
our AM
2019-03-08 11:15:10,544 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:10 INFO  Client:54 - Setting up the launch environment for our 
AM container
2019-03-08 11:15:10,552 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:10 INFO  Client:54 - Preparing resources for our AM container
2019-03-08 11:15:11,553 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:11 INFO  Client:54 - Uploading resource 
hdfs://master:9820/kylin/spark/spark-libs.jar -> 
hdfs://kylincluster/user/root/.sparkStaging/application_1552042715496_0010/spark-libs.jar
2019-03-08 11:15:13,978 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:13 INFO  Client:54 - Uploading resource 
file:/home/apache/kylin/lib/kylin-job-2.6.1.jar -> 
hdfs://kylincluster/user/root/.sparkStaging/application_1552042715496_0010/kylin-job-2.6.1.jar
2019-03-08 11:15:14,485 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:14 WARN  Client:66 - Same name resource 
file:///home/apache/kylin/lib/kylin-job-2.6.1.jar added multiple times to 
distributed cache
2019-03-08 11:15:14,594 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:14 INFO  Client:54 - Uploading resource 
file:/tmp/spark-1e734412-567c-4c59-a6c6-9c58c83355db/__spark_conf__6590808330343993974.zip
 -> 
hdfs://kylincluster/user/root/.sparkStaging/application_1552042715496_0010/__spark_conf__.zip
2019-03-08 11:15:14,796 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:14 INFO  SecurityManager:54 - Changing view acls to: root
2019-03-08 11:15:14,797 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:14 INFO  SecurityManager:54 - Changing modify acls to: root
2019-03-08 11:15:14,798 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:14 INFO  SecurityManager:54 - Changing view acls groups to: 
2019-03-08 11:15:14,798 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:14 INFO  SecurityManager:54 - Changing modify acls groups to: 
2019-03-08 11:15:14,800 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:14 INFO  SecurityManager:54 - SecurityManager: authentication 
disabled; ui acls disabled; users  with view permissions: Set(root); groups 
with view permissions: Set(); users  with modify permissions: Set(root); groups 
with modify permissions: Set()
2019-03-08 11:15:14,811 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:14 INFO  Client:54 - Submitting application 
application_1552042715496_0010 to ResourceManager
2019-03-08 11:15:15,038 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:15 INFO  YarnClientImpl:273 - Submitted application 
application_1552042715496_0010
2019-03-08 11:15:16,042 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:16 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: ACCEPTED)
2019-03-08 11:15:16,046 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:16 INFO  Client:54 - 
2019-03-08 11:15:16,046 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 client token: N/A
2019-03-08 11:15:16,046 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 diagnostics: AM container is launched, waiting for AM container to Register 
with RM
2019-03-08 11:15:16,046 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 ApplicationMaster host: N/A
2019-03-08 11:15:16,046 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 ApplicationMaster RPC port: -1
2019-03-08 11:15:16,046 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 queue: default
2019-03-08 11:15:16,047 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 start time: 1552043714819
2019-03-08 11:15:16,047 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 final status: UNDEFINED
2019-03-08 11:15:16,047 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 tracking URL: http://master:8088/proxy/application_1552042715496_0010/
2019-03-08 11:15:16,050 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 user: root
2019-03-08 11:15:17,048 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:17 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: ACCEPTED)
2019-03-08 11:15:18,050 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:18 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: ACCEPTED)
2019-03-08 11:15:19,052 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:19 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: ACCEPTED)
2019-03-08 11:15:20,054 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:20 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: ACCEPTED)
2019-03-08 11:15:21,056 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:21 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: ACCEPTED)
2019-03-08 11:15:22,058 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:22 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:22 INFO  Client:54 - 
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 client token: N/A
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 diagnostics: N/A
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 ApplicationMaster host: 192.168.0.110
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 ApplicationMaster RPC port: 0
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 queue: default
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 start time: 1552043714819
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 final status: UNDEFINED
2019-03-08 11:15:22,059 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 tracking URL: http://master:8088/proxy/application_1552042715496_0010/
2019-03-08 11:15:22,063 INFO  [pool-16-thread-1] spark.SparkExecutable:38 :     
 user: root
2019-03-08 11:15:23,061 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:23 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:24,062 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:24 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:24,647 INFO  [FetcherRunner 1227064704-51] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 1 should running, 1 actual 
running, 1 stopped, 0 ready, 22 already succeed, 29 error, 0 discarded, 0 others
2019-03-08 11:15:25,065 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:25 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:26,067 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:26 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:27,069 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:27 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:28,071 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:28 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:29,073 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:29 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:30,075 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:30 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:31,077 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:31 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:32,079 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:32 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:33,081 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:33 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:33,487 DEBUG [http-nio-7070-exec-4] common.KylinConfig:328 : 
KYLIN_CONF property was not set, will seek KYLIN_HOME env variable
2019-03-08 11:15:33,487 INFO  [http-nio-7070-exec-4] common.KylinConfig:334 : 
Use KYLIN_HOME=/home/apache/kylin
2019-03-08 11:15:34,083 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:34 INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
2019-03-08 11:15:34,242 DEBUG [http-nio-7070-exec-3] 
badquery.BadQueryHistoryManager:65 : Loaded 0 Bad Query(s)
2019-03-08 11:15:35,085 INFO  [pool-16-thread-1] spark.SparkExecutable:38 : 
2019-03-08 11:15:35 INFO  Client:54 - Application report for 
application_15520427



spark ?????????? INFO  Client:54 - Application report for 
application_1552042715496_0010 (state: RUNNING)
????????????????????????????10%
spark??job????,??????



??cube??map reduce????????????????????


hadoop-root-resourcemanager-master.log ????????????????yarn??container?? 
??????1-4??????????
??????resourcemanager??????
2019-03-08 11:32:34,829 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Allocation proposal accepted
2019-03-08 11:32:37,590 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_e89_1552042715496_0010_01_000409 Container Transitioned from ACQUIRED 
to RELEASED
2019-03-08 11:32:37,590 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     
IP=192.168.0.110        OPERATION=AM Released Container TARGET=SchedulerApp     
RESULT=SUCCESS  APPID=application_1552042715496_0010    
CONTAINERID=container_e89_1552042715496_0010_01_000409  RESOURCE=<memory:6144, 
vCores:1>        QUEUENAME=default
2019-03-08 11:32:37,591 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_e89_1552042715496_0010_01_000410 Container Transitioned from 
ALLOCATED to ACQUIRED
2019-03-08 11:32:37,833 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
 assignedContainer application attempt=appattempt_1552042715496_0010_000001 
container=null queue=default clusterResource=<memory:32000, vCores:24> 
type=NODE_LOCAL requestedPartition=
2019-03-08 11:32:37,833 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_e89_1552042715496_0010_01_000411 Container Transitioned from NEW to 
ALLOCATED
2019-03-08 11:32:37,833 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode:
 Assigned container container_e89_1552042715496_0010_01_000411 of capacity 
<memory:6144, vCores:1> on host master:33725, which has 2 containers, 
<memory:12288, vCores:2> used and <memory:3712, vCores:10> available after 
allocation
2019-03-08 11:32:37,833 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     
OPERATION=AM Allocated Container        TARGET=SchedulerApp     RESULT=SUCCESS  
APPID=application_1552042715496_0010    
CONTAINERID=container_e89_1552042715496_0010_01_000411  RESOURCE=<memory:6144, 
vCores:1>        QUEUENAME=default
2019-03-08 11:32:37,833 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
assignedContainer queue=root usedCapacity=0.512 absoluteUsedCapacity=0.512 
used=<memory:16384, vCores:3> cluster=<memory:32000, vCores:24>
2019-03-08 11:32:37,833 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Allocation proposal accepted
2019-03-08 11:32:40,595 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_e89_1552042715496_0010_01_000410 Container Transitioned from ACQUIRED 
to RELEASED
2019-03-08 11:32:40,595 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     
IP=192.168.0.110        OPERATION=AM Released Container TARGET=SchedulerApp     
RESULT=SUCCESS  APPID=application_1552042715496_0010    
CONTAINERID=container_e89_1552042715496_0010_01_000410  RESOURCE=<memory:6144, 
vCores:1>        QUEUENAME=default
2019-03-08 11:32:40,596 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_e89_1552042715496_0010_01_000411 Container Transitioned from 
ALLOCATED to ACQUIRED
2019-03-08 11:32:40,836 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
 assignedContainer application attempt=appattempt_1552042715496_0010_000001 
container=null queue=default clusterResource=<memory:32000, vCores:24> 
type=NODE_LOCAL requestedPartition=
2019-03-08 11:32:40,836 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_e89_1552042715496_0010_01_000412 Container Transitioned from NEW to 
ALLOCATED
2019-03-08 11:32:40,836 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode:
 Assigned container container_e89_1552042715496_0010_01_000412 of capacity 
<memory:6144, vCores:1> on host master:33725, which has 2 containers, 
<memory:12288, vCores:2> used and <memory:3712, vCores:10> available after 
allocation
2019-03-08 11:32:40,836 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     
OPERATION=AM Allocated Container        TARGET=SchedulerApp     RESULT=SUCCESS  
APPID=application_1552042715496_0010    
CONTAINERID=container_e89_1552042715496_0010_01_000412  RESOURCE=<memory:6144, 
vCores:1>        QUEUENAME=default
2019-03-08 11:32:40,836 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
assignedContainer queue=root usedCapacity=0.512 absoluteUsedCapacity=0.512 
used=<memory:16384, vCores:3> cluster=<memory:32000, vCores:24>
2019-03-08 11:32:40,836 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Allocation proposal accepted







??????diagnosis
??????????????????????????

Best regards

Reply via email to