[jira] [Commented] (KYLIN-2642) Relax check in RowKeyColDesc to keep backward compatibility
[ https://issues.apache.org/jira/browse/KYLIN-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027271#comment-16027271 ] liyang commented on KYLIN-2642: --- I don't see big problem relaxing the check. Just want to double check on the GUI, can user select fixed_len encoding on integers? If not, then I think we are good. > Relax check in RowKeyColDesc to keep backward compatibility > --- > > Key: KYLIN-2642 > URL: https://issues.apache.org/jira/browse/KYLIN-2642 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2642.patch > > > This check will make the cube DESCBROKEN if user used FixedLenDimEnc encode > integer: > {code:java} > if (encodingName.startsWith(FixedLenDimEnc.ENCODING_NAME) && > (type.isIntegerFamily() || type.isNumberFamily())) { > throw new IllegalArgumentException(colRef + " type is " + type + > " and cannot apply fixed_length encoding"); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-2638) Cannot install Kylin
[ https://issues.apache.org/jira/browse/KYLIN-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyang updated KYLIN-2638: -- Labels: newbie (was: ) > Cannot install Kylin > > > Key: KYLIN-2638 > URL: https://issues.apache.org/jira/browse/KYLIN-2638 > Project: Kylin > Issue Type: Bug > Components: Environment >Affects Versions: v2.0.0 >Reporter: Gergely >Assignee: hongbin ma >Priority: Blocker > Labels: newbie > > Hi All, > We have a blocking issue by installing Kylin on CDH 5.11. By executing such > lines on CentOS 7 we always getting empty strings. > Could you please advise? > bash $KYLIN_HOME/bin/get-properties.sh kylin.env.hdfs-working-dir > KYLIN_HOME points to the right location, it also gives empty when we call > directly bash get-properties.sh kylin.env.hdfs-working-dir > Many thanks in advance. > Regards -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2619) Use newCachedThreadPool instead of newFixedThreadPool in Broadcaster
[ https://issues.apache.org/jira/browse/KYLIN-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027275#comment-16027275 ] Shaofeng SHI commented on KYLIN-2619: - The newCachedThreadPool isn't recommended as there is no limit on the max size of the pool. It will cause too many threads be created when the app is busy or the network isn't stable. The fixed size pool isn't good either. I will directly new a "ThreadPoolExecutor" object with a min/max pool size and a alive time. That will solve cachedThreadPool's limitation and be more dynamic than fixedSizePool. > Use newCachedThreadPool instead of newFixedThreadPool in Broadcaster > > > Key: KYLIN-2619 > URL: https://issues.apache.org/jira/browse/KYLIN-2619 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.1.0 > > Attachments: KYLIN-2619.patch > > > We should use newCachedThreadPool instead of newFixedThreadPool in > Broadcaster because newCachedThreadPool is more flexible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2511) java.lang.NoClassDefFoundError: org/apache/hadoop/hive/serde2/typeinfo/TypeInfo
[ https://issues.apache.org/jira/browse/KYLIN-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027287#comment-16027287 ] liyang commented on KYLIN-2511: --- Since 2.0, another workaround other than the "kylin.engine.mr.lib-dir" is set the below env vars. export HADOOP_CONF_DIR=/etc/hadoop/conf export HIVE_LIB=/usr/lib/hive export HIVE_CONF=/etc/hive/conf export HCAT_HOME=/usr/lib/hive-hcatalog Then "bin/find-hive-dependencies.sh" should pick up the right hive jars. > java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/serde2/typeinfo/TypeInfo > --- > > Key: KYLIN-2511 > URL: https://issues.apache.org/jira/browse/KYLIN-2511 > Project: Kylin > Issue Type: Bug >Reporter: LIUCUN > > When I use the apache-kylin-1.6.0-cdh5.7-bin.tar.gz + CDH5.8 to build the > sample cube , will cause the ERROR at Extract Fact Table Distinct Columns > step , the ERROR info as belows: > 2017-03-16 11:21:18,357 ERROR [pool-9-thread-6] > threadpool.DefaultScheduler:140 : ExecuteException > job:02941fbe-6f3a-4a2c-bb99-4bb61b388f80 > org.apache.kylin.job.exception.ExecuteException: > org.apache.kylin.job.exception.ExecuteException: > java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/serde2/typeinfo/TypeInfo > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:123) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.kylin.job.exception.ExecuteException: > java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/serde2/typeinfo/TypeInfo > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:123) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) > ... 4 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/serde2/typeinfo/TypeInfo > at > org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:105) > at > org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:119) > at > org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:103) > at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92) > at > org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) > ... 6 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.typeinfo.TypeInfo > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 12 more -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2634) kylin build stops while building dimension dictionary with file not found exception
[ https://issues.apache.org/jira/browse/KYLIN-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027310#comment-16027310 ] liyang commented on KYLIN-2634: --- Confirm this is because of corrupted metadata. HBase metadata falls back to HDFS for big resources (> 10 MB if I remember correctly). And in this case, the resource "/dict/DEFAULT.LOG_DATA_170416/IP/56abfe1f-1fdf-4bec-baec-43721e693c32.dict" was deleted by accident and causes this error. > kylin build stops while building dimension dictionary with file not found > exception > --- > > Key: KYLIN-2634 > URL: https://issues.apache.org/jira/browse/KYLIN-2634 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.0.0 > Environment: CentOS release 6.8 (Final) x64 > CDH 5.9 >Reporter: flashput >Assignee: Dong Li > Attachments: kylin_hive_conf.xml, kylin_job_conf.xml, kylin.log, > kylin.properties > > > bq. > +--+--+--+--+--+--+ > | ip | u_domain | u_page | r_domain | r_page | agent| > +--+--+--+--+--+--+ > | 2519 | 2012 | 20118849 | 2000 | 2000 | 2022 | > | 2113 | 2012 | 20118850 | 2000 | 2000 | 2022 | > | 2247 | 2012 | 20118851 | 2000 | 2000 | 2022 | > | 2325 | 2012 | 20118852 | 2000 | 2000 | 2022 | > | 2247 | 2012 | 20118853 | 2000 | 2000 | 2022 | > +--+--+--+--+--+--+ > +--+ > | count(*) | > +--+ > | 25452592 | > +--+ > Model description: > { > "uuid": "c39058c4-3e9d-4c0c-a908-c8efef41cc91", > "last_modified": 1495117591531, > "version": "2.0.0", > "name": "LOG_PV", > "owner": "ADMIN", > "description": "", > "fact_table": "DEFAULT.LOG_DATA_170416", > "lookups": [], > "dimensions": [ > { > "table": "LOG_DATA_170416", > "columns": [ > "U_DOMAIN", > "U_PAGE", > "R_DOMAIN", > "R_PAGE", > "AGENT", > "IP" > ] > } > ], > "metrics": [ > "LOG_DATA_170416.LOAD_TIME", > "LOG_DATA_170416.ARTICLE_CONTENT_HEIGHT" > ], > "filter_condition": "", > "partition_desc": { > "partition_date_column": null, > "partition_time_column": null, > "partition_date_start": 0, > "partition_date_format": "MMdd", > "partition_time_format": "HH:mm:ss", > "partition_type": "APPEND", > "partition_condition_builder": > "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder" > }, > "capacity": "MEDIUM" > } > Cube description json: > { > "uuid": "b475f98a-1ec2-45ad-a2eb-90217aa83d9b", > "last_modified": 1495117617084, > "version": "2.0.0", > "name": "cc", > "model_name": "LOG_PV", > "description": "", > "null_string": null, > "dimensions": [ > { > "name": "IP", > "table": "LOG_DATA_170416", > "column": "IP", > "derived": null > }, > { > "name": "U_DOMAIN", > "table": "LOG_DATA_170416", > "column": "U_DOMAIN", > "derived": null > }, > { > "name": "U_PAGE", > "table": "LOG_DATA_170416", > "column": "U_PAGE", > "derived": null > }, > { > "name": "R_DOMAIN", > "table": "LOG_DATA_170416", > "column": "R_DOMAIN", > "derived": null > }, > { > "name": "R_PAGE", > "table": "LOG_DATA_170416", > "column": "R_PAGE", > "derived": null > }, > { > "name": "AGENT", > "table": "LOG_DATA_170416", > "column": "AGENT", > "derived": null > } > ], > "measures": [ > { > "name": "_COUNT_", > "function": { > "expression": "COUNT", > "parameter": { > "type": "constant", > "value": "1" > }, > "returntype": "bigint" > } > } > ], > "dictionaries": [], > "rowkey": { > "rowkey_columns": [ > { > "column": "LOG_DATA_170416.IP", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.U_DOMAIN", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.U_PAGE", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.R_DOMAIN", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.R_PAGE", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.AGENT", > "encoding": "dict", >
[jira] [Updated] (KYLIN-2634) HBaseResourceStore should throw clearer error
[ https://issues.apache.org/jira/browse/KYLIN-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyang updated KYLIN-2634: -- Summary: HBaseResourceStore should throw clearer error (was: kylin build stops while building dimension dictionary with file not found exception) > HBaseResourceStore should throw clearer error > - > > Key: KYLIN-2634 > URL: https://issues.apache.org/jira/browse/KYLIN-2634 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.0.0 > Environment: CentOS release 6.8 (Final) x64 > CDH 5.9 >Reporter: flashput >Assignee: Dong Li > Attachments: kylin_hive_conf.xml, kylin_job_conf.xml, kylin.log, > kylin.properties > > > bq. > +--+--+--+--+--+--+ > | ip | u_domain | u_page | r_domain | r_page | agent| > +--+--+--+--+--+--+ > | 2519 | 2012 | 20118849 | 2000 | 2000 | 2022 | > | 2113 | 2012 | 20118850 | 2000 | 2000 | 2022 | > | 2247 | 2012 | 20118851 | 2000 | 2000 | 2022 | > | 2325 | 2012 | 20118852 | 2000 | 2000 | 2022 | > | 2247 | 2012 | 20118853 | 2000 | 2000 | 2022 | > +--+--+--+--+--+--+ > +--+ > | count(*) | > +--+ > | 25452592 | > +--+ > Model description: > { > "uuid": "c39058c4-3e9d-4c0c-a908-c8efef41cc91", > "last_modified": 1495117591531, > "version": "2.0.0", > "name": "LOG_PV", > "owner": "ADMIN", > "description": "", > "fact_table": "DEFAULT.LOG_DATA_170416", > "lookups": [], > "dimensions": [ > { > "table": "LOG_DATA_170416", > "columns": [ > "U_DOMAIN", > "U_PAGE", > "R_DOMAIN", > "R_PAGE", > "AGENT", > "IP" > ] > } > ], > "metrics": [ > "LOG_DATA_170416.LOAD_TIME", > "LOG_DATA_170416.ARTICLE_CONTENT_HEIGHT" > ], > "filter_condition": "", > "partition_desc": { > "partition_date_column": null, > "partition_time_column": null, > "partition_date_start": 0, > "partition_date_format": "MMdd", > "partition_time_format": "HH:mm:ss", > "partition_type": "APPEND", > "partition_condition_builder": > "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder" > }, > "capacity": "MEDIUM" > } > Cube description json: > { > "uuid": "b475f98a-1ec2-45ad-a2eb-90217aa83d9b", > "last_modified": 1495117617084, > "version": "2.0.0", > "name": "cc", > "model_name": "LOG_PV", > "description": "", > "null_string": null, > "dimensions": [ > { > "name": "IP", > "table": "LOG_DATA_170416", > "column": "IP", > "derived": null > }, > { > "name": "U_DOMAIN", > "table": "LOG_DATA_170416", > "column": "U_DOMAIN", > "derived": null > }, > { > "name": "U_PAGE", > "table": "LOG_DATA_170416", > "column": "U_PAGE", > "derived": null > }, > { > "name": "R_DOMAIN", > "table": "LOG_DATA_170416", > "column": "R_DOMAIN", > "derived": null > }, > { > "name": "R_PAGE", > "table": "LOG_DATA_170416", > "column": "R_PAGE", > "derived": null > }, > { > "name": "AGENT", > "table": "LOG_DATA_170416", > "column": "AGENT", > "derived": null > } > ], > "measures": [ > { > "name": "_COUNT_", > "function": { > "expression": "COUNT", > "parameter": { > "type": "constant", > "value": "1" > }, > "returntype": "bigint" > } > } > ], > "dictionaries": [], > "rowkey": { > "rowkey_columns": [ > { > "column": "LOG_DATA_170416.IP", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.U_DOMAIN", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.U_PAGE", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.R_DOMAIN", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.R_PAGE", > "encoding": "dict", > "isShardBy": false > }, > { > "column": "LOG_DATA_170416.AGENT", > "encoding": "dict", > "isShardBy": false > } > ] > }, > "hbase_mapping": { > "column_family": [ > { > "name": "F1", > "columns": [ > { > "qualifier": "M", > "measure_refs": [ > "_COUNT_" >
[jira] [Commented] (KYLIN-2642) Relax check in RowKeyColDesc to keep backward compatibility
[ https://issues.apache.org/jira/browse/KYLIN-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027436#comment-16027436 ] kangkaisen commented on KYLIN-2642: --- I have checked the code just. The front-end get valid encoding type for different column type by {{EncodingController.getValidEncodings}}. So user couldn't select fixed_len encoding for integer column in web. > Relax check in RowKeyColDesc to keep backward compatibility > --- > > Key: KYLIN-2642 > URL: https://issues.apache.org/jira/browse/KYLIN-2642 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2642.patch > > > This check will make the cube DESCBROKEN if user used FixedLenDimEnc encode > integer: > {code:java} > if (encodingName.startsWith(FixedLenDimEnc.ENCODING_NAME) && > (type.isIntegerFamily() || type.isNumberFamily())) { > throw new IllegalArgumentException(colRef + " type is " + type + > " and cannot apply fixed_length encoding"); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2619) Use newCachedThreadPool instead of newFixedThreadPool in Broadcaster
[ https://issues.apache.org/jira/browse/KYLIN-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027439#comment-16027439 ] kangkaisen commented on KYLIN-2619: --- We have added timeout for Http request, So newCachedThreadPool will terminate the idle thread. I agree with you. Add a min/max pool size is more better. > Use newCachedThreadPool instead of newFixedThreadPool in Broadcaster > > > Key: KYLIN-2619 > URL: https://issues.apache.org/jira/browse/KYLIN-2619 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.1.0 > > Attachments: KYLIN-2619.patch > > > We should use newCachedThreadPool instead of newFixedThreadPool in > Broadcaster because newCachedThreadPool is more flexible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (KYLIN-2648) kylin.env.hdfs-working-dir should be qualified and absolute path
[ https://issues.apache.org/jira/browse/KYLIN-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyang resolved KYLIN-2648. --- Resolution: Fixed Fix Version/s: v2.1.0 > kylin.env.hdfs-working-dir should be qualified and absolute path > > > Key: KYLIN-2648 > URL: https://issues.apache.org/jira/browse/KYLIN-2648 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.0.0 > Environment: hadoop :cdh5.4.0 (both main and hbase env) > hbase : hbase-1.2.0-cdh5.7.6 > hive: apache-hive-2.1.1 > kylin version: 2.0 >Reporter: suheng.cloud >Assignee: Dong Li > Fix For: v2.1.0 > > > I try to deploy kylin on one node of a stand alone hbase > cluster(hdfs://cdh5-mini/) which seperate from main hive > cluster(hdfs://cdh5/), > According to the blog "Deploy Apache Kylin with Standalone HBase Cluster" : > make sure the configurations of hadoop and hive points to main cluster, > I clone hadoop dir to another path and modify "fs.defaultFS" in core-site.xml > to "hdfs://cdh5/" , and in head of kylin.sh, I export HADOOP_HOME to this new > path. > So all goes well (include cube build/refresh) until I execute cube merge. > The merge error occurs at step "#9 Step Name: Garbage Collection on HDFS". > The stacktrace as follows: > 2017-05-25 17:28:07,070 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:114 : > CubingJob{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3, name=kylin_sales_cube - > 2012010100_2014020100 - MERGE - GMT+08:00 2017-05-25 16:51:30, > state=READY} prepare to schedule > 2017-05-25 17:28:07,073 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:117 : > CubingJob{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3, name=kylin_sales_cube - > 2012010100_2014020100 - MERGE - GMT+08:00 2017-05-25 16:51:30, > state=READY} scheduled > 2017-05-25 17:28:07,075 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:110 : Executing AbstractExecutable > (kylin_sales_cube - 2012010100_2014020100 - MERGE - GMT+08:00 > 2017-05-25 16:51:30) > 2017-05-25 17:28:07,078 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3 > 2017-05-25 17:28:07,083 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 1 actual > running, 0 stopped, 1 ready, 19 already succeed, 0 error, 11 discarded, 0 > others > 2017-05-25 17:28:07,083 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.ExecutableManager:389 : job id:c6709f0b-8858-4e66-a4c2-320ebc70a2e3 > from READY to RUNNING > 2017-05-25 17:28:07,105 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:110 : Executing AbstractExecutable (Garbage > Collection on HDFS) > 2017-05-25 17:28:07,106 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 > 2017-05-25 17:28:07,111 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.ExecutableManager:389 : job > id:c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 from READY to RUNNING > 2017-05-25 17:28:07,154 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: > hdfs://cdh5 > 2017-05-25 17:28:07,217 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:90 : HDFS path > hdfs:///kylin/kylin_metadata/kylin-a11d510f-d8a5-45c1-b430-bc7def851432 not > exists. > 2017-05-25 17:28:07,249 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:90 : HDFS path > hdfs:///kylin/kylin_metadata/kylin-0c1ed2d0-f595-4f58-aaea-2dbe7b41a550 not > exists. > 2017-05-25 17:28:07,320 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: > hdfs://cdh5-mini > 2017-05-25 17:28:07,324 ERROR [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:126 : error running Executable: > HDFSPathGarbageCollectionStep{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08, > name=Garbage Collection on HDFS, state=RUNNING} > 2017-05-25 17:28:07,326 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 > 2017-05-25 17:28:07,331 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 > 2017-05-25 17:28:07,334 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.ExecutableManager:389 : job > id:c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 from RUNNING
[jira] [Commented] (KYLIN-2603) Try push 'having' filter down to storage
[ https://issues.apache.org/jira/browse/KYLIN-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027431#comment-16027431 ] kangkaisen commented on KYLIN-2603: --- {code:java} BytesUtil.writeByteArray(TupleFilterSerializer.serialize(value.havingFilterPushDown, StringCodeSystem.INSTANCE), out); TupleFilter sGTHavingFilter = TupleFilterSerializer.deserialize(BytesUtil.readByteArray(in), StringCodeSystem.INSTANCE); {code} changing the serialize and deserialize method for GTScanRequest means users must update Coprocessor firstly when users upgrade Kylin and must upgrade the JobServer and QueryServer at the same time, in other word,users couldn't upgrade gradually. The cost is too expensive. > Try push 'having' filter down to storage > > > Key: KYLIN-2603 > URL: https://issues.apache.org/jira/browse/KYLIN-2603 > Project: Kylin > Issue Type: New Feature >Reporter: liyang > > We know push filter down to storage is good and have done that for 'where' > filter. Is it possible to push 'having' filter down to storage as well? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2647) Should get FileSystem from HBaseConfiguration in HBaseResourceStore
[ https://issues.apache.org/jira/browse/KYLIN-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027323#comment-16027323 ] liyang commented on KYLIN-2647: --- +1 looks good > Should get FileSystem from HBaseConfiguration in HBaseResourceStore > --- > > Key: KYLIN-2647 > URL: https://issues.apache.org/jira/browse/KYLIN-2647 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.0.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Critical > Attachments: KYLIN-2647.patch > > > KYLIN-2351 introduced a bug if User use Standalone HBase Cluster. > {code:java} >Error while executing SQL "SELECT SUM(revenue) AS revenue, SUM(profit) AS > profit, SUM(repay_profit) AS repayProfit, SUM(fraud_profit) AS fraudProfit, > SUM(share_profit) AS shareProfit, SUM(consume) AS consume, SUM(repay_consume) > AS repayConsume, SUM(fraud_consume) AS fraudConsume, SUM(share_consume) AS > shareConsume, SUM(cost) AS cost, SUM(fraud_cost) AS fraudCost, > SUM(repay_cost) AS repayCost, poi_cate2_id AS poiCategory2Id, poi_cate2_name > AS poiCategory2Name, main_poi_id AS orgId, main_poi_name AS orgName, > COUNT(DISTINCT NEW_OBJECT) AS newDeal, COUNT(DISTINCT ONLINE_OBJECT) AS > onlineDeal, partition_date AS dateStr FROM > mart_catering.app_shu_v5_trade_view WHERE (bd_id = 2084324 AND c_platform IN > ('mt', 'dp') AND partition_date = '2017-05-24') GROUP BY poi_cate2_id, > poi_cate2_name, partition_date, main_poi_id, main_poi_name LIMIT 5": > java.io.FileNotFoundException: File does not exist: > /user/kylin2x/prod/kylin2x_metadata_prod/resources/dict/MART_CATERING.APP_SHU_V5_TRADE_VIEW/C_OBJECT_ID/854df823-abc8-4e19-9035-def12f8af3e2.dict > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1850) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1821) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1729) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:589) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:299) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:793) > at > org.apache.kylin.storage.hbase.HBaseResourceStore.getInputStream(HBaseResourceStore.java:206) > at > org.apache.kylin.storage.hbase.HBaseResourceStore.getResourceImpl(HBaseResourceStore.java:226) > at > org.apache.kylin.common.persistence.ResourceStore.getResource(ResourceStore.java:148) > at > org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:448) > at > org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:105) > at > org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:102) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) > at > com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257) > at com.google.common.cache.LocalCache.get(LocalCache.java:4000) > at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004) > at > com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) > at > org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:122) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2648) Encounter cube merge error when deploy kylin on stand alone hbase cluster
[ https://issues.apache.org/jira/browse/KYLIN-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027363#comment-16027363 ] liyang commented on KYLIN-2648: --- {{fs.defaultFS}} is obsolete. Should use {{fs.default.name}} instead. This is caused by {{kylin.env.hdfs-working-dir}} not a qualified path. I'm fixing code on master branch. Meanwhile user can workaround by setting {{kylin.env.hdfs-working-dir=hdfs://cdh5}} > Encounter cube merge error when deploy kylin on stand alone hbase cluster > - > > Key: KYLIN-2648 > URL: https://issues.apache.org/jira/browse/KYLIN-2648 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.0.0 > Environment: hadoop :cdh5.4.0 (both main and hbase env) > hbase : hbase-1.2.0-cdh5.7.6 > hive: apache-hive-2.1.1 > kylin version: 2.0 >Reporter: suheng.cloud >Assignee: Dong Li > > I try to deploy kylin on one node of a stand alone hbase > cluster(hdfs://cdh5-mini/) which seperate from main hive > cluster(hdfs://cdh5/), > According to the blog "Deploy Apache Kylin with Standalone HBase Cluster" : > make sure the configurations of hadoop and hive points to main cluster, > I clone hadoop dir to another path and modify "fs.defaultFS" in core-site.xml > to "hdfs://cdh5/" , and in head of kylin.sh, I export HADOOP_HOME to this new > path. > So all goes well (include cube build/refresh) until I execute cube merge. > The merge error occurs at step "#9 Step Name: Garbage Collection on HDFS". > The stacktrace as follows: > 2017-05-25 17:28:07,070 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:114 : > CubingJob{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3, name=kylin_sales_cube - > 2012010100_2014020100 - MERGE - GMT+08:00 2017-05-25 16:51:30, > state=READY} prepare to schedule > 2017-05-25 17:28:07,073 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:117 : > CubingJob{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3, name=kylin_sales_cube - > 2012010100_2014020100 - MERGE - GMT+08:00 2017-05-25 16:51:30, > state=READY} scheduled > 2017-05-25 17:28:07,075 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:110 : Executing AbstractExecutable > (kylin_sales_cube - 2012010100_2014020100 - MERGE - GMT+08:00 > 2017-05-25 16:51:30) > 2017-05-25 17:28:07,078 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3 > 2017-05-25 17:28:07,083 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 1 actual > running, 0 stopped, 1 ready, 19 already succeed, 0 error, 11 discarded, 0 > others > 2017-05-25 17:28:07,083 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.ExecutableManager:389 : job id:c6709f0b-8858-4e66-a4c2-320ebc70a2e3 > from READY to RUNNING > 2017-05-25 17:28:07,105 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:110 : Executing AbstractExecutable (Garbage > Collection on HDFS) > 2017-05-25 17:28:07,106 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 > 2017-05-25 17:28:07,111 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.ExecutableManager:389 : job > id:c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 from READY to RUNNING > 2017-05-25 17:28:07,154 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: > hdfs://cdh5 > 2017-05-25 17:28:07,217 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:90 : HDFS path > hdfs:///kylin/kylin_metadata/kylin-a11d510f-d8a5-45c1-b430-bc7def851432 not > exists. > 2017-05-25 17:28:07,249 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:90 : HDFS path > hdfs:///kylin/kylin_metadata/kylin-0c1ed2d0-f595-4f58-aaea-2dbe7b41a550 not > exists. > 2017-05-25 17:28:07,320 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: > hdfs://cdh5-mini > 2017-05-25 17:28:07,324 ERROR [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:126 : error running Executable: > HDFSPathGarbageCollectionStep{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08, > name=Garbage Collection on HDFS, state=RUNNING} > 2017-05-25 17:28:07,326 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 > 2017-05-25 17:28:07,331 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating
[jira] [Updated] (KYLIN-2648) kylin.env.hdfs-working-dir should be qualified and absolute path
[ https://issues.apache.org/jira/browse/KYLIN-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyang updated KYLIN-2648: -- Summary: kylin.env.hdfs-working-dir should be qualified and absolute path (was: Encounter cube merge error when deploy kylin on stand alone hbase cluster) > kylin.env.hdfs-working-dir should be qualified and absolute path > > > Key: KYLIN-2648 > URL: https://issues.apache.org/jira/browse/KYLIN-2648 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.0.0 > Environment: hadoop :cdh5.4.0 (both main and hbase env) > hbase : hbase-1.2.0-cdh5.7.6 > hive: apache-hive-2.1.1 > kylin version: 2.0 >Reporter: suheng.cloud >Assignee: Dong Li > > I try to deploy kylin on one node of a stand alone hbase > cluster(hdfs://cdh5-mini/) which seperate from main hive > cluster(hdfs://cdh5/), > According to the blog "Deploy Apache Kylin with Standalone HBase Cluster" : > make sure the configurations of hadoop and hive points to main cluster, > I clone hadoop dir to another path and modify "fs.defaultFS" in core-site.xml > to "hdfs://cdh5/" , and in head of kylin.sh, I export HADOOP_HOME to this new > path. > So all goes well (include cube build/refresh) until I execute cube merge. > The merge error occurs at step "#9 Step Name: Garbage Collection on HDFS". > The stacktrace as follows: > 2017-05-25 17:28:07,070 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:114 : > CubingJob{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3, name=kylin_sales_cube - > 2012010100_2014020100 - MERGE - GMT+08:00 2017-05-25 16:51:30, > state=READY} prepare to schedule > 2017-05-25 17:28:07,073 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:117 : > CubingJob{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3, name=kylin_sales_cube - > 2012010100_2014020100 - MERGE - GMT+08:00 2017-05-25 16:51:30, > state=READY} scheduled > 2017-05-25 17:28:07,075 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:110 : Executing AbstractExecutable > (kylin_sales_cube - 2012010100_2014020100 - MERGE - GMT+08:00 > 2017-05-25 16:51:30) > 2017-05-25 17:28:07,078 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3 > 2017-05-25 17:28:07,083 INFO [pool-9-thread-1] > threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 1 actual > running, 0 stopped, 1 ready, 19 already succeed, 0 error, 11 discarded, 0 > others > 2017-05-25 17:28:07,083 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.ExecutableManager:389 : job id:c6709f0b-8858-4e66-a4c2-320ebc70a2e3 > from READY to RUNNING > 2017-05-25 17:28:07,105 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:110 : Executing AbstractExecutable (Garbage > Collection on HDFS) > 2017-05-25 17:28:07,106 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 > 2017-05-25 17:28:07,111 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.ExecutableManager:389 : job > id:c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 from READY to RUNNING > 2017-05-25 17:28:07,154 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: > hdfs://cdh5 > 2017-05-25 17:28:07,217 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:90 : HDFS path > hdfs:///kylin/kylin_metadata/kylin-a11d510f-d8a5-45c1-b430-bc7def851432 not > exists. > 2017-05-25 17:28:07,249 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:90 : HDFS path > hdfs:///kylin/kylin_metadata/kylin-0c1ed2d0-f595-4f58-aaea-2dbe7b41a550 not > exists. > 2017-05-25 17:28:07,320 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: > hdfs://cdh5-mini > 2017-05-25 17:28:07,324 ERROR [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > execution.AbstractExecutable:126 : error running Executable: > HDFSPathGarbageCollectionStep{id=c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08, > name=Garbage Collection on HDFS, state=RUNNING} > 2017-05-25 17:28:07,326 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 > 2017-05-25 17:28:07,331 DEBUG [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] > dao.ExecutableDao:217 : updating job output, id: > c6709f0b-8858-4e66-a4c2-320ebc70a2e3-08 > 2017-05-25 17:28:07,334 INFO [Job c6709f0b-8858-4e66-a4c2-320ebc70a2e3-128] >