Re: NPE In QueryMetricsFacade
I cannot reproduce it one my kylin cluster. Could anyone show us how to reproduce that error? I found other users have met same error, counld anyone show us the detail myabe useful? Please attach your detail to https://issues.apache.org/jira/browse/KYLIN-3609. Thanks. 俞霄翔/Xiaoxiang Yu 软件工程师/Software Engineer 电话/Mobile:18516298930 http://kyligence.io <http://kyligence.io/> On [DATE], "[NAME]" <[ADDRESS]> wrote: Hi Team, Query metrics are not getting updated on "host:7070/kylin/dashboard". I found below stack trace regarding query metrics facade in Kylin logs. My Kylin version v2.5 with Hbase v1.x. Am I missing something in configuration which is causing below error? 2018-10-21 18:34:41,625 WARN [Query 7b16f84c-3053-f64c-2cd2-4ae5348c295c-121] service.QueryService:421 : Write metric error. java.lang.NullPointerException at org.apache.kylin.rest.metrics.QueryMetricsFacade.updateMetricsToReservoir(QueryMetricsFacade.java:148) at org.apache.kylin.rest.metrics.QueryMetricsFacade.updateMetrics(QueryMetricsFacade.java:73) at org.apache.kylin.rest.service.QueryService.recordMetric(QueryService.java:503) at org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:419) at org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:351) at org.apache.kylin.rest.controller.QueryController.query(QueryController.java:86) at sun.reflect.GeneratedMethodAccessor253.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) at javax.servlet.http.HttpServlet.service(HttpServlet.java:650) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) Thank You, Shrikant Bang.
Re: [DISCUSS] New Kylin Streaming Solution From eBay
Hi gang, I am so glad to know that eBay has a solution for realtime olap on kylin. I have some small question: 1. Is it possible to use Yarn as cluster manager for index task. Coordinator process will set up them at specificed period. Yarn will manage : a) retry these task if some failed b) resource allocation c) log collection 2. As I know, ebay’s New Kylin Streaming Solution use replica Set to ensure that income messages wouldn’t lost if some processes lost. I think replica set is a set of kafka cosumer processes which is responsible for ingest message and build base cuboid in memory. Could you please show me some detail about how replica Set provide HA guarantee? How to configure it? A link / paper is OK. I found one but I don’t know if it same meaning for your replica Set. a) [Mongodb replication](https://docs.mongodb.com/manual/replication/). 3. How to add or remove node of replica Set in production env? How to monitor the health/pressure of replica Set cluster ? 4. Does all measure are supported in ebay’s New Kylin Streaming Solution? What about count distinct(bitmap)? 5. It seems ebay’s New Kylin Streaming Solution use a custom columnar storage, why not use a open source mature columnar storage solution ? Have your ever compare the performance of your custom columnar storage to open source columnar storage solution ? Best wishes, Xiaoxiang Yu 发件人: Ma Gang 答复: "dev@kylin.apache.org" 日期: 2018年10月30日 星期二 15:24 收件人: "dev@kylin.apache.org" 主题: [DISCUSS] New Kylin Streaming Solution From eBay Hi all, eBay Kylin team has developed a new Kylin streaming solution, the basic idea is to build a streaming cluster to ingest data from streaming source(Kafka), and provide query for real-time data, the data preparation latency is milliseconds, which means the data is queryable almost when it is ingested, attach is the architecture design doc. We would like to contribute the feature to community, please let us know if you have any concern. Thanks, Gang(Allen) Ma
Re: [DISCUSS] New Kylin Streaming Solution From eBay
Thank you for your reply. Maybe I can help to improve your Kylin Streaming Solution in the future. Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Thanks Xiaoxiang, Very good questions! Please see my comments started with [Gang]: 1. Is it possible to use Yarn as cluster manager for index task. Coordinator process will set up them at specificed period. [Gang] I think it is possible, but in current design, the indexing task is designed as long running task, it also can provide query service, this makes the whole system very simple and efficiency, I don't think we need to stop/start indexing task time by time. But use yarn to manage the resource is possible, we need to redesign the existing coordinator, to make it easy to deploy to Yarn, Kubernetes, etc. Hope this can be done after contribution to community. 2. As I know, ebay’s New Kylin Streaming Solution use replica Set to ensure that income messages wouldn’t lost if some processes lost. I think replica set is a set of kafka cosumer processes which is responsible for ingest message and build base cuboid in memory. Could you please show me some detail about how replica Set provide HA guarantee? How to configure it? A link / paper is OK. I found one but I don’t know if it same meaning for your replica Set. [Gang] Yes, it is similar as the MongoDB replication, but currently we don't replicate data from Primary node, just assign the same Kafka topic/partitions to the receivers in a ReplicaSet, all receivers in a ReplicaSet will consume data from Kafka, so if one receiver is down, other receivers in the ReplicaSet are still consuming the same Kafka data, so the consume/query will not be impact. And We don't guarantee that the receivers in a ReplicaSet have the same consuming rate, but we can guarantee that the user can view data consistently by stick to the query to one receiver for one cube. The HA implementation is a little bit naive, but simple and worked. Maybe in the future, we can do HA by replication to support other streaming sources that don't support multiple consumers and don't have persistent store. 3. How to add or remove node of replica Set in production env? How to monitor the health/pressure of replica Set cluster ? [Gang] Currently we have UI/restful api to let admin to add/remove node to/from a ReplicaSet, and have a simple ui to let admin monitor the health, consuming rate for each receiver/cube. Also all metrics are collected using yammer metrics framework, it is easy to exposed to other monitor system. 4. Does all measure are supported in ebay’s New Kylin Streaming Solution? What about count distinct(bitmap)? [Gang] Most measures are supported, but precise count distinct(bitmap) is not support in case that the distinct dimension is not int type. As you know, to support precise count distinct for not-int type dimension, it needs to build global dictionary, it is not possible in the streaming env. 5. It seems ebay’s New Kylin Streaming Solution use a custom columnar storage, why not use a open source mature columnar storage solution ? Have your ever compare the performance of your custom columnar storage to open source columnar storage solution ? [Gang] Most open source columnar format like Parquet, ORC are designed to use in Hadoop env, the streaming data are in local disk, so I didn't consider them at the beginning. It is not very hard to define columnar format to store Kylin specific data, use a customize columnar storage, you can use mmap file to scan data, add row-level invert index for all dimensions, so I think the performance will be better compared to using common columnar format. I didn't compare the performance, but the storage engine is pluggable, you may contribute a parquet storage if you are interesting. At 2018-11-01 12:42:25, "Xiaoxiang Yu" wrote: >Hi gang, I am so glad to know that eBay has a solution for realtime olap on kylin. I have some small question: > > >1. Is it possible to use Yarn as cluster manager for index task. Coordinator process will set up them at specificed period. Yarn will manage : > >a) retry these task if some failed > >b) resource allocation > >c) log collection > >2. As I know, ebay’s New Kylin Streaming Solution use replica Set to ensure that income messages wouldn’t lost if some processes lost. I think replica set is a set of kafka cosumer processes which is responsible for ingest message and build base cuboid in memory. Could you please show me some detail about how replica Set provide HA guarantee? How to configure it? A link / paper is OK. I found one but I don’t know if it same meaning for your replica Set. >
Re: [VOTE] Release apache-kylin-2.5.1 (RC1)
+1 Mvn test passed Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Hi all, I have created a build for Apache Kylin 2.5.1, release candidate 1. Changes highlights: [KYLIN-3531] - Login failed with case-insensitive username [KYLIN-3604] - Can't build cube with spark in HBase standalone mode [KYLIN-3613] - Kylin with Standalone HBase Cluster could not find the main cluster namespace at "Create HTable" step [KYLIN-3634] - When the filter column has null value may cause incorrect query result [KYLIN-3635] - Percentile calculation on Spark engine is wrong [KYLIN-3644] - NumberFormatExcetion on null values when building cube with Spark [KYLIN-3599] - Bulk Add Measures Thanks to everyone who has contributed to this release. Here’s release notes: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344108 The commit to be voted upon: https://github.com/apache/kylin/commit/24e2452309a450ec4ef62339b003343eabe23016 Its hash is 24e2452309a450ec4ef62339b003343eabe23016. The artifacts to be voted on are located here: https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.5.1-rc1/ The hashe of the artifact is as follows: apache-kylin-2.5.1-source-release.zip.sha256 21db5dab4d3900a49237b9083b5d270c8471d1882a5427cddf1cc74873df42f2 A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachekylin-1056/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/shaofengshi.asc Please vote on releasing this package as Apache Kylin 2.5.1. The vote is open for the next 72 hours and passes if a majority of at least three +1 PPMC votes are cast. [ ] +1 Release this package as Apache Kylin 2.5.1 [ ] 0 I don't feel strongly about it, but I'm okay with the release [ ] -1 Do not release this package because... Here is my vote: +1 (binding) -- Best regards, Shaofeng Shi 史少锋
Re: [DISCUSS] Not sending Github PR notifications to dev@kylin
+1 Xiaoxiang Yu xiaoxiang...@kyligence.io On [DATE], "[NAME]" <[ADDRESS]> wrote: Hello, Kylin dev subscribers, Recently I received several complaints saying that there are many emails sent to the "dev@kylin.apache.org" from the github.com pull request since we enabled the Gitbox service for Kylin. Today most patches and code reviews are performed on GitHub. Each pull request action (even add a comment) will emit an email to dev@kylin, instead of the individual contributor or reviewer; This generates many spams and causes the emails from people are left in the basket. Now I plan to change the Gitbox email notifications rule: removing dev@kylin, use author and reviewer instead, as follows: *For Github issues, please notify iss...@kylin.apache.org ;For Github PR, please notify the author, reviewer and iss...@kylin.apache.org * The related JIRA to Apache Infra is https://issues.apache.org/jira/browse/INFRA-17073 Please +1 if you agree with the new rule, or -1 if you want to keep as today. If no objection, we will move on with the new rule. -- Best regards, Shaofeng Shi 史少锋
Re: 我使用的kylin版本是2.5.2,我发现经常会出现找不到kylin.properties以及kylin_hive_conf.xml等配置文件的情况,下面是我的一些截图。为什么这个变量的值即便我设置了,他也是随机的呢?这导致我经常会报这个错误。我现在怀疑这个变量是不是跟文件的加载顺序有关。但是加载配置文件的堆栈信息太长了。我没法追踪下去。
Hi, I cannot see your screenshot files because it wasn’t uploaded. Please add these screenshot files as email attach and try again, or you use a link in your email which other person could click in and read your screenshot. Best wishes, Xiaoxiang Yu 发件人: 冯广彬 答复: "dev@kylin.apache.org" 日期: 2019年1月19日 星期六 18:09 收件人: "dev@kylin.apache.org" 主题: 我使用的kylin版本是2.5.2,我发现经常会出现找不到kylin.properties以及kylin_hive_conf.xml等配置文件的情况,下面是我的一些截图。为什么这个变量的值即便我设置了,他也是随机的呢?这导致我经常会报这个错误。我现在怀疑这个变量是不是跟文件的加载顺序有关。但是加载配置文件的堆栈信息太长了。我没法追踪下去。 [fail_to_locate_kylin.properties.jpg] [很多文件都找不到.jpg] [没设置.jpg] [设置了KYLIN_CONF.jpg] [又换了位置.jpg] [又换了位置.png]
Re: Kylin real-time streaming is ready on realtime-streaming branch
Hi,everyone. I am reading source code of real-time streaming and find some way which may helpful to other who is interested in this feature. If you are interested in eBay's new real time streaming solution but don't know in which way it may help you, then the following link will help you running or debugging it on your laptop. https://github.com/hit-lacus/hit-lacus.github.io/issues/13#issuecomment-448449318 Best wishes, Xiaoxiang Yu 发件人: Ma Gang 答复: "u...@kylin.apache.org" 日期: 2018年12月23日 星期日 13:33 收件人: kylin_dev , kylin_user 主题: Kylin real-time streaming is ready on realtime-streaming branch Hi all, Kylin real-time streaming feature has been staged in Kylin code repository for public review and evaluation. You can check out the "realtime-streaming" branch to read the code, and make a binary build to run an example. The detail design doc and usage doc can be found in the attachment of jira: https://issues.apache.org/jira/browse/KYLIN-3654. This is just the first version, any comments and pull request are welcome! Thanks, Ma,Gang
Re: [Question:]The version of HDP sandbox for set up kylin dev environment
Hi, I am using hdp 2.4(2.4.0.0-169) to develop kylin(start DebugTomcat in IDE) , and I think it works well. It's also necessary to use Java 8, because source using some API/feature introduced in Java8. Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Hi, any engineer, could you tell me which version of HDP sandbox that you use for developing Kylin 2.6.0?Please. As we can see, the doc show that the version of HDP sandbox used for set up kylin dev env is 2.4, it's jdk need to update to 1.8. But I find that the latest version of kylin 2.6.0 can support Hadoop 3.0. So, those baffle me. Help me, please
Re: 转发: issue: the same cube in diffent project response different result.
Hi liang gang, Using you sample data, sql query and hive ddl, Kylin gave the same result as Hive in my sandbox env. So I think I think maybe you should double check your codebase or env. If you confirm the result is different from the result return by hive in the future, please report a JIRA and attach required files. Best wishes, Xiaoxiang Yu 发件人: "liangg...@qutoutiao.net" 日期: 2018年12月15日 星期六 18:43 收件人: Xiaoxiang Yu 抄送: "dev@kylin.apache.org" 主题: 转发: issue: the same cube in diffent project response different result. Hi xiaoxiang, Thanks for your response. I provide attaching data file in case you can check my issue. Thanks. “GRAPH_DETAIL_CUBE_ VIEW” is the fact table, and “DIM_METRIC_INFO” is the dimension table, it uses “inner join” in kylin Model. When I use the 1.sql to query ,it responses one data. Like below snapshot: [cid:image005.jpg@01D4949A.40AAB230] When I add ‘7464’ in “exp_id” filter like 2.sql to query, it responses no data. Like below snapshot: [cid:image008.jpg@01D4949A.40AAB230] Below columns are measure columns. All of others are dimension columns. SAMPLE_SIZE SUM_X SUM_SQUARE_X ACT_DEV [cid:image001.jpg@01D494A1.F2122D30] I have checked it in kylin version 2.3.2 and 2.5.0, it has the same issue. So I think it’s a bug of kylin. Please help me check, thanks! 发件人: Xiaoxiang Yu [mailto:xiaoxiang...@kyligence.io] 发送时间: 2018年12月13日 11:13 收件人: u...@kylin.apache.org<mailto:u...@kylin.apache.org> 抄送: liangg...@qutoutiao.net<mailto:liangg...@qutoutiao.net> 主题: Re: issue: the same cube in diffent project response different result. Hi, lianggang I haven’t see such problem. Could you please provide following detail for deeper research: 1. Sample data which could can reproduce inconsistent result.(Maybe a csv/json file contains data.) 2. Your model and cube metadata (some json file ). 3. Your sql query which result is not correct.(in a sql file) 4. Your Kylin version, Spark/Hbase/MapReduce version If you confirm inconsistent result can be reproduced on sample data, please open a JIRA ticket on https://issues.apache.org/jira/projects/KYLIN and attach your files. -------- Best wishes, Xiaoxiang Yu 发件人: "liangg...@qutoutiao.net<mailto:liangg...@qutoutiao.net>" mailto:liangg...@qutoutiao.net>> 答复: "u...@kylin.apache.org<mailto:u...@kylin.apache.org>" mailto:u...@kylin.apache.org>> 日期: 2018年12月12日 星期三 17:43 收件人: "u...@kylin.apache.org<mailto:u...@kylin.apache.org>" mailto:u...@kylin.apache.org>> 抄送: "liangg...@qutoutiao.net<mailto:liangg...@qutoutiao.net>" mailto:liangg...@qutoutiao.net>> 主题: issue: the same cube in diffent project response different result. Hi All, I encountered one strange problem for my cube data. Please help me check what’s wrong with Kylin. I created two same cube in “ABtest” and “ABtest_prod” project, the cube structure is the same. built the same time range’s data. The data size is the same. But Using the same SQL to query, the response data is different. The snapshot like below: ABtest project’s snapshot: [cid:image001.png@01D4923F.A7D44B90] ABtest_prod project’s snapshot: [cid:image002.png@01D4923F.A7D44B90] One other strange thing, when I use “trim(a.os)” in the query filter of ABtest_prod project, the response rows is 20. And data is correct. The snapshot like below: [cid:image003.png@01D49241.0DC91B00] I have checked it for a long time. I confirm the cube structure is the same, the time range and data size is also the same. I use the version 2.3.2 of Kylin. Currently I don’t know what’s the reason. Please help me. Thank you very much!
Re: [VOTE] Release apache-kylin-2.5.2 (RC2)
+1 Local CI pass Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Hi all, I have created a build for Apache Kylin 2.5.2, release candidate 2. Changes: [KYLIN-3187] - JDK APIs using the default locale, time zone or character set should be avoided [KYLIN-3636] - Wrong "storage_type" in CubeDesc causing cube building error [KYLIN-3666] - Mege cube step 2: Update dictionary throws IllegalStateException [KYLIN-3672] - Performance is poor when multiple queries occur in a short period [KYLIN-3676] - Update to custom calcite and remove the "atopcalcite" [KYLIN-3678] - CacheStateChecker may remove a cache file that under a building [KYLIN-3683] - Package org.apache.commons.lang3 not exists [KYLIN-3689] - When the startTime is equal to the endTime in build request, the segment will build all data. [KYLIN-3693] - TopN, Count distinct incorrect in Spark engine [KYLIN-3705] - Segment Pruner mis-functions when the source data has Chinese characters Thanks to everyone who has contributed to this release. Here are release notes: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344466 The commit to being voted upon: https://github.com/apache/kylin/commit/0e519d859e217fbfadd534313376e532d2c647fa Its hash is 0e519d859e217fbfadd534313376e532d2c647fa. The artifacts to be voted on are located here: https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.5.2-rc2/ The hashe of the artifact is as follows: apache-kylin-2.5.2-source-release.zip.sha256 fca5688cf64442ea595e07c2a4a4b2b549836d268ce8f10f3d559f05c22b61d0 A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachekylin-1058/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/shaofengshi.asc Please vote on releasing this package as Apache Kylin 2.5.2. The vote is open for the next 72 hours and passes if a majority of at least three +1 PPMC votes are cast. [ ] +1 Release this package as Apache Kylin 2.5.2 [ ] 0 I don't feel strongly about it, but I'm okay with the release [ ] -1 Do not release this package because... Here is my vote: +1 (binding) Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Work email: shaofeng@kyligence.io Kyligence Inc: https://kyligence.io/ Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re: [DISCUSS] Stop inserting git diffs to JIRA ticket
+1 Good idea. Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Hello Kylin developers, After we enable the git box for Kylin code repository, when there is a PR merged, the "ASF Github Bot" will insert the git diff to the associated JIRA. We noticed this function will make the JIRA very long when the code change is big. Besides, when cherry-picking the change to another branch, it will append again. This makes it is too hard for a human to read the JIRA, the important message may be overlooked. A typical sample is this: https://issues.apache.org/jira/browse/KYLIN-3187 My proposal is, stopping sync the code change from GitHub to JIRA; Only keep necessary notifications like "A PR is created/closed" etc. For the code change, people should go to GitHub code history, not JIRA. Please express your ideas; If no objection in the next couple of days, we will raise a change request to the infrastructure team. Thanks for your input! Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Work email: shaofeng@kyligence.io Kyligence Inc: https://kyligence.io/ Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re: dont complete a job in apache kylin.
In your picture , I can see 'Memory Total' is 0B. So, it seems that you Yarn is not configured right. The jobs submited by Kylin is accepted by Yarn, but Yarn has no any resource to allocated. Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: hi. I built a job and see it in the monitor of Kylin. the status is running. I see a new table in Hive and all tables in yarn(hadoop:http://localhost:8088/cluster/apps/ACCEPTED) but dont full progress (final status=UNDEFINED). I check the kylin.log but i dont find Error. --- <http://apache-kylin.74782.x6.nabble.com/file/t799/Screenshot_from_2018-11-18_07-48-55.png> <http://apache-kylin.74782.x6.nabble.com/file/t799/Screenshot_from_2018-11-18_07-49-20.png> <http://apache-kylin.74782.x6.nabble.com/file/t799/Screenshot_from_2018-11-18_07-51-32.png> .. thank you. -- Sent from: http://apache-kylin.74782.x6.nabble.com/
Re: [VOTE] Release apache-kylin-2.6.0 (RC1)
+1 mvn test pass Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: +1 binding mvn test passed on my env: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-18T02:33:14+08:00) Maven home: /usr/local/Cellar/maven/3.5.4 Java version: 1.8.0_91, vendor: Oracle Corporation, runtime: /Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home/jre Default locale: en_US, platform encoding: UTF-8 OS name: "mac os x", version: "10.14.2", arch: "x86_64", family: "mac" With Warm regards Billy Liu 李 栋 于2019年1月10日周四 下午6:30写道: > > +1(binding) mvn test passed > > Dong Li > > -Original Message- > From: Ma Gang > Sent: Thursday, January 10, 2019 1:16 PM > To: dev@kylin.apache.org > Subject: Re:[VOTE] Release apache-kylin-2.6.0 (RC1) > > +1, mvn test passed > > At 2019-01-10 09:59:55, "Yichen Zhou" wrote: > >+1 > >mvn test passed > > > >Regards, > >Yichen > > > >On Wed, Jan 9, 2019 at 5:58 PM Rongchuan Jin > > > >wrote: > > > >> +1 > >> > >> > >> 金荣钏/Rongchuan.Jin > >> > >> > >> 在 2019/1/10 上午9:49,“ShaoFeng Shi” 写入: > >> > >> Checked the source package, the signature, and the sha256 hash; > >> > >> Mvn package and test are all successful with jdk 1.8.0_111 on > >> Mac; > >> > >> +1 (binding) > >> > >> Best regards, > >> > >> Shaofeng Shi 史少锋 > >> Apache Kylin PMC > >> Work email: shaofeng@kyligence.io > >> Kyligence Inc: https://kyligence.io/ > >> > >> Apache Kylin FAQ: > >> https://kylin.apache.org/docs/gettingstarted/faq.html > >> Join Kylin user mail group: user-subscr...@kylin.apache.org > >> Join Kylin dev mail group: dev-subscr...@kylin.apache.org > >> > >> > >> > >> > >> JiaTao Tao 于2019年1月10日周四 上午9:42写道: > >> > >> > +1 > >> > mvn test passed > >> > > >> > Yanghong Zhong 于2019年1月9日周三 上午2:46写道: > >> > > >> > > Hi all, > >> > > > >> > > I have created a build for Apache Kylin 2.6.0, release candidate 1. > >> > > > >> > > Changes highlights: > >> > > [KYLIN-2895] - Refine query cache by changing the query cache > >> expiration > >> > > strategy by signature checking and introducing memcached as > >> distributed > >> > > cache > >> > > [KYLIN-2932] - Simplify the thread model for in-memory cubing > >> > > [KYLIN-3021] - Check MapReduce job failed reason and include the > >> > > diagnostics into email notification > >> > > [KYLIN-3272] - Upgrade Spark dependency to 2.3.2 > >> > > [KYLIN-3540] - Improve Mandatory Cuboid Recommendation Algorithm > >> > > [KYLIN-3552] - Data Source SDK to ingest data from different > >> JDBC sources > >> > > [KYLIN-3611] - Upgrade Tomcat to 7.0.91, 8.5.34 or later > >> > > [KYLIN-3656] - Improve HLLCounter performance > >> > > [KYLIN-3700] - Quote sql identities when creating flat table > >> > > [KYLIN-3729] - CLUSTER BY CAST(field AS STRING) will > >> accelerate base > >> > cuboid > >> > > build with UHC global dict > >> > > > >> > > Thanks to everyone who has contributed to this release. > >> > > Here’s release notes: > >> > > > >> > > > >> > > >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344003 > >> > > > >> > > The commit to be voted upon: > >> > > > >> > > > >> > > > >> > > >> https://github.com/apache/kylin/commit/8737bc1f555a2789a67462c8f8420b6ab3be9
Re: 退订
Please send a piece of words to dev-unsubscr...@kylin.apache.org to unsubscribe dev mail list More information please visit http://kylin.apache.org/community/ Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: 麻烦退订谢谢
Re: [Discuss] Moving toward Apache Kylin 3.0
+1, I am looking forward to real-time streaming feature. Wish more dev/user’s participation. Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Hi Kylin developers, In last week, Kylin released v2.6.0, with the enhanced & distributed query cache and JDBC data source SDK. After this release, the next batch candidate features include real-time streaming, parquet storage, and druid storage. These features were developed in the past 1-2 years by different Kylin players and were open sourced in the past 6 months. They have already been staged in separate branches and are under evaluation by the community. We have received much feedback from the community. These candidate features are big supplements to as-is Kylin functions; For example, the real-time streaming feature will bring Kylin from batch & historical analytics into real-time analytics. The parquet storage will make the deployment more flexible and more cloud-friendly. Of course, stabilizing and improving these features need additional time and effort. So, when we merging and releasing them, we'd better give it a new version number so that user can clearly know the difference with current 2.x versions. I discussed this with several developers offline, we think it is time to move toward Kylin 3.0. So, if one of the above features is merged, the version will be 3.0. The current 2.6 will be maintained until 3.x is ready for production use. Your comments, ideas, and suggestions are welcomed! Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Work email: shaofeng@kyligence.io Kyligence Inc: https://kyligence.io/ Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re: 星座模型问题-- 有用到多张事实表都需要聚合情况,请问有解决的方法吗?
Hi, I have some ideas which may be helpful. 1. Try to join these fact tables into another hive table(new fact table) which contains all column, but be careful for some measure(such as SUM) because rows will be duplicate after join. 2. Create single cube for each fact table, and merge result at your side. If you find any mistake, please let me know. Best wishes, Xiaoxiang Yu On 2019/3/27, 20:09, "奥威软件" <3513797...@qq.com> wrote: Hi, 星座模型问题: 有用到多张事实表关联情况,需要聚合多张事实表的度量值,请问有解决的方法吗? 如果把事实表当作维度表,这张就不能聚合事实表的度量值了把? 请帮忙看下怎么解决,谢谢! Best regards
Re: sqlserver jdbc数据源问题: 当table前缀有dbo时,kylin无法读取表
Hi, We cannot see your screenshots. But this is my solution, hope this may help you.(Following steps have be test at my dev env.) 1. copy mssql-jdbc-6.4.0.jre8.jar to your $KYLIN_HOME/lib 2. add following config in kylin.properties or Project Level 3. start Kylin process and import table kylin.source.default 16 kylin.source.jdbc.adaptor org.apache.kylin.sdk.datasource.adaptor.DefaultAdaptor kylin.source.jdbc.connection-url jdbc:sqlserver://XXX.com:1433;database=sample kylin.source.jdbc.dialect mssql kylin.source.jdbc.driver com.microsoft.sqlserver.jdbc.SQLServerDriver kylin.source.jdbc.pass XXX kylin.source.jdbc.sqoop-home /opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4 kylin.source.jdbc.user root For screenshots, refer to: https://github.com/hit-lacus/hit-lacus.github.io/issues/32#issuecomment-469134060 If you find any mistakes, please let me know. Thank you. Best wishes, Xiaoxiang Yu From: 奥威软件 <3513797...@qq.com> Reply-To: "dev@kylin.apache.org" Date: Wednesday, February 27, 2019 at 21:54 To: dev Subject: sqlserver jdbc数据源问题: 当table前缀有dbo时,kylin无法读取表 Hi, sqlserver jdbc数据源问题: 当table前缀有dbo时,kylin无法读取表信息 [cid:2AFCFD0C@78E74533.9E96765C] 数据源查询的时候都是前缀 dbo [cid:31FEFC0B@7518E300.9E96765C] 当表的前缀不是dbo的时候 kylin可以读取表数据 [cid:2D00F70A@DB1CFE19.9E96765C] 请帮忙看下怎么解决,谢谢! Best regards -- Regards! Aron Tao
Re: 搭建开发环境
Hi, can you provide any error log and stack trace, in kylin.log or kylin.out? Please check your hbase cluster in health state(This can be checked by Ambari web UI)? And I think kylin.properties should be provided as well. Best wishes, Xiaoxiang Yu On 2019/3/4, 19:38, "294936039" wrote: 各位好!我现在按照看kylin官网提供的搭建教程来搭建2.6版本的开发环境,启动DebugTomcat的时候报找不到hbase的region server,请求解答呀,或者有谁整理一份比较清楚的搭建手册吗,谢谢!小白一个,各位大佬多担待。
Re: [VOTE] Release apache-kylin-2.6.1 (RC1)
+1 mvn test passed Best wishes, Xiaoxiang Yu On 2019/3/4, 18:35, "ShaoFeng Shi" wrote: Hi all, I have created a build for Apache Kylin 2.6.1, release candidate 1. Changes highlights: [KYLIN-3494] - Build cube with spark reports ArrayIndexOutOfBoundsException [KYLIN-3537] - Use Spark to build Cube on Yarn failed at Setp8 on HDP3. [KYLIN-3815] - Unexpected behavior when joining the streaming table and hive table [KYLIN-3828] - ArrayIndexOutOfBoundsException thrown when building a streaming cube with empty data in its first dimension [KYLIN-3833] - Potential OOM in Spark Extract Fact Table Distinct Columns step [KYLIN-3826] - MergeCuboidJob only uploads necessary segment's dictionary Thanks to everyone who has contributed to this release. Here’s the release notes: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344845 The commit to being voted upon: https://github.com/apache/kylin/commit/270cfe68ecc94c66141b29e2ccf20b9ec25e23dd Its hash is 270cfe68ecc94c66141b29e2ccf20b9ec25e23dd. The artifacts to be voted on are located here: https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.1-rc1/ The hash of the artifact is as follows: apache-kylin-2.6.1-source-release.zip.sha256 961b8c8d0e781fe7936efb7f33cebb9661b4fbf83082669769a41b47cea19001 A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachekylin-1060/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/shaofengshi.asc Please vote on releasing this package as Apache Kylin 2.6.1. The vote is open for the next 72 hours and passes if a majority of at least three +1 PMC votes are cast. [ ] +1 Release this package as Apache Kylin 2.6.1 [ ] 0 I don't feel strongly about it, but I'm okay with the release [ ] -1 Do not release this package because... Here is my vote: +1 (binding) Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Work email: shaofeng@kyligence.io Kyligence Inc: https://kyligence.io/ Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
[Discussion] Enable shrunken dictionary by default
Dear all, I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by default(it is disabled by default), because I found enable it will speed up cube build process when cube have count distinct(bitmap) on a large cardinality column. This feature is contributed in KYLIN-3491. When using count distinct(bitmap) measure on a large cardinality column(this require global dictionary), build base cuboid step need frequent cache swap so it cannot finished within a reasonable period. KYLIN-3491 add a new step to build separated dictionary for each InputSplit before BuildBaseCuboid step. So mapper of BuildBaseCuboid step only has to fetch a smaller dictionary for itself(without unused value), instead of a larger global dictionary. It will reduce cache swap and make BuildBaseCuboid step run as quick as possible. In my test env, my hadoop cluster is a CDH cluster with 56 vcore and 110GB Memory. I create a model with a fact table (153326740 rows) and three dimension tables, there are three count distinct(bitmap) measure which the largest cardinality of single column is 55200325. With ShrunkenDict disabled, the BuildBaseCuboid cannot completed in 22 hours. Comparatively, with ShrunkenDict enabled, build process completed in a reasonable duration(Extra Dictionary cost 5 minutes, Build Base Cuboid costs 5 minutes). https://user-images.githubusercontent.com/14030549/54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png If you want know more, please check https://issues.apache.org/jira/browse/KYLIN-3491. If you have any suggestion, please let me know. Best wishes, Xiaoxiang Yu
Re: 回复: hive表数据源: 当hive的table中某列名含有中文,build cube 会报错
Hi, Thanks for your reporting, it does has such issue. If you have fix this, please submit your PR. Best wishes, Xiaoxiang Yu On 2019/3/6, 14:18, "奥威软件" <3513797...@qq.com> wrote: The datasource is hive as described in the title -- 原始邮件 -- 发件人: "PENG Zhengshuai"; 发送时间: 2019年3月6日(星期三) 下午2:07 收件人: "dev@kylin.apache.org"; 主题: Re: hive表数据源: 当hive的table中某列名含有中文,build cube 会报错 Hi, Let’s make sure the below things: 1. The datasource is hive or RDBMS? 2. If use RDBMS as datasource, which RDBMS? Mysql or Mssql 3. Do you have change some configurations like disable quote in sql? BR PENG Zhengshuai > On Mar 6, 2019, at 1:28 PM, 奥威软件 <3513797...@qq.com> wrote: > > kylin生成的sql语句(在hive): > CREATE EXTERNAL TABLE IF NOT EXISTS kylin_intermediate_ch_cube_5cc1_a6f3_f9f8_b90a_3e52abe0760b > ( > ICSTOCKBILL_1W_C_门店ID int > ,ICSTOCKBILL_1W_C_客户ID int > ,ICSTOCKBILL_1W_C_时间 timestamp > ,ICSTOCKBILL_1W_C_商品ID int > ,GOODS_C_商品ID int > ,GOODS_C_品类ID int > ,DEPARTMENT_C_门店ID int > ,DEPARTMENT_C_区域ID int > ,GOODSCLASS_C_品类ID int > ,DEPARTMENTCLASS_C_区域ID int > ,ICSTOCKBILL_1W_C_数量 int > ,ICSTOCKBILL_1W_C_进货价 decimal(20,3) > ,ICSTOCKBILL_1W_C_总售价 decimal(20,3) > ,ICSTOCKBILL_1W_C_售价 decimal(20,3) > ,ICSTOCKBILL_1W_C_总成本 decimal(20,3) > ) > STORED AS SEQUENCEFILE > > LOCATION 'hdfs://kylincluster/kylin/kylin_metadata/kylin-6613a735-0452-1bd5-aa22-e63013366c2a/kylin_intermediate_ch_cube_5cc1_a6f3_f9f8_b90a_3e52abe0760b'; > > > > 错误信息为:FAILED: ParseException line 3:19 cannot recognize input near 'ID' 'int' ',' in column type > > > > > 能正常使用的hive sql语句(区别是表名都添加了但反引号 ` ): > CREATE EXTERNAL TABLE IF NOT EXISTS kylin_intermediate_ch_cube_5cc1_a6f3_f9f8_b90a_3e52abe0760b > ( > `ICSTOCKBILL_1W_C_门店ID` int > ,`ICSTOCKBILL_1W_C_客户ID` int > ,`ICSTOCKBILL_1W_C_时间` timestamp > ,`ICSTOCKBILL_1W_C_商品ID` int > ,`GOODS_C_商品ID` int > ,`GOODS_C_品类ID` int > ,`DEPARTMENT_C_门店ID` int > ,`DEPARTMENT_C_区域ID` int > ,`GOODSCLASS_C_品类ID` int > ,`DEPARTMENTCLASS_C_区域ID` int > ,`ICSTOCKBILL_1W_C_数量` int > ,`ICSTOCKBILL_1W_C_进货价` decimal(20,3) > ,`ICSTOCKBILL_1W_C_总售价` decimal(20,3) > ,`ICSTOCKBILL_1W_C_售价` decimal(20,3) > ,`ICSTOCKBILL_1W_C_总成本` decimal(20,3) > ) > STORED AS SEQUENCEFILE > > LOCATION 'hdfs://kylincluster/kylin/kylin_metadata/kylin-6613a735-0452-1bd5-aa22-e63013366c2a/kylin_intermediate_ch_cube_5cc1_a6f3_f9f8_b90a_3e52abe0760b'; > > > > > > > > -- 原始邮件 -- > 发件人: "PENG Zhengshuai"; > 发送时间: 2019年3月6日(星期三) 中午1:09 > 收件人: "dev@kylin.apache.org"; > > 主题: Re: hive表数据源: 当hive的table中某列名含有中文,build cube 会报错 > > > > Hi, > > Can you show the Hive Sql in Kylin.log when cube building? > > BR > PENG Zhengshuai > >> On Mar 5, 2019, at 3:25 PM, 奥威软件 <3513797...@qq.com> wrote: >> >> Hi, >> hive表数据源: 当hive的table中某列名含有中文,build cube 会报错 >> kylin2.6.0 hadoop3 >> 错误如下: >> >> >> NoViableAltException(24@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.type(HiveParser.java:36813) at org.apache.hadoop.hive.ql.parse.HiveParser.colType(HiveParser.java:36595)at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeConstraint(HiveParser.java:34322) at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeOrConstraint(HiveParser.java:34075) at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeOrConstraintList(HiveParser.java:29819) at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:6662) at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:4295) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2494) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616)at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) at org.apache.hadoop.hive.ql.Driver.compile
Re: Unexpected behavior when joinning streaming table and hive table
Hi, lifei After check your model.json, I found you use "HOUR_START" as your partition_date_column, which is not correct. I think you should change to "timestamp" and have another try. Source code at https://github.com/apache/kylin/blob/master/source-kafka/src/main/java/org/apache/kylin/source/kafka/TimedJsonStreamParser.java#L111 If you find any mistake, please let me know. ---- Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Hello, I am evaluating Kylin and tried to join streaming table and hive table, but now got unexpected behavior. All the scripts can be found in https://gist.github.com/OstCollector/a4ac396e3169aa42a416d96db3021195 (may need to modify some script to match the environments) Environment: Centos 7 Hadoop on CDH-5.8 dedicated Kafka-2.1 (not included in CDH) How to reproduce this problem: 1. run gen_station.pl to generate dim table data 2. run import-data.sh to build dim table in Hive 3. run factdata.pl and pipe its output into kafka 4. create tables TEST_WEATHER.STATION_INFO(hive) TEST_WEATHER.WEATHER(streaming) in Kylin 5. create model and cube in Kylin, join WEATHER.SATION_ID = STATION.ID 6. build the cube Expected behavior: The cube is built correctly and I can get data when search. Actual behavior: On apache-kylin-2.6.0-bin-cdh57: build failed at step #2 (Create Intermediate Flat Hive Table) On apache-kylin-2.5.2-bin-cdh57: got empty cube I also tried with this case without streaming, with the format of timestamp column changed to "%Y-%m-%d %H:%M:%S", and an additional table to store the mapping of timestamp and {hour,day,month,year}_start. In this case, the cube is built as expected. In both failed cases, the intermediate fact table on Hive built in step #2 seems to have wrong column order. e.g. on version 2.5.2-cdh57, the schema and content of temp table are shown below: CREATE EXTERNAL TABLE IF NOT EXISTS kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact ( DAY_START date ,YEAR_START date ,STATION_ID string ,QUARTER_START date ,MONTH_START date ,TEMPERATURE bigint ,HOUR_START timestamp ) STORED AS SEQUENCEFILE LOCATION 'hdfs://hz-dev-hdfs-service/user/admin/kylin-2/kylin_metadata/kylin-5dbe40eb-55ba-2245-c0b5-1e9efcb67937/kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact'; ALTER TABLE kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact SET TBLPROPERTIES('auto.purge'='true'); hive> select * from kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact limit 10; OK NULL2010-01-01 2010-01-01 2010-01-01 2010-01-01 NULL NULL NULL2009-01-01 2009-10-01 2009-12-01 2009-12-31 NULL NULL NULL2009-01-01 2009-10-01 2009-12-01 2009-12-31 NULL NULL NULL2009-01-01 2009-10-01 2009-12-01 2009-12-31 NULL NULL NULL2009-01-01 2009-10-01 2009-12-01 2009-12-31 NULL NULL NULL2010-01-01 2010-01-01 2010-01-01 2010-01-01 NULL NULL NULL2010-01-01 2010-01-01 2010-01-01 2010-01-01 NULL NULL NULL2009-01-01 2009-10-01 2009-12-01 2009-12-31 NULL NULL NULL2009-01-01 2009-10-01 2009-12-01 2009-12-31 NULL NULL NULL2010-01-01 2010-01-01 2010-01-01 2010-01-01 NULL NULL Time taken: 0.421 seconds, Fetched: 10 row(s) While the the content of temp file is: # hdfs dfs -text hdfs://hz-dev-hdfs-service/user/admin/kylin-2/kylin_metadata/kylin-5dbe40eb-55ba-2245-c0b5-1e9efcb67937/kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact/part-m-1 | head -n 10 19/02/13 11:44:12 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor [.deflate] 19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor [.deflate] 19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor [.deflate] 19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor [.deflate] 0030322010-01-012010-01-012010-01-012010-01-012010-01-01 07:00:001706 0075762010-01-012010-01-012010-01-012010-01-012010-01-01 07:00:002605 0113882010-01-012010-01-012010-01-012010-01-012010-01-01 07:00:002963 0214922010-01-012010-01-012010-01-012010-01-012010-01-01 07:00:001769 0303062010-01-012010-01-012010-01-012010-01-012010-01-01 0
Re: Hdfs Working directory usage
Hi, Ketan. This is what I find: - cuboid - This dir contains the cuboid data with each row contains dimensions array and MeasureAggregator array. - The size is depend on the cardinality of each columns and it is often very large. - When merge job completed, cuboid file of all segments which be merged successfully will be deleted automatically. - fact_distinct_columns - This dir contains the distinct value of each column. - It should be deleted after current segment build job succeed. - hfile - This dir contains data file which be bulk loaded into hbase. - It should be deleted after current segment build job succeed. - rowkey_stats - Files under this dir are often very small, you may not need deleted them yourself. - These files are used to partition hfile. I think you should update your auto-merge settings to let auto-merge more often, if you find any mistakes, please let me know, thank you! Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Hi team, Any updates on the same ? Thanks, Ketan > On 01-Feb-2019, at 11:39 AM, ketan dikshit wrote: > > Hi Team, > > We have a lot of data accumulated in our hdfs-working-directory, so we want to understand the usage of the following job data, once the job has been completed and segment is successfully created. > > cuboid > fact_distinct_columns > hfile > rowkey_stats > > Basically I need to understand the purpose of: cuboid,fact_distinct_columns,hfile,rowkey_stats after the job has built the cube segment (assuming we don’t use and merging/automerging of segments on the cube later). > > The space taken up by these data in hdfs-working-dir is quite huge(affecting our costing), and is not getting cleaned by by cleanup job(org.apache.kylin.tool.StorageCleanupJob). So we need to be understand, that if we manually clean this up we will not get any issues later. > > Thanks, > Ketan@Exponential
Re: SQL Server JDBC Datasource doesnt list tables.
Hi Gladson, I reproduce the same situation in my dev env,. The question is caused by org.apache.kylin.source.jdbc.metadata.SQLServerJdbcMetadata.java(Line:53) didn't report null in "TABLE_CATALOG" column. This can be verified by following code: String sqlServerUrl="jdbc:sqlserver://cdh1.cloudera.com:1433;user=SA;password=Kyligence2019;DatabaseName=xiaoxiang_test"; Connection con = DriverManager.getConnection(sqlServerUrl); DatabaseMetaData meta = con.getMetaData(); System.out.println(""); ResultSet rs1 = meta.getCatalogs(); while(rs1.next()){ System.out.println(String.format("Catalog \t%s", rs1.getString(1))); } rs1 = meta.getSchemas(); System.out.println(""); while(rs1.next()){ System.out.println(String.format("Schemas \t%s\t%s", rs1.getString(1), rs1.getString(2))); } Output as follow: Catalog master Catalog model Catalog msdb Catalog tempdb Catalog xiaoxiang_test Schemas db_accessadmin null Schemas db_backupoperator null Schemas db_datareader null Schemas db_datawriter null Schemas db_ddladmin null Schemas db_denydatareader null Schemas db_denydatawriter null Schemas db_ownernull Schemas db_securityadminnull Schemas dbo null Schemas guest null Schemas INFORMATION_SCHEMA null Schemas sys null By the way, which version of sqlserver as you use, and which version of jdbc driver are you use? Maybe we can try to fix this by change the way of fetching metadata(catalog) in SQLServerJdbcMetadata. If you find any mistake, please let me know. Best wishes, Xiaoxiang Yu On [DATE], "[NAME]" <[ADDRESS]> wrote: Hi, I followed all the steps in this url http://kylin.apache.org/docs/tutorial/setup_jdbc_datasource.html , but when i click on Load table button or Load table from tree i don't seem to have any tables loaded from the SQL Server data source.There are no errors/exceptions in the logs too. kylin.properties: kylin.source.default=8kylin.source.jdbc.connection-url=jdbc:sqlserver://hostname:1433;database=samplekylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriverkylin.source.jdbc.dialect=mssqlkylin.source.jdbc.user=userkylin.source.jdbc.pass=passkylin.source.jdbc.sqoop-home=sqoophomekylin.source.jdbc.filed-delimiter=|kylin.source.jdbc.sqoop-mapper-num=4 kylin.log 2019-01-28 22:52:27,948 DEBUG [http-bio-7070-exec-1] common.KylinConfig:328 : KYLIN_CONF property was not set, will seek KYLIN_HOME env variable2019-01-28 22:52:28,017 INFO [FetcherRunner 992042775-44] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 0 error, 0 discarded, 0 others2019-01-28 22:52:28,123 INFO [http-bio-7070-exec-4] common.KylinConfig:455 : Creating new manager instance of class org.apache.kylin.metadata.project.ProjectManager2019-01-28 22:52:28,125 INFO [http-bio-7070-exec-4] project.ProjectManager:81 : Initializing ProjectManager with metadata url kylin_metadata@hbase2019-01-28 22:52:28,129 DEBUG [http-bio-7070-exec-4] cachesync.CachedCrudAssist:118 : Reloading ProjectInstance from kylin_metadata(key='/project')@kylin_metadata@hbase2019-01-28 22:52:28,188 DEBUG [http-bio-7070-exec-4] cachesync.CachedCrudAssist:127 : Loaded 1 ProjectInstance(s) out of 1 resource2019-01-28 22:52:28,304 DEBUG [http-bio-7070-exec-5] project.ProjectL2Cache:198 : Loading L2 project cache for Testing2019-01-28 22:52:28,325 INFO [http-bio-7070-exec-5] common.KylinConfig:455 : Creating new manager instance of class org.apache.kylin.metadata.TableMetadataManager2019-01-28 22:52:28,326 DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:118 : Reloading TableDesc from kylin_metadata(key='/table')@kylin_metadata@hbase2019-01-28 22:52:28,333 INFO [http-bio-7070-exec-1] common.KylinConfig:455 : Creating new manager instance of class org.apache.kylin.metadata.model.DataModelManager2019-01-28 22:52:28,360 DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:127 : Loaded 0 TableDesc(s) out of 0 resource2019-01-28 22:52:28,361 DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:118 : Reloading TableExtDesc from kylin_metadata(key='/table_exd')@kylin_metadata@hbase2019-01-28 22:52:28,391 DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:127 : Loaded 0 TableExtDesc(s) out of 0 resource2019-01-28 22:52:28,392 DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:118 : Reloading ExternalFilterDesc from kylin_metadata(key='/ext_filter')@kylin_metadata@hbase2019-01-28 22:52:28,421 DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:127 : Loaded 0 ExternalFilte
Re: Hive ORC tables and empty lookup tables in the cube
Hi, Vadym Antsut 1. "The cube builded successfully, but all lookup tables is empty." How do you know your lookup tables is empty? Is it empty in Hive or in Kylin? 2. " I build the cube on the hive parquet tables in model " Does that means you drop you previous table, and create a new one with " STORED AS PARQUET", have you check your lookup table in hive cli? ---- Best wishes, Xiaoxiang Yu 在 2019/6/2 16:03,“Вадим Анцут” 写入: Hello! Can anyone help me with a problem: I have in Hive one fact table and few lookup tables in the model for kylin cube. The cube builded successfully, but all lookup tables is empty. If I build the cube on the hive parquet tables in model with the same values, then the lookup tables are not empty. -- WBR, Vadym Antsut
Re: [ANNOUNCE] Gang Ma joins the Apache Kylin PMC
Thanks Gang Ma for his selfless help and encouragement in reviewing my code. I hope I can get more guidance from you in the future and make Kylin better. Congratulations! Best wishes, Xiaoxiang Yu 在 2019/6/3 13:32,“ShaoFeng Shi” 写入: On behalf of the Apache Kylin PMC, I am pleased to announce that Gang Ma (马刚) has accepted our invitation to become a PMC member on the Apache Kylin project. We appreciate Gang stepping up to take more responsibility in the Kylin project. Please join me in welcoming Gang to the Kylin PMC! Best Regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re: Hive ORC tables and empty lookup tables in the cube
Hi Antsut, Sorry for my late reply. First let's make it clear, you have two cubes (let us call them cube1 and cube2), cube1 's lookup table is in ORC format and cube2's lookup table is in parquet format. Now you find that with the same sql query, result by cube1 is correct and result by cube2 is worry. So, to eliminate cause from data source(external from Kylin), please enter the same sql query in Hive CLI, if you find the result from Kylin is different from what you see in Hive CLI, I guess it is a bug of Kylin. In that case, please report a JIRA in https://issues.apache.org/jira/projects/KYLIN, with some detail which help us to reproduce bug(version of Kylin you are using, version of Hive/Hadoop, DDL of lookup table, screenshot of query result from hive CLI and from Kylin). Best wishes, Xiaoxiang Yu 在 2019/6/3 15:40,“Vadajer” 写入: Hello Xiaoxiang Yu! Thanks for reply! 1. I'm selecting in kylin insight gui from lookup table or from fact table with join to lookup table. All values from lookup table are empty. <http://apache-kylin.74782.x6.nabble.com/file/t967/empty.png> 2. I'm create in hive the same tables (name with prefix) with "STORED AS PARQUET" and create another cube. (see attached pics) <http://apache-kylin.74782.x6.nabble.com/file/t967/parquet.png> WBR, Vadym Antsut -- Sent from: http://apache-kylin.74782.x6.nabble.com/
Re: build cube时发生错误
Hi, I have a simple workaround which maybe help you. Could you please use the HDP3.0 (which Hive version is 3.1 and Hbase version is 2.0 and use MySQL as metadata of Hive)? I am sure kylin can work well in that env because I have verified it. If you have to use oracle as metadata store of hive, and make sure that error could be reproduce , please report a JIRA to https://issues.apache.org/jira/browse/KYLIN (with your kylin version and environment detail). Thank you very much. Best wishes, Xiaoxiang Yu 发件人: 平 答复: "u...@kylin.apache.org" 日期: 2019年6月13日 星期四 17:35 收件人: "u...@kylin.apache.org" 抄送: "dev@kylin.apache.org" 主题: build cube时发生错误 你好,新手,碰到这种问题实在不知道怎么解决。 在build cube时,提示ORA-00904: "B0"."CATALOG_NAME": 标识符⽆无效,我的kylin版本是hadoop3 的2.6.2,hadoop版本时3.1,hive版本时3.1.0,hbase版本时2.0,操作系统是centos7 我也查看了dbs表,的确没有CATALOG_NAME这个字段,而且我找了好几个hive版本的初始化脚本,就是没看见CATALOG_NAME这个字段,也不知道哪里错了。。希望哪位好心提点一下,万分感谢! java.lang.RuntimeException: java.io.IOException: MetaException(message:Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MTable' AS NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.OWNER_TYPE,A0.RETENTION,A 0.REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.TBL_ID FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE A0.TBL_NAME = ? AND B0."NAME" = ? AND B0."CATALOG_NAME" = ?) at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104) at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167) at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: MetaException(message:Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MTable' AS NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.OWNER_TYPE,A0.RETENTION,A 0.REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.TBL_ID FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE A0.TBL_NAME = ? AND B0."NAME" = ? AND B0."CATALOG_NAME" = ?) at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97) at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51) at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:80) ... 10 more Caused by: MetaException(message:Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MTable' AS NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.OWNER_TYPE,A0.RETENTION,A 0.REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.TBL_ID FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE A0.TBL_NAME = ? AND B0."NAME" = ? AND B0."CATALOG_NAME" = ?) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:208) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) at com.sun.proxy.$Proxy70.get_table_req(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1578) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1570) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208) at com.sun.proxy.$Proxy71.getTable(Unknown Source) at org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191) at org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:105) at org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:88) at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95) ... 12 more Caused by: javax.jdo.JDOException: Exception thrown when executing query : SELECT DISTINCT 'org.apache.ha
Re: cube-rowkey排序咨询
Hi, wangfx Kylin converts sql query to two parameters(Start_key and end_key) in the range Scan operation in HBase. The well-designed Rowkey will more effectively complete the query filtering and positioning of the data, reduce the number of IO, improve the query speed, the order of the dimension in the Rowkey, and have a significant impact on the query performance. The following 2 principles need to be combined when adjusting the order of Rowkey: · 1. Dimensions that are used as filter criteria in a query are placed in front of the non-filtered conditional dimension · 2. Dimensions with a higher cardinality, before the lower cardinality dimension. So, in your situation, I suggest the order should be :a,b,c,d.(If you have only four dimensions). And this link may help, https://kyligence.io/zh/blog/apache-kylin-optimizer-kybot-rowkey/. Best wishes, Xiaoxiang Yu 在 2019/6/11 09:56,“wangfx”<945517...@qq.com> 写入: cube若干个维度,其中a,b为强制维度,一定出现在where里,b的基数很低(只有3种数据);c,d不会出现在where里,只出现在select和group by里,基数c>d>a>b,剩下的维度是where里的常规维度,请问rowkey里abcd和其他的维度顺序怎么排?
Re: ORC lookup tables missed in cube calculation.
Thanks for reporting, we will examined it closely. Best wishes, Xiaoxiang Yu 在 2019/6/11 01:22,“Александр Сидорчук” 写入: Hello, Please help wuth solution finding. ORC is the best way to store data in hive. But we have a blocking issue on that. Please look into it: https://issues.apache.org/jira/browse/KYLIN-4038 i've checked already source code, and looks like the problem in HCatalog, that used by Kylin. But i haven't idea what can i do next... I will be glad to any suggestions. Thanks!
Reply:kylin求和时空值如何处理
Hi, I have a verify in my test env. Surely what you report is right. SELECT SUM(INTEREST_SCORE) AS A, AAA as B, AAA + SUM(INTEREST_SCORE) as C FROM USERACTION LEFT JOIN ( SELECT SUM(INTEREST_SCORE) AS AAA FROM USERACTION WHERE DT = '2012-01-02' ) A on 1 = 1 WHERE DT = '2012-01-01' GROUP BY AAA What kylin returns looks like the following: | A | B | C | | 5.85 || | But using the same sql in Hive CLI(1.2.X), result is the same: OK 5.851NULLNULL So kylin provided the same result as Hive return. Kylin looks make no mistakes. And I have check other query releated with NULL in kylin, I find no mistake. - - Best wishes to you ! From :Xiaoxiang Yu 在 2019-06-25 10:58:22,"肖孟华" 写道: >执行:select sum( TRAAMT ) AAA from TM_SR_SKY_T31293716 where AREA_CODE ='370799' >查询结果集为空,TRAAMT 为number类型; > > >执行:select sum( TRAAMT ) AAA from TM_SR_SKY_T31293716 where AREA_CODE ='370783' >查询结果集不为空,有数值,TRAAMT 为number类型; > > >执行:select sum( TRAAMT ) A , AAA AS B, sum( TRAAMT ) + AAA AS C from >TM_SR_SKY_T31293716 >left join >(select sum( TRAAMT ) AAA from TM_SR_SKY_T31293716 where AREA_CODE ='370799') >A >on 1=1 >where AREA_CODE ='370783' group by aaa >执行成功后,A结果不为空,B结果集为空,A与B相加后结果集C为空。 > > >问题:当空值与非空值进行加减乘除四则运算时,结果集均为空值,sum()函数在使用过程中也存在这样的问题,能否实现在运算前对空值在sql中进行特殊处理,将空值转换为0后在进行运算?若能实现,如何实现?需要如何处理? > > > > >发件人:肖孟华 >联系电话:(+86)17616716362 >地址:(中国)山东省潍坊市高新区健康街潍坊软件园 > > > > >
Re: [VOTE] Release apache-kylin-2.6.2 (RC1)
+1 mvn test passed Best wishes, Xiaoxiang Yu 在 2019/5/14 09:10,“ShaoFeng Shi” 写入: Hi all, I have created a build for Apache Kylin 2.6.2, release candidate 1. Changes highlights: [KYLIN-3892] - Set cubing job priority [KYLIN-3839] - Storage clean up after refreshing or deleting a segment [KYLIN-3873] - Fix inappropriate use of memory in SparkFactDistinct.java [KYLIN-3905] - Enable shrunken dictionary default [KYLIN-3922] - Fail to update coprocessor when run DeployCoprocessorCLI [KYLIN-3936] - MR/Spark task will still run after the job is stopped. Thanks to everyone who has contributed to this release. Here’s release notes: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12345051 The commit to being voted upon: https://github.com/apache/kylin/commit/c507ae29fa64bc7234efd6a002dcfe990969ad35 Its hash is c507ae29fa64bc7234efd6a002dcfe990969ad35. The artifacts to be voted on are located here: https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.2-rc1/ The hash of the artifact is as follows: apache-kylin-2.6.2-source-release.zip.sha256 db2ab59d3e66d635462e9c9ef49fd7ca29342f07ff4eea0730e52777287e2ebf A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachekylin-1062/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/shaofengshi.asc Please vote on releasing this package as Apache Kylin 2.6.2. The vote is open for the next 72 hours and passes if a majority of at least three +1 PMC votes are cast. [ ] +1 Release this package as Apache Kylin 2.6.2 [ ] 0 I don't feel strongly about it, but I'm okay with the release [ ] -1 Do not release this package because... Here is my vote: +1 (binding) Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re : Could not open client transport for any of the Server URI's in ZooKeeper: Unable to read HiveServer2 configs from ZooKeeper
Hi, From the error stack trace, I guess it is maybe caused by zookeeper server is not in health state, or maybe your hive-client is not configured correctly. I think you should contact your Hadoop admin for help. Best wishes, Xiaoxiang Yu 在 2019/5/20 11:19,“王廉鑫” 写入: Hi, Kylin 2.5.2 On FusionInsight HD,when click load table metadata from tree in web UI datasource,get error: 2019-05-20 11:15:58,350 INFO [http-bio-7070-exec-5-EventThread] zookeeper.ClientCnxn:614 : EventThread shut down for session: 0xfb0ab634ba4f 2019-05-20 11:15:58,350 INFO [http-bio-7070-exec-5] zookeeper.ZooKeeper:1325 : Session: 0xfb0ab634ba4f closed 2019-05-20 11:15:58,351 ERROR [http-bio-7070-exec-5] controller.TableController:190 : java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: Unable to read HiveServer2 configs from ZooKeeper java.lang.RuntimeException: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: Unable to read HiveServer2 configs from ZooKeeper at org.apache.kylin.source.hive.BeelineHiveClient.init(BeelineHiveClient.java:76) at org.apache.kylin.source.hive.BeelineHiveClient.(BeelineHiveClient.java:66) at org.apache.kylin.source.hive.HiveClientFactory.getHiveClient(HiveClientFactory.java:29) at org.apache.kylin.source.hive.HiveMetadataExplorer.(HiveMetadataExplorer.java:43) at org.apache.kylin.source.hive.HiveSource.getSourceMetadataExplorer(HiveSource.java:41) at org.apache.kylin.rest.service.TableService.getSourceDbNames(TableService.java:276) at org.apache.kylin.rest.controller.TableController.showHiveDatabases(TableController.java:188) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861) at javax.servlet.http.HttpServlet.service(HttpServlet.java:624) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) at javax.servlet.http.HttpServlet.service(HttpServlet.java:731) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317) at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127) at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91) at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331) at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114) at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331
Re:the mysql datasource and the hive datasource dose not valueable at the same time
Hi, Currently Kylin didn't support multi source type in single project, such as RDBMS & Hive. I think manage specific source type in their project won't be a big trouble. Maybe you can share with us your case. - - Best wishes to you ! From :Xiaoxiang Yu At 2019-07-11 18:06:16, "ALERT MAIL" wrote: > > > hi, > > Sorry to trouble you, I need to load tables from mysql datasource and hive > datasource on the same kylin project at the same time. but if I configure > the “kylin.source.default=8”, it can’t load hive table ; if I configure the > “kylin.source.default=0”, it can’t load the mysql table, are there other > solutions to slove this problem. > > thanks! >
Re:Kylin interacting with AWS EMR
Dear friend , I am feeling sad that you have met such trouble. I have depolyed Kylin into CDH's Hadoop Cluster, but I have less knowledge about AWS's EMR, but I think I may share what I know to you. First question, how to depoly Kylin outside the Hadoop cluster? As far as I see, I think you should deploy Kylin into a router/client node of Hadoop Cluster. A router node should be a node which has deploy Hadoop binary(such as Hive/HDFS) and conf file, but without DataNode/NodeManager(So it has no heavy workload). The router/client node let you have fully access to Hive CLI/HBase CLI/HDFS CLI, that is suitable for Kylin's depolyment. On another hand, I think depoly Kylin outside the Hadoop cluster is not suitable, because Kylin need to upload/download large amounts of data to/from Hadoop cluster. So, depolying Kylin outside the Hadoop cluster, make network being a bottleneck, which has bad influence on Kylin's performance. Another question, the entry "kylin.job.use-remote-cli=true", which is used for Kylin's developer, but not for Kylin's user. If you are interested in that, please check http://kylin.apache.org/development/dev_env.html for detail. Besides, I have invited you into a slack channel(https://apache-kylin.slack.com). Some kylin user has deploy Kylin successfully on EMR, you may ask them more question. - - Best wishes to you ! From :Xiaoxiang Yu At 2019-07-09 00:34:01, "Fábio Teixeira" wrote: >Dear all, > >First of all, thank you very much for building and maintaining Apache Kylin, >it is a really awesome, the work you are doing. > >I had to try it out, so I first configured Apache Kylin into an AWS EMR >cluster which worked pretty well and then I wanted to really go crazy and have >it outside the AWS EMR cluster. > >I’ve already setup a Kylin cluster using MySQL as metastore but I am >struggling on making it interacting with the EMR cluster. > >My issue: >On the first build step of a cube, It is fetching data using sqoop and should >add it to the Hive table, but there it is timing out because it tries to >connect to 127.0.0.1:50010 which obviously is not the AWS EMR cluster. I was >trying to find where I could change the ip for the datanode without success. > >Considering my issue, I was checking the code and I saw that there is the >possibility of running the jobs using remote cli and I was wondering if this >should be the way to go on a Production environment. > >Would you be so kind and provide me some guidance on the following topics?: >Setting up kylin.job.use-remote-cli=true is the configuration that one should >use when Apache Kylin is not inside the Hadoop cluster. >If not then could you provide me any kind of guidance where I can find >documentation for doing that kind of configuration (Kylin and Hadoop >separated)? >I was already investigating the >https://github.com/apache/kylin/tree/master/examples/test_case_data/sandbox ><https://github.com/apache/kylin/tree/master/examples/test_case_data/sandbox> >Do you have more updated documentation for having Kylin outside the Hadoop >cluster? >Is it recommended to use Kylin outside the Hadoop cluster on a production >environment? > >Thank you in advance. > >I look forward to hearing from you. > >Kind regards, >Fábio Teixeira >
Re: [DISCUSS] Support multiple pushdown query engines
Hi, Thanks for your enthusiasm for Kylin community, I guess it will be a great feature in next release. - - Best wishes to you ! From :Xiaoxiang Yu At 2019-07-07 15:05:49, "codingfor...@126.com" wrote: >Thanks nichunen and Xiaoxiang Yu for replying , I will create a jira with >lable `new-feature` and implement it. > >> On Jul 7, 2019, at 14:49, Xiaoxiang Yu wrote: >> >> Hi, >> +1. >> I am agree with such proposal. Kylin should support multi-level pushdown, >> query which can not match by cube should be pushdown to several engines in >> order, such as presto -> SparkSQL -> Hive, which is more reasonable and let >> query can be answer as far as possible. Maybe it is worthy to open a JIRA. >> >> >> >> >> - >> - >> Best wishes to you ! >> From :Xiaoxiang Yu >> >> >> >> 在 2019-07-05 15:37:41,"nichunen" 写道: >>> +1 >>> >>> >>> Sounds useful and not difficult to develop. >>> >>> >>> >>> Best regards, >>> >>> >>> >>> Ni Chunen / George >>> >>> >>> >>> On 07/5/2019 15:20,codingfor...@126.com wrote: >>> Hi, all: >>> Current (version 3.0.0-SNAPSHOT), kylin support only one kind of pushdown >>> query engine. In some user's scenario, need pushdown query to mysql, spark >>> sql,hive etc. >>> I think kylin need support multiple pushdowns. I want to discuss with you >>> whether it is need? >>> Any suggestion is welcome. Thanks. >
Re:regarding glue metastore
Hi krishna, I guess that you set EMR to use AWS Glue catalog as Hive metadata and Kylin is missing the AWS lib com.amazonaws.glue in Kylin's classpath. Maybe /usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar(https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore/blob/master/aws-glue-datacatalog-hive2-client/src/main/java/com/amazonaws/glue/catalog/metastore/AWSGlueDataCatalogHiveClientFactory.java)? You should find the lib in the EMR cluster and add it to yourclass path(maybe $KYLIN_HOME/lib). If you cannot find the right jar, you may package it manually, repo should be this https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore. Maybe ask EMR customer service for help should be considered. - - Best wishes to you ! From :Xiaoxiang Yu At 2019-07-16 10:40:10, "Krishna Bandaru" wrote: hi I created Kylin cluster with HA(3 masters and 2 cores) java.lang.RuntimeException: java.io.IOException: MetaException(message:Unable to instantiate a metastore client factory com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due to: java.lang.ClassNotFoundException: Class com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not found) at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:97) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:122) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:100) at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163) at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:111) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: MetaException(message:Unable to instantiate a metastore client factory com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due to: java.lang.ClassNotFoundException: Class com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not found) at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97) at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51) at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:94) ... 10 more Caused by: MetaException(message:Unable to instantiate a metastore client factory com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due to: java.lang.ClassNotFoundException: Class com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not found) at org.apache.hadoop.hive.ql.metadata.HiveUtils.createMetaStoreClientFactory(HiveUtils.java:525) at org.apache.hadoop.hive.ql.metadata.HiveUtils.createMetaStoreClient(HiveUtils.java:506) at org.apache.hive.hcatalog.common.HiveClientCache.getNonCachedHiveMetastoreClient(HiveClientCache.java:99) at org.apache.hive.hcatalog.common.HiveClientCache$5.call(HiveClientCache.java:318) at org.apache.hive.hcatalog.common.HiveClientCache$5.call(HiveClientCache.java:315) at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4791) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3584) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250) at com.google.common.cache.LocalCache.get(LocalCache.java:3985) at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4788) at org.apache.hive.hcatalog.common.HiveClientCache.getOrCreate(HiveClientCache.java:315) at org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:277) at org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558) at org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:104) at org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:88) at org.apache.hive.hcatalog.mapreduce.HCatInputForma
Re: regarding glue metastore
Hi krishna, I guess that you set EMR to use AWS Glue catalog as Hive metadata and Kylin is missing the AWS lib com.amazonaws.glue in Kylin's classpath. You should find the lib in the EMR cluster and add it to your class path(maybe $KYLIN_HOME/lib) is OK. If you cannot find the right jar, you may package it manually, repo should be this https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore. Maybe ask EMR customer service for help should be considered. == Xiaoxiang Yu Best wishes to you! > 在 2019年7月16日,10:40,Krishna Bandaru 写道: > > hi I created Kylin cluster with HA(3 masters and 2 cores) > > java.lang.RuntimeException: java.io.IOException: MetaException(message:Unable > to instantiate a metastore client factory > com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due > to: java.lang.ClassNotFoundException: Class > com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not > found) > at > org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:97) > at > org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:122) > at > org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:100) > at > org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163) > at > org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:111) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: MetaException(message:Unable to instantiate a > metastore client factory > com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due > to: java.lang.ClassNotFoundException: Class > com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not > found) > at > org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97) > at > org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51) > at > org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:94) > ... 10 more > Caused by: MetaException(message:Unable to instantiate a metastore client > factory > com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due > to: java.lang.ClassNotFoundException: Class > com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not > found) > at > org.apache.hadoop.hive.ql.metadata.HiveUtils.createMetaStoreClientFactory(HiveUtils.java:525) > at > org.apache.hadoop.hive.ql.metadata.HiveUtils.createMetaStoreClient(HiveUtils.java:506) > at > org.apache.hive.hcatalog.common.HiveClientCache.getNonCachedHiveMetastoreClient(HiveClientCache.java:99) > at > org.apache.hive.hcatalog.common.HiveClientCache$5.call(HiveClientCache.java:318) > at > org.apache.hive.hcatalog.common.HiveClientCache$5.call(HiveClientCache.java:315) > at > com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4791) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3584) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250) > at com.google.common.cache.LocalCache.get(LocalCache.java:3985) > at > com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4788) > at > org.apache.hive.hcatalog.common.HiveClientCache.getOrCreate(HiveClientCache.java:315) > at > org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:277) > at > org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558) > at > org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:104) > at > org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:88) &g
Re: [VOTE] Release apache-kylin-2.6.3 (RC1)
+1mvn test passed- - Best wishes to you ! From :Xiaoxiang Yu At 2019-07-01 15:51:58, "Wang rupeng" wrote: >+1 >mvn test passed > >在 2019/7/1 下午2:16,“Chao Long” 写入: > >+1 >mvn test passed > >On Mon, Jul 1, 2019 at 2:09 PM Cheng wang wrote: > >> +1(binding) >> >> Best regards, >> Cheng Wang >> >> >> On 7/1/19, 9:27 AM, "ShaoFeng Shi" wrote: >> >> Hi all, >> >> I have created a build for Apache Kylin 2.6.3, release candidate 1. >> >> Changes highlights: >> - [KYLIN-4024] - Support pushdown to Presto >> - [KYLIN-3977] - Avoid mistaken deleting dicts by storage cleanup > while >> building jobs are running >> - [KYLIN-4015] – Fix build cube error at the "Build UHC Dictionary" >> step >> - [KYLIN-4022] - Error with message "Unrecognized column type: >> DECIMAL(xx,xx)" happens when doing query pushdown >> >> Thanks to everyone who has contributed to this release. >> Here’s release notes: >> >> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12345582 >> >> The commit to being voted upon: >> >> >> > https://github.com/apache/kylin/commit/0d5f85b0a40c301134122de927204a0d17ad65fa >> >> Its hash is 0d5f85b0a40c301134122de927204a0d17ad65fa. >> >> The artifacts to be voted on are located here: >> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.3-rc1/ >> >> The hash of the artifact is as follows: >> apache-kylin-2.6.3-source-release.zip.sha256 >> 50d1cad423f1a15a5e25f1c3c68748c7ce10e0116fd67fa9e38c1470a11d389c >> >> A staged Maven repository is available for review at: >> >> https://repository.apache.org/content/repositories/orgapachekylin-1063/ >> >> Release artifacts are signed with the following key: >> https://people.apache.org/keys/committer/shaofengshi.asc >> >> Please vote on releasing this package as Apache Kylin 2.6.3. >> >> The vote is open for the next 72 hours and passes if a majority of >> at least three +1 PMC votes are cast. >> >> [ ] +1 Release this package as Apache Kylin 2.6.3 >> [ ] 0 I don't feel strongly about it, but I'm okay with the release >> [ ] -1 Do not release this package because... >> >> >> Here is my vote: >> >> +1 (binding) >> >> Best regards, >> >> Shaofeng Shi 史少锋 >> Apache Kylin PMC >> Email: shaofeng...@apache.org >> >> Apache Kylin FAQ: >> https://kylin.apache.org/docs/gettingstarted/faq.html >> Join Kylin user mail group: user-subscr...@kylin.apache.org >> Join Kylin dev mail group: dev-subscr...@kylin.apache.org >> >> >> > >
Re: 取消订阅
Hi. We are sad to here you leave. If you have to unsubscribe, please send something to dev-unsubscr...@kylin.apache.org. For more detail, you may check http://www.apache.org/foundation/mailinglists.html and http://kylin.apache.org/community/ . Best wishes, Xiaoxiang Yu 在 2019/8/22 15:28,“徐时永” 写入: 麻烦帮忙取消订阅。谢谢
Re: Failed to read big resource /dict/xxxx
e that marker. Since this is a broken metadata entry, deletion won't make damage. After the deletion, following rebuilt job will succeed. This is some related report mail : 1. http://apache-kylin.74782.x6.nabble.com/How-to-repair-the-cube-that-it-lost-someone-dictionary-td12989.html 2. http://mail-archives.apache.org/mod_mbox/kylin-user/201908.mbox/%3c4bcca64e.4af8.16cdb473a62.coremail.itzhangqi...@163.com%3e I think we should fix this in next release by deleting broken metadata entry if found. And I want to say thank you to issue's reporter for their patience and assistance. If anyone find any mistake, please let me know. Thank you. ---- Best wishes, Xiaoxiang Yu 发件人: Johnson 答复: "u...@kylin.apache.org" 日期: 2019年8月29日 星期四 21:44 收件人: Wang rupeng 抄送: "u...@kylin.apache.org" 主题: Re: Failed to read big resource /dict/ 1.kylin版本2.6.2,查看邮件列表,之前有个同学也遇到了这个问题,也是2.6.2版本 2.报错的那个dict不存在,这个cube已经稳定运行数月,各组件权限没有问题。 3.查看kylin log没有发现问题点。 4.之后我反复测试构建该cube,监控hdfs上kylin元数据目录,发现该报错维度的dict文件没有生成,报错信息一样,怀疑是元数据问题。 5.使用kylin cli工具清理存储(包括部分元数据),再构建,报错一致。 6.对报错cube使用到的hive表,建hive视图(改名),导入kylin,新建了一样的cube,之后构建这个新cube没有报错。怀疑kylin构建之前那个hive 表元数据问题。 [图像已被发件人删除。] itzhangqiang 邮箱itzhangqi...@163.com 签名由 网易邮箱大师<https://mail.163.com/dashi/dlpro.html?from=mail88> 定制 On 08/29/2019 19:38, Wang rupeng<mailto:wangrup...@live.cn> wrote: Hi, When this error occurs, you can using “$KYLIN_HOME/bin/metastore.sh fetch /dict“ to download the dictionaries to local and check if the dictionary file exist, besides. You can check your hdfs permission. Otherwise, you may show us more information about your situation like your working scene. --- Best wishes, Rupeng Wang 发件人: Johnson mailto:itzhangqi...@163.com>> 答复: "u...@kylin.apache.org<mailto:u...@kylin.apache.org>" mailto:u...@kylin.apache.org>> 日期: 2019年8月29日 星期四 10:49 收件人: "u...@kylin.apache.org<mailto:u...@kylin.apache.org>" mailto:u...@kylin.apache.org>> 主题: Failed to read big resource /dict/ hi,大家好, 今天发现一个失败的任务,失败在#4 Step Name: Build Dimension Dictionary 报错信息如下:之后我把这个任务drop掉,重新构建还是一直报这个错,大家知道怎么解决吗 org.apache.kylin.engine.mr.exception.HadoopShellException: java.lang.RuntimeException: java.io.IOException: Failed to read big resource /dict/KYLIN_VIEW./COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:108) at org.apache.kylin.dict.DictionaryManager.checkDupByContent(DictionaryManager.java:173) at org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:151) at org.apache.kylin.dict.DictionaryManager.saveDictionary(DictionaryManager.java:320) at org.apache.kylin.cube.CubeManager$DictionaryAssist.saveDictionary(CubeManager.java:1127) at org.apache.kylin.cube.CubeManager.saveDictionary(CubeManager.java:1089) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:74) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:55) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73) at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:93)
Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera
Hi , As far as I can see, Kylin should be installed on a separated Gateway node in CDH cluster. In gateway node, you should not install any long running process/component such as data node/region server/node manager. Instead with some gateway role to let you have access to HDFS/HBase/Yarn/Hive etc. Following screenshot is a gateway node in my test env. [cid:image001.png@01D55F66.D07C07F0] Best wishes, Xiaoxiang Yu 发件人: Wang rupeng 答复: "u...@kylin.apache.org" 日期: 2019年8月30日 星期五 15:31 收件人: Gourav Gupta , "u...@kylin.apache.org" 主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera Hi Gupta, You can change kylin port by using following command and new port is 7070 plus the number you set: ./$KYLIN_HOME/bin/kylin-port-replace-util.sh set If kylin web UI cannot be opened, you can check kylin log which is $KYLIN_HOME/logs/kylin.log to see more details. There are some suggestions for your doubts: 1. You need to add environment variable SPARK_HOME=/local/path/to/spark so that you can start kylin successfully even though you don't use spark to build cube. And you'd better using suggested version of spark(spark-2.3.2), you can download it by ./$KYLIN_HOME/bin/down-spark.sh . 2. Kylin supported cdh vertion is cdh5.7+, cdh6.0, cdh6.1 and you don't have to care about HBase version if you are using cdh. In case you are using cdh5.16, you can download apache-kylin--bin-cdh57.tar.gz from http://kylin.apache.org/download/ 3. You don't have to install kylin on master node, any other node in cluster would be OK. --- Best wishes, Rupeng Wang 发件人: Gourav Gupta 日期: 2019年8月30日 星期五 02:03 收件人: Wang rupeng 抄送: "dev@kylin.apache.org" 主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera Thanks a lot Wang for the prompt helpful reply. Actually today I have removed the old version of Kylin and installed successfully apache Kylin 2.6 for CDH mode but now at this time, we are unable to open Kylin WEB UI. Even though I have had changed port number 7070 to some other number in server.xml(Tomcat directory), but still facing the same issue. I have some doubts while configuring the Kylin which are mentioned below: 1. Would I have to write the path of spark master node or path of spark which has come with Kylin? 2.Which tar file will be suitable for cloudera 5.16 ?? What is the need of Kylin-HBase version? 3.should I need to install and configured Kylin on master node? will installation over the edge node work? Actually, we are trying to switch the visualization layer from SQL(OLAP) - PowerBI pipeline to KYLIN-Mean Stack (Open Source/Enterprise version ). So your help is much appreciated on the same. I am waiting for your positive response. Regards, Gourav Gupta On Thu, Aug 29, 2019 at 5:43 PM Wang rupeng mailto:wangrup...@live.cn>> wrote: Hi, It seems the problem is following "60505 [dispatcher-event-loop-6] ERROR org.apache.spark.scheduler.cluster.YarnScheduler - Lost executor 1 on *: Container marked as failed:" It usually comes out with not enough memory for your yarn so that yarn container is closed because of lack of memory , you can go to yarn resource manager web page to see more details with yarn log. If it's the memory issue, you can try to allocate more memory for spark yarn executor by change the following configuration item in "$KYLIN_HOME/conf/kylin.properties" kylin.engine.spark-conf.spark.yarn.executor.memoryOverhead=384 --- Best wishes, Rupeng Wang 在 2019/8/29 14:57,“Gourav Gupta”mailto:techgouravgu...@gmail.com>> 写入: Hi Sir, I have installed and configured Apache Kylin 2.4 on Cloudera Platform for creating the Cube. I have been able to create a cube in MapReduce mode but getting the below-mentioned caveat while executes on spark mode. I have had followed all the steps and tried many remedies for debugging the problem. Please let me know how to resolve this bug. Thanks in Advance. 1091 [main] ERROR org.apache.spark.SparkContext - Error adding jar (java.lang.IllegalArgumentException: requirement failed: JAR kylin-job-2.4.0.jar already registered.), was the --addJars option used? [Stage 0:> (0 + 0) / 2] [Stage 0:> (0 + 2) / 2] 60505 [dispatcher-event-loop-6] ERROR org.apache.spark.scheduler.cluster.YarnScheduler - Lost executor 1 on ** ***: Container marked as failed: container_e62_1566915974858_6628_01_03 on host: ***. Exit status: 50. Diagnostics: Exception from container-launch. Container id: container_e62_1566915974858_6628_01_03 Exit code: 50 Stack trace: ExitCodeException exitCode=50: at org.apache.hadoop.util.Shell.runCommand(Sh
Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera
Dear Gourav, Thank you for your update. Best wishes, Xiaoxiang Yu 发件人: Gourav Gupta 日期: 2019年9月4日 星期三 00:09 收件人: Xiaoxiang Yu , Wang rupeng 抄送: "dev@kylin.apache.org" 主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera Dear Xiaoxiang, Thanks for the helpful reply. Please be apprised, have resolved all the issues and now I am able to create a cube with MapReduce mode. Last caveat i.e. "FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" is resolved while I configured the "hive.auto.convert.join = false" in kylin-hive-site.xml. Thanks for the support and appreciates the quick response from you and Kylin Team. I will take your help in future as well if I face any other issue when building a cube with spark mode. Best Regards, Gourav Gupta On Sun, Sep 1, 2019 at 10:54 AM Xiaoxiang Yu mailto:xiaoxiang...@kyligence.io>> wrote: Hi friend, I feel so glad to hear you have resolved some problem after a lot effort, and it is very kind of you to share something you found about kylin-port-replace-util.sh with us. It seems that you meet another trouble of the first step of your cube building, using Hive to create a flat table. As far as I can see, the message provided by you “FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask”indicated that your Hive is NOT configured in right way. Your Hive command run in local mode other than Yarn mode. It is strange, did your node which you choose to deploy Kylin is configured in correct way? Maybe you should ask your Hadoop administrator for help. Or could you please provided more detail about how your deploy Kylin? If you use Kylin for the first time and you are familiar with Docker, maybe you can run a docker container to have a technical preview. Please refer to http://kylin.apache.org/docs/install/kylin_docker.html. -------- Best wishes, Xiaoxiang Yu 发件人: Gourav Gupta mailto:techgouravgu...@gmail.com>> 日期: 2019年9月1日 星期日 01:24 收件人: Wang rupeng mailto:wangrup...@live.cn>>, Xiaoxiang Yu mailto:xiaoxiang...@kyligence.io>>, "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" mailto:dev@kylin.apache.org>> 主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera Dear Wang and Xiaoxiang, Thanks for providing the suggestions and solutions for all those queries which I had mentioned in the previous trailing mail. Truly appreciated!!! As the answers have been received from you, I did the port number amendment in "./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still thereafter I was facing with the same issue. After doing hours of brainstorming, I was able to resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the java application was running on 9009 port no. and we also know that Kylin takes 3 ports 7070,9009 & 7443. Was able to access the Kylin Web UI while I stopped the already running script on 9009. At this time I am facing with one caveat i.e "FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going to create a cube in Map-Reduce mode. I googled the same and did the amendment( Kylin and Hive property) as per the solution I got over the shared link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask) but still, I am not able to resolve. Please let me know is there any way of resolving this issue. Attaching the screenshot of the error. Thanks in advance. Best Regards, Gourav Gupta On Sat, Aug 31, 2019 at 10:49 PM Gourav Gupta mailto:techgouravgu...@gmail.com>> wrote: Dear Wang and Xiaoxiang, Thanks for providing the suggestions and solutions for all those queries which I had mentioned in the previous trailing mail. Truly appreciated!!! As the answers have been received from you, I did the port number amendment in "./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still thereafter I was facing with the same issue. After doing hours of brainstorming, I was able to resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the java application was running on 9009 port no. and we also know that Kylin takes 3 ports 7070,9009 & 7443. Was able to access the Kylin Web UI while I stopped the already running script on 9009. At this time I am facing with one caveat i.e "FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going to create a cube in Map-Reduce mode. I googled the same and did the amendment( Kylin and Hive property) as per the solution I got over the shared link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask) but still, I am not able to resolve. Please let me know
Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera
Hi friend, I feel so glad to hear you have resolved some problem after a lot effort, and it is very kind of you to share something you found about kylin-port-replace-util.sh with us. It seems that you meet another trouble of the first step of your cube building, using Hive to create a flat table. As far as I can see, the message provided by you “FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask”indicated that your Hive is NOT configured in right way. Your Hive command run in local mode other than Yarn mode. It is strange, did your node which you choose to deploy Kylin is configured in correct way? Maybe you should ask your Hadoop administrator for help. Or could you please provided more detail about how your deploy Kylin? If you use Kylin for the first time and you are familiar with Docker, maybe you can run a docker container to have a technical preview. Please refer to http://kylin.apache.org/docs/install/kylin_docker.html. Best wishes, Xiaoxiang Yu 发件人: Gourav Gupta 日期: 2019年9月1日 星期日 01:24 收件人: Wang rupeng , Xiaoxiang Yu , "dev@kylin.apache.org" 主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera Dear Wang and Xiaoxiang, Thanks for providing the suggestions and solutions for all those queries which I had mentioned in the previous trailing mail. Truly appreciated!!! As the answers have been received from you, I did the port number amendment in "./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still thereafter I was facing with the same issue. After doing hours of brainstorming, I was able to resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the java application was running on 9009 port no. and we also know that Kylin takes 3 ports 7070,9009 & 7443. Was able to access the Kylin Web UI while I stopped the already running script on 9009. At this time I am facing with one caveat i.e "FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going to create a cube in Map-Reduce mode. I googled the same and did the amendment( Kylin and Hive property) as per the solution I got over the shared link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask) but still, I am not able to resolve. Please let me know is there any way of resolving this issue. Attaching the screenshot of the error. Thanks in advance. Best Regards, Gourav Gupta On Sat, Aug 31, 2019 at 10:49 PM Gourav Gupta mailto:techgouravgu...@gmail.com>> wrote: Dear Wang and Xiaoxiang, Thanks for providing the suggestions and solutions for all those queries which I had mentioned in the previous trailing mail. Truly appreciated!!! As the answers have been received from you, I did the port number amendment in "./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still thereafter I was facing with the same issue. After doing hours of brainstorming, I was able to resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the java application was running on 9009 port no. and we also know that Kylin takes 3 ports 7070,9009 & 7443. Was able to access the Kylin Web UI while I stopped the already running script on 9009. At this time I am facing with one caveat i.e "FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going to create a cube in Map-Reduce mode. I googled the same and did the amendment( Kylin and Hive property) as per the solution I got over the shared link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask) but still, I am not able to resolve. Please let me know is there any way of resolving this issue. Attaching the screenshot of the error. Thanks in advance. Best Regards, Gourav Gupta On Fri, Aug 30, 2019 at 1:00 PM Wang rupeng mailto:wangrup...@live.cn>> wrote: Hi Gupta, You can change kylin port by using following command and new port is 7070 plus the number you set: ./$KYLIN_HOME/bin/kylin-port-replace-util.sh set If kylin web UI cannot be opened, you can check kylin log which is $KYLIN_HOME/logs/kylin.log to see more details. There are some suggestions for your doubts: 1. You need to add environment variable SPARK_HOME=/local/path/to/spark so that you can start kylin successfully even though you don't use spark to build cube. And you'd better using suggested version of spark(spark-2.3.2), you can download it by ./$KYLIN_HOME/bin/down-spark.sh . 2. Kylin supported cdh vertion is cdh5.7+, cdh6.0, cdh6.1 and you don't have to care about HBase version if you are using cdh. In case you are using cdh5.16, you can download apache-kylin--bin-cdh57.tar.gz from http://kylin.apache.org/download/ 3. You don't have to install kylin on master node, an
Re: How to migrate model/cube metadat across cluster
Hi Lionel, Sorry for my misunderstanding, you are right, I think in your situation, use metastore.sh back is a better way. Best wishes, Xiaoxiang Yu 在 2019/9/19 10:42,“lionel@oocl.com” 写入: Hi Note that the different Kylin environments should share the same Hadoop cluster, including HDFS, HBase and HIVE. The doc says that should share the same Hadoop cluster, including HDFS, HBase and HIVE, right? Thanks & Regards Lionel -Original Message- From: Xiaoxiang Yu Sent: Thursday, September 19, 2019 10:32 AM To: LIONEL TAO (OPS-IRIS-ISD-OOCLL/ZHA) ; dev@kylin.apache.org Subject: Re: How to migrate model/cube metadat across cluster Hi, As the doc says, it could migrate cube between different clusters. If you have a kylin instance in QA cluster (qa_node:7070), and you want to migrate to a PROD cluster (prod_node:7070), you can use qa_node:7070 as srcKylinConfigUri and prod_node:7070 as dstKylinConfigUri, thus to migrate across cluster. Best wishes, Xiaoxiang Yu 在 2019/9/19 10:21,“lionel@oocl.com” 写入: Hi Xiaoxiang, CubeMigrationCLI.java can migrate a cube from a Kylin environment to another, for example, promote a well tested cube from the testing env to production env. Note that the different Kylin environments should share the same Hadoop cluster, including HDFS, HBase and HIVE. Per the document, cannot migrate across cluster? We are finding a solution to do this and QA & PROD are in different cluster. Can you help give more advice? Thanks & Regards Lionel -Original Message- From: Xiaoxiang Yu Sent: Thursday, September 19, 2019 10:07 AM To: dev@kylin.apache.org; LIONEL TAO (OPS-IRIS-ISD-OOCLL/ZHA) Subject: Re: How to migrate model/cube metadat across cluster Hi Lionel, It is a good practice to do a full test at test env and migrate it to another env, I think you can use CubeMigrationCLI to meet your request. Please check https://apc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fkylin.apache.org%2Fdocs%2Fhowto%2Fhowto_use_cli.htmldata=02%7C01%7Clionel.tao%40oocl.com%7C2610d4d2698b44a0bf6908d73ca98c1d%7C7851b4cc2c5c459f96d916731d6b4ca4%7C0%7C0%7C637044571243399491sdata=%2FLMuIwEDnbaulQoZONRm5fNMseHb%2BdH3dkAOfc5N%2Ft8%3Dreserved=0 for more detail. Best wishes, Xiaoxiang Yu 在 2019/9/19 01:12,“lionel@oocl.com” 写入: Hi Kylin dev, We are now trying Kylin for our OLAP system, and now have some questions need clarify with you : 1. See that cannot use CLI to migrate data from QA environment to PROD if they are in different cluster, what's the limitation when migrate across clusters and then how can do the migration? 2. We have tried with metastore.sh backup and then restore in another cluster, Will this be a solution for migration? And any impact if using this solution? Thanks & Regards Lionel Disclaimer : This email and all contents are subject to the following disclaimer: https://apc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Femaildisclaimer.oocl.com%2Fdefault.htmldata=02%7C01%7Clionel.tao%40oocl.com%7C2610d4d2698b44a0bf6908d73ca98c1d%7C7851b4cc2c5c459f96d916731d6b4ca4%7C0%7C0%7C637044571243399491sdata=%2BHDGzQz0VlgJhx3vqt6b5xadXgkwhf4GvepE0djVQC0%3Dreserved=0 Disclaimer : This email and all contents are subject to the following disclaimer: https://apc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Femaildisclaimer.oocl.com%2Fdefault.htmldata=02%7C01%7Clionel.tao%40oocl.com%7C2610d4d2698b44a0bf6908d73ca98c1d%7C7851b4cc2c5c459f96d916731d6b4ca4%7C0%7C0%7C637044571243399491sdata=%2BHDGzQz0VlgJhx3vqt6b5xadXgkwhf4GvepE0djVQC0%3Dreserved=0 Disclaimer : This email and all contents are subject to the following disclaimer: http://emaildisclaimer.oocl.com/default.html
Re: [jira] [Created] (KYLIN-4125) Kylin upgraded from springmvc architecture to spring boot architecture
Hi zjt, Glad to hear your suggestion. Best wishes, Xiaoxiang Yu 在 2019/8/6 14:36,“zjt” 写入: Hi Team: Next, I will do these things to upgrade kylin from springmvc architecture to springbok architecture. 1. Modify the project's pom.xml to import the spring boot dependency. 2. Upgrade kylin-server from spring mvc to springbok 1.5.5, modify kylin-server code to support spring boot. 3. Modify the build script, package kylin into to a war package and run it as external tomcat mode. It's the same as current deployment. I have implemented this part or the features and conducted a simple test. I expect to submit my code to the "KYLIN-4125" branch this week. Forwarding messages From: "zhao jintao (JIRA)" Date: 2019-08-06 11:53:00 To: dev@kylin.apache.org Subject: [jira] [Created] (KYLIN-4125) Kylin upgraded from springmvc architecture to spring boot architecture zhao jintao created KYLIN-4125: -- Summary: Kylin upgraded from springmvc architecture to spring boot architecture Key: KYLIN-4125 URL: https://issues.apache.org/jira/browse/KYLIN-4125 Project: Kylin Issue Type: Improvement Components: REST Service Reporter: zhao jintao Assignee: zhao jintao Hi Team: Kylin is based on the spring mvc architecture, but the spring mvc configuration is more complicated. It is cumbersome when integrateing new components. Now, The mainstream of the industry has been based on the spring boot architecture. Spring boot can be automatically configured to reduce the complexity of project integration; promote the expansion and implementation of microservice architecture. More and more project architectures have been upgraded from springmvc to spring boot. Kylin can also be upgraded from the springmvc architecture to the spring boot architecture. Do you have any suggestions? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
Re:Using kylin in simplified way
Hi Asim, I heard a lot of Kylin users may have such wish(remove Hadoop component), currently kylin comunity have already implement RDBMS as Metadata Store(default is HBase), next plan will be add parquet/druid as storage layer(to replace HBase) and use Spark instead of MapReduce. Maybe newer version of Kylin which deployed in cloud will only depend on MYSQL(metadata), Spark/Parquet(as storage layer and compute engine), S3 as distributed storage. Currently there are branchs https://github.com/apache/kylin/tree/kylin-on-parquet & https://github.com/apache/kylin/tree/kylin-on-druid, but these feature have some limitation and maybe enough mature. We may wait for further development work. - - Best wishes to you ! From :Xiaoxiang Yu At 2019-07-25 11:29:22, "Asim Ali" wrote: >Hi All, >I tried using Kylin on Hadoop environment, but overhead of hadoop is too >much >for our medium scale need. >Is there any way we can use kylin Olap engine with minimal requirements of >underlaying storage layer. >What are the best practices and architecture to support this, where we >possibly can use it without hadoop components. >Thanks > >Asim Ali >*Software Developer* > >Email: a...@easyemployer.com > >Phone: 1300 855 642 <1300855642> >Website: www.easyemployer.com > >[image: easyemployer] <http://www.easyemployer.com/> > > >On Fri, 21 Jun 2019 at 15:22, 敏丞 wrote: > >> Hi, >> After check cube.json provided by you, I can reproduce this error in >> my development env. Looks like this kind exception occurs when you have a >> cube which *have both Raw measure and CountDistinct(Bitmap) on the same >> column*. I find the reason should be the Raw Measure choose the wrong >> dictionary (AppendTrieDictionary cannot used to decode). Maybe you should >> try use two cubes in this situation. >> And if you don't mind, I have a question, have you ever use this type >> of query "select * from FACT_TABLE" in old version of kylin in such kind of >> cube(raw measure and count_distinct both on the same column) and get >> correct result successfully? >> >> >> If you have find anything wrong or other information, please let me >> know. Thank you. >> >> >> >> >> *-* >> *-* >> *Best wishes to you ! * >> *From :**Xiaoxiang Yu* >> >> >> At 2019-06-20 16:00:10, "greatelvisw...@gmail.com" >> wrote: >> >hi,all: >> > >> > I got an error like "AppendTrieDictionary can't retrieve value from id" >> > while query the cube data, the following is the cube info and exception >> > info. >> > >> >I found the same error in this >> >thread(https://lists.apache.org/thread.html/63981bc08ef7b97c41921ed692de79ef9a744f6329192e5199074ba3@%3Cdev.kylin.apache.org%3E), >> > but I just use the bitmaps (count distinct) as measure, and never use it >> >as dimension. >> > >> >So please help me to resolve it. >> > >> >cube data: >> >{ >> > "uuid": "eb0b4a32-fbc0-b197-b3f0-4c9cd5fb3916", >> > "last_modified": 1561014544528, >> > "version": "2.6.2.0", >> > "name": "dev_cube_user_currency", >> > "is_draft": false, >> > "model_name": "user_currency", >> > "description": "", >> > "null_string": null, >> > "dimensions": [ >> >{ >> > "name": "TYPE", >> > "table": "DEV_DWD_USER_CURRENCY", >> > "column": "TYPE", >> > "derived": null >> >}, >> >{ >> > "name": "SUB_TYPE", >> > "table": "DEV_DWD_USER_CURRENCY", >> > "column": "SUB_TYPE", >> > "derived": null >> >}, >> >{ >> > "name": "SOURCE_TYPE", >> > "table": "DEV_DWD_USER_CURRENCY", >> > "column": "SOURCE_TYPE", >> > "derived": null >> >}, >> >{ >> > "name": "SOURCE", >> > "table": "DEV_DWD_USER_CURRENCY", >> > "column": "SOURCE", >> > "derived": null >> >}, >> >{ >&g
Re: How to migrate model/cube metadat across cluster
Hi Lionel, It is a good practice to do a full test at test env and migrate it to another env, I think you can use CubeMigrationCLI to meet your request. Please check http://kylin.apache.org/docs/howto/howto_use_cli.html for more detail. Best wishes, Xiaoxiang Yu 在 2019/9/19 01:12,“lionel@oocl.com” 写入: Hi Kylin dev, We are now trying Kylin for our OLAP system, and now have some questions need clarify with you : 1. See that cannot use CLI to migrate data from QA environment to PROD if they are in different cluster, what's the limitation when migrate across clusters and then how can do the migration? 2. We have tried with metastore.sh backup and then restore in another cluster, Will this be a solution for migration? And any impact if using this solution? Thanks & Regards Lionel Disclaimer : This email and all contents are subject to the following disclaimer: http://emaildisclaimer.oocl.com/default.html
Re: [VOTE] Release apache-kylin-3.0.0-beta (RC1)
+1 mvn test passed Best wishes, Xiaoxiang Yu 在 2019/9/27 10:03,“Wang rupeng” 写入: +1 --- Best wishes, Rupeng Wang 在 2019/9/27 09:45,“Chao Long” 写入: +1 mvn test passed On Thu, Sep 26, 2019 at 8:44 PM Yaqian Zhang wrote: > +1 > mvn test passed > > > 在 2019年9月26日,20:10,nichunen 写道: > > > > +1 > > > > > > > > Best regards, > > > > > > > > Ni Chunen / George > > > > > > > > On 09/26/2019 16:41,ShaoFeng Shi wrote: > > Hi all, > > > > I have created a build for Apache Kylin 3.0.0-beta, release candidate 1. > > > > Changes highlights: > > [KYLIN-4122] - Add Kylin user and group management modules > > [KYLIN-4167] - Refactor streaming coordinator > > [KYLIN-4114] - Provided a self-contained docker image for Kylin > > [KYLIN-4137] - Accelerate metadata reloading > > > > Thanks to everyone who has contributed to this release. > > Here’s the release notes: > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12345686 > > > > The commit to being voted upon: > > > https://github.com/apache/kylin/commit/721be80866223fecad9a6231fa2427a847bc8f48 > > > > Its hash is 721be80866223fecad9a6231fa2427a847bc8f48. > > > > The artifacts to be voted on, including the source package and two > > pre-compiled binary packages, are located here: > > > https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.0-beta-rc1/ > > > > The hash of the artifacts are as follows: > > apache-kylin-3.0.0-beta-source-release.zip.sha256 > > 53547e8a94eb74cdcd329777ff03f1c79209020016c2f9a62351e8c73ac8e0bd > > apache-kylin-3.0.0-beta-bin-hbase1x.tar.gz.sha256 > > 1d50348660899baa9005b78cf45243e0eb2495fa0403d6250b3439ff50bf1731 > > apache-kylin-3.0.0-beta-bin-cdh57.tar.gz.sha256 > > bc9e303154901d4061dbac3876157cb4be25f23307f4c709d083da70aa18524b > > apache-kylin-3.0.0-beta-bin-hadoop3.tar.gz.sha256 > > 681452450248f56ebe107d278e3ccb1478e42137875a2dded953db8c03488f9a > > apache-kylin-3.0.0-beta-bin-cdh60.tar.gz.sha256 > > 2f66497ed39d7d78ea5a634a8796ab408586dce369edc97ed9374ba90a88b03d > > > > A staged Maven repository is available for review at: > > https://repository.apache.org/content/repositories/orgapachekylin-1066/ > > > > Release artifacts are signed with the following key: > > https://people.apache.org/keys/committer/shaofengshi.asc > > > > Please vote on releasing this package as Apache Kylin 3.0.0-beta. > > > > The vote is open for the next 72 hours and passes if a majority of > > at least three +1 PMC votes are cast. > > > > [ ] +1 Release this package as Apache Kylin 3.0.0-beta > > [ ] 0 I don't feel strongly about it, but I'm okay with the release > > [ ] -1 Do not release this package because... > > > > Best regards, > > > > Shaofeng Shi 史少锋 > > Apache Kylin PMC > > Email: shaofeng...@apache.org > > > > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html > > Join Kylin user mail group: user-subscr...@kylin.apache.org > > Join Kylin dev mail group: dev-subscr...@kylin.apache.org > >
Re: Install Apache Kylin in custom environment
Dear sir, Hadoop has a long history and it is complex, so we recommend you to use some well-tested Hadoop Distribution such as CDH and HDP, but not a custom Hadoop environment. If you are do a PoC and want to learn Kylin quickly, please use Docker image https://hub.docker.com/r/apachekylin/apache-kylin-standalone. If want to use Kylin in a more formal Hadoop environment, could you please use a CDH 5.x or HDP 2.x Hadoop Distribution? Best wishes, Xiaoxiang Yu 在 2019/9/27 11:24,“Ngọc Thiên Nguyễn” 写入: Dear Sir, I'm trying to install Apache Kylin in custom environment. But I had some problem https://stackoverflow.com/questions/58126981/install-apache-kylin-in-custom-environment, can you help me? Looking forward to hear from you. Thanks & Regards
Re: Build cube by JDBC
Hi, Sorry for my late reply. I have reproduced that error, using oracle 11g as data source. First, I met the error " java.sql.SQLSyntaxErrorException: ORA-00911: 无效字符". This can be fixed by adding a configuration in kylin.properties, that is " kylin.source.hive.quote-enabled=false". After restart Kylin process, I met another error " java.sql.SQLSyntaxErrorException: ORA-00933: SQL 命令未正确结束", and I found this exception is caused by "AS" in from clause. I will create a JIRA and fixed it later, you may wait for next release. Following is my configuration: kylin.source.default=8 kylin.source.jdbc.connection-url=jdbc:oracle:thin:@hdp30-qa:49161/XE kylin.source.jdbc.driver=oracle.jdbc.driver.OracleDriver kylin.source.jdbc.dialect=oracle kylin.source.jdbc.user=system kylin.source.jdbc.pass=oracle kylin.source.jdbc.sqoop-home=/opt/cloudera/parcels/CDH/lib/sqoop kylin.source.jdbc.filed-delimiter=| kylin.source.hive.quote-enabled=false If you have any suggestion or find any mistake, please let me know, thank you very much. ---- Best wishes, Xiaoxiang Yu 在 2019/6/10 17:03,“高铭潮” 写入: Hi, all When I build the cube by Oracle JDBC. There is something error. Like this error message: java.io.IOException: OS command error exit with return code: 1, error message: Warning: /Users/gmc/Technology/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /Users/gmc/Technology/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /Users/gmc/Technology/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/Users/gmc/Technology/hadoop/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/Users/gmc/Technology/hbase/hbase-2.0.5/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2019-06-10 15:57:06,965 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2019-06-10 15:57:06,996 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2019-06-10 15:57:07,088 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 2019-06-10 15:57:07,106 INFO manager.SqlManager: Using default fetchSize of 1000 2019-06-10 15:57:07,106 INFO tool.CodeGenTool: Beginning code generation 2019-06-10 15:57:07,587 INFO manager.SqlManager: Executing SQL statement: SELECT `WFPROCESSINST`.`PROCESSINSTID` as `WFPROCESSINST_PROCESSINSTID` ,`WFPROCESSINST`.`PROCESSINSTNAME` as `WFPROCESSINST_PROCESSINSTNAME` ,`WFPROCESSINST`.`CREATOR` as `WFPROCESSINST_CREATOR` ,`WFPROCESSINST`.`OWNER` as `WFPROCESSINST_OWNER` ,`WFPROCESSINST`.`RELATEDATA` as `WFPROCESSINST_RELATEDATA` ,`WFPROCESSINST`.`STARTTIME` as `WFPROCESSINST_STARTTIME` ,`WFPROCESSINST`.`ENDTIME` as `WFPROCESSINST_ENDTIME` ,`WFPROCESSINST`.`FINALTIME` as `WFPROCESSINST_FINALTIME` ,`WFPROCESSINST`.`REMINDTIME` as `WFPROCESSINST_REMINDTIME` ,`WFPROCESSINST`.`CURRENTSTATE` as `WFPROCESSINST_CURRENTSTATE` ,`WFPROCESSINST`.`PARENTACTID` as `WFPROCESSINST_PARENTACTID` FROM `SIE_EMS`.`WFPROCESSINST` as `WFPROCESSINST` INNER JOIN `SIE_EMS`.`SMBP_PROCESSINSTBIZRELA` as `SMBP_PROCESSINSTBIZRELA` ON `WFPROCESSINST`.`PROCESSINSTID` = `SMBP_PROCESSINSTBIZRELA`.`PROCESSINSTID` INNER JOIN `SIE_EMS`.`WFWORKITEM` as `WFWORKITEM` ON `WFPROCESSINST`.`PROCESSINSTID` = `WFWORKITEM`.`PROCESSINSTID` WHERE 1=1 AND (`WFPROCESSINST`.CREATETIME >= '2016-01-01 00:00:00' AND `WFPROCESSINST`.CREATETIME < '2019-06-10 00:00:00') AND (1 = 0) 2019-06-10 15:57:07,623 ERROR manager.SqlManager: Error executing statement: java.sql.SQLSyntaxErrorException: ORA-00911: 无效字符 java.sql.SQLSyntaxErrorException: ORA-00911: 无效字符 at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951) at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513) at oracle.
Re:[DISCUSS] Support multiple pushdown query engines
Hi, +1. I am agree with such proposal. Kylin should support multi-level pushdown, query which can not match by cube should be pushdown to several engines in order, such as presto -> SparkSQL -> Hive, which is more reasonable and let query can be answer as far as possible. Maybe it is worthy to open a JIRA. - - Best wishes to you ! From :Xiaoxiang Yu 在 2019-07-05 15:37:41,"nichunen" 写道: >+1 > > >Sounds useful and not difficult to develop. > > > >Best regards, > > > >Ni Chunen / George > > > >On 07/5/2019 15:20,codingfor...@126.com wrote: >Hi, all: >Current (version 3.0.0-SNAPSHOT), kylin support only one kind of pushdown >query engine. In some user's scenario, need pushdown query to mysql, spark >sql,hive etc. >I think kylin need support multiple pushdowns. I want to discuss with you >whether it is need? >Any suggestion is welcome. Thanks.
Re:When building a cube, because the backquote is reported incorrectly
Hi friend, I think it is a good question. When related to RDBMS soure, make a correct decision is not so easy, I think it should be depend you requirement and your understanding about these feature. Firstly, source type = 8, is older, it use sqoop to ingest data from RDBMS into hive, thus make "create flat table" realizable. It has been tested under MySQL and SqlServer. After cube built successfully, your cube could answer all measure type provided by Kylin, and it is quicker than query directly aganist underlying RDBMS soure, . Secondly, source type = 16(we called DataSource SDK), is newer. Not only sqoop, but also apache calcite, has been introduced into Kylin, to make Kylin even stronger. In addition to the ability of ingesting data from RDBMS, but also DataSource SDK can rewrite your query to make more query could be answered bu Kylin. In some scenario, when your query cannot be match with your cube, kylin will try pushdown to let real source answer such query. But pushdown may failed bacause reason of difference of RDBMS's dialect. DataSource SDK can make a difference by rewriting your query based on a mapping file(a XML file) provided by you. And it has provided some SPI(service provided interface) to let your implement your rewrite logic, which is more stronger and also more difficult. You should check source code carefully to know how it should be used. So, as far as I can see, source type = 8 is OK for most scenario, it is out of box and should works well. On the another hand,if you want to make push down query more smoothly, you should try source type = 16. If you have more question, or I have make any mistake, please let me know. Thank you very much. - - Best wishes to you ! From :Xiaoxiang Yu At 2019-07-04 20:11:10, "紫电_恶魔" wrote: The following content is translated from Chinese using translation software, it is inevitable that errors, please treat Kylin.properties Kylin.source.default in the official documentation using jdbc source has 8 and 16 configurations, I do not know the difference http://kylin.apache.org/cn/development/datasource_sdk.html Based on this document, create a new configuration file, postgresql.xml, as follows The SQL executed when building cube is as follows Please tell me what other configuration needs to be done. Thank you
Re:Detailed Structure Reference
Hi, I don't know the exact meaning of "Kylin detailed structure". Could you please provided a more specific desciption? By the way, if you want to know the implementation of Kylin, you may visit Kylin's Doc and read some books. And I think this book should fit for most chinese reader, https://book.douban.com/subject/26975003/. - - Best wishes to you ! From :Xiaoxiang Yu At 2019-07-05 09:38:01, "shicheng31...@gmail.com" wrote: >Hi all: >Is there any reference of Kylin detailed structure? I have some quetions > about it , but I could not find any solution on official introduction. > Thanks. > > > >shicheng31...@gmail.com
Re: using mySql as metadata storage
Dear Sir, I want to share my opinion, and I will be glad if it could help you to make better decision. First, RDBMS metadata store is contributed by an experienced dev team which provided professional solution based on Apache Kylin, they have verified its stability in their customer's prod env before their contribution. After that, I have known some users which deployed Kylin on AWS EMR, and they choose MySQL as metadata store, their Kylin is running smoothly for several months. I think you could have a try by deploying a smaller Kylin cluster which use MySQL as metadata store and monitor its stability. And if you face any issue in the future, please let us know you problem, thank you. Best wishes, Xiaoxiang Yu 在 2019/11/14 13:57,“听风看雨” 写入: hey, may i ask, is now the support using mysql as metadata base stable? I've been looking through kylin official website for some time, but found no more hints in release notes after version 2.5.0, are there any details neglected? best wishes!
Re: Kylin 2.6.4 error when building cubes with hadoop 3.1.3
Hi, Do you use Kylin in CDH 6.3? I have heard one user deploy Kylin on CDH 6.3 and met the same problem? Best wishes, Xiaoxiang Yu 发件人: "zx张笑(深圳)" 答复: "dev@kylin.apache.org" 日期: 2019年11月16日 星期六 01:52 收件人: "dev@kylin.apache.org" 主题: Kylin 2.6.4 error when building cubes with hadoop 3.1.3 Hi,developers of, I’m a user of Kylin. I’m facing following errors at the third step when building a cube (Kylin_Fact_Distinct_Columns): [cid:image001.png@01D59BCF.A2D5BDF0] The versions of my environment is as follows: [cid:image002.png@01D59BCF.A2D5BDF0] I think it’s maybe a problem caused by guava versions. Hadoop 3.1.3 used guava-27.0,but kylin call function from guava versions less than guava-16. So, how can I fixed this problem. Thank you! Best Begards, Sunshine
Re: New committer: Temple Zhou
Temple Zhou , congratulations! Best wishes, Xiaoxiang Yu 在 2019/11/17 22:57,“Temple Zhou” 写入: Sorry for late reply, thank you everyone. Kylin community is a very open and friendly community. I'm very honored to become a Kylin committer. I will make Kylin more reliable and excellent in every way I can. The more people join us, the better the community will be. On Sun, Nov 17, 2019, 09:15 Yaqian Zhang wrote: > Congratulations! > > > 在 2019年11月16日,14:27,codingfor...@126.com 写道: > > > > Congratulations! > > > > > >> 在 2019年11月16日,13:57,nichunen 写道: > >> > >> Congratulations! > > > >
Re: Kylin to PostgreSQL Error in Cube build Step 1
Hi Andrey, Firstly, thank you for your testing on our build, I have some question to ask: 1. When you set kylin.source.default=16, you said you found “Oops… Failed to take action.”, did you see what the exception kylin throw? Could you please show us error message in kylin.log? Our patch work when kylin.source.default=16, so the error message throw by kylin when you set it to 8 is not what we care in this issue/PR. So the important things is what occurred when you see “Oops… Failed to take action.” 2. If you could provided more detail about you related config, maybe I can find something useful. Best wishes, Xiaoxiang Yu 发件人: Andrey Molotov 日期: 2019年11月21日 星期四 16:19 收件人: Xiaoxiang Yu 抄送: "dev@kylin.apache.org" 主题: Re: Kylin to PostgreSQL Error in Cube build Step 1 Hello, Sir. I’ve installed the Kylin binary you’ve provided. Also I’ve prepared data tables that you used to test you build https://github.com/apache/kylin/pull/902 . If I set a property kylin.source.default=16 and click on Load Table Metadata From Tree, I got an error: “Oops… Failed to take action.” So, I was forced to use kylin.source.default=8. I prepared model and cube just like you, but still got the error on the first step My env: • PostgreSQL 9.5.20 • cdh 5.16.2 • Kylin build from master branch Here is log: java.io.IOException: OS command error exit with return code: 1, error message: Warning: /home/hadoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/hadoop/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/hadoop/sqoop/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2019-11-21 10:51:09,835 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2019-11-21 10:51:09,872 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2019-11-21 10:51:09,982 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 2019-11-21 10:51:09,997 INFO manager.SqlManager: Using default fetchSize of 1000 2019-11-21 10:51:09,997 INFO tool.CodeGenTool: Beginning code generation 2019-11-21 10:51:10,443 INFO manager.SqlManager: Executing SQL statement: SELECT `FILM_PLAY`.`AUDIENCE_ID` as `FILM_PLAY_AUDIENCE_ID` ,`FILM_PLAY`.`FILM_ID` as `FILM_PLAY_FILM_ID` ,`FILM_PLAY`.`WATCH_TIME` ,`FILM_PLAY`.`PAYMENT` as `FILM_PLAY_PAYMENT` FROM `SC1`.`FILM_PLAY` as `FILM_PLAY` INNER JOIN `PUBLIC`.`FILM` as `FILM` ON `FILM_PLAY`.`FILM_ID` = `FILM`.`FILM_ID` INNER JOIN `SC2`.`AUDIENCE` as `AUDIENCE` ON `FILM_PLAY`.`AUDIENCE_ID` = `AUDIENCE`.`AUDIENCE_ID` WHERE 1=1 AND (`FILM_PLAY`.`WATCH_TIME` >= '2017-01-01 00:00:00' AND `FILM_PLAY`.`WATCH_TIME` < '2017-12-01 00:00:00') AND (1 = 0) 2019-11-21 10:51:10,454 ERROR manager.SqlManager: Error executing statement: org.postgresql.util.PSQLException: ERROR: syntax error at or near "." Position: 19 org.postgresql.util.PSQLException: ERROR: syntax error at or near "." Position: 19 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200) at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:424) at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:161) at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:114) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:777) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260) at org.apache.sqoop.manag
Re: Kylin to PostgreSQL Error in Cube build Step 1
Yes, actually there is NPE throw by JdbcExplorer, but it is not the first exception which related to JdbcExplorer, and that previous exception should be the root cause. You may search "jdbc.extensible" or "JdbcSource" in kylin.log and you will find more information. I guess maybe your jdbc information like user/password/url is not configured in correct way. Could you please find it and share with us? And it is OK if you send my the whole log file and I will check it. ---- Best wishes, Xiaoxiang Yu 发件人: Andrey Molotov 日期: 2019年11月21日 星期四 18:47 收件人: Xiaoxiang Yu 抄送: "dev@kylin.apache.org" 主题: Re: Kylin to PostgreSQL Error in Cube build Step 1 Dear Sir, Thank you for your reply. 1. Here is kylin.log at the moment, when “Oops… Failed to take action.” was thrown: 2019-11-21 13:04:32,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:05:02,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:05:08,277 INFO [BadQueryDetector] service.BadQueryDetector:147 : Detect bad query. 2019-11-21 13:05:32,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:06:02,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:06:08,278 INFO [BadQueryDetector] service.BadQueryDetector:147 : Detect bad query. 2019-11-21 13:06:32,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:07:02,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:07:08,278 INFO [BadQueryDetector] service.BadQueryDetector:147 : Detect bad query. 2019-11-21 13:07:32,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:07:54,609 DEBUG [http-nio-7070-exec-3] security.KylinAuthenticationProvider:114 : User ADMIN authorities : [ROLE_ADMIN, ROLE_ANALYST, ROLE_MODELER] 2019-11-21 13:07:54,609 DEBUG [http-nio-7070-exec-3] security.KylinAuthenticationProvider:57 : User cache [-108, 112, -63, -32, 41, -87, -81, 81, -32, 61, -35, -111, 7, 56, -29, -59] is removed due to EXPIRED 2019-11-21 13:07:54,609 DEBUG [http-nio-7070-exec-3] security.KylinAuthenticationProvider:128 : Authenticated user org.springframework.security.authentication.UsernamePasswordAuthenticationToken@3704d9a0: Principal: org.springframework.security.core.userdetails.User@3b40b2f: Username: ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER; Credentials: [PROTECTED]; Authenticated: true; Details: org.springframework.security.web.authentication.WebAuthenticationDetails@e21a: RemoteIpAddress: 172.0.0.66; SessionId: null; Granted Authorities: ROLE_ADMIN, ROLE_ANALYST, ROLE_MODELER 2019-11-21 13:07:54,610 DEBUG [http-nio-7070-exec-3] controller.UserController:52 : User login: org.springframework.security.core.userdetails.User@3b40b2f: Username: ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER 2019-11-21 13:08:02,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:08:08,279 INFO [BadQueryDetector] service.BadQueryDetector:147 : Detect bad query. 2019-11-21 13:08:32,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:09:02,365 INFO [FetcherRunner 308979117-47] threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others 2019-11-21 13:09:08,279 INFO [BadQueryDetector] service.BadQueryDetector:147 : Detect bad query. 2019-11-21 13:09:32,365 INF
Re: Error on EMR
Hi, I have successfully deployed latest version of Kylin(3.0.beta) on AWS EMR 5.27 and build a few cubes successfully, maybe you can have a try? The cluster is created by CLI looks like this, and I deployed Kylin on MASTER node: aws emr create-cluster --applications Name=Hadoop Name=Hive Name=Pig Name=Spark Name=Sqoop Name=Tez Name=Zeppelin Name=ZooKeeper Name=Ganglia\ --release-label emr-5.27.0 \ --instance-groups '[{"InstanceCount":4,"EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"SizeInGB":200,"VolumeType":"gp2"},"VolumesPerInstance":1}]},"InstanceGroupType":"CORE","InstanceType":"m4.2xlarge","Name":"Worker Cluster"},{"InstanceCount":1,"EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"SizeInGB":100,"VolumeType":"gp2"},"VolumesPerInstance":1}]},"InstanceGroupType":"MASTER","InstanceType":"c4.4xlarge","Name":"MasterQuery"}]' \ --configurations '[{"Classification":"hdfs-site","Properties":{"dfs.replication":"2"}}]' \ --ebs-root-volume-size 100 \--enable-debugging \ --name 'BenchmarkCluster' \ --scale-down-behavior TERMINATE_AT_TASK_COMPLETION \ --region cn-northwest-1 Best wishes, Xiaoxiang Yu 在 2019/12/2 20:38,“Tanmay Movva” 写入: Hello, We have installed kylin on our EMR master along with hbase, hadoop and hive. Using download-spark.sh from KYLIN_HOME/bin I have installed spark. As mentioned in "Install KYLIN on AWS EMR" guide we have followed the steps to configure Kylin working dir and hbase storage as S3 and also made the necessary zkquorum changes. When we run the sample.sh or check-env.sh we don't get any errors. But when we run the cube build job from UI, the job fails at stage-2 "Redistribute Flat Hive Tables". As the job "Create Intermediate Hive tables" has been completed successfully I don't think there has been any error with Hive. Can anyone help us with this? Thank You. java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf at org.apache.kylin.source.hive.CLIHiveClient.(CLIHiveClient.java:47) at org.apache.kylin.source.hive.HiveClientFactory.getHiveClient(HiveClientFactory.java:27) at org.apache.kylin.source.hive.RedistributeFlatHiveTableStep.computeRowCount(RedistributeFlatHiveTableStep.java:40) at org.apache.kylin.source.hive.RedistributeFlatHiveTableStep.doWork(RedistributeFlatHiveTableStep.java:91) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1928) at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1771) ... 11 more -- Regards, Tanmay Krishna Movva Razorpay
Re: Kylin v2.6.4 support Spark 2.4 or above
Hi friend, In my view, the latest version of Kylin should support Spark2.4, but do not support Spark SQL as data source. But I have see some patch which is try to achieve such targets. This is one of them, https://github.com/apache/kylin/pull/927, you may have a look. Best wishes, Xiaoxiang Yu 在 2019/12/9 21:38,“Madhusudhan Maankar” 写入: Hi, I would appreciate, if could please let me know the details on below points. 1. if Apache Kylin V2.6.4 has support to Spark 2.4 and above? 2. And if Kylin can work without Hive but with Spark only. Thanks, Madhusudhan Maankar. -- Sent from: http://apache-kylin.74782.x6.nabble.com/
Re: Releasing Apache Kylin v3.0-GA
Good news, I cannot wait to the next generation of Kylin. Best wishes, Xiaoxiang Yu 发件人: George Ni 答复: "u...@kylin.apache.org" 日期: 2019年12月6日 星期五 10:25 收件人: "u...@kylin.apache.org" , "dev@kylin.apache.org" 主题: Releasing Apache Kylin v3.0-GA Hi Community, As we have released v3.0-alpha, v3.0-alpha2, v3.0-beta, we have enough cofidence to release the GA version for v3.0 next week, and I’m planning to create a branch for its release. Detail features, improvements and bug fixes will come later, the main features are: 1. Realtime OLAP 2. Job scheduler with Apache Curator 3. User and user group management Please feel free to leave your comments here. - Best regards, Ni Chunen / George
Re: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970)
Hi Sir, I cannot see your screenshot, could you please paste the stack trace (in kylin.log) or provided more description to us? And have you ever try use the workaround which was provided in KYLIN-3970 to fix your problem? Best wishes, Xiaoxiang Yu 发件人: 田家铮 答复: "dev@kylin.apache.org" 日期: 2019年10月24日 星期四 11:44 收件人: dev 主题: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970) 环境配置: kylin-2.6.4 HDP-3.1.4。0 issue: 3970 https://issues.apache.org/jira/browse/KYLIN-3970?jql=text%20~%20%22partition.assignment.strategy%22<https://issues.apache.org/jira/browse/KYLIN-3970?jql=text%20%7E%20%22partition.assignment.strategy%22> 问题: 流式构建读取kafka数据时,无法构建,但是hive构建没问题 [cid:E5BB65D6@3C8BF016.500DB15D] [cid:1F6BECD4@A04E8373.500DB15D]
Re: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970)
Hi Sir, If I make no mistake, from the logs provided by you, it is clear that the workaround/steps provided by me in my later mail has FIXED the previous error. The previous error was occured at Step1(Collect data from kafka), and it is a class conflict problem, because we can see the text like " Error: org.apache.kafka.clients.consumer.ConsumerConfig.configNames()Ljava/util/Set;". And the later error is occured at Step4(Fact Distinct Column), because we can see the text like " org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper". I think it is a totally NEW problem(has no relation with KYLIN-3970 because it didn't use kafka client), and it may be caused by Mapreduce Framework. You should use less data and try again or modify the map reduce configuration (by check http://kylin.apache.org/docs/install/configuration.html#mr-config-override). If I make any mistake or you have more findings , please let me know. Thank you. ---- Best wishes, Xiaoxiang Yu 在 2019/10/24 19:10,“tianjiazheng” 写入: yes,I think this is the same problem,can you tell me the ways? -- 发自邮洽 _ 发件人: dev@kylin.apache.org 发送时间: 2019-10-24 18:58:53 收件人: dev@kylin.apache.org 主题: Re: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970) Does your error log like this KYLIN-3970 ? If yes, there're some ways to solve this problem, it may be helpful. 在 2019/10/24 17:30,“田家铮” 写入: I followed your steps and found the error report again. Can you help me fix this problem? 2019-10-24 05:25:11,171 ERROR [pool-11-thread-2] threadpool.DefaultScheduler:116 : ExecuteException job:9722333e-789b-1f30-50e6-38e9905802c0 org.apache.kylin.job.exception.ExecuteException: org.apache.kylin.job.exception.ExecuteException: org.apache.kylin.engine.mr.exception.MapReduceException: no counters for job job_1571906689528_0004Job Diagnostics:Task failed task_1571906689528_0004_m_00 Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0 Failure task Diagnostics: [2019-10-24 05:25:00.454]Exception from container-launch. Container id: container_e01_1571906689528_0004_01_05 Exit code: 255 [2019-10-24 05:25:00.456]Container exited with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could be found for logger (org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. [2019-10-24 05:25:00.457]Container exited with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could be found for logger (org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:182) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.kylin.job.exception.ExecuteException: org.apache.kylin.engine.mr.exception.MapReduceException: no counters for job job_1571906689528_0004Job Diagnostics:Task failed task_1571906689528_0004_m_00 Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0 Failure task Diagnostics: [2019-10-24 05:25:00.454]Exception from container-launch. Container id: container_e01_1571906689528_0004_01_05 Exit code: 255 [2019-10-24 05:25:00.456]Container exited with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could be found for logger (org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. [2019-10-24 05:25:00.457]Container exited with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could
Re: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970)
您好,我的微信ID是: hit-lacus,您可以加我微信看看能不能一起解决 Best wishes, Xiaoxiang Yu 在 2019/10/25 10:37,“田家铮” 写入: I have replaced all the guava versions in mapreduce.tar.gz with guava-11.0.2.jar and uploaded them to HDFS again, but the error is still the same! -- java.lang.NoSuchMethodError: com.google.common.hash.Hasher.putString(Ljava/lang/CharSequence;)Lcom/google/common/hash/Hasher; --
Re: kylin使用oracle做为数据源遇到的疑问
Hi friend, I am happy to hear that you are interested in Kylin, in fact kylin did NOT make a promise to support Oracle as JDBC source in the moment, so you cannot find any doc about oracle source because it is not exists. But I think you can still have a try. By the way, we cannot see you screenshot picture. And it is great if you could provided you config in Kylin . Best wishes, Xiaoxiang Yu 发件人: 莪哭ㄋ誰疼 <13205288...@qq.com> 答复: "dev@kylin.apache.org" 日期: 2019年11月22日 星期五 15:13 收件人: dev 主题: kylin使用oracle做为数据源遇到的疑问 kylin的开发团队: 见信好,我是一名kylin爱好者,我想深入学习使用kylin,我现在遇到一个问题。 问题描述如下:我是使用oracle作为原始数据的,然后的我按照JDBC数据源的配置方式,全部配置成功,表也都可以load到kylin里的,但在build我的Cube的时候报错了,报错我传到附件里面去了,后台打出来的sql语句我分析判断是方言错了,但是在官方的文档上没有找到相关的配置说明,还请大神指点迷津。然后我把我的配置也贴在邮件里了,望指教。 发送人:某位不知名的小人物 [cid:F38F3FE0@A814B056.8A52D65D.jpg]
Re: metastore clean OutOfMemoryError
You should a log like hs_err_pid19438.log, could you please show the content to us? Best wishes, Xiaoxiang Yu 在 2019/11/22 15:00,“MrWell” 写入: Hi, Kylin Team. When I execute "bin/metastore.sh clean --delete true" , I get a "OutOfMemoryError" like this java.lang.OutOfMemoryError: Java heap space Dumping heap to java_pid4839.hprof ... Heap dump file created [317991670bytes in 2.120 secs] # # java.lang.OutOfMemoryError: Java heap space # -XX:OnOutOfMemoryError="kill -9 %p" # Executing /bin/sh -c "kill -9 4839"... bin/metastore.sh: line 109: 4839 Killed ${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.MetadataCleanupJob "${@:2}" I have set 'setenv.sh' file, like this export KYLIN_JVM_SETTINGS="-Xms16g -Xmx16g -XX:MaxPermSize=512m -XX:NewSize=3g -XX:MaxNewSize=3g -XX:SurvivorRatio=4 -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:$KYLIN_HOME/logs/kylin.gc.$$ -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M" Dose it means heap memory is still small?
Re: Cube data tables for cube-wizard tutorial
Unfortunately, the image you sent to mailing list is NOT shown correctly, could you please re-send your image and the content of your kylin.log and version information of your Hadoop env to mailing list? Thank you. You can choose to store your image in somewhere on the Internet, and paste URL of image in mail. Best wishes, Xiaoxiang Yu 发件人: Ben Lee 答复: "dev@kylin.apache.org" 日期: 2019年11月28日 星期四 08:03 收件人: "dev@kylin.apache.org" 主题: Cube data tables for cube-wizard tutorial Hi team, I'm trying to follow the following tutorial http://kylin.apache.org/docs/tutorial/create_cube.html to create cube But I noticed that I missed the following DB and tables. ( tried to run samples.sh) [cid:ii_k3hm10iq0] Can anyone share where to get the data or share any other tutorial as a bundle together? Thanks, Ming
Re: Cube data tables for cube-wizard tutorial
Hi ming, I can not understand what happen in env. Could you please re-describe your problem in detail? Looks like recharge_detail is loaded successfully in Kylin. Do you mean when you loaded a table into Kylin, but cannot find it when you want to create a model? Do you check your kylin.log and maybe you will find some clue. Best wishes, Xiaoxiang Yu 发件人: Ben Lee 日期: 2019年11月29日 星期五 00:27 收件人: Xiaoxiang Yu 主题: Re: Cube data tables for cube-wizard tutorial Hi Xiaoxiang, Thanks for your response.Here is the picture again, [cid:image001.png@01D5A6A8.757A6610] Thanks, Ming. On Thu, 28 Nov 2019 at 00:51, Xiaoxiang Yu mailto:xiaoxiang...@kyligence.io>> wrote: Unfortunately, the image you sent to mailing list is NOT shown correctly, could you please re-send your image and the content of your kylin.log and version information of your Hadoop env to mailing list? Thank you. You can choose to store your image in somewhere on the Internet, and paste URL of image in mail. Best wishes, Xiaoxiang Yu 发件人: Ben Lee mailto:benmin...@gmail.com>> 答复: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" mailto:dev@kylin.apache.org>> 日期: 2019年11月28日 星期四 08:03 收件人: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" mailto:dev@kylin.apache.org>> 主题: Cube data tables for cube-wizard tutorial Hi team, I'm trying to follow the following tutorial http://kylin.apache.org/docs/tutorial/create_cube.html to create cube But I noticed that I missed the following DB and tables. ( tried to run samples.sh) 错误!未指定文件名。 Can anyone share where to get the data or share any other tutorial as a bundle together? Thanks, Ming
Re: Kylin to PostgreSQL Error in Cube build Step 1
Hi Molotov, The PR is under review and test, and In my side it is OK, you can check the test with screenshot at page (https://github.com/apache/kylin/pull/902) to see if it is tested well. If you want to test it at your env, please let me know, and I will send the binary to you. Best wishes, Xiaoxiang Yu 在 2019/10/28 15:08,“Andrey Molotov” 写入: Hello, thank you for your answer. I pulled the commit you provided and compiled jar file (two jar files, actually: kylin-source-jdbc-3.0.0-SNAPSHOT.jar and kylin-jdbc-3.0.0-SNAPSHOT.jar). Then for each of these files I did following: renamed it and put it instead of existing kylin-jdbc-2.6.4.jar file in kylin/lib directory. But unfortunately this did help me resolve my problem with the backtick in SQL query. Is there any other way to get a proper query line for PostgreSQL or maybe I did something wrong? Thanks in advance. > 16 окт. 2019 г., в 02:51, "codingfor...@126.com" написал(а): > > Hi, Molotov, because postgresql's syntax and metadata have certain specialities, need to do some development work. PR https://github.com/apache/kylin/pull/747 <https://github.com/apache/kylin/pull/747> id doing this kind of thing, it is in review now. > >>> 在 2019年10月15日,20:54,Andrey Molotov 写道: >> Hello, everyone. >> I’ve set up Kylin to access a PostgreSQL Database using JDBC as described in http://kylin.apache.org/docs/tutorial/setup_jdbc_datasource.html . >> I’ve also set kylin.source.default=16 and kylin.source.hive.enable.quote=false in kylin.properties. >> But when I try to build a cube a get an error on #1 Step Name: Sqoop To Flat Hive Table. >> My Kylin Version is 2.6.4. >> Here is log: >> java.io.IOException: OS command error exit with return code: 1, error message: Error: Could not find or load main class org.apache.hadoop.hbase.util.GetJavaProperty >> SLF4J: Class path contains multiple SLF4J bindings. >> SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: Found binding in [jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. >> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >> 2019-10-15 08:40:23,908 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 >> 2019-10-15 08:40:23,936 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. >> 2019-10-15 08:40:24,004 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. >> 2019-10-15 08:40:24,017 INFO manager.SqlManager: Using default fetchSize of 1000 >> 2019-10-15 08:40:24,017 INFO tool.CodeGenTool: Beginning code generation >> 2019-10-15 08:40:24,164 INFO manager.SqlManager: Executing SQL statement: SELECT "installations"."city" AS "INSTALLATIONS_CITY", "installations"."device_type" AS "INSTALLATIONS_DEVICE_TYPE", "installations"."install_datetime" >> FROM "data"."installations" AS "installations" >> WHERE 1 = 1 AND ("installations"."install_datetime" >= '2019-01-01' AND "installations"."install_datetime" < '2019-01-03') AND (1 = 0) >> 2019-10-15 08:40:24,176 INFO manager.SqlManager: Executing SQL statement: SELECT "installations"."city" AS "INSTALLATIONS_CITY", "installations"."device_type" AS "INSTALLATIONS_DEVICE_TYPE", "installations"."install_datetime" >> FROM "data"."installations" AS "installations" >> WHERE 1 = 1 AND ("installations"."install_datetime" >= '2019-01-01' AND "installations"."install_datetime" < '2019-01-03') AND (1 = 0) >> 2019-10-15 08:40:24,200 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/hadoop >> Note: /tmp/sqoop-hadoop/compile/33bbb7f633bb5f8338ed0a8e1e7ce3cc/QueryResult.java uses or overrides a deprecated API. >> Note: Recompile with -Xlint:deprecation for details. >>
Re: New committer: Chao Long
Chao, Congratulations!! Best wishes, Xiaoxiang Yu 在 2019/10/7 10:01,“zjsy...@163.com 代表 nichunen” 写入: Congratulations! Best regards, Ni Chunen / George On 10/7/2019 09:22,Yichen Zhou wrote: Congratulations, Chao!!! Best, Yichen On Sun, Oct 6, 2019 at 6:19 PM ShaoFeng Shi wrote: The Project Management Committee (PMC) for Apache Kylin has invited Chao Long to become a committer and we are pleased to announce that he has accepted. Chao Long (龙超,email: wayn...@qq.com) has started to contribute to Kylin since last year. Till today, he has made 81 commits on the master branch, resolved 71 JIRAs. His contribution includes: making fact distinct job in Spark, merging dictionary on Yarn, improving cube planner, parquet storage PoC, and many bug fixes. Besides, he also answered many questions on the mailing lists. Congratulations, Chao! Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re: [Announce] Apache Kylin 3.0.0 released
May new version of Kylin be even more powerful and stable! Best wishes, Xiaoxiang Yu 在 2019/12/24 13:53,“Wang rupeng” 写入: Congratulations! --- Best wishes, Rupeng Wang 在 2019/12/24 12:08,“Xiaoyuan Gu” 写入: Big congrats! Looking forward to seeing Kylin embraces more state-of-the-art features. Kudos to all contributors! Bests, Xiaoyuan At 2019-12-20 20:45:16, "ShaoFeng Shi" wrote: >The Apache Kylin team is pleased to announce the immediate availability of >the 3.0.0 release. > >This is the GA release of Kylin’s next generation after 2.x, with the new >real-time OLAP feature, Kylin can query streaming data with sub-second >latency. All of the > changes in this release can be found in: >https://kylin.apache.org/docs/release_notes.html > > >You can download the source release and binary packages from Apache Kylin's >download page:https://kylin.apache.org/download/ > > >Apache Kylin is an open-source Distributed Analytics Engine designed to >provide SQL interface and multi-dimensional analysis (OLAP) on Apache >Hadoop, supporting extremely > large datasets. > > >Apache Kylin lets you query massive dataset at sub-second latency in 3 >steps: >1. Identify a star schema or snowflake schema data set on Hadoop. >2. Build Cube on Hadoop. >3. Query data with ANSI-SQL and get results in sub-second, via ODBC, JDBC >or RESTful API. > > >Thanks to everyone who has contributed to this release. > > >We welcome your help and feedback. For more information on how to report >problems, and to get involved, visit the project website at >https://kylin.apache.org/ > >Best regards, > >Shaofeng Shi 史少锋 >Apache Kylin PMC >Email: shaofeng...@apache.org > >Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html >Join Kylin user mail group: user-subscr...@kylin.apache.org >Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re:Kylin start up failing on EMR 5.28
Dear friend, If I make no mistake, it looks like you faced two problem. The first one you are facing is how to set up kylin in single EMR cluster, I have tested it last month, I think http://kylin.apache.org/docs31/install/kylin_aws_emr.html and https://github.com/hit-lacus/hit-lacus.github.io/issues/81 may help you solve your problem. The second one is about R/W separated deployment, in this case, could you please show us how you configure in your kylin.properties? I will have a test when I have some free time. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-02-09 11:55:44, "Raghu Ram Reddy Medapati" wrote: >I've setup kylin on EMR edge node, when i try to start the service, i get the >below error. >Kylin -->Latest (3.0) >EMR --> 5.28 >ERROR [localhost-startStop-1] org.springframework.web.context.ContextLoader - >Context initialization failed >org.springframework.beans.factory.BeanCreationException: Error creating bean >with name >'org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter': > Instantiation of bean failed; nested exception is >org.springframework.beans.BeanInstantiationException: Failed to instantiate >[org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter]: > Constructor threw exception; nested exception is >java.lang.ClassCastException: com.fasterxml.jackson.datatype.joda.JodaModule >cannot be cast to com.fasterxml.jackson.databind.Module > at > org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateBean(AbstractAutowireCapableBeanFactory.java:1155) > ~[spring-beans-4
Re:Need support
Hi friend, Have you ever checked this doc http://kylin.apache.org/docs/install/index.html ? What kind of concrete problem did you met? As far as I know, test has passed on Hortonworks HDP 2.4-2.6 and 3.0 - 3.1 for latest Kylin binary. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-02-16 12:11:17, "Dinesh Dhanasekaran" wrote: >Hi Team, > >I need to install Kylin , could you please help me for installation on Ambari. > >Regards, >Dinesh Dhanasekaran
Re: [VOTE] Release apache-kylin-2.6.5 (RC2)
+1mvn test passed in my dev env -- Best wishes to you ! From :Xiaoxiang Yu At 2020-02-16 13:45:19, "Yaqian Zhang" wrote: >+1 >mvn test passed > >> 在 2020年2月16日,10:45,Xiaoyuan Gu 写道: >> >> +1 >> mvn test passed >> >> >> >> >> Bests, >> Xiaoyuan Gu >> >> >> >> >> >> At 2020-02-14 21:21:30, "George Ni" wrote: >>> Hi all, >>> >>> >>> >>> I have created a build for Apache Kylin 2.6.5, release candidate 2. >>> >>> >>> >>> Changes highlights: >>> >>> [KYLIN-4374] - Fix security issues reported by code analysis platform LGTM >>> >>> [KYLIN-4291] - Parallel segment building may causes WriteConflictException >>> >>> [KYLIN-4263] - Inappropriate exception handling causes job stuck on running >>> status >>> >>> [KYLIN-4169] - Too many logs during DataModelManager’s initiation, cause >>> the first RESTful API to hang for a long time >>> >>> >>> >>> Thanks to everyone who has contributed to this release. >>> >>> Here are the release notes: >>> >>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346281==12316121 >>> >>> >>> >>> The commit to being voted upon: >>> >>> https://github.com/apache/kylin/commit/73d42edec5f6492b3d3ffc222c26dce4bdfe7263 >>> >>> >>> Its hash is 73d42edec5f6492b3d3ffc222c26dce4bdfe7263. >>> >>> >>> >>> The artifacts to be voted on, including the source package and four >>> >>> pre-compiled binary packages are located here: >>> >>> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.5-rc2/ >>> >>> >>> >>> The hash of the artifacts are as follows: >>> >>> apache-kylin-2.6.5-source-release.zip.sha256 >>> >>> 32fb722a58ed49318dc5f72da429506238ee2089170236c742f22240d93080a3 >>> >>> apache-kylin-2.6.5-bin-hbase1x.tar.gz.sha256 >>> >>> f8e04fafa7e63cdeb28aa4e8c35bc2f54456888bef37d4cca060c424be722e1e >>> >>> apache-kylin-2.6.5-bin-cdh57.tar.gz.sha256 >>> >>> 71deb9bc84e5b75320e2c8ca358449f808ef2decbee11679e7f4339abeeb3d3f >>> >>> apache-kylin-2.6.5-bin-hadoop3.tar.gz.sha256 >>> >>> e77218204800a7b6be42e90e0fcb03e6e8458d600125e44bc112a8a359291fc4 >>> >>> apache-kylin-2.6.5-bin-cdh60.tar.gz.sha256 >>> >>> 2f7003eba8b528198aa7f886be04ecb75e86ece9e2e67dba4c15a89fb8817f7a >>> >>> >>> >>> A staged Maven repository is available for review at: >>> >>> https://repository.apache.org/content/repositories/orgapachekylin-1075/ >>> >>> >>> >>> Release artifacts are signed with the following key: >>> >>> https://people.apache.org/keys/committer/nic.asc >>> >>> >>> >>> Please vote on releasing this package as Apache Kylin 2.6.5 >>> >>> >>> >>> The vote is open for the next 72 hours and passes if a majority of >>> >>> at least three +1 PMC votes are cast. >>> >>> >>> >>> [ ] +1 Release this package as Apache Kylin 2.6.5 >>> >>> [ ] 0 I don't feel strongly about it, but I'm okay with the release >>> >>> [ ] -1 Do not release this package because... >>> >>> >>> >>> Here is my vote: >>> >>> >>> >>> +1 (binding) >>> >>> -- >>> >>> - >>> >>> Best regards, >>> >>> >>> >>> Ni Chunen / George
Re:[DISCUSS] Collect Kylin best practices with Apache Wiki
+1 Great, wiki is good place to share knowledge and it is easy to use. I will share what I found when I have some free time. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-02-18 20:15:52, "ShaoFeng Shi" wrote: Hello Kylin users, I'm proposing to collect the Kylin best practices with Apache Wiki. I have created an entry page, and start to compose some there. If you want to share or contribute, please email to the group, then we will review and add to it. The practice should be brief and easy to understand; If it need to dive into detail, a reference link can be provided together. Let's try, thank you! Here is the wiki link: https://cwiki.apache.org/confluence/display/KYLIN/Best+practices Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re:Kylin Building Engine With SparkSql & Parquet
Great news! I can foresee Kylin could be in a more Cloud-Native way after the mature of parquet storage. And I wish the developer team will share more detail for its desgin. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-01-19 22:22:30, "George Ni" wrote: >Hi Kylin users & developers, > >By-layer Spark Cubing has been introduced into Apache Kylin since v2.0 to >achieve better performance and it does run much faster compared to MR >engine. Also Hbase has been Kylin’s trustful storage engine since Kylin was >born and it has been proved to be a success for providing the ability to >handle high concurrency queries in extremely large data scale with low >latency. But there are also limitations for HBase, such as filtering is not >flexible as we could only filter by RowKey, measures are usually combined >together which causes more data to be scanned than requested. > > > >So in order to optimize Kylin in both building strategy and storage engine, >development team of Kyligence is introducing a new cube building engine >which uses Spark Sql to construct cuboids with a new strategy and stores >cube results in Parquet files. The building strategy allows Kylin to build >cuboids in a smarter way by choosing and building on the optimal cuboid >source. And Parquet, a columnar storage format available to any project in >the Hadoop ecosystem, will power the filtering ability with the page-level >column index and reduce I/O by saving measures in different columns. Also >with Storing cuboid in Parquet instead of Hbase, we can utilize Kylin in >Cloud Native way. More information on design and technique details will >come soon. > > > >Below is the comparison in building duration and size of results between >By-layer Spark Cubing and the new cubing strategy. > > > >Environment > >4-nodes Hadoop cluster > >YRAN has 400GB RAM and 128 cores in total; > >CDH 5.1, Apache Kylin 3.0. > > > >Spark > >Spark 2.4.1-kylin-r17 > > > >Test Data > >SSB data > >Cube: 15 dimensions, 3 measures (SUM) > > > >Test Scenarios > >Build the cube at different source size level: 30 million, 60 million >source rows; Compare the build time with Spark (by layer) + Hbase and >SparkSql + Parquet. > > >Besides, we attempt to resolve many drawbacks in current query engine, >which relies heavily on Apache Calcite, such as the performance bottleneck >in aggregating large query results which currently can only be operated by >a single worker. By embracing SparkSql, this kind of expensive computing >can be done distributedly. Also combined with Parquet format, plenty of >filtering optimizations could be applied,which will boost Kylin’s query >performance significantly. The features will be open source along with >technique details in the near future. > > > > - https://issues.apache.org/jira/browse/KYLIN-4188 > > >-- > >- > >Best regards, > > > >Ni Chunen / George
Re: New committer: Xiaoxiang Yu
Thank you. I am very grateful to everyone who gave me guide and support during this period, especially Shaofeng Shi and Gang Ma. Wish kylin will be even more powerful and user-friendly in 2020! -- Best wishes to you ! From :Xiaoxiang Yu 在 2019-12-30 16:46:29,"JiaTao Tao" 写道: >Congratulations Xiaoxiang! > >-- > > >Regards! > >Aron Tao > >> 在 2019年12月29日,17:17,ShaoFeng Shi 写道: >> >> Hi folks, >> >> The Project Management Committee (PMC) for Apache Kylin >> has invited Xiaoxiang Yu to become a committer and we are pleased to >> announce that he has accepted. >> >> Xiaoxiang Yu (俞霄翔, email hit_la...@126.com) is one of the big data >> engineers from Kyligence; He started to work on the Kylin project since the >> middle of 2018. In the past time, he fixed many issues, investigated and >> verified many new features (especially the v3.0 real-time streaming), >> enhancements and bug fixes. Thank you and congratulations, Xiaoxiang! >> >> Let's warmly welcome Xiaoxiang as the Kylin committer! >> >> Best regards, >> >> Shaofeng Shi 史少锋 >> Apache Kylin PMC >> Email: shaofeng...@apache.org >> >> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html >> Join Kylin user mail group: user-subscr...@kylin.apache.org >> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
Re: kafka-kylin real time streaming
Hello Sir: I am very glad to hear your comment. Here is my replies: 1. I see the subject of this email is "real time streaming", it seems to indicated that you are using the "Realtime OLAP" feature(see http://kylin.apache.org/docs/tutorial/realtime_olap.html ) which introduced in Kylin 3.0, not the NRT streaming feature(see http://kylin.apache.org/docs/tutorial/cube_streaming.html ) which introduced in Kylin v1.6, am I right? But I see you are using the Rest API " http://localhost:7070/kylin/api/cubes/{your_cube_name}/init_start_offsets " which is only for NRT Streaming, not the Realtime OLAP. 2. For the real-time OLAP cube, please check this if you need to update segment(http://kylin.apache.org/docs/tutorial/lambda_mode_and_timezone_realtime_olap.html). 3. What is real-time OLAP cube? It is the cube which its fact table is loaded by icon "Add streaming table V2". 4. I have updated my message via PR(https://github.com/apache/kylin/pull/1018 ), when it be merged, you can check the new documentation. Thank you again for your comments. If you have more suggestions, please let us know. ---- Best wishes, Xiaoxiang Yu 在 2019/12/25 16:35,“newUser” 写入: Hello, I am using Kylin version 3.0 I could manage to create streaming cube But I can't refresh the cube. I am getting this error. The new refreshing segment kafka_kylin_cube[20191225093200_20191225093200] does not match any exisiting segment in cube CUBE[name=kylin_cube] But when I run this code I get successfull message but cant refresh the cube. curl -X PUT --user ADMIN:KYLIN -H "Content-Type: application/json;charset=utf-8" -d '{ "sourceOffsetStart": 0, "sourceOffsetEnd": 9223372036854775807, "buildType": "REFRESH"}' http://localhost:7070/kylin/api/cubes/{your_cube_name}/init_start_offsets -- Sent from: http://apache-kylin.74782.x6.nabble.com/
Re:Kylin GUI error "Cannot get HiveTableMeta" on AWS EMR with Glue as hive metastore
For 3.0.2, we are trying to fix a few security issues, and code change is under review. I think we maybe start to plan a release RC for 3.0.2 in next two week, depend on the review result. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-04-10 09:21:59, "mvishnubhatta" wrote: >Ah, that makes sense. I didn't look closely at the "31" in the link. I will >try out your suggestions, but is there any planned timeline for 3.0.2 or >3.1? > >Appreciate your quick response. > >-- >Sent from: http://apache-kylin.74782.x6.nabble.com/
Re:Kylin GUI error "Cannot get HiveTableMeta" on AWS EMR with Glue as hive metastore
Hi, Glad to hear you that you have fix previous issue and thank you for your update, that is great. For the AWS Glue support with Kylin, that feature will be introduced in future release(should be 3.0.2 and 3.1.0), please check https://issues.apache.org/jira/browse/KYLIN-4206, the doc https://kylin.apache.org/docs31/install/kylin_aws_emr.html is for Kylin 3.1, and it is not released yet. For now, you can build binary package from master branch and do a test, this is my https://github.com/hit-lacus/hit-lacus.github.io/issues/81 . Besides, another committer Kaige has another suggestion(but I didn't test it), you can check it: https://issues.apache.org/jira/browse/KYLIN-3685 . -- Best wishes to you ! From :Xiaoxiang Yu At 2020-04-10 05:29:04, "mvishnubhatta" wrote: >Hi, > >I am trying to set up Kylin on an EMR cluster (trying both on a master node >and on an edge node). While the GUI shows up, I get a "Cannot get >HiveTableMeta" error when trying to load a hive table. > >The hive metastore is AWS Glue. > >Curiously, the most recent version of the document ><https://kylin.apache.org/docs/install/kylin_aws_emr.html> does not say >anything about Glue) > >Looking at an older version of the document ><https://kylin.apache.org/docs31/install/kylin_aws_emr.html> , I tried the >following: > >cp /usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar >$KYLIN_HOME/lib/ >cp >/usr/share/aws/hmclient/lib/aws-glue-datacatalog-client-common-1.11.0-SNAPSHOT.jar >$KYLIN_HOME/lib/ >#Modify kylin.properties to uncomment the entry >kylin.source.hive.client=cli >#Modify kylin.properties to add an entry that says >kylin.source.hive.metadata-type=gluecatalog > >But I still get the same error on the GUI. > >I feel this Kylin server is not even aware that Glue is the hive catalog >since I cannot find any reference to Glue in any of the logs or error >messages. > >Any ideas on how to set up the hive metastore correctly? > >Hadoop version is Hadoop 2.8.5-amzn-4 >Kylin version is 3.0.1 >Hive 2.3.5-amzn-0 > >I exported the following variables: >export KYLIN_HOME=/usr/local/kylin/apache-kylin-3.0.1-bin-hbase1x >#export HADOOP_HOME=/usr/lib/hadoop/etc/hadoop >export HADOOP_HOME=/usr/lib/hadoop >export HBASE_HOME=/usr/lib/hbase/ >export HIVE_HOME=/usr/lib/hive/ >export HADOOP_CONF_DIR=/etc/hadoop/conf >export HIVE_LIB=/usr/lib/hive/lib >export HIVE_CONF=/etc/hive/conf >export HCAT_HOME=/usr/lib/hive-hcatalog > >The kylin.log file throws this error: >2020-04-09 21:17:21,243 ERROR [http-bio-7070-exec-1] >controller.TableController:130 : Failed to load Hive Table >java.lang.RuntimeException: cannot get HiveTableMeta >at >org.apache.kylin.source.hive.HiveMetadataExplorer.loadTableMetadata(HiveMetadataExplorer.java:68) >at >org.apache.kylin.rest.service.TableService.extractHiveTableMeta(TableService.java:211) >at >org.apache.kylin.rest.service.TableService.loadHiveTablesToProject(TableService.java:137) >at >org.apache.kylin.rest.controller.TableController.loadHiveTables(TableController.java:114) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >at java.lang.reflect.Method.invoke(Method.java:498) >at >org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) >at >org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) >at >org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) >at >org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) >at >org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) >at >org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) >at >org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) >at >org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) >at >org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) >at >org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) >at javax.servlet.http.HttpServlet.service(HttpS
Re:Kylin Start fails with NoClassDefFoundError on AWS EMR Edge Node
Hi friend, I have a suggestion and I am not sure if it is the best practice, and I don't know if it works. But maybe you can have a try if you are willing to do it. I guess maybe you have installed Kylin successfully on Master node of EMR cluster, right? After kylin instance was started successfully(if you are using the kylin 3.0 or above), you will see some files with cached-*-dependency.sh under $KYLIN_HOME/bin, in all these files, you can find the location of jars that is needed by Kylin. You may try copy the jars from master node to edge node in the same folder, then restart Kylin in edge node. On the another hand, you don't need to modify the property "kylin.job.mr.lib.dir" when starting a Kylin instance. Following : cached-hadoop-conf-dir.sh cached-hbase-dependency.sh cached-hive-dependency.sh cached-kafka-dependency.sh cached-spark-dependency.sh [root@cdh-master all]# cat bin/cached-hive-dependency.sh export hive_dependency=/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/conf:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/hive-jdbc-standalone.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/junit-4.11.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/commons-collections-3.2.2.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/hive-service.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/metrics-jvm-3.0.2.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/ant-launcher-1.9.1.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/log4j-1.2.16.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/datanucleus-api-jdo-3.2.6.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/hive-cli-1.1.0-cdh5.7.6.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/hive-ant.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib . -- Best wishes to you ! From :Xiaoxiang Yu At 2020-04-09 09:22:59, "mvishnubhatta" wrote: >Hi, > >I am trying to set up a Kylin server on an AWS EMR edge node. When doing >that the web URL throws the error that "The origin server did not find a >current representation for the target resource or is not willing to disclose >that one exists". > >The tomcat localhost log file shows a NoClassDefFoundError for the class >org/apache/hadoop/hive/metastore/api/NoSuchObjectException > >But I see this class in the hive library >/usr/lib/hive/lib/hive-metastore-2.3.5-amzn-0.jar and the path is exported >as HIVE_LIB and as HIVE_HOME (used HIVE_LIB based on suggestion here >https://issues.apache.org/jira/browse/KYLIN-2511). > >This appears to be a case where some CLASSPATH somewhere is not correctly >including this path. > >So, I tried modifying the kylin.properties file to add a property called >kylin.job.mr.lib.dir pointing to the /usr/lib/hive/lib/ but that didnt help >either. (This too was based on another suggested workaround here). > >Can someone point out what I am missing? > >Here are the full details: > >Hadoop version is Hadoop 2.8.5-amzn-4 >Kylin version is 3.0.1 >Hive 2.3.5-amzn-0 > >I have an AWS EMR Cluster and a separate EC2 machine that is set up as an >edge node. From this edge node I am able to run hive queries to connect to >the EMR Master node. Hbase also works on this node. > >On this edge node, I are trying to start a Kylin server. I followed the >instructions given here: >https://kylin.apache.org/docs/install/kylin_aws_emr.html ><https://kylin.apache.org/docs/install/kylin_aws_emr.html> > >I exported the following variables: >export KYLIN_HOME=/usr/local/kylin/apache-kylin-3.0.1-bin-hbase1x >#export HADOOP_HOME=/usr/lib/hadoop/etc/hadoop >export HADOOP_HOME=/usr/lib/hadoop >export HBASE_HOME=/usr/lib/hbase/ >export HIVE_HOME=/usr/lib/hive/lib/ >export HADOOP_CONF_DIR=/etc/hadoop/conf >export HIVE_LIB=/usr/lib/hive/lib >export HIVE_CONF=/etc/hive/conf >export HCAT_HOME=/usr/lib/hive-hcatalog > > >The command I run is: $KYLIN_HOME/bin/kylin.sh start > >After starting it, I get this message: > >/A new Kylin instance is started by hadoop. To stop it, run 'kylin.sh stop' >Check the log at >/usr/local/kylin/apache-kylin-3.0.1-bin-hbase1x/logs/kylin.log >Web UI is at http://ip-x.compute.internal:7070/kylin/ > >When I navigate to the URL, I get the following error. > >"The origin server did not find a current representation for the target >resource or is not willing to d
RE: Kylin 3.0 - Need some help - Can't open the Kylin UI
Hi friend, I can felt your pain, but please be patient. In my opinion, to make good use of Apache Kylin, or any other complex system, some prerequisites should be met. Firstly, I think understanding Kylin's overall architecture, some knowledge of analysing logs(kylin.log/kylin.out) of Kylin are very important, only in this way you can narrow down the problem you faced. After problem was identified, if you have some knowledge of how to manipulate Hadoop components: such as Apache HBase, Apache HDFS and Apache Hive, I think you can fix problem yourself or ask for assistance from community. Hortonworks have provided some documentation which looks good to me(https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/starting-hive/content/hive_start_a_command_line_query_locally.html), maybe you can have a try. Besides, did you ever try to use docker image to start kylin in one command? It will give you a learning env without any Hadoop cluster. I think it works well in my laptop(Macbook Pro,15 inch with the latest Docker Desktop installed), here is the link, https://hub.docker.com/r/apachekylin/apache-kylin-standalone -- Best wishes to you ! From :Xiaoxiang Yu At 2020-04-15 21:46:49, "Phillip Poirier" wrote: >Not that misery always loves company, but I had the previous Kylin version >and was trying to use it with the Hortonworks 2.0 distro, and even though I >could eventually get the service to start, the UI would never launch. The >few people that responded to me told me to try things I already tried (like >try a different URL or try using different browsers) and it never worked. >As someone who's spent decades working in BI and OLAP (especially in SSAS, >Essbase and TM1) I too was very interested in trying Kylin as a big data >OLAP option and even took the Udemy course on it. Unfortunately, I just had >to sit and watch as the course instructor performed cube-building tasks as I >was never able to get the UI to launch. > > > >If you get a real answer to this issue, I'd like to try again. I have the >Hortonworks 3.0 distro now, and would like to try the new version of Kylin, >if possible. If not, I'll take a look at Druid. > > > >Good luck. > > > >-Phil > > > >From: Rubio Piqueras, David [mailto:david.ru...@gft.com] >Sent: Tuesday, April 14, 2020 11:58 AM >To: dev@kylin.apache.org >Subject: Kylin 3.0 - Need some help - Can't open the Kylin UI > > > >Hi guys, > > > >Just starting using Kylin too which looks so interesting. > >We downloaded your Kylin 3.0 image version: > > > > > >However, I can't open the Kylin UI anytime I try to open it. Checking the >logs, this is what we are seeing: > > > >2020-04-14 13:58:23,853 INFO [main-SendThread(localhost:2181)] >zookeeper.ClientCnxn:1235 : Session establishment complete on server >localhost/127.0.0.1:2181, sessionid = 0x17178f6fbd6000a, negotiated timeout >= 4 > >Exception in thread "main" java.lang.IllegalArgumentException: Failed to >find metadata store by url: kylin_metadata@hbase > >at >org.apache.kylin.common.persistence.ResourceStore.createResourceStore(Resour >ceStore.java:101) > >at >org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.jav >a:113) > >at >org.apache.kylin.rest.service.AclTableMigrationTool.checkIfNeedMigrate(AclTa >bleMigrationTool.java:99) > >at >org.apache.kylin.tool.AclTableMigrationCLI.main(AclTableMigrationCLI.java:43 >) > >Caused by: java.lang.reflect.InvocationTargetException > >at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >Method) > >at >sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcces >sorImpl.java:62) > >at >sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstruc >torAccessorImpl.java:45) > >at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > >at >org.apache.kylin.common.persistence.ResourceStore.createResourceStore(Resour >ceStore.java:94) > >... 3 more > >Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed >after attempts=1, exceptions: > >Tue Apr 14 13:58:24 UTC 2020, >RpcRetryingCaller{globalStartTime=1586872703980, pause=100, retries=1}, >java.net.ConnectException: Connection refused > > > >at >org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetrying >Caller.java:147) > >at >org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture >.run(ResultBoundedCompletionService.java:64) > >at >java.util.concurrent.ThreadPoolExecutor.runWorker(Thre
Re:Kylin GUI error "Cannot get HiveTableMeta" on AWS EMR with Glue as hive metastore
Thank you for update on this. And appreciate Kaige 's contribution. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-04-10 22:38:00, "mvishnubhatta" wrote: >Great. Thank you. I tried out Kaige's suggestion on >https://issues.apache.org/jira/browse/KYLIN-3685 that you pointed me to and >that worked for me. > >-- >Sent from: http://apache-kylin.74782.x6.nabble.com/
Re:[VOTE] Release apache-kylin-2.6.6 (RC1)
Hi Kylin team, I am sadly to say that I find one patch was not introduced in both two release candidates, thus cause some serious Hive integration error, such as Glue support. Here is the patch link : https://github.com/apache/kylin/pull/1115/ . So my vote is: -1. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-05-11 20:51:14, "George Ni" wrote: >Hi all, > > > >I have created a build for Apache Kylin 2.6.6, release candidate 1. > > > >Changes highlights: > >[KYLIN-4390] - Update tomcat to 7.0.100 > >[KYLIN-4426] - Refine CliCommandExecutor > >[KYLIN-4206] - Support Glue as Hive Metatdata > > > >Thanks to everyone who has contributed to this release. > >Here are the release notes: > >https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346976==12316121 > > > >The commit to being voted upon: >https://github.com/apache/kylin/commit/b2f6fea3368f6b75892cf38294c2d18696758fa7 > > >Its hash is b2f6fea3368f6b75892cf38294c2d18696758fa7. > > > >The artifacts to be voted on, including the source package and four > >pre-compiled binary packages are located here: > >https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.6-rc1/ > > > >The hash of the artifacts are as follows: > >apache-kylin-2.6.6-source-release.zip.sha256 >1a796080df7bc3e7ed75a2f07a8795de11275686cca8b851f53a3df21f0cb824 > >apache-kylin-2.6.6-bin-hbase1x.tar.gz.sha256 >39dee0f749cb8d83505ec8d24bbfc1c98681c2cfcc6ca7aec2ec058e20a68e21 > >apache-kylin-2.6.6-bin-cdh57.tar.gz.sha256 >48169ddbe9ebba1977a490333e6a767c9eb5e172134d1c53cf581cff7064cc98 > >apache-kylin-2.6.6-bin-hadoop3.tar.gz.sha256 >589dd14afaf5751ee1277d2920748a7000e44ecfef12d21574c3adf83825149b > >apache-kylin-2.6.6-bin-cdh60.tar.gz.sha256 >1768b8dc916f35d437c5c6080fa3c9ce4debeb2d3a7a442eace9bbf7a8840c26 > > > >A staged Maven repository is available for review at: > >https://repository.apache.org/content/repositories/orgapachekylin-1076/ > > > >Release artifacts are signed with the following key: > >https://people.apache.org/keys/committer/nic.asc > > > >Please vote on releasing this package as Apache Kylin 2.6.6. > > > >The vote is open for the next 72 hours and passes if a majority of > >at least three +1 PMC votes are cast. > > > >[ ] +1 Release this package as Apache Kylin 2.6.6 > >[ ] 0 I don't feel strongly about it, but I'm okay with the release > >[ ] -1 Do not release this package because... > > > > > >Here is my vote: > > > >+1 (binding) > >-- > >- > >Best regards, > > > >Ni Chunen / George
Re:[VOTE] Release apache-kylin-3.0.2 (RC2)
+1 . Maven test passed and happy path passed in CDH5.7. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-05-14 17:11:28, "George Ni" wrote: >Hi all, > > > >I have created a build for Apache Kylin 3.0.2, release candidate 2. > > > >Changes highlights: > >[KYLIN-4390] - Update tomcat to 7.0.100 > >[KYLIN-4426] - Refine CliCommandExecutor > >[KYLIN-4206] - Support Glue as Hive Metatdata > > > >Thanks to everyone who has contributed to this release. > >Here are the release notes: > >https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346975==12316121 > > > >The commit to being voted upon: > >https://github.com/apache/kylin/commit/57090efe4bdc079ccfde4f9c8729d69ba3a90624 > > >Its hash is 57090efe4bdc079ccfde4f9c8729d69ba3a90624. > > > >The artifacts to be voted on, including the source package and four > >pre-compiled binary packages are located here: > >https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.2-rc2/ > > > >The hash of the artifacts are as follows: > >apache-kylin-3.0.2-source-release.zip.sha256 >1add5892bd1d5994e0e467846e9a844758420f14819ceef63370c07a6aa0b8af > >apache-kylin-3.0.2-bin-hbase1x.tar.gz.sha256 >086397d9ecbccf80517977a4b65b660b8e1496ad097d890226bd78a34a9fe190 > >apache-kylin-3.0.2-bin-cdh57.tar.gz.sha256 >181929fcd35a63a81b6dc097137a3dd1e129fd1f81400e09f64019dcb7ac8a21 > >apache-kylin-3.0.2-bin-hadoop3.tar.gz.sha256 >c2250734fed971f32d242036a55ba955bcf8de91e0e73704e07cfb09124d9899 > >apache-kylin-3.0.2-bin-cdh60.tar.gz.sha256 >83a68d2aec32e634475c490434981ebc91e8680dbb6388edc4ed919687ad1dac > > > >A staged Maven repository is available for review at: > >https://repository.apache.org/content/repositories/orgapachekylin-1078/ > > > >Release artifacts are signed with the following key: > >https://people.apache.org/keys/committer/nic.asc > > > >Please vote on releasing this package as Apache Kylin 3.0.2. > > > >The vote is open for the next 72 hours and passes if a majority of > >at least three +1 PMC votes are cast. > > > >[ ] +1 Release this package as Apache Kylin 3.0.2 > >[ ] 0 I don't feel strongly about it, but I'm okay with the release > >[ ] -1 Do not release this package because... > > > > > >Here is my vote: > > > >+1 (binding) > >-- > >- > >Best regards, > > > >Ni Chunen / George
Re:[VOTE] Release apache-kylin-2.6.6 (RC2)
+1 . Maven test passed and happy path passed in CDH5.7. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-05-14 17:06:55, "George Ni" wrote: >Hi all, > > > >I have created a build for Apache Kylin 2.6.6, release candidate 2. > > > >Changes highlights: > >[KYLIN-4390] - Update tomcat to 7.0.100 > >[KYLIN-4426] - Refine CliCommandExecutor > >[KYLIN-4206] - Support Glue as Hive Metatdata > > > >Thanks to everyone who has contributed to this release. > >Here are the release notes: > >https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346976==12316121 > > > >The commit to being voted upon: > >https://github.com/apache/kylin/commit/ddd5f8ecd4157b8f889b047e421dd9cfae7e1142 > > >Its hash is ddd5f8ecd4157b8f889b047e421dd9cfae7e1142. > > > >The artifacts to be voted on, including the source package and four > >pre-compiled binary packages are located here: > >https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.6-rc2/ > > > >The hash of the artifacts are as follows: > >apache-kylin-2.6.6-source-release.zip.sha256 >6d38671f494e3d5f2bb26dfb94d996a5ceb0c00c2a17b9c181ad853639198d3a > >apache-kylin-2.6.6-bin-hbase1x.tar.gz.sha256 >6a42962efbce5a51e2ce4bf8db0b8fa7341ef0b30e4f02e876a5c2fb0500944b > >apache-kylin-2.6.6-bin-cdh57.tar.gz.sha256 >85cb22e7d6d9adad214854f9ab285b7d47e874eb9f9df1c5bd01882877171762 > >apache-kylin-2.6.6-bin-hadoop3.tar.gz.sha256 >f060f8e16f909ae74d9e3c188bb071fcfa87e0a21fd7581fc968e1bcf00e5121 > >apache-kylin-2.6.6-bin-cdh60.tar.gz.sha256 >8d85a3036d312b47030e3b309af526afe4484720be156f7e3f05e626c02bf531 > > > >A staged Maven repository is available for review at: > >https://repository.apache.org/content/repositories/orgapachekylin-1077/ > > > >Release artifacts are signed with the following key: > >https://people.apache.org/keys/committer/nic.asc > > > >Please vote on releasing this package as Apache Kylin 2.6.6. > > > >The vote is open for the next 72 hours and passes if a majority of > >at least three +1 PMC votes are cast. > > > >[ ] +1 Release this package as Apache Kylin 2.6.6 > >[ ] 0 I don't feel strongly about it, but I'm okay with the release > >[ ] -1 Do not release this package because... > > > > > >Here is my vote: > > > >+1 (binding) > >-- > >- > >Best regards, > > > >Ni Chunen / George
Re:[VOTE] Release apache-kylin-3.0.2 (RC1)
Hi Kylin team, I am sadly to say that I find one patch was not introduced in both two release candidates, thus cause some serious Hive integration error, such as Glue support. Here is the patch link : https://github.com/apache/kylin/pull/1115/ . So my vote is: -1. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-05-11 20:36:42, "George Ni" wrote: >Hi all, > > > >I have created a build for Apache Kylin 3.0.2, release candidate 1. > > > >Changes highlights: > >[KYLIN-4390] - Update tomcat to 7.0.100 > >[KYLIN-4426] - Refine CliCommandExecutor > >[KYLIN-4206] - Support Glue as Hive Metatdata > > > >Thanks to everyone who has contributed to this release. > >Here are the release notes: > >https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346975==12316121 > > > >The commit to being voted upon: > >https://github.com/apache/kylin/commit/d0d3e124372991331d96f881b2361b865bf4f2d9 > > > >Its hash is d0d3e124372991331d96f881b2361b865bf4f2d9. > > > >The artifacts to be voted on, including the source package and four > >pre-compiled binary packages are located here: > >https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.2-rc1/ > > > >The hash of the artifacts are as follows: > >apache-kylin-3.0.2-source-release.zip.sha256 >7f68ba3d9ffd73cb405cc1bea6b14f988274e851e1be0e93c75682120308c994 > >apache-kylin-3.0.2-bin-hbase1x.tar.gz.sha256 >3119db9f3fdcf530a031d31f605148da092dc41be8c69194b609766091915ea0 > >apache-kylin-3.0.2-bin-cdh57.tar.gz.sha256 >7a4f2f1aeb66d68012b42e03f0b92ad7a8abd3d5b6afcfbc9a3c27c9c3e7a219 > >apache-kylin-3.0.2-bin-hadoop3.tar.gz.sha256 >7678390303e03c98fcf2d1233d21d95ac04d3e36424ba7f2b23155103d089c7b > >apache-kylin-3.0.2-bin-cdh60.tar.gz.sha256 >db5998d22679b19cf038b707faf6ad70178742b788eec2dcb215ccfb443c3433 > > > >A staged Maven repository is available for review at: > >https://repository.apache.org/content/repositories/orgapachekylin-1076/ > > > >Release artifacts are signed with the following key: > >https://people.apache.org/keys/committer/nic.asc > > > >Please vote on releasing this package as Apache Kylin 3.0.2. > > > >The vote is open for the next 72 hours and passes if a majority of > >at least three +1 PMC votes are cast. > > > >[ ] +1 Release this package as Apache Kylin 3.0.2 > >[ ] 0 I don't feel strongly about it, but I'm okay with the release > >[ ] -1 Do not release this package because... > > > > > >Here is my vote: > > > >+1 (binding) > >-- > >- > >Best regards, > > > >Ni Chunen / George
Re:回复:[VOTE] Release apache-kylin-4.0.0-alpha (RC1)
+1 mvn test passed on my machine; happy path passed on CDH 5.7. -- Best wishes to you ! From :Xiaoxiang Yu 在 2020-09-09 08:21:57,"恩爸" <441586...@qq.com> 写道: >+1 from my side. (non-binding) > > > > > > > >Best regards, >Zhichao Zhang > > > > > > > > > >--原始邮件-- >发件人: > "dev" > 发送时间:2020年9月8日(星期二) 晚上11:49 >收件人:"dev" >主题:[VOTE] Release apache-kylin-4.0.0-alpha (RC1) > > > >Hi all, > >I have created a build for Apache Kylin 4.0.0-alpha, release candidate >1. Please note, this release is built on kylin-on-parquet-v2 branch. > > >Changes highlights: >[KYLIN-4213] - The new build engine with Spark-SQL >[KYLIN-4450] - Add the feature that adjusting spark driver memory adaptively >[KYLIN-4458] - FilePruner prune shards >[KYLIN-4475] - Support intersect count for Kylin on Parquet >[KYLIN-4462] - Support Count Distinct,TopN and Percentile by kylin on Parquet >[KYLIN-4713] - Support use diff spark schedule pool for diff query >[KYLIN-4468] - Support Percentile by kylin on Parquet >[KYLIN-4662] - Migrate from third-party Spark to offical Apache Spark >[KYLIN-4701] - Upgrade front-end from HBase Storage to Parquet Storage >[KYLIN-4644] - New tool to clean up intermediate files for Kylin 4.0 >[KYLIN-4744] - Add tracking URL for build spark job on yarn > >Thanks to everyone who has contributed to this release. > >Here are the release >notes:https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121version=12348093 > > >The commit to being voted >upon:https://github.com/apache/kylin/commit/a285f9a5b84affc36c5466ce5a1b2fcdb4348b37 > >Its hash is a285f9a5b84affc36c5466ce5a1b2fcdb4348b37. > >The artifacts to be voted on, including the source package and two >pre-compiled binary packages are located >here:https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-4.0.0-alpha-rc1/ > >The hash of the artifacts are as follows: >apache-kylin-4.0.0-alpha-source-release.zip.sha256 >f98da070a9839251c8cf3806c274c1191bb11d0b251288b4d2586e034f6ac291 >apache-kylin-4.0.0-alpha-bin-hadoop2.tar.gz.sha256 >8075af2608b62177f04bc5a528194c555959775ed69ae80e5ccaf9a37ec1bf74 >apache-kylin-4.0.0-alpha-bin-cdh57.tar.gz.sha256 >f98da070a9839251c8cf3806c274c1191bb11d0b251288b4d2586e034f6ac291 > > >A staged Maven repository is available for review >at:https://repository.apache.org/content/repositories/orgapachekylin-1081/ > >Release artifacts are signed with the following >key:https://people.apache.org/keys/committer/nic.asc > > >Please vote on releasing this package as Apache Kylin 4.0.0-alpha. > > >The vote is open for the next 72 hours and passes if a majority of at >least three +1 binding votes are cast. > > >[ ] +1 Release this package as Apache Kylin 4.0.0-alpha >[ ] 0 I don't feel strongly about it, but I'm okay with the release >[ ] -1 Do not release this package because... > > >Here is my vote: > >+1 (binding) > >- > >Best regards, >Ni Chunen / George
Re:Kylin v4 query engine tuning
Hi, Would you please like to check this doc : https://cwiki.apache.org/confluence/display/KYLIN/Improve+query+performance+by+setting+shard+by+column . -- Best wishes to you ! From :Xiaoxiang Yu 在 2020-10-05 20:11:42,"hubert stefani" 写道: >Hi, >We are currently using Kylin v4 on AWS EMR for tests and benchmarks. > >We have successfully optimized the building parameters to speed-up cube >building. > >We are now searching for tuning tips regarding the spark query engine >(sparder) : changing parameters (memory, cpu) doesn't seem to have any effect >on performances ( and nothing seems to change in the monitoring views). > >do you have any recommendation ? > >Regards, >Hubert.
Re: new committer: Rupeng Wang
Congrats to rupeng. | | 敏丞 邮箱:hit_la...@126.com | 签名由 网易邮箱大师 定制 On 10/14/2020 22:25, ShaoFeng Shi wrote: The Project Management Committee (PMC) for Apache Kylin has invited Rupeng Wang (王汝鹏, wangrup...@apache.org) to become a committer and we are pleased to announce that he has accepted. Being a committer enables easier contribution to the project since there is no need to go via the patch submission process. This should enable better productivity. Congratulations, Rupeng! Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
[VOTE] Release apache-kylin-3.1.1 (RC1)
Hi all, I have created a build for Apache Kylin 3.1.1, release candidate 1. Changes highlights: [KYLIN-4612] - Support job status write to kafka [KYLIN-4712] - Optimize CubeMetaIngester.java CLI [KYLIN-4657] - dead-loop in org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork [KYLIN-4688] - Too many tmp files in HDFS tmp directory [KYLIN-4619] - Make shrunken dict able to coexist with mr-hive global dict Thanks to everyone who has contributed to this release. Here are the release notes: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12348354 The commit to being voted upon: https://github.com/apache/kylin/commit/d8f5b1b40da42401df90f6205e5f650be05c81c4 Its hash is d8f5b1b40da42401df90f6205e5f650be05c81c4. The artifacts to be voted on, including the source package and four pre-compiled binary packages are located here: https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.1.1-rc1/ The hash of the artifacts are as follows: apache-kylin-3.1.1-source-release.zip.sha256 1f4e28dd53e2ef72faf40c3313f6a53d61205000250a57658d45800ad243594a apache-kylin-3.1.1-bin-hbase1x.tar.gz.sha256 23dcc21c3aa3d496afe39749a2e6832e3aeb4cabc83819a283a1468d70248302 apache-kylin-3.1.1-bin-cdh57.tar.gz.sha256 a0d50fb19f11918a9849ab93bd7a6033ae0e8a7fa5ffcfd7c4e8b5889e4b4829 apache-kylin-3.1.1-bin-cdh60.tar.gz.sha256 856cb8e3fbb1a3593121e3ba9c9f5b528ff96d156fd0648fa3ee71804d946283 apache-kylin-3.1.1-bin-hadoop3.tar.gz.sha256 4a0090acaa627e3c2611a1827ab49b822c33a43fc316b26e9efb0a0117031ddf A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachekylin-1083/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/xxyu.asc Please vote on releasing this package as Apache Kylin 3.1.1 . The vote is open for the next 72 hours and passes if a majority of at least three +1 binding votes are cast. [ ] +1 Release this package as Apache Kylin 3.1.1 [ ] 0 I don't feel strongly about it, but I'm okay with the release [ ] -1 Do not release this package because... Here is my vote: +1 (binding) -- Best wishes to you ! From :Xiaoxiang Yu
[RESULT][VOTE] Release apache-kylin-3.1.1 (RC1)
Thanks to everyone who has tested the release candidate and given their comments and votes. The tally is as follows. 3 binding +1s: Chunen Ni Shaofeng Shi Xiaoxiang Yu 6 non-binding +1s: Yaqian Zhang Rupeng Wang Chuxiao Zhichao Zhang Chao Long Johnson No 0s or -1s. Therefore I am delighted to announce that the proposal to release Apache-Kylin-3.1.1 has passed. -- Best wishes to you ! From :Xiaoxiang Yu
Re:Query over Rest API results in bad query
Hello, What kind of exception/error did you met ? Could you please share with us some more log? Actually, "org.apache.kylin.rest.service.BadQueryDetector - Detect bad query." did not provided enough information for cause analysis. -- Best wishes to you ! From :Xiaoxiang Yu At 2020-10-17 22:45:34, "Aditya Rohilla" wrote: >Hey, > > > >I am trying to send this query : > > > >{ "sql": "select process, avg(w) as `avg_w`, sum(vol) as `sum_vol` from >WHY_XYZ where id in ('BLA') and balance_date between '2020-03-06' and >'2020-09-06' group by process order by process asc, "offset": 0, "limit": >1000, "acceptPartial": false, "project": null } > > > >And it results in bad query as shown in the logs: > > > >2020-10-15T22:29:37,092 INFO [BadQueryDetector] >org.apache.kylin.rest.service.BadQueryDetector - Detect bad query. > > >Any idea how I should send the sql query? > >I am using mysql as sqldialect. > > > >Thanks and Regards, > >Aditya Rohilla
[SECURITY][CVE-2020-13937] Unauthenticated Configuration Disclosure
Versions Affected: Kylin 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.3.1, 2.3.2, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 2.5.2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.6.6, 3.0.0-alpha, 3.0.0-alpha2, 3.0.0-beta, 3.0.0, 3.0.1, 3.0.2, 3.1.0, 4.0.0-alpha. Description: Kylin has one restful api which exposed Kylin's configuration information without any authentication, so it is dangerous because some confidential information entries will be disclosed to everyone. Mitigation: Users could edit "$KYLIN_HOME/WEB-INF/classes/kylinSecurity.xml", and remove this line "". After that, restart all Kylin instances to make it effective. Otherwise, you can upgrade Kylin to 3.1.1. Credit: This issue was discovered by Ngo Wei Lin (@Creastery) of STAR Labs (@starlabs_sg). -- Best wishes to you ! From :Xiaoxiang Yu
Re: Failed to enable Real-time OLAP cube(SSL configuration not work)
After research, I guess you can try to add "kylin.source.kafka.config-override.xxx" for consumer related properties in Cube level configuration and try again. If you have some time, you can check source code at "https://github.com/apache/kylin/blob/fb4cdb32828a6508dcb8fd2cd953762dbd8a7e02/stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/KafkaSource.java#L75; to confirm . -- Best wishes to you ! From :Xiaoxiang Yu 在 2020-10-20 15:10:06,"Xiaoxiang Yu" 写道: >Hi, >1. From "Caused by: org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata" in your log files, I guess you > did not configure kafka broker list in correct way. Please >double check this doc : >http://kylin.apache.org/docs/tutorial/realtime_olap.html . >2. kylin-kafka-consumer.xml won't take effect for Realtime OLAP because it > is used for NRT ( http://kylin.apache.org/docs/tutorial/cube_streaming.html > ) . > >-- > >Best wishes to you ! >From :Xiaoxiang Yu > > > > >在 2020-10-20 14:18:58,"张敏" 写道: > >Hi, >Sorry about the screenshots. >I attached kylin-kafka-consumer.xml & kylin instance log kylin.out in the > attachment. Receiver instances have no log actually. >Thank you a lot. > > > >| >| > > >| >| >zhang_...@startimes.com.cn >| >签名由网易邮箱大师定制 >On 10/20/2020 12:06,Xiaoxiang Yu wrote: >Hi, >I cannot see your screenshots, maybe you can try to use plain text format > to share with us your error messages/exceptions, I perfer you to upload logs > both from kylin instance & receiver instances. > >-- > >Best wishes to you ! >From :Xiaoxiang Yu > > > > >在 2020-10-20 10:57:18,"张敏" 写道: > > > >encryption between clients and brokers in my kafka cluster : >- port 9092: Plaintext >- port 9094: TLS > > >it's all good to test with kafka-console-producer.sh and >kafka-console-consumer.sh on the machine where kylin is located > > >it's also ok when specify 9092 in kylin, like cube kylin_test_cube, while >there is no other configuration > > > > >but failed to enable kylin_ssl_test_cube which is ssl encryption(port 9094), I >did update conf/kylin-kafka-consumer.xml > > > > > > > > >Did I make any wrong configuration? thank you >签名由网易邮箱大师定制
Re: Failed to enable Real-time OLAP cube(SSL configuration not work)
Hi, 1. From "Caused by: org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata" in your log files, I guess you did not configure kafka broker list in correct way. Please double check this doc : http://kylin.apache.org/docs/tutorial/realtime_olap.html . 2. kylin-kafka-consumer.xml won't take effect for Realtime OLAP because it is used for NRT ( http://kylin.apache.org/docs/tutorial/cube_streaming.html ) . -- Best wishes to you ! From :Xiaoxiang Yu 在 2020-10-20 14:18:58,"张敏" 写道: Hi, Sorry about the screenshots. I attached kylin-kafka-consumer.xml & kylin instance log kylin.out in the attachment. Receiver instances have no log actually. Thank you a lot. | | | | zhang_...@startimes.com.cn | 签名由网易邮箱大师定制 On 10/20/2020 12:06,Xiaoxiang Yu wrote: Hi, I cannot see your screenshots, maybe you can try to use plain text format to share with us your error messages/exceptions, I perfer you to upload logs both from kylin instance & receiver instances. -- Best wishes to you ! From :Xiaoxiang Yu 在 2020-10-20 10:57:18,"张敏" 写道: encryption between clients and brokers in my kafka cluster : - port 9092: Plaintext - port 9094: TLS it's all good to test with kafka-console-producer.sh and kafka-console-consumer.sh on the machine where kylin is located it's also ok when specify 9092 in kylin, like cube kylin_test_cube, while there is no other configuration but failed to enable kylin_ssl_test_cube which is ssl encryption(port 9094), I did update conf/kylin-kafka-consumer.xml Did I make any wrong configuration? thank you 签名由网易邮箱大师定制
Re: Failed to enable Real-time OLAP cube(SSL configuration not work)
Glad to hear that. Thanks for update. -- Best wishes to you ! From :Xiaoxiang Yu 在 2020-10-20 17:30:00,"张敏" 写道: It works! thank you : ) | | 张敏 | | zhang_...@startimes.com.cn | 签名由网易邮箱大师定制 On 10/20/2020 16:37,Xiaoxiang Yu wrote: After research, I guess you can try to add "kylin.source.kafka.config-override.xxx" for consumer related properties in Cube level configuration and try again. If you have some time, you can check source code at "https://github.com/apache/kylin/blob/fb4cdb32828a6508dcb8fd2cd953762dbd8a7e02/stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/KafkaSource.java#L75; to confirm . -- Best wishes to you ! From :Xiaoxiang Yu 在 2020-10-20 15:10:06,"Xiaoxiang Yu" 写道: >Hi, >1. From "Caused by: org.apache.kafka.common.errors.TimeoutException: > Timeout expired while fetching topic metadata" in your log files, I guess you > did not configure kafka broker list in correct way. Please >double check this doc : >http://kylin.apache.org/docs/tutorial/realtime_olap.html . >2. kylin-kafka-consumer.xml won't take effect for Realtime OLAP because it > is used for NRT ( http://kylin.apache.org/docs/tutorial/cube_streaming.html > ) . > >-- > >Best wishes to you ! >From :Xiaoxiang Yu > > > > >在 2020-10-20 14:18:58,"张敏" 写道: > >Hi, >Sorry about the screenshots. >I attached kylin-kafka-consumer.xml & kylin instance log kylin.out in the > attachment. Receiver instances have no log actually. >Thank you a lot. > > > >| >| > > >| >| >zhang_...@startimes.com.cn >| >签名由网易邮箱大师定制 >On 10/20/2020 12:06,Xiaoxiang Yu wrote: >Hi, >I cannot see your screenshots, maybe you can try to use plain text format > to share with us your error messages/exceptions, I perfer you to upload logs > both from kylin instance & receiver instances. > >-- > >Best wishes to you ! >From :Xiaoxiang Yu > > > > >在 2020-10-20 10:57:18,"张敏" 写道: > > > >encryption between clients and brokers in my kafka cluster : >- port 9092: Plaintext >- port 9094: TLS > > >it's all good to test with kafka-console-producer.sh and >kafka-console-consumer.sh on the machine where kylin is located > > >it's also ok when specify 9092 in kylin, like cube kylin_test_cube, while >there is no other configuration > > > > >but failed to enable kylin_ssl_test_cube which is ssl encryption(port 9094), I >did update conf/kylin-kafka-consumer.xml > > > > > > > > >Did I make any wrong configuration? thank you >签名由网易邮箱大师定制
Re:你好,请问,如何编译 v3.1.0-cdh6.0/cdh6.1 版本
Hi, Please checkout to "master-hadoop3" branch, and use "build/script/package.sh -P cdh60" to build kylin package. -- Best wishes to you ! From :Xiaoxiang Yu 在 2020-10-14 11:18:48,"李 小元" 写道: >我应该切换到哪个分支? > >我可以使用idea maven package 命令吗,还是使用下面的命令? > > > >cd kylin > >build/script/package.sh -P cdh5.7 > >cd kylin > >build/script/package.sh -P cdh6.0 > > >谢谢!!! > >发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用 >
[Announce] Apache Kylin 3.1.1 released
The Apache Kylin team is pleased to announce the immediate availability of the 3.1.1 release. This is a bugfix release after 3.1.0, with 21 bug fixes and 37 enhancements. All of the changes in this release can be found in: https://kylin.apache.org/docs/release_notes.html You can download the source release and binary packages from Apache Kylin's download page: https://kylin.apache.org/download/ Apache Kylin is an open-source Distributed Analytical Data Warehouse for Big Data; it was designed to provide OLAP (Online Analytical Processing) capability in the big data era. By renovating the multi-dimensional cube and precalculation technology on Hadoop and Spark, Kylin is able to achieve near-constant query speed regardless of the ever-growing data volume. Reducing query latency from minutes to sub-second, Kylin brings online analytics back to big data. Apache Kylin lets you query billions of rows at sub-second latency in 3 steps: 1. Identify a Star/Snowflake Schema on Hadoop. 2. Build Cube from the identified tables. 3. Query using ANSI-SQL and get results in sub-second, via ODBC, JDBC or RESTful API. Thanks to everyone who has contributed to this release. We welcome your help and feedback. For more information on how to report problems, and to get involved, visit the project website at https://kylin.apache.org/ -- Best wishes to you ! From :Xiaoxiang Yu