[jira] [Commented] (KYLIN-5268) kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/KYLIN-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17612010#comment-17612010 ] liyang commented on KYLIN-5268: --- > 如果tableau发出的奇怪sql不多固定,似乎修改源码的 org.apache.kylin.rest.util.TableauInterceptor > 类更合理 +1 > kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space > --- > > Key: KYLIN-5268 > URL: https://issues.apache.org/jira/browse/KYLIN-5268 > Project: Kylin > Issue Type: Bug >Reporter: zhenkuan_zhang >Priority: Major > Attachments: kylin_query.log > > > Hi,最近发现我们的kylin服务非常不稳定,三个节点,每天都会有节点出现内存溢出的状况,jvm从16G调到48G也同样会内存溢出。通过分析Dump 出 > java_pid34184.hprof文件,发现内存会被两个查询完全占满导致jvm内存溢出,sql是通过tableau发送来的。sql如下(该事实表行数上亿): > 1. > SELECT "自訂 SQL 查詢"."FAIL" AS "FAIL" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 2. > SELECT "自訂 SQL 查詢"."RETEST" AS "RETEST" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 一方面我们在与tableau方联系,沟通为什么会发出这么奇怪的sql。 > 另一方面希望得到kylin的支持,看看有没有什么解决方法, > 比如kylin.query.force-limit=1000 可以为 select * from table 加上 limit 1000,有没有办法限制 > select a from table。 > 或者有没有其他的解决方法。 > 期待回复,十分感谢!!! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KYLIN-5268) kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/KYLIN-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607490#comment-17607490 ] Liu Zhao commented on KYLIN-5268: - hi,我刚刚看了下,如果tableau发出的奇怪sql不多固定,似乎修改源码的 org.apache.kylin.rest.util.TableauInterceptor 类更合理,相较于修改 QueryUtil.java 强制为没有limit的sql加limit,前者可以直接拦截不提交spark查询作业,后者还是会发起spark查询作业有资源消耗。 {code:java} // QueryService.java conn = QueryConnection.getConnection(sqlRequest.getProject()); String userInfo = SecurityContextHolder.getContext().getAuthentication().getName(); QueryContext context = QueryContextFacade.current(); context.setUsername(userInfo); context.setGroups(AclPermissionUtil.getCurrentUserGroups()); context.setProject(sqlRequest.getProject()); final Collection grantedAuthorities = SecurityContextHolder.getContext() .getAuthentication().getAuthorities(); for (GrantedAuthority grantedAuthority : grantedAuthorities) { userInfo += ","; userInfo += grantedAuthority.getAuthority(); } SQLResponse fakeResponse = TableauInterceptor.tableauIntercept(sqlRequest.getSql()); if (null != fakeResponse) { logger.debug("Return fake response, is exception? " + fakeResponse.getIsException()); return fakeResponse; } String correctedSql = QueryUtil.massageSql(sqlRequest.getSql(), sqlRequest.getProject(), sqlRequest.getLimit(), sqlRequest.getOffset(), conn.getSchema(), Constant.FakeCatalogName); if (!correctedSql.equals(sqlRequest.getSql())) { logger.info("The corrected query: " + correctedSql); //CAUTION: should not change sqlRequest content! //sqlRequest.setSql(correctedSql); } {code} > kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space > --- > > Key: KYLIN-5268 > URL: https://issues.apache.org/jira/browse/KYLIN-5268 > Project: Kylin > Issue Type: Bug >Reporter: zhenkuan_zhang >Priority: Major > Attachments: kylin_query.log > > > Hi,最近发现我们的kylin服务非常不稳定,三个节点,每天都会有节点出现内存溢出的状况,jvm从16G调到48G也同样会内存溢出。通过分析Dump 出 > java_pid34184.hprof文件,发现内存会被两个查询完全占满导致jvm内存溢出,sql是通过tableau发送来的。sql如下(该事实表行数上亿): > 1. > SELECT "自訂 SQL 查詢"."FAIL" AS "FAIL" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 2. > SELECT "自訂 SQL 查詢"."RETEST" AS "RETEST" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 一方面我们在与tableau方联系,沟通为什么会发出这么奇怪的sql。 > 另一方面希望得到kylin的支持,看看有没有什么解决方法, > 比如kylin.query.force-limit=1000 可以为 select * from table 加上 limit 1000,有没有办法限制 > select a from table。 > 或者有没有其他的解决方法。 > 期待回复,十分感谢!!! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KYLIN-5268) kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/KYLIN-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607475#comment-17607475 ] zhenkuan_zhang commented on KYLIN-5268: --- 非常感谢您的回复。 根据您提供的方案,我们目前在跟tableau方沟通, 并设置 spark.driver.maxResultSize =500m,kylin.query.timeout-seconds=180, 关于kylin.query.force-limit 我们设置成了 5, 并修改了源码query/src/main/java/org/apache/kylin/query/util/QueryUtil.java,把判断sql是否包含*的判断去掉了,自动为所有sql加limit。 今天没有出现OOM的状况,我们会再持续观察几天,如果有更好的方案,希望可以再通知我们,谢谢! > kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space > --- > > Key: KYLIN-5268 > URL: https://issues.apache.org/jira/browse/KYLIN-5268 > Project: Kylin > Issue Type: Bug >Reporter: zhenkuan_zhang >Priority: Major > Attachments: kylin_query.log > > > Hi,最近发现我们的kylin服务非常不稳定,三个节点,每天都会有节点出现内存溢出的状况,jvm从16G调到48G也同样会内存溢出。通过分析Dump 出 > java_pid34184.hprof文件,发现内存会被两个查询完全占满导致jvm内存溢出,sql是通过tableau发送来的。sql如下(该事实表行数上亿): > 1. > SELECT "自訂 SQL 查詢"."FAIL" AS "FAIL" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 2. > SELECT "自訂 SQL 查詢"."RETEST" AS "RETEST" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 一方面我们在与tableau方联系,沟通为什么会发出这么奇怪的sql。 > 另一方面希望得到kylin的支持,看看有没有什么解决方法, > 比如kylin.query.force-limit=1000 可以为 select * from table 加上 limit 1000,有没有办法限制 > select a from table。 > 或者有没有其他的解决方法。 > 期待回复,十分感谢!!! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KYLIN-5268) kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/KYLIN-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607283#comment-17607283 ] Liu Zhao commented on KYLIN-5268: - 1. 从tableau角度解决根因 2. 可以从github上拿源码自行编译,对tableau奇怪的sql拦截或者使kylin.query.force-limit对所有sql生效 3. 是否可以在 tableau kylin之间增加一层代理拦截异常sql > kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space > --- > > Key: KYLIN-5268 > URL: https://issues.apache.org/jira/browse/KYLIN-5268 > Project: Kylin > Issue Type: Bug >Reporter: zhenkuan_zhang >Priority: Major > Attachments: kylin_query.log > > > Hi,最近发现我们的kylin服务非常不稳定,三个节点,每天都会有节点出现内存溢出的状况,jvm从16G调到48G也同样会内存溢出。通过分析Dump 出 > java_pid34184.hprof文件,发现内存会被两个查询完全占满导致jvm内存溢出,sql是通过tableau发送来的。sql如下(该事实表行数上亿): > 1. > SELECT "自訂 SQL 查詢"."FAIL" AS "FAIL" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 2. > SELECT "自訂 SQL 查詢"."RETEST" AS "RETEST" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 一方面我们在与tableau方联系,沟通为什么会发出这么奇怪的sql。 > 另一方面希望得到kylin的支持,看看有没有什么解决方法, > 比如kylin.query.force-limit=1000 可以为 select * from table 加上 limit 1000,有没有办法限制 > select a from table。 > 或者有没有其他的解决方法。 > 期待回复,十分感谢!!! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KYLIN-5268) kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/KYLIN-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17606831#comment-17606831 ] zhenkuan_zhang commented on KYLIN-5268: --- 2022-09-15 21:46:54,588 INFO [BadQueryDetector] service.BadQueryDetector:204 : Slow query has been running 1328.617 seconds (project:KPY_RANGER, thread: 0x42, user:h26, query id:fb8d154f-726c-3f74-bfba-486b484952a9) -- SELECT "自訂 SQL 查詢"."FAIL" AS "FAIL" FROM ()"自訂 SQL 查詢" 2022-09-15 21:46:54,647 DEBUG [BadQueryDetector] badquery.BadQueryHistoryManager:65 : Loaded 50 Bad Query(s) 2022-09-15 21:46:54,690 INFO [localhost-startStop-1-EventThread] state.ConnectionStateManager:228 : State change: LOST 2022-09-15 21:46:54,692 ERROR [localhost-startStop-1-EventThread] imps.CuratorFrameworkImpl:557 : Background operation retry gave up org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728) at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:516) at org.apache.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:53) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:609) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) 2022-09-15 21:46:54,707 INFO [BadQueryDetector] service.BadQueryDetector:192 : Problematic thread 0x42 Query fb8d154f-726c-3f74-bfba-486b484952a9-66, query id: fb8d154f-726c-3f74-bfba-486b484952a9 at org.apache.spark.sql.execution.SparkPlan$$anon$1.next(SparkPlan.scala:284) at org.apache.spark.sql.execution.SparkPlan$$anon$1.next(SparkPlan.scala:278) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at org.apache.spark.sql.execution.SparkPlan$$anon$1.foreach(SparkPlan.scala:278) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeCollect$1.apply(SparkPlan.scala:300) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeCollect$1.apply(SparkPlan.scala:299) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:299) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3389) 2022-09-15 21:46:54,708 INFO [BadQueryDetector] service.BadQueryDetector:204 : Slow query has been running 1328.68 seconds > kylin4.0.1服务每天都会内存溢出java.lang.OutOfMemoryError: Java heap space > --- > > Key: KYLIN-5268 > URL: https://issues.apache.org/jira/browse/KYLIN-5268 > Project: Kylin > Issue Type: Bug >Reporter: zhenkuan_zhang >Priority: Major > Attachments: kylin_query.log > > > Hi,最近发现我们的kylin服务非常不稳定,三个节点,每天都会有节点出现内存溢出的状况,jvm从16G调到48G也同样会内存溢出。通过分析Dump 出 > java_pid34184.hprof文件,发现内存会被两个查询完全占满导致jvm内存溢出,sql是通过tableau发送来的。sql如下(该事实表行数上亿): > 1. > SELECT "自訂 SQL 查詢"."FAIL" AS "FAIL" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 2. > SELECT "自訂 SQL 查詢"."RETEST" AS "RETEST" FROM ( > SELECT DT TEST_DATE,TEST_STATION_CODE,SUBSTRING(TEST_LINE,1,6) > TEST_FLOOR,TEST_LINE,CONFIG,REGION,LOCALLIZATION, > SUBSTRING(ATL FROM 1 for POSITION('*-' IN ATL from 1)-1 ) > VENDERATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 5)+1 FOR 3) > WEEKATL,SUBSTRING(ATL FROM POSITION('-' IN ATL FROM 6)+1 FOR 4) > DATEATL,SUBSTRING(ATL FROM POSITION('*-' IN ATL FROM 1)+7 FOR 4) ATL , > INPUT,FAIL,RETEST FROM BI_DW.KPY_FACT_SUM_DATA > ) "自訂 SQL 查詢" > 一方面我们在与tableau方联系,沟通为什么会发出这么奇怪的sql。 > 另一方面希望得到kylin的支持,看看有没有什么解决方法, > 比如kylin.query.force-limit=1000 可以为 select * from table 加上 limit 1000,有没有办法限制 > select a from table。 > 或者有没有其他的解决方法。 -- This message was sent by Atlassian Jira (v8.20.10#820010)