[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("
[ https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961647#comment-16961647 ] Greg Senia commented on HIVE-17395: --- After spending many hours troubleshooting this enhancement seems to be the root of the LParen problem > HiveServer2 parsing a command with a lot of "(" > --- > > Key: HIVE-17395 > URL: https://issues.apache.org/jira/browse/HIVE-17395 > Project: Hive > Issue Type: Bug > Components: Beeline, HiveServer2 >Affects Versions: 2.3.0 >Reporter: dan young >Priority: Major > > Hello, > We're seeing what appears to be the same issue that was outlined in > HIVE-15388 where the query parser spends a lot of time (never returns and I > need to kill the beeline process) parsing a command with a lot of "(" . I > tried this in both 2.2 and now 2.3. > Here's an example query (this is auto generated SQL BTW) in beeline that > never completes/parses, I end up just killing the beeline process. > It looks like something similar was addressed as part of HIVE-15388. Any > ideas on how to address this? write better SQL? patch? > Regards, > Dano > {noformat} > Connected to: Apache Hive (version 2.3.0) > Driver: Hive JDBC (version 2.3.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 2.3.0 by Apache Hive > 0: jdbc:hive2://localhost:1/test_db> SELECT > ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), > -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))),11)); > When I did a jstack on the HiveServer2, it appears the be stuck/running in > the HiveParser/antlr. > "e62658bd-5ea9-43c4-898f-3048d913f192 HiveServer2-Handler-Pool: Thread-96" > #96 prio=5 os_prio=0 tid=0x7fb78c366000 nid=0x4476 runnable > [0x7fb77d7bb000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser$DFA36.specialStateTransition(HiveParser_IdentifiersParser.java:31502) > at org.antlr.runtime.DFA.predict(DFA.java:80) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6746) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6988) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:7324) > at >
[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("
[ https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807152#comment-16807152 ] Julian Hyde commented on HIVE-17395: HIVE-17395 seems to be a duplicate of HIVE-18624. Do others agree? > HiveServer2 parsing a command with a lot of "(" > --- > > Key: HIVE-17395 > URL: https://issues.apache.org/jira/browse/HIVE-17395 > Project: Hive > Issue Type: Bug > Components: Beeline, HiveServer2 >Affects Versions: 2.3.0 >Reporter: dan young >Priority: Major > > Hello, > We're seeing what appears to be the same issue that was outlined in > HIVE-15388 where the query parser spends a lot of time (never returns and I > need to kill the beeline process) parsing a command with a lot of "(" . I > tried this in both 2.2 and now 2.3. > Here's an example query (this is auto generated SQL BTW) in beeline that > never completes/parses, I end up just killing the beeline process. > It looks like something similar was addressed as part of HIVE-15388. Any > ideas on how to address this? write better SQL? patch? > Regards, > Dano > {noformat} > Connected to: Apache Hive (version 2.3.0) > Driver: Hive JDBC (version 2.3.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 2.3.0 by Apache Hive > 0: jdbc:hive2://localhost:1/test_db> SELECT > ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), > -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))),11)); > When I did a jstack on the HiveServer2, it appears the be stuck/running in > the HiveParser/antlr. > "e62658bd-5ea9-43c4-898f-3048d913f192 HiveServer2-Handler-Pool: Thread-96" > #96 prio=5 os_prio=0 tid=0x7fb78c366000 nid=0x4476 runnable > [0x7fb77d7bb000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser$DFA36.specialStateTransition(HiveParser_IdentifiersParser.java:31502) > at org.antlr.runtime.DFA.predict(DFA.java:80) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6746) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6988) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:7324) > at >
[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("
[ https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802180#comment-16802180 ] Julian Hyde commented on HIVE-17395: [~kgyrtkirk] Thanks. I was aware of HIVE-15388 but it does look as if HIVE-18624 is a better match. This might be a duplicate. It does match the timescale when this issue appeared. > HiveServer2 parsing a command with a lot of "(" > --- > > Key: HIVE-17395 > URL: https://issues.apache.org/jira/browse/HIVE-17395 > Project: Hive > Issue Type: Bug > Components: Beeline, HiveServer2 >Affects Versions: 2.3.0 >Reporter: dan young >Priority: Major > > Hello, > We're seeing what appears to be the same issue that was outlined in > HIVE-15388 where the query parser spends a lot of time (never returns and I > need to kill the beeline process) parsing a command with a lot of "(" . I > tried this in both 2.2 and now 2.3. > Here's an example query (this is auto generated SQL BTW) in beeline that > never completes/parses, I end up just killing the beeline process. > It looks like something similar was addressed as part of HIVE-15388. Any > ideas on how to address this? write better SQL? patch? > Regards, > Dano > {noformat} > Connected to: Apache Hive (version 2.3.0) > Driver: Hive JDBC (version 2.3.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 2.3.0 by Apache Hive > 0: jdbc:hive2://localhost:1/test_db> SELECT > ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), > -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))),11)); > When I did a jstack on the HiveServer2, it appears the be stuck/running in > the HiveParser/antlr. > "e62658bd-5ea9-43c4-898f-3048d913f192 HiveServer2-Handler-Pool: Thread-96" > #96 prio=5 os_prio=0 tid=0x7fb78c366000 nid=0x4476 runnable > [0x7fb77d7bb000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser$DFA36.specialStateTransition(HiveParser_IdentifiersParser.java:31502) > at org.antlr.runtime.DFA.predict(DFA.java:80) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6746) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6988) > at >
[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("
[ https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801679#comment-16801679 ] Zoltan Haindrich commented on HIVE-17395: - [~julianhyde]: I'm not sure if it makes your problem go away or not; but HIVE-18624 made this issue less severe (intervals and subq both need a '(' or an explicit keyword - so in most cases the recursion spiral was started by "function") unfortunately that patch right now is only on branch-3/master; and there is no Hive releases which contain it... about antlr4it would be great but it doesn't seem possible. > HiveServer2 parsing a command with a lot of "(" > --- > > Key: HIVE-17395 > URL: https://issues.apache.org/jira/browse/HIVE-17395 > Project: Hive > Issue Type: Bug > Components: Beeline, HiveServer2 >Affects Versions: 2.3.0 >Reporter: dan young >Priority: Major > > Hello, > We're seeing what appears to be the same issue that was outlined in > HIVE-15388 where the query parser spends a lot of time (never returns and I > need to kill the beeline process) parsing a command with a lot of "(" . I > tried this in both 2.2 and now 2.3. > Here's an example query (this is auto generated SQL BTW) in beeline that > never completes/parses, I end up just killing the beeline process. > It looks like something similar was addressed as part of HIVE-15388. Any > ideas on how to address this? write better SQL? patch? > Regards, > Dano > {noformat} > Connected to: Apache Hive (version 2.3.0) > Driver: Hive JDBC (version 2.3.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 2.3.0 by Apache Hive > 0: jdbc:hive2://localhost:1/test_db> SELECT > ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), > -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))),11)); > When I did a jstack on the HiveServer2, it appears the be stuck/running in > the HiveParser/antlr. > "e62658bd-5ea9-43c4-898f-3048d913f192 HiveServer2-Handler-Pool: Thread-96" > #96 prio=5 os_prio=0 tid=0x7fb78c366000 nid=0x4476 runnable > [0x7fb77d7bb000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser$DFA36.specialStateTransition(HiveParser_IdentifiersParser.java:31502) > at org.antlr.runtime.DFA.predict(DFA.java:80) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6746)
[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("
[ https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800973#comment-16800973 ] Julian Hyde commented on HIVE-17395: I'm running into this issue also, in my testing of Looker against Hive. It seems to be a regression from earlier versions of Hive. Looker generates deeply nested expressions, and so hits this problem hard; we are recommending that our customers do not upgrade to Hive 2.2, 2.3 or 3 because of this issue. I am more of an expert on JavaCC than Antlr, but I agree with [~kgyrtkirk] that the problem seems to be lookaheads. The calls to {{org.antlr.runtime.DFA.predict}} on the stack are evidence of that. Each call to predict will be followed by a call to actually parse, so each call to predict doubles the running time. There are 12 calls, which would suggest a 4096x slowdown. I don't know whether an upgrade to antlr v4 is possible or planned. [A post on stackoverflow|https://stackoverflow.com/questions/17054285/is-it-possible-to-lookahead-in-antlr4-without-actually-matching-a-token] suggests that "=>" (the lookahead operator) is no longer necessary on antlr v4; antlr verifies lookahead as it parses. If true, that 2 ^ 12 number above would be become 1 ^ 12, a much nicer number! > HiveServer2 parsing a command with a lot of "(" > --- > > Key: HIVE-17395 > URL: https://issues.apache.org/jira/browse/HIVE-17395 > Project: Hive > Issue Type: Bug > Components: Beeline, HiveServer2 >Affects Versions: 2.3.0 >Reporter: dan young >Priority: Major > > Hello, > We're seeing what appears to be the same issue that was outlined in > HIVE-15388 where the query parser spends a lot of time (never returns and I > need to kill the beeline process) parsing a command with a lot of "(" . I > tried this in both 2.2 and now 2.3. > Here's an example query (this is auto generated SQL BTW) in beeline that > never completes/parses, I end up just killing the beeline process. > It looks like something similar was addressed as part of HIVE-15388. Any > ideas on how to address this? write better SQL? patch? > Regards, > Dano > {noformat} > Connected to: Apache Hive (version 2.3.0) > Driver: Hive JDBC (version 2.3.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 2.3.0 by Apache Hive > 0: jdbc:hive2://localhost:1/test_db> SELECT > ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), > -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS
[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("
[ https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160987#comment-16160987 ] Zoltan Haindrich commented on HIVE-17395: - It seems to me that all the relevant fixes which are related to '(' parsing were already present at 2.3 release.. the problematic rule seems to be atomexpression; the parser spends quite a lot of time trying to judge whenever its a subquery/iterval or not... https://github.com/apache/hive/blob/92f9d8fb4de7896cc9161bfbd461fac4f2917a35/ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g#L412 > HiveServer2 parsing a command with a lot of "(" > --- > > Key: HIVE-17395 > URL: https://issues.apache.org/jira/browse/HIVE-17395 > Project: Hive > Issue Type: Bug > Components: Beeline, HiveServer2 >Affects Versions: 2.3.0 >Reporter: dan young > > Hello, > We're seeing what appears to be the same issue that was outlined in > HIVE-15388 where the query parser spends a lot of time (never returns and I > need to kill the beeline process) parsing a command with a lot of "(" . I > tried this in both 2.2 and now 2.3. > Here's an example query (this is auto generated SQL BTW) in beeline that > never completes/parses, I end up just killing the beeline process. > It looks like something similar was addressed as part of HIVE-15388. Any > ideas on how to address this? write better SQL? patch? > Regards, > Dano > Connected to: Apache Hive (version 2.3.0) > Driver: Hive JDBC (version 2.3.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Beeline version 2.3.0 by Apache Hive > 0: jdbc:hive2://localhost:1/test_db> SELECT > ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), > -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP), > 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 AS STRING), '-', > LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))), > -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 > 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), > 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))),11)); > When I did a jstack on the HiveServer2, it appears the be stuck/running in > the HiveParser/antlr. > "e62658bd-5ea9-43c4-898f-3048d913f192 HiveServer2-Handler-Pool: Thread-96" > #96 prio=5 os_prio=0 tid=0x7fb78c366000 nid=0x4476 runnable > [0x7fb77d7bb000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser$DFA36.specialStateTransition(HiveParser_IdentifiersParser.java:31502) > at org.antlr.runtime.DFA.predict(DFA.java:80) > at > org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6746) > at >