[
https://issues.apache.org/jira/browse/SPARK-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Santiago M. Mola updated SPARK-8628:
------------------------------------
Description:
SPARK-5009 introduced the following code in AbstractSparkSQLParser:
{code}
def parse(input: String): LogicalPlan = {
// Initialize the Keywords.
lexical.initialize(reservedWords)
phrase(start)(new lexical.Scanner(input)) match {
case Success(plan, _) => plan
case failureOrError => sys.error(failureOrError.toString)
}
}
{code}
The corresponding initialize method in SqlLexical is not thread-safe:
{code}
/* This is a work around to support the lazy setting */
def initialize(keywords: Seq[String]): Unit = {
reserved.clear()
reserved ++= keywords
}
{code}
I'm hitting this when parsing multiple SQL queries concurrently. When one query
parsing starts, it empties the reserved keyword list, then a race-condition
occurs and other queries fail to parse because they recognize keywords as
identifiers.
was:
SPARK-5009 introduced the following code:
def parse(input: String): LogicalPlan = {
// Initialize the Keywords.
lexical.initialize(reservedWords)
phrase(start)(new lexical.Scanner(input)) match {
case Success(plan, _) => plan
case failureOrError => sys.error(failureOrError.toString)
}
}
The corresponding initialize method in SqlLexical is not thread-safe:
/* This is a work around to support the lazy setting */
def initialize(keywords: Seq[String]): Unit = {
reserved.clear()
reserved ++= keywords
}
I'm hitting this when parsing multiple SQL queries concurrently. When one query
parsing starts, it empties the reserved keyword list, then a race-condition
occurs and other queries fail to parse because they recognize keywords as
identifiers.
> Race condition in AbstractSparkSQLParser.parse
> ----------------------------------------------
>
> Key: SPARK-8628
> URL: https://issues.apache.org/jira/browse/SPARK-8628
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.3.0, 1.3.1, 1.4.0
> Reporter: Santiago M. Mola
> Priority: Critical
> Labels: regression
>
> SPARK-5009 introduced the following code in AbstractSparkSQLParser:
> {code}
> def parse(input: String): LogicalPlan = {
> // Initialize the Keywords.
> lexical.initialize(reservedWords)
> phrase(start)(new lexical.Scanner(input)) match {
> case Success(plan, _) => plan
> case failureOrError => sys.error(failureOrError.toString)
> }
> }
> {code}
> The corresponding initialize method in SqlLexical is not thread-safe:
> {code}
> /* This is a work around to support the lazy setting */
> def initialize(keywords: Seq[String]): Unit = {
> reserved.clear()
> reserved ++= keywords
> }
> {code}
> I'm hitting this when parsing multiple SQL queries concurrently. When one
> query parsing starts, it empties the reserved keyword list, then a
> race-condition occurs and other queries fail to parse because they recognize
> keywords as identifiers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]