[
https://issues.apache.org/jira/browse/SPARK-15486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Bruno updated SPARK-15486:
-------------------------------
Description:
We're using spark commit db75ccb (not sure if that's unreleased 2.0.0 or 2.1.0).
We don't use Hive as we have a custom filesystem hierarchy and we like to use
dots in table names. For this reason we use backticks when registering
temporary tables.
We have noticed that dropTempTable doesn't work as expected when using
backticks.
{code}
from pyspark import SparkContext
from pyspark.sql import SQLContext
sc = SparkContext()
sqlc = SQLContext(sc)
data = sc.parallelize([ { "col1": "val" } ])
df = sqlc.createDataFrame(data)
df.registerTempTable("`a.b.c`")
print sqlc.sql("select * from `a.b.c`").collect()
sqlc.dropTempTable("`a.b.c`")
print sqlc.sql("select * from `a.b.c`").collect()
{code}
The above code will print the dataframe twice. We instead expect the second
collect to fail because the table shouldn't exist... the dropTempTable is
failing silently.
Removing backticks from registerTempTable or dropTempTable is not an option
because we'd get an invalid syntax exception.
was:
We're using spark commit db75ccb (not sure if that's unreleased 2.0.0 or 2.1.0).
We don't use Hive as we have a custom filesystem hierarchy and we like to use
dots in table names. For this reason we use backticks when registering
temporary tables.
We have noticed that dropTempTable doesn't work as expected when using
backticks.
{code}
from pyspark import SparkContext
from pyspark.sql import SQLContext
sc = SparkContext()
sqlc = SQLContext(sc)
data = sc.parallelize([ { "col1": "val" } ])
df = sqlc.createDataFrame(data)
df.registerTempTable("`a.b.c`")
print sqlc.sql("select * from `a.b.c`").collect()
sqlc.dropTempTable("`a.b.c`")
print sqlc.sql("select * from `a.b.c`").collect()
{code}
The above code will print the dataframe twice. We instead expect the second
collect to fail because the table shouldn't exist... the dropTempTable is
failing silently.
Removing backticks from registerTempTable or dropTempTable is not an option
because we'd get an invalid syntax exception.
> dropTempTable does not work with backticks
> ------------------------------------------
>
> Key: SPARK-15486
> URL: https://issues.apache.org/jira/browse/SPARK-15486
> Project: Spark
> Issue Type: Bug
> Affects Versions: 2.0.0
> Reporter: Luca Bruno
>
> We're using spark commit db75ccb (not sure if that's unreleased 2.0.0 or
> 2.1.0).
> We don't use Hive as we have a custom filesystem hierarchy and we like to use
> dots in table names. For this reason we use backticks when registering
> temporary tables.
> We have noticed that dropTempTable doesn't work as expected when using
> backticks.
> {code}
> from pyspark import SparkContext
> from pyspark.sql import SQLContext
> sc = SparkContext()
> sqlc = SQLContext(sc)
> data = sc.parallelize([ { "col1": "val" } ])
> df = sqlc.createDataFrame(data)
> df.registerTempTable("`a.b.c`")
> print sqlc.sql("select * from `a.b.c`").collect()
> sqlc.dropTempTable("`a.b.c`")
> print sqlc.sql("select * from `a.b.c`").collect()
> {code}
> The above code will print the dataframe twice. We instead expect the second
> collect to fail because the table shouldn't exist... the dropTempTable is
> failing silently.
> Removing backticks from registerTempTable or dropTempTable is not an option
> because we'd get an invalid syntax exception.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]