[ 
https://issues.apache.org/jira/browse/SPARK-15486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Bruno updated SPARK-15486:
-------------------------------
    Description: 
We're using spark commit db75ccb (not sure if that's unreleased 2.0.0 or 2.1.0).

We don't use Hive as we have a custom filesystem hierarchy and we like to use 
dots in table names. For this reason we use backticks when registering 
temporary tables.

We have noticed that dropTempTable doesn't work as expected when using 
backticks.

{code}
from pyspark import SparkContext
from pyspark.sql import SQLContext

sc = SparkContext()
sqlc = SQLContext(sc)

data = sc.parallelize([ { "col1": "val" } ])
df = sqlc.createDataFrame(data)
df.registerTempTable("`a.b.c`")

print sqlc.sql("select * from `a.b.c`").collect()
sqlc.dropTempTable("`a.b.c`")
print sqlc.sql("select * from `a.b.c`").collect()
{code}

The above code will print the dataframe twice. We instead expect the second 
collect to fail because the table shouldn't exist... the dropTempTable is 
failing silently.

Removing backticks from registerTempTable or dropTempTable is not an option 
because we'd get an invalid syntax exception.

  was:
We're using spark commit db75ccb (not sure if that's unreleased 2.0.0 or 2.1.0).

We don't use Hive as we have a custom filesystem hierarchy and we like to use 
dots in table names. For this reason we use backticks when registering 
temporary tables.
We have noticed that dropTempTable doesn't work as expected when using 
backticks.

{code}
from pyspark import SparkContext
from pyspark.sql import SQLContext

sc = SparkContext()
sqlc = SQLContext(sc)

data = sc.parallelize([ { "col1": "val" } ])
df = sqlc.createDataFrame(data)
df.registerTempTable("`a.b.c`")

print sqlc.sql("select * from `a.b.c`").collect()
sqlc.dropTempTable("`a.b.c`")
print sqlc.sql("select * from `a.b.c`").collect()
{code}

The above code will print the dataframe twice. We instead expect the second 
collect to fail because the table shouldn't exist... the dropTempTable is 
failing silently.

Removing backticks from registerTempTable or dropTempTable is not an option 
because we'd get an invalid syntax exception.


> dropTempTable does not work with backticks
> ------------------------------------------
>
>                 Key: SPARK-15486
>                 URL: https://issues.apache.org/jira/browse/SPARK-15486
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Luca Bruno
>
> We're using spark commit db75ccb (not sure if that's unreleased 2.0.0 or 
> 2.1.0).
> We don't use Hive as we have a custom filesystem hierarchy and we like to use 
> dots in table names. For this reason we use backticks when registering 
> temporary tables.
> We have noticed that dropTempTable doesn't work as expected when using 
> backticks.
> {code}
> from pyspark import SparkContext
> from pyspark.sql import SQLContext
> sc = SparkContext()
> sqlc = SQLContext(sc)
> data = sc.parallelize([ { "col1": "val" } ])
> df = sqlc.createDataFrame(data)
> df.registerTempTable("`a.b.c`")
> print sqlc.sql("select * from `a.b.c`").collect()
> sqlc.dropTempTable("`a.b.c`")
> print sqlc.sql("select * from `a.b.c`").collect()
> {code}
> The above code will print the dataframe twice. We instead expect the second 
> collect to fail because the table shouldn't exist... the dropTempTable is 
> failing silently.
> Removing backticks from registerTempTable or dropTempTable is not an option 
> because we'd get an invalid syntax exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to