Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/5349#issuecomment-89318424
Didn't write a test for this because we don't have Hive testing infra like
`TestHive` in Scala code for PySpark yet. I opened this PR to make code
snippets in #5348 more complete. But then I realize we can always use
`sqlCtx.sql("REFRESH TABLE my_table")` in Python. So this PR doesn't seem that
necessary.
Tested this locally with the following snippet:
```python
import tempfile, shutil
jsonFile = tempfile.mkdtemp()
shutil.rmtree(jsonFile)
sqlCtx.createDataFrame([("a", "b")])\
.toJSON()\
.saveAsTextFile(jsonFile)
sqlCtx.sql(
"CREATE TABLE jt " +
"USING org.apache.spark.sql.json " +
"OPTIONS (path '%s')" % jsonFile)
sqlCtx.sql("SELECT * FROM jt").show()
# _1 _2
# a b
shutil.rmtree(jsonFile)
sqlCtx.createDataFrame([("a", "b", "c")])\
.toJSON()\
.saveAsTextFile(jsonFile)
sqlCtx.sql("SELECT * FROM jt").show()
# _1 _2
# a b
sqlCtx.refreshTable("jt")
sqlCtx.sql("SELECT * FROM jt").show()
# _1 _2 _3
# a b c
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]