[GitHub] spark pull request: [SPARK-6696] [SQL] Adds HiveContext.refreshTab...

liancheng Fri, 03 Apr 2015 08:20:06 -0700

Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/5349#issuecomment-89318424
  
    Didn't write a test for this because we don't have Hive testing infra like 
`TestHive` in Scala code for PySpark yet. I opened this PR to make code 
snippets in #5348 more complete. But then I realize we can always use 
`sqlCtx.sql("REFRESH TABLE my_table")` in Python. So this PR doesn't seem that 
necessary.
    
    Tested this locally with the following snippet:
    
    ```python
    import tempfile, shutil
    
    jsonFile = tempfile.mkdtemp()
    shutil.rmtree(jsonFile)
    sqlCtx.createDataFrame([("a", "b")])\
          .toJSON()\
          .saveAsTextFile(jsonFile)
    sqlCtx.sql(
        "CREATE TABLE jt " +
        "USING org.apache.spark.sql.json " +
        "OPTIONS (path '%s')" % jsonFile)
    sqlCtx.sql("SELECT * FROM jt").show()
    # _1 _2
    # a  b
    
    shutil.rmtree(jsonFile)
    sqlCtx.createDataFrame([("a", "b", "c")])\
          .toJSON()\
          .saveAsTextFile(jsonFile)
    sqlCtx.sql("SELECT * FROM jt").show()
    # _1 _2
    # a  b
    
    sqlCtx.refreshTable("jt")
    sqlCtx.sql("SELECT * FROM jt").show()
    # _1 _2 _3
    # a  b  c
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-6696] [SQL] Adds HiveContext.refreshTab...

Reply via email to