Hi Todd, What you could do is run some SparkSQL commands immediately after the Thrift server starts up. Or does Tableau have some init SQL commands you could run?
You can actually load data using SQL, such as: create temporary table people using org.apache.spark.sql.json options (path 'examples/src/main/resources/people.json’) cache table people create temporary table users using org.apache.spark.sql.parquet options (path 'examples/src/main/resources/users.parquet’) cache table users From: Todd Nist Date: Tuesday, February 10, 2015 at 3:03 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: SparkSQL + Tableau Connector Hi, I'm trying to understand how and what the Tableau connector to SparkSQL is able to access. My understanding is it needs to connect to the thriftserver and I am not sure how or if it exposes parquet, json, schemaRDDs, or does it only expose schemas defined in the metastore / hive. For example, I do the following from the spark-shell which generates a schemaRDD from a csv file and saves it as a JSON file as well as a parquet file. import org.apache.sql.SQLContext import com.databricks.spark.csv._ val sqlContext = new SQLContext(sc) val test = sqlContext.csfFile("/data/test.csv")test.toJSON().saveAsTextFile("/data/out") test.saveAsParquetFile("/data/out") When I connect from Tableau, the only thing I see is the "default" schema and nothing in the tables section. So my questions are: 1. Can the connector fetch or query schemaRDD's saved to Parquet or JSON files? 2. Do I need to do something to expose these via hive / metastore other than creating a table in hive? 3. Does the thriftserver need to be configured to expose these in some fashion, sort of related to question 2. TIA for the assistance. -Todd