Hi,
I'm working on integrating Apache Drill with Apache Spark with Drill's JDBC
driver. I'm trying a simple select * from table from Drill through
spark.sqlContext.load via jdbc driver. I'm running the following code in
Spark Shell:
> ./bin/spark-shell --driver-class-path
/home/ubuntu/dir/spark/jars/jackson-databind-2.6.5.jar --packages
org.apache.drill.exec:drill-jdbc-all:1.10.0
scala> val options = Map[String,String](
"driver" -> "org.apache.drill.jdbc.Driver",
"url" -> "jdbc:drill:drillbit=localhost:31010",
"dbtable" -> "(SELECT * FROM dfs.root.`output.parquet`) AS Customers")
scala> val df = spark.sqlContext.load("jdbc",options)
scala> df.schema
res0: org.apache.spark.sql.types.StructType =
StructType(StructField(CustomerID,IntegerType,true),
StructField(First_name,StringType,true),
StructField(Last_name,StringType,true),
StructField(Email,StringType,true), StructField(Gender,StringType,true),
StructField(Country,StringType,true))
It gives correct schema of DataFrame, but when I do:
scala> df.show
*I am facing the following error:*
java.sql.SQLException: Failed to create prepared statement: PARSE
ERROR: *Encountered
"\"" at line 1, column 23.*
Was expecting one of:
"STREAM" ...
"DISTINCT" ...
"ALL" ...
"*" ...
"+" ...
"-" ...
<UNSIGNED_INTEGER_LITERAL> ...
__MORE_DRILL_GRAMMAR__ ...
SQL Query SELECT * FROM (SELECT
"CustomerID","First_name","Last_name","Email","Gender","Country" FROM
(SELECT * FROM dfs.root.`output.parquet`) AS Customers ) LIMIT 0
Now, the Encountered quote is at "CustomerID" in the query.
I tried to run the following query in Drill shell:
SELECT "CustomerID" from dfs.root.`output.parquet`;
It gives the same error of 'Encountered "\"" '.
I want to ask if there is any way to remove the above "SELECT
"CustomerID","First_name","Last_name","Email","Gender","Country" FROM" from
the above query formulated by Spark and pushed down to Apache Drill via
JDBC driver.
Or any other way around like removing the Quotes?
Thanks,
Luqman