Hi to all,
I'm the maintainer of the JDBC driver OrientDB.

We are trying to fetch data to Spark from an orientDB using theJDBC driver.

I'm facing some issues:


To gather metadata spark performs a "test query" of this form: select
* from TABLE_NAME whre 1=0
For this case, I write a workaround inside the driver, getting rid of
where 1=0 and replaging it with LIMIT 1.

After that query, it then performs a query with each field wrapped by
double quote:

SELECT "stringKey", "intKey" FROM Item

In orientDB's SQL dialect a double quote means a string value, so for
each record of the result set it will return stringkey and intKey as
vaules.

row 1: stringKey:strinKey, intKey:intKey
row 2: stringKey:strinKey, intKey:intKey
row 3: stringKey:strinKey, intKey:intKey
....


Is there a  way to configure SqlContext to avoid the double quoting of
fields names?

I'm using Java with spark 1.6.2:

Map<String, String> options = new HashMap<String, String>() {{
  put("url", "jdbc:orient:plocal:./target/databases/sparkTest");
  put("dbtable", "Item");
}};

SQLContext sqlCtx = new SQLContext(ctx);

DataFrame jdbcDF = sqlCtx.read().format("jdbc").options(options).load();


I found that someone has the same problem with SAS JDBC.
As a workaround I will implement a query cleaner inside the driver,
but an option to configure the quoting char would be better.

Regards,
RF

-- 
Roberto Franchini
"The impossible is inevitable"
https://github.com/robfrank/ https://twitter.com/robfrankie
hangout:ro.franchini skype:ro.franchini

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to