[ https://issues.apache.org/jira/browse/SPARK-17614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510525#comment-15510525 ]
Paul Wu commented on SPARK-17614: --------------------------------- Thanks. I tried to register my custom dialect as following, but it does not reach the getTableExistsQuery() method. Could anyone help? import org.apache.spark.sql.jdbc.JdbcDialect; public class NRSCassandraDialect extends JdbcDialect { @Override public boolean canHandle(String url) { System.out.println("came here.."+ url.startsWith("jdbc:cassandra")); return url.startsWith("jdbc:cassandra"); } @Override public String getTableExistsQuery (String table) { System.out.println("query?"); return "SELECT * from " + table + " LIMIT 1"; } } -------------------------------------------------------------- public class CassJDBC implements Serializable { private static final org.apache.log4j.Logger LOGGER = org.apache.log4j.Logger.getLogger(CassJDBC.class); private static final String _CONNECTION_URL = "jdbc:cassandra://ulpd326.****.com/test?loadbalancing=DCAwareRoundRobinPolicy(%22datacenter1%22)"; private static final String _USERNAME = ""; private static final String _PWD = ""; private static final SparkSession sparkSession = SparkSession.builder() .config("spark.sql.warehouse.dir", "file:///home/zw251y/tmp").master("local[*]").appName("Spark2JdbcDs").getOrCreate(); public static void main(String[] args) { JdbcDialects.registerDialect(new NRSCassandraDialect()); final Properties connectionProperties = new Properties(); final String dbTable= "sql_demo"; Dataset<Row> jdbcDF = sparkSession.read() .jdbc(_CONNECTION_URL, dbTable, connectionProperties); jdbcDF.show(); } } -------------------- Error message: came here..true parameters = "datacenter1" Exception in thread "main" java.sql.SQLTransientException: com.datastax.driver.core.exceptions.SyntaxError: line 1:29 no viable alternative at input '1' (SELECT * FROM sql_demo WHERE [1]...) at com.github.adejanovski.cassandra.jdbc.CassandraPreparedStatement.<init>(CassandraPreparedStatement.java:108) at com.github.adejanovski.cassandra.jdbc.CassandraConnection.prepareStatement(CassandraConnection.java:371) at com.github.adejanovski.cassandra.jdbc.CassandraConnection.prepareStatement(CassandraConnection.java:348) at com.github.adejanovski.cassandra.jdbc.CassandraConnection.prepareStatement(CassandraConnection.java:48) > sparkSession.read() .jdbc(***) use the sql syntax "where 1=0" that Cassandra > does not support > --------------------------------------------------------------------------------------------- > > Key: SPARK-17614 > URL: https://issues.apache.org/jira/browse/SPARK-17614 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.0.0 > Environment: Any Spark Runtime > Reporter: Paul Wu > Priority: Minor > Labels: cassandra-jdbc, sql > > I have the code like the following with Cassandra JDBC > (https://github.com/adejanovski/cassandra-jdbc-wrapper): > final String dbTable= "sql_demo"; > Dataset<Row> jdbcDF > = sparkSession.read() > .jdbc(CASSANDRA_CONNECTION_URL, dbTable, > connectionProperties); > List<Row> rows = jdbcDF.collectAsList(); > It threw the error: > Exception in thread "main" java.sql.SQLTransientException: > com.datastax.driver.core.exceptions.SyntaxError: line 1:29 no viable > alternative at input '1' (SELECT * FROM sql_demo WHERE [1]...) > at > com.github.adejanovski.cassandra.jdbc.CassandraPreparedStatement.<init>(CassandraPreparedStatement.java:108) > at > com.github.adejanovski.cassandra.jdbc.CassandraConnection.prepareStatement(CassandraConnection.java:371) > at > com.github.adejanovski.cassandra.jdbc.CassandraConnection.prepareStatement(CassandraConnection.java:348) > at > com.github.adejanovski.cassandra.jdbc.CassandraConnection.prepareStatement(CassandraConnection.java:48) > The reason is that the Spark jdbc code uses the sql syntax "where 1=0" > somewhere (to get the schema?), but Cassandra does not support this syntax. > Not sure how this issue can be resolved...this is because CQL is not standard > sql. > The following log shows more information: > 16/09/20 13:16:35 INFO CassandraConnection 138: Datacenter: %s; Host: %s; > Rack: %s > 16/09/20 13:16:35 TRACE CassandraPreparedStatement 98: CQL: SELECT * FROM > sql_demo WHERE 1=0 > 16/09/20 13:16:35 TRACE RequestHandler 71: [19400322] > com.datastax.driver.core.Statement$1@41ccb3b9 > 16/09/20 13:16:35 TRACE RequestHandler 272: [19400322-1] Starting -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org