Hi list, I have a Cassandra table with two fields; id bigint, kafka text
My goal is to read only the kafka field (that is a JSON) and infer the schema Hi have this skeleton code (not working): sc.stop import org.apache.spark._ import com.datastax.spark._ import org.apache.spark.sql.functions.get_json_object import org.apache.spark.sql.functions.to_json import org.apache.spark.sql.functions.from_json import org.apache.spark.sql.types._ val conf = new SparkConf(true) .set("spark.cassandra.connection.host", "127.0.0.1") .set("spark.cassandra.auth.username", "cassandra") .set("spark.cassandra.auth.password", "cassandra") val sc = new SparkContext(conf) val sqlContext = new org.apache.spark.sql.SQLContext(sc) val df = sqlContext.sql("SELECT kafka from table1") df.printSchema() I think at least I have two problems; is missing the keyspace, is not recognizing the table and for sure is not going to infer the schema from the text field. I have a working solution for json files, but I can't "translate" this to Cassandra: import org.apache.spark.sql.SparkSession import spark.implicits._ val spark = SparkSession.builder().appName("Spark SQL basic example").getOrCreate() val redf = spark.read.json("/usr/local/spark/examples/cqlsh_r.json") redf.printSchema redf.count redf.show redf.createOrReplaceTempView("clicks") val clicksDF = spark.sql("SELECT * FROM clicks") clicksDF.show() My Spark version is 2.2.1 and Cassandra version is 3.11.1 Thanks in advance --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org