Hi list, val dfs = spark .read .format("org.apache.spark.sql.cassandra") .options(Map("cluster" -> "helloCassandra", "spark.cassandra.connection.host" -> "127.0.0.1", "spark.input.fetch.size_in_rows" -> "100000", "spark.cassandra.input.consistency.level" -> "ONE", "table" -> "cliks", "pushdown" -> "true", "keyspace" -> "test", "header" -> "false", "inferSchema" -> "true")) .load() .select("kafka")
dfs.printSchema() Any way to put this schema in json? Thanks in advance On 22-01-2018 17:51, Sathish Kumaran Vairavelu wrote: > You have to register a Cassandra table in spark as dataframes > > > https://github.com/datastax/spark-cassandra-connector/blob/master/doc/14_data_frames.md > > > Thanks > > Sathish > On Mon, Jan 22, 2018 at 7:43 AM Conconscious <conconsci...@gmail.com > <mailto:conconsci...@gmail.com>> wrote: > > Hi list, > > I have a Cassandra table with two fields; id bigint, kafka text > > My goal is to read only the kafka field (that is a JSON) and infer the > schema > > Hi have this skeleton code (not working): > > sc.stop > import org.apache.spark._ > import com.datastax.spark._ > import org.apache.spark.sql.functions.get_json_object > > import org.apache.spark.sql.functions.to_json > import org.apache.spark.sql.functions.from_json > import org.apache.spark.sql.types._ > > val conf = new SparkConf(true) > .set("spark.cassandra.connection.host", "127.0.0.1") > .set("spark.cassandra.auth.username", "cassandra") > .set("spark.cassandra.auth.password", "cassandra") > val sc = new SparkContext(conf) > > val sqlContext = new org.apache.spark.sql.SQLContext(sc) > val df = sqlContext.sql("SELECT kafka from table1") > df.printSchema() > > I think at least I have two problems; is missing the keyspace, is not > recognizing the table and for sure is not going to infer the > schema from > the text field. > > I have a working solution for json files, but I can't "translate" this > to Cassandra: > > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val spark = SparkSession.builder().appName("Spark SQL basic > example").getOrCreate() > val redf = spark.read.json("/usr/local/spark/examples/cqlsh_r.json") > redf.printSchema > redf.count > redf.show > redf.createOrReplaceTempView("clicks") > val clicksDF = spark.sql("SELECT * FROM clicks") > clicksDF.show() > > My Spark version is 2.2.1 and Cassandra version is 3.11.1 > > Thanks in advance > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> >