Re: Not able to overwrite cassandra table using Spark

2018-06-27 Thread Siva Samraj
You can try with this, it will work val finaldf = merchantdf.write. format("org.apache.spark.sql.cassandra") .mode(SaveMode.Overwrite) .option("confirm.truncate", true) .options(Map("table" -> "tablename", "keyspace" -> "keyspace")) .save() On Wed 27 Jun,

Re: Spark Streaming

2018-11-26 Thread Siva Samraj
ect statement. If I'm not mistaken, it is known > as a bit costly since each call would produce a new Dataset. Defining > schema and using "from_json" will eliminate all the call of withColumn"s" > and extra calls of "get_json_object". > > - Jungtaek

Spark Streaming

2018-11-26 Thread Siva Samraj
Hello All, I am using Spark 2.3 version and i am trying to write Spark Streaming Join. It is a basic join and it is taking more time to join the stream data. I am not sure any configuration we need to set on Spark. Code: * import org.apache.spark.sql.SparkSession import

Re: Spark structural streaming sinks output late

2020-03-28 Thread Siva Samraj
Yes, I am also facing the same issue. Did you figured out? On Tue, 9 Jul 2019, 7:25 pm Kamalanathan Venkatesan, < kamalanatha...@in.ey.com> wrote: > Hello, > > > > I have below spark structural streaming code and I was expecting the > results to be printed on the console every 10 seconds. But, I

Spark Streaming Code

2020-03-28 Thread Siva Samraj
Hi Team, Need help on windowing & watermark concept. This code is not working as expected. package com.jiomoney.streaming import org.apache.spark.sql.SparkSession import org.apache.spark.sql.functions._ import org.apache.spark.sql.streaming.ProcessingTime object SlingStreaming { def

Spark Streaming ElasticSearch

2020-10-05 Thread Siva Samraj
Hi Team, I have a spark streaming job, which will read from kafka and write into elastic via Http request. I want to validate each request from Kafka and change the payload as per business need and write into Elastic Search. I have used ES Http Request to push the data into Elastic Search. Can

Re: Spark Streaming ElasticSearch

2020-10-05 Thread Siva Samraj
Hi Jainshasha, I need to read each row from Dataframe and made some changes to it before inserting it into ES. Thanks Siva On Mon, Oct 5, 2020 at 8:06 PM jainshasha wrote: > Hi Siva > > To emit data into ES using spark structured streaming job you need to used > ElasticSearch jar which has

Offset Management in Spark

2020-09-30 Thread Siva Samraj
Hi all, I am using Spark Structured Streaming (Version 2.3.2). I need to read from Kafka Cluster and write into Kerberized Kafka. Here I want to use Kafka as offset checkpointing after the record is written into Kerberized Kafka. Questions: 1. Can we use Kafka for checkpointing to manage offset

Spark - ElasticSearch Integration

2021-11-22 Thread Siva Samraj
Hi All, I want to write a Spark Streaming Job from Kafka to Elasticsearch. Here I want to detect the schema dynamically while reading it from Kafka. Can you help me to do that.? I know, this can be done in Spark Batch Processing via the below line. val schema =

Re: add an auto_increment column

2022-02-06 Thread Siva Samraj
Monotonically_increasing_id() will give the same functionality On Mon, 7 Feb, 2022, 6:57 am , wrote: > For a dataframe object, how to add a column who is auto_increment like > mysql's behavior? > > Thank you. > > - > To