Re: Happy Diwali everyone!!!

2018-11-07 Thread Dilip Biswal
Thank you Sean. Happy Diwali !!   -- Dilip - Original message -From: Xiao Li To: "user@spark.apache.org" , user Cc:Subject: Happy Diwali everyone!!!Date: Wed, Nov 7, 2018 3:10 PM  Happy Diwali everyone!!!   Xiao Li   -

Happy Diwali everyone!!!

2018-11-07 Thread Xiao Li
Happy Diwali everyone!!! Xiao Li

subscribe

2018-11-07 Thread Vein Kong
subscribe

[Spark-Core] Long scheduling delays (1+ hour)

2018-11-07 Thread bsikander
We are facing an issue with very long scheduling delays in Spark (upto 1+ hours). We are using Spark-standalone. The data is being pulled from Kafka. Any help would be much appreciated. I have attached the screenshots.

Re: [Spark-Core] Long scheduling delays (1+ hour)

2018-11-07 Thread Biplob Biswas
Hi, This has to do with your batch duration and processing time, as a rule, the batch duration should be lower than the processing time of your data. As I can see from your screenshots, your batch duration is 10 seconds but your processing time is more than a minute mostly, this adds up and you

Re: How to increase the parallelism of Spark Streaming application?

2018-11-07 Thread vincent gromakowski
On the other side increasing parallelism with kakfa partition avoid the shuffle in spark to repartition Le mer. 7 nov. 2018 à 09:51, Michael Shtelma a écrit : > If you configure to many Kafka partitions, you can run into memory issues. > This will increase memory requirements for spark job a

Re: How to increase the parallelism of Spark Streaming application?

2018-11-07 Thread Michael Shtelma
If you configure to many Kafka partitions, you can run into memory issues. This will increase memory requirements for spark job a lot. Best, Michael On Wed, Nov 7, 2018 at 8:28 AM JF Chen wrote: > I have a Spark Streaming application which reads data from kafka and save > the the

DB2 Sequence - Error while invoking

2018-11-07 Thread ☼ R Nair
Hi all, We are trying to call the DB2 Sequence through Spark and assign that value to one of the column (PK) in table. We are getting the below issue: SEQ: CITI_VENDOR_UNITED_LIST_TARGET_SEQ Table: CITI_VENDOR_UNITED_LIST_TARGET DB: CITIVENDORS Host: CIT_XX Port: 42194 Schema: MINE DB2 SQL

Re: [Spark-Core] Long scheduling delays (1+ hour)

2018-11-07 Thread bsikander
Actually, our job runs fine for 17-18 hours and this behavior just suddenly starts happening after that. We found the following ticket which is exactly what is happening in our Kafka cluster also. WARN Failed to send SSL Close message (org.apache.kafka.common.network.SslTransportLayer) You

How does shuffle operation work in Spark?

2018-11-07 Thread Joe
Hello, I'm looking for a detailed description of the shuffle operation in Spark, something that would explain what are the criteria for assigning blocks to nodes, how many go where, what happens when there are memory constraints, etc. If anyone knows of such a document I'd appreciate a link

Re: How to increase the parallelism of Spark Streaming application?

2018-11-07 Thread Shahbaz
Hi , - Do you have adequate CPU cores allocated to handle increased partitions ,generally if you have Kafka partitions >=(greater than or equal to) CPU Cores Total (Number of Executor Instances * Per Executor Core) ,gives increased task parallelism for reader phase. - However if