Re: How to get logging right for Spark applications in the YARN ecosystem

2019-08-01 Thread Srinath C
Hi Raman, Probably use the rolling file appender in log4j to compress the rotated log file? Regards. On Fri, Aug 2, 2019 at 12:47 AM raman gugnani wrote: > HI , > > I am looking for right solution for logging the logs produced by the > executors. Most of the places I have seen logging done

Re: spark stream kafka wait for all data process done

2019-08-01 Thread 刘 勇
Hi, You can set spark.streaming.kafka.backpressure.enable=true. If your tasks can't process larger data that this variable can control the kafka data into streaming speed. And you can increment your streaming process time window. Sent from my Samsung Galaxy smartphone. Original

spark stream kafka wait for all data process done

2019-08-01 Thread zenglong chen
How can kafka wait for tasks process done then begin receive next batch?I want to process 5000 record once by pandas and it may take too long time to process.

Announcing Delta Lake 0.3.0

2019-08-01 Thread Tathagata Das
Hello everyone, We are excited to announce the availability of Delta Lake 0.3.0 which introduces new programmatic APIs for manipulating and managing data in Delta Lake tables. Here are the main features: - Scala/Java APIs for DML commands - You can now modify data in Delta Lake

How to get logging right for Spark applications in the YARN ecosystem

2019-08-01 Thread raman gugnani
HI , I am looking for right solution for logging the logs produced by the executors. Most of the places I have seen logging done by log4j properties, but no where people I have seen any solution where logs are being compressed. Is there anyway I can compress the logs, So that further those logs