Re: [Spark Streaming] Disk not being cleaned up during runtime after RDD being processed

2015-03-30 Thread Nathan Marin
, size) .set(spark.executor.logs.rolling.size.maxBytes, 1024) .set(spark.executor.logs.rolling.maxRetainedFiles, 3) Also see, whats really getting filled on disk. Thanks Best Regards On Sat, Mar 28, 2015 at 8:18 PM, Nathan Marin nathan.ma...@teads.tv wrote: Hi, I’ve been trying to use

Re: Spark Streaming/Flume display all events

2015-03-30 Thread Nathan Marin
Hi, DStream.print() only prints the first 10 elements contained in the Stream. You can call DStream.print(x) to print the first x elements but if you don’t know the exact count you can call DStream.foreachRDD and apply a function to display the content of every RDD. For example:

[Spark Streaming] Disk not being cleaned up during runtime after RDD being processed

2015-03-28 Thread Nathan Marin
Hi, I’ve been trying to use Spark Streaming for my real-time analysis application using the Kafka Stream API on a cluster (using the yarn version) of 6 executors with 4 dedicated cores and 8192mb of dedicated RAM. The thing is, my application should run 24/7 but the disk usage is leaking. This

[Spark Streaming] Disk not being cleaned up during runtime after RDD being processed

2015-03-26 Thread Nathan Marin
Hi, I’ve been trying to use Spark Streaming for my real-time analysis application using the Kafka Stream API on a cluster (using the yarn version) of 6 executors with 4 dedicated cores and 8192mb of dedicated RAM. The thing is, my application should run 24/7 but the disk usage is leaking. This