Are there any news about this issue? I was using a local folder in linux
for checkpointing, "file:///opt/sparkfolders/checkpoints". I think that
being able to use the ReliableKafkaReceiver in a 24x7 system without having
to worry about disk getting full is a reasonable expectation.

Regards,

Luis

2014-11-21 15:17 GMT+00:00 Luis Ángel Vicente Sánchez <
langel.gro...@gmail.com>:

> I have seen the same behaviour while testing the latest spark 1.2.0
> snapshot.
>
> I'm trying the ReliableKafkaReceiver and it works quite well but the
> checkpoints folder is always increasing in size. The receivedMetaData
> folder remains almost constant in size but the receivedData folder is
> always increasing in size even if I set spark.cleaner.ttl to 300 seconds.
>
> Regards,
>
> Luis
>
> 2014-09-23 22:47 GMT+01:00 RodrigoB <rodrigo.boav...@aspect.com>:
>
>> Just a follow-up.
>>
>> Just to make sure about the RDDs not being cleaned up, I just replayed the
>> app both on the windows remote laptop and then on the linux machine and at
>> the same time was observing the RDD folders in HDFS.
>>
>> Confirming the observed behavior: running on the laptop I could see the
>> RDDs
>> continuously increasing. When I ran on linux, only two RDD folders were
>> there and continuously being recycled.
>>
>> Metadata checkpoints were being cleaned on both scenarios.
>>
>> tnks,
>> Rod
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/RDD-data-checkpoint-cleaning-tp14847p14939.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to