Re: Spark streaming app shutting down

Dibyendu Bhattacharya Wed, 04 Feb 2015 20:08:31 -0800

Thanks Akhil for mentioning this Low Level Consumer (
https://github.com/dibbhatt/kafka-spark-consumer ) . Yes it has better
fault tolerant mechanism than any existing Kafka consumer available . This
has no data loss on receiver failure and have ability to reply or restart
itself in-case of failure. You can definitely give it a try .


Dibyendu

On Thu, Feb 5, 2015 at 1:04 AM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> AFAIK, From Spark 1.2.0 you can have WAL (Write Ahead Logs) for fault
> tolerance, which means it can handle the receiver/driver failures. You can
> also look at the lowlevel kafka consumer
> <https://github.com/dibbhatt/kafka-spark-consumer> which has a better
> fault tolerance mechanism for receiver failures. This low level consumer
> will push the offset of the message being read into zookeeper for fault
> tolerance. In your case i think mostly the "inflight data" would be lost if
> you arent using any of the fault tolerance mechanism.
>
> Thanks
> Best Regards
>
> On Wed, Feb 4, 2015 at 5:24 PM, Mukesh Jha <me.mukesh....@gmail.com>
> wrote:
>
>> Hello Sprakans,
>>
>> I'm running a spark streaming app which reads data from kafka topic does
>> some processing and then persists the results in HBase.
>>
>> I am using spark 1.2.0 running on Yarn cluster with 3 executors (2gb, 8
>> cores each). I've enable checkpointing & I am also  rate limiting my
>> kafkaReceivers so that the number of items read is not more than 10 records
>> per sec.
>> The kafkaReceiver I'm using is *not* ReliableKafkaReceiver.
>>
>> This app was running fine for ~3 days then there was an increased load on
>> the HBase server because of some other process querying HBase tables.
>> This led to increase in the batch processing time of the spark batches
>> (processed 1 min batch in 10 min) which previously was finishing in 20 sec
>> which in turn led to the shutdown of the spark application, PFA the
>> executor logs.
>>
>> From the logs I'm getting below exceptions *[1]* & *[2]* looks like
>> there was some outstanding Jobs that didn't get processed or the Job
>> couldn't find the input data. From the logs it looks seems that the
>> shutdown hook gets invoked but it cannot process the in-flight block.
>>
>> I have a couple of queries on this
>>   1) Does this mean that these jobs failed and the *in-flight data *is
>> lost?
>>   2) Does the Spark job *buffers kafka* input data while the Job is
>> under processing state for 10 mins and on shutdown is that too lost? (I do
>> not see any OOM error in the logs).
>>   3) Can we have *explicit commits* enabled in the kafkaReceiver so that
>> the offsets gets committed only when the RDD(s) get successfully processed?
>>
>> Also I'd like to know if there is a *graceful way to shutdown a spark
>> app running on yarn*. Currently I'm killing the yarn app to stop it
>> which leads to loss of that job's history wheras in this case the
>> application stops and succeeds and thus preserves the logs & history.
>>
>> *[1]* 15/02/02 19:30:11 ERROR client.TransportResponseHandler: Still
>> have 1 requests outstanding when connection from
>> hbase28.usdc2.cloud.com/10.193.150.221:43189 is closed
>> *[2]* java.lang.Exception: Could not compute split, block
>> input-2-1422901498800 not found
>> *[3]* 
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>> No lease on /tmp/spark/realtime-failover/msg_2378481654720966.avro (inode
>> 879488): File does not exist. Holder DFSClient_NONMAPREDUCE_-148264920_63
>> does not have any open files.
>>
>> --
>> Thanks & Regards,
>>
>> *Mukesh Jha <me.mukesh....@gmail.com>*
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>
>

Re: Spark streaming app shutting down

Reply via email to