I was having this issue when my batch interval was very big -- like 5 minutes. When my batch interval is smaller, I don't get this exception. Can someone explain to me why this might be happening?
Vadim ᐧ On Tue, Apr 28, 2015 at 4:26 PM, Vadim Bichutskiy < vadim.bichuts...@gmail.com> wrote: > I am using Spark Streaming to monitor an S3 bucket. Everything appears to > be fine. But every batch interval I get the following: > > *15/04/28 16:12:36 WARN HttpMethodReleaseInputStream: Attempting to > release HttpMethod in finalize() as its response data stream has gone out > of scope. This attempt will not always succeed and cannot be relied upon! > Please ensure response data streams are always fully consumed or closed to > avoid HTTP connection starvation.* > > *15/04/28 16:12:36 WARN HttpMethodReleaseInputStream: Successfully > released HttpMethod in finalize(). You were lucky this time... Please > ensure response data streams are always fully consumed or closed.* > > *Traceback (most recent call last):* > > * File "/Users/vb/spark-1.3.0-bin-hadoop2.4/python/pyspark/daemon.py", > line 162, in manager* > > * code = worker(sock)* > > * File "/Users/vb/spark-1.3.0-bin-hadoop2.4/python/pyspark/daemon.py", > line 60, in worker* > > * worker_main(infile, outfile)* > > * File "/Users/vb/spark-1.3.0-bin-hadoop2.4/python/pyspark/worker.py", > line 126, in main* > > * if read_int(infile) == SpecialLengths.END_OF_STREAM:* > > * File > "/Users/vb/spark-1.3.0-bin-hadoop2.4/python/pyspark/serializers.py", line > 528, in read_int* > > * raise EOFError* > > *EOFError* > > Does anyone know the cause of this and how to fix it? > > Thanks, > > Vadim > ᐧ >