Re: [Spark Streaming] Spark Streaming dropping last lines

2016-02-10 Thread Nipun Arora
Hi All, I apologize for reposting, I wonder if anyone can explain this behavior? And what would be the best way to resolve this without introducing something like kafka in the midst. I basically have a logstash instance, and would like to stream output of logstash to spark_streaming without

Re: [Spark Streaming] Spark Streaming dropping last lines

2016-02-10 Thread Dean Wampler
Here's a wild guess; it might be the fact that your first command uses tail -f, so it doesn't close the input file handle when it hits the end of the available bytes, while your second use of nc does this. If so, the last few lines might be stuck in a buffer waiting to be forwarded. If so, Spark

[Spark Streaming] Spark Streaming dropping last lines

2016-02-08 Thread Nipun Arora
I have a spark-streaming service, where I am processing and detecting anomalies on the basis of some offline generated model. I feed data into this service from a log file, which is streamed using the following command tail -f | nc -lk Here the spark streaming service is taking data from