Hi All,
I apologize for reposting, I wonder if anyone can explain this behavior?
And what would be the best way to resolve this without introducing
something like kafka in the midst.
I basically have a logstash instance, and would like to stream output of
logstash to spark_streaming without
Here's a wild guess; it might be the fact that your first command uses tail
-f, so it doesn't close the input file handle when it hits the end of the
available bytes, while your second use of nc does this. If so, the last few
lines might be stuck in a buffer waiting to be forwarded. If so, Spark
I have a spark-streaming service, where I am processing and detecting
anomalies on the basis of some offline generated model. I feed data into
this service from a log file, which is streamed using the following command
tail -f | nc -lk
Here the spark streaming service is taking data from