Hi all,   I have a coded a custom receiver which receives kafka messages. These 
Kafka messages have FTP server credentials in them. The receiver then opens the 
message and uses the ftp credentials in it  to connect to the ftp server. It 
then streams this huge text file (3.3G) . Finally this stream it read line by 
line using buffered reader and pushed to the spark streaming via the receiver's 
"store" method. Spark streaming process receives all these lines and stores it 
in hdfs.
With this process I could ingest small files (50 mb) but cant ingest this 3.3gb 
file.  I get a YARN exception of SIGTERM 15 in spark streaming process. Also, I 
tried going to that 3.3GB file directly (without custom receiver) in spark 
streaming using ssc.textFileStream  and everything works fine and that file 
ends in HDFS
Please let me know what I might have to do to get this working with receiver. I 
know there are better ways to ingest the file but we need to use Spark 
streaming in our case.
Thanks.

Reply via email to