Re: Issues with reading gz files with Spark Streaming

2016-10-24 Thread Steve Loughran
On 22 Oct 2016, at 20:58, Nkechi Achara > wrote: I do not use rename, and the files are written to, and then moved to a directory on HDFS in gz format. in that case there's nothing obvious to mee. try logging at trace/debug the class:

Re: Issues with reading gz files with Spark Streaming

2016-10-22 Thread Nkechi Achara
I do not use rename, and the files are written to, and then moved to a directory on HDFS in gz format. On 22 October 2016 at 15:14, Steve Loughran wrote: > > > On 21 Oct 2016, at 15:53, Nkechi Achara wrote: > > > > Hi, > > > > I am using Spark

Re: Issues with reading gz files with Spark Streaming

2016-10-22 Thread Steve Loughran
> On 21 Oct 2016, at 15:53, Nkechi Achara wrote: > > Hi, > > I am using Spark 1.5.0 to read gz files with textFileStream, but when new > files are dropped in the specified directory. I know this is only the case > with gz files as when i extract the file into the

Issues with reading gz files with Spark Streaming

2016-10-21 Thread Nkechi Achara
Hi, I am using Spark 1.5.0 to read gz files with textFileStream, but when new files are dropped in the specified directory. I know this is only the case with gz files as when i extract the file into the directory specified the files are read on the next window and processed. My code is here: