Assuming you are running Linux, an easy option would be just to use the
Linux tail command to extract the last line (or last couple of lines) of
a file and save them to a different file/directory, before feeding it to
Spark. It shouldn't be hard to write a shell script that executes tail
on al
Hi users,
Does anyone here has experience with written spark code that just read the
last line of each text file in a directory, s3 bucket, etc?
I am looking for a solution that doesn’t require reading the whole file. I
basically wonder whether you can create a data frame/Rdd using file seek.
Not s