Have a look at the spark streaming. You can make use of the ssc.fileStream.


val avroStream = ssc.fileStream[AvroKey[GenericRecord], NullWritable,

You can also specify a filter function
as the second argument.

Best Regards

On Wed, Aug 19, 2015 at 10:46 PM, Masf <masfwo...@gmail.com> wrote:

> Hi.
> I'd like to read Avro files using this library
> https://github.com/databricks/spark-avro
> I need to load several files from a folder, not all files. Is there some
> functionality to filter the files to load?
> And... Is is possible to know the name of the files loaded from a folder?
> My problem is that I have a folder where an external process is inserting
> files every X minutes and I need process these files once, and I can't
> move, rename or copy the source files.
> Thanks
> --
> Regards
> Miguel Ángel

Reply via email to