I'm trying the same thing now. I guess you need to read the file as byte arrays somehow to make it work. What read function did you use? The mapper is not hard to write but the byte array stuff gives me a headache.
cheers Martin On Fri, Feb 12, 2016 at 9:12 PM, Nick Dimiduk <ndimi...@apache.org> wrote: > Hi Martin, > > I have the same usecase. I wanted to be able to load from dumps of data in > the same format as is on the kafak queue. I created a new application main, > call it the "job" instead of the "flow". I refactored my code a bit for > building the flow so all that can be reused via factory method. I then > implemented a MapFunction that simply calls my existing deserializer. > Create a new DataStream from flat file and tack on the MapFunction step. > The resulting DataStream is then type-compatible with the Kakfa consumer > that starts the "flow" application, so I pass it into the factory method. > Tweak the ParameterTools options for the "job" application, et voilà! > > Sorry I don't have example code for you; this would be a good example to > contribute back to the community's example library though. > > Good luck! > -n > > On Fri, Feb 12, 2016 at 2:25 AM, Martin Neumann <mneum...@sics.se> wrote: > >> Its not only about testing, I will also need to run things against >> different datasets. I want to reuse as much of the code as possible to load >> the same data from a file instead of kafka. >> >> Is there a simple way of loading the data from a File using the same >> conversion classes that I would use to transfrom them when I read them from >> kafka or do I have to write a new avro deserializer (InputFormat). >> >> On Fri, Feb 12, 2016 at 2:06 AM, Gyula Fóra <gyula.f...@gmail.com> wrote: >> >>> Hey, >>> >>> A very simple thing you could do is to set up a simple kafka producer in >>> a java program that will feed the data into a topic. This also has the >>> additional benefit that you are actually testing against kafka. >>> >>> Cheers, >>> Gyula >>> >>> Martin Neumann <mneum...@sics.se> ezt írta (időpont: 2016. febr. 12., >>> P, 0:20): >>> >>>> Hej, >>>> >>>> I have a stream program reading data from Kafka where the data is in >>>> avro. I have my own DeserializationSchema to deal with it. >>>> >>>> For testing reasons I want to read a dump from hdfs instead, is there a >>>> way to use the same DeserializationSchema to read from an avro file stored >>>> on hdfs? >>>> >>>> cheers Martin >>>> >>> >> >