My input file contains newline-delimited JSON records, one per text line.
The records on the Kafka topic are JSON blobs encoded to UTF8 and written
as bytes.

On Fri, Feb 12, 2016 at 1:41 PM, Martin Neumann <mneum...@sics.se> wrote:

> I'm trying the same thing now.
>
> I guess you need to read the file as byte arrays somehow to make it work.
> What read function did you use? The mapper is not hard to write but the
> byte array stuff gives me a headache.
>
> cheers Martin
>
>
>
>
> On Fri, Feb 12, 2016 at 9:12 PM, Nick Dimiduk <ndimi...@apache.org> wrote:
>
>> Hi Martin,
>>
>> I have the same usecase. I wanted to be able to load from dumps of data
>> in the same format as is on the kafak queue. I created a new application
>> main, call it the "job" instead of the "flow". I refactored my code a bit
>> for building the flow so all that can be reused via factory method. I then
>> implemented a MapFunction that simply calls my existing deserializer.
>> Create a new DataStream from flat file and tack on the MapFunction step.
>> The resulting DataStream is then type-compatible with the Kakfa consumer
>> that starts the "flow" application, so I pass it into the factory method.
>> Tweak the ParameterTools options for the "job" application, et voilà!
>>
>> Sorry I don't have example code for you; this would be a good example to
>> contribute back to the community's example library though.
>>
>> Good luck!
>> -n
>>
>> On Fri, Feb 12, 2016 at 2:25 AM, Martin Neumann <mneum...@sics.se> wrote:
>>
>>> Its not only about testing, I will also need to run things against
>>> different datasets. I want to reuse as much of the code as possible to load
>>> the same data from a file instead of kafka.
>>>
>>> Is there a simple way of loading the data from a File using the same
>>> conversion classes that I would use to transfrom them when I read them from
>>> kafka or do I have to write a new avro deserializer (InputFormat).
>>>
>>> On Fri, Feb 12, 2016 at 2:06 AM, Gyula Fóra <gyula.f...@gmail.com>
>>> wrote:
>>>
>>>> Hey,
>>>>
>>>> A very simple thing you could do is to set up a simple kafka producer
>>>> in a java program that will feed the data into a topic. This also has the
>>>> additional benefit that you are actually testing against kafka.
>>>>
>>>> Cheers,
>>>> Gyula
>>>>
>>>> Martin Neumann <mneum...@sics.se> ezt írta (időpont: 2016. febr. 12.,
>>>> P, 0:20):
>>>>
>>>>> Hej,
>>>>>
>>>>> I have a stream program reading data from Kafka where the data is in
>>>>> avro. I have my own DeserializationSchema to deal with it.
>>>>>
>>>>> For testing reasons I want to read a dump from hdfs instead, is there
>>>>> a way to use the same DeserializationSchema to read from an avro file
>>>>> stored on hdfs?
>>>>>
>>>>> cheers Martin
>>>>>
>>>>
>>>
>>
>

Reply via email to