Pulling API Endpoints into Kafka Topics in Avro

2017-03-28 Thread Steve Champagne
I'm in the process of creating an ingest workflow that will pull into Kafka topics a number of API endpoints on an hourly basis. I'd like convert them from JSON to AVRO when I bring them in. I have, however, run into a few problems that I haven't been able to figure out and haven't turned anything

Re: Pulling API Endpoints into Kafka Topics in Avro

2017-03-28 Thread Steve Champagne
uot;, > "name": "agentMetrics", > "fields": [ > { > "name": "connectedEngagements", > "type": "long" > }, > { >

Merging Small Files

2017-08-06 Thread Steve Champagne
Hello, I'm pulling data from API endpoints every five minutes and putting it into HDFS. This, however, is giving me quite a few small files. 288 files per day times however many endpoints I am reading. My current approach for handling them is to load the small files into some sort of staging

HDFS Environments

2017-08-17 Thread Steve Champagne
Hello, How would I handle environment separation in HDFS? My initial thought was to use a directory structure like /data///, but I'm running into problems with reading the files back out of HDFS (for example merging small files into larger files). For the ListHDFS processor, it doesn't allow

Wrapping a JSON string

2018-08-27 Thread Steve Champagne
Hello, I'm ingesting some JSON data that I'd like to wrap in a json_string field as a string type. I tried using a JsonPathReader with a dynamic property 'json_string' and a value of $, but I seem to be getting back a string version of the JSON:

Re: Wrapping a JSON string

2018-09-04 Thread Steve Champagne
D": "${payload}" > > } > > } > > ] > > > > > > > > > *Aurélien DEHAY *Big Data Architect > +33 616 815 441 > > aurelien.de...@faurecia.com > > 2 rue Hennape - 92735 Nanterre Cedex – France > > [image: Faurecia_inspiring_mobility_logo-RVB