I'm in the process of creating an ingest workflow that will pull into Kafka
topics a number of API endpoints on an hourly basis. I'd like convert them
from JSON to AVRO when I bring them in. I have, however, run into a few
problems that I haven't been able to figure out and haven't turned anything
uot;,
> "name": "agentMetrics",
> "fields": [
> {
> "name": "connectedEngagements",
> "type": "long"
> },
> {
>
Hello,
I'm pulling data from API endpoints every five minutes and putting it into
HDFS. This, however, is giving me quite a few small files. 288 files per
day times however many endpoints I am reading. My current approach for
handling them is to load the small files into some sort of staging
Hello,
How would I handle environment separation in HDFS? My initial thought was
to use a directory structure like /data///,
but I'm running into problems with reading the files back out of HDFS (for
example merging small files into larger files). For the ListHDFS processor,
it doesn't allow
Hello,
I'm ingesting some JSON data that I'd like to wrap in a json_string field
as a string type. I tried using a JsonPathReader with a dynamic property
'json_string' and a value of $, but I seem to be getting back a string
version of the JSON:
D": "${payload}"
>
> }
>
> }
>
> ]
>
>
>
>
>
>
>
>
> *Aurélien DEHAY *Big Data Architect
> +33 616 815 441
>
> aurelien.de...@faurecia.com
>
> 2 rue Hennape - 92735 Nanterre Cedex – France
>
> [image: Faurecia_inspiring_mobility_logo-RVB