Hello, I think it likely requires a custom processor, or custom script with ExecuteScript.
Coming out of the database processor, you are going to have two levels of Avro... The outer Avro is representing the rows from your database, so you'll have Avro records where one field in each record is itself another Avro object. You would likely need to split all the outer records to one per flow file (not great for performance), then for each flow file use the custom processors/script to read the value of the field where the Avro blob is, and overwrite the flow file content with that value, then send all of these to a MergeRecord. -Bryan On Mon, Sep 14, 2020 at 2:29 PM Jason Iannone <[email protected]> wrote: > Anyone have thoughts on this? Essentially we have binary avro stored as a > BLOB in Oracle, and I want to extract it via Nifi and read and write out > the contents. > > Thanks, > Jason > > On Mon, Aug 17, 2020 at 10:04 AM Jason Iannone <[email protected]> wrote: > >> Hi all, >> >> I have a scenario where an Avro binary is being stored as a BLOB in an >> RDBMS. What's the recommended approach for querying this in bulk, >> extracting this specific field, and batching it to HDFS? >> >> 1. GenerateTableFetch OR QueryDatabaseTableRecord >> 2. Extract Avro column and assemble output <-- How? >> 3. MergeRecord >> 4. PutHDFS >> >> Additional clarification is that ultimately I want to make the Avro >> exactly as it is (content wise), store in HDFS, with an external Hive table >> on top. >> >> Thanks, >> Jason >> >
