Hello,

I think it likely requires a custom processor, or custom script with
ExecuteScript.

Coming out of the database processor, you are going to have two levels of
Avro...

The outer Avro is representing the rows from your database, so you'll have
Avro records where one field in each record is itself another Avro object.

You would likely need to split all the outer records to one per flow file
(not great for performance), then for each flow file use the custom
processors/script to read the value of the field where the Avro blob is,
and overwrite the flow file content with that value, then send all of these
to a MergeRecord.

-Bryan


On Mon, Sep 14, 2020 at 2:29 PM Jason Iannone <[email protected]> wrote:

> Anyone have thoughts on this? Essentially we have binary avro stored as a
> BLOB in Oracle, and I want to extract it via Nifi and read and write out
> the contents.
>
> Thanks,
> Jason
>
> On Mon, Aug 17, 2020 at 10:04 AM Jason Iannone <[email protected]> wrote:
>
>> Hi all,
>>
>> I have a scenario where an Avro binary is being stored as a BLOB in an
>> RDBMS. What's the recommended approach for querying this in bulk,
>> extracting this specific field, and batching it to HDFS?
>>
>>    1. GenerateTableFetch OR QueryDatabaseTableRecord
>>    2. Extract Avro column and assemble output <-- How?
>>    3. MergeRecord
>>    4. PutHDFS
>>
>> Additional clarification is that ultimately I want to make the Avro
>> exactly as it is (content wise), store in HDFS, with an external Hive table
>> on top.
>>
>> Thanks,
>> Jason
>>
>

Reply via email to