Bill,

Can you share a little bit more detail as to your database setup?
What kind of database (MySQL, Oracle, Postgres, e.g.) is it, and what
does your table look like?  Are you looking to do this once, or
periodically, or incrementally as new rows are added? If
incrementally, is there a column that is always being increased (like
a primary key / ID column or timestamp)?

In general you'll want to set up a DBCPConnectionPool controller
service, which gives processors connections to the database.  Then you
could use ExecuteSQL, QueryDatabaseTable, or perhaps another
SQL-related processor to fetch the data. The aforementioned processors
will require a reference to the DBCPConnectionPool you set up, then
they can be configured to execute a SQL statement (in the case of
ExecuteSQL) or can incrementally fetch "new" rows from a specified
table (with QueryDatabaseTable).  These processors output the rows as
an Avro-formatted file. Often to manipulate the contents you'd want to
convert the file to JSON using the ConvertAvroToJSON processor, then
often you want to deal with each row/record at a time, so you can use
SplitJson (alternatively after the SQL processor you can use SplitAvro
then ConvertAvroToJSON). Depending on how many fields are in each row
(I'm going to assume 3 for Turtle), you can use EvaluateJsonPath to
get each field/column value into an attribute, then ReplaceText to set
the values in Turtle format.

There is also an ExecuteStreamCommand processor where you could shell
out to your Python script, or if it is a pure Python script, you could
paste it into an ExecuteScript processor (using "python" as the engine
which is actually Jython). Not sure if any of these approaches will
give you better performance except that you could perform some of
these operations concurrently (or in parallel if using a NiFi
cluster).

Regards,
Matt

On Thu, Apr 13, 2017 at 11:55 AM, Bill Duncan <[email protected]> wrote:
> Sorry if this is a duplicate message ...
>
> I am quite interested in the Nifi software and I've been watching the
> videos. However, I can't seem to connect to a database and extract records.
> My main goal is to be able to take take records from a database and convert
> them into RDF (Turtle). I already do this using Python, but I was hoping
> that Nifi could speed up the translation process. But, before I start
> translating records, I'm just trying to connect and extract.
>
> Any help would be much appreciated!
>
> Thanks,
> Bill
>

Reply via email to