True, but my concern is that you might see performance issues with a new
connection each time, especially if the same value(s) come in many times in
a row (i.e. choosing the same connection config). Having a small cache
might afford you some speedups.

Regards,
Matt

On Sun, Mar 10, 2024 at 9:17 AM Eduardo Fontes <eduardo.fon...@gmail.com>
wrote:

> Hi Matt,
>
> I don't think I need a pool or a cache, since DB connection will be used
> once for an object (table/view). So I think that won't be a problem create
> a DB connection, read object and destroy connection, for each object.
>
> I'll try to implement this using DBCPService Controller Interface.
>
> Thanks for your consideration.
>
> Eduardo Fontes
>
> On Tue, Mar 5, 2024 at 11:10 PM Matt Burgess <mattyb...@apache.org> wrote:
>
> > Eduardo,
> >
> > It doesn't sound like DBCPConnectionPoolLookup will work for you because
> of
> > all the different connection strings. I don't know if there's a good
> reason
> > why we couldn't create the BasicDataSource when getConnection() is
> called,
> > passing in a Map of FlowFile attributes (that's how the Lookup version
> > works). One issue I do see is with "churn" if we're recreating the data
> > source each time. At that point it's not pooling connections. I suppose
> you
> > could have an internal cache of data sources but it would have to be
> > bounded and/or configurable and have a least-recently-used (LRU) eviction
> > strategy.
> >
> > DBCPService is the name of the controller service interface that the
> > database processors use, but that's a misnomer since the API doesn't
> > mention pooling specifically. Instead you could have an implementation
> that
> > uses a cache vs a pooling approach. But Apache DBCP does handle a lot of
> > the management (validation, eviction, idle timeouts, etc.)  so unless
> > there's no way to avoid the potential memory/performance issues (like
> > having 50+ controller services in a PG) you could try to wrangle smaller
> > pools per data source and cache those if that's ok for your use case.
> >
> > My two cents,
> > Matt
> >
> > On Tue, Mar 5, 2024 at 7:25 PM Eduardo Fontes <eduardo.fon...@gmail.com>
> > wrote:
> >
> > > Hi Everybody!
> > >
> > > I'm thinking about make a generic ingestor with Apache NiFi but I found
> > > some difficulties because of the DataBase Connection Pool controller.
> It
> > > doesn't accept flowfiles parameters for its properties, specially
> > > connection string, username and password (for security reasons, some
> > > sensitive parameter name instead password itself).
> > >
> > > This is important because, as a generic ingestor, I might have hundreds
> > of
> > > different connection strings, and I had a lot of problems when I tried
> to
> > > put 50 DBCP controllers in a Process Group.
> > >
> > > I wouldn't like to create a flow for each ingestion, but one flow for
> > each
> > > database vendor.
> > >
> > > Does anyone have any suggestions on how I can achieve this? Would it be
> > > easy to create a parameterized DBCP controller? (That I could do it
> > myself)
> > >
> > > Best regards.
> > >
> > > Eduardo Fontes
> > > Data Eng / System Analyst Sr.
> > >
> >
>

Reply via email to