Prometheus Reporting Task Erroneous Thread Count Data

2023-09-01 Thread Agbozo, Robert
Greetings!

I believe I came across a bug that may have snuck through the cracks. Benjamin 
Charron originally reported this issue in the Summer of 2020 in the following 
ticket NIFI-7616. The 
suggestion was to replace the status.getActiveThreadCount() call with 
processorStatus.getActiveThreadCount() in order to have the component’s thread 
count accurately reported instead of the process group’s thread count. At the 
moment the call remains unchanged in 
PrometheusMetricsUtil.java
 at both lines 
173
 and 
175.
 NIFI-7616 as closed as a 
duplicate for NiFi-7796 but it 
appears Benjamin Charron’s diff was not merged in.

I originally observed this erroneous behavior with Prometheus when a component 
configured with a low number of concurrent tasks was reporting a high active 
thread count for hours. After combing through 
PrometheusMetricsUtil.java
 and isolating that specific component into its own process group with local 
input and output ports, the current number of active threads was then reported 
to Prometheus.

Robert Agbozo
Senior Security Operations Engineer
Sony Interactive Entertainment



[RESULT][VOTE] Release Apache NiFi MiNiFi C++ 0.15.0

2023-09-01 Thread Martin Zink
Apache NiFi Community,

I am pleased to announce that the 0.15.0 release of Apache NiFi MiNiFi C++
passes with
  3 +1 (binding) votes
  3 +1 (non-binding) votes
  0 0 votes
  0 -1 votes

Thanks to all who helped make this release possible.

Here is the PMC vote thread:
https://lists.apache.org/thread/nrlqtnysvrhvkmk7ozdwfh3tklm1mkb0


Re: Inquiry on Key Skills for an Apache NiFi Developer

2023-09-01 Thread Henry Sowell
A lot of it’s going to depend on your organizational needs, use cases, and the 
specific job requirements (e.g., developer for components like custom 
processors, extensions, etc., flow developer, admin, etc.). 

Here’s a starting point:

Apache NiFi Developer:

NiFi-specific skills:

- Technical Proficiency with Apache NiFi's core components and architecture.
- Experience in designing, building, and optimizing data flows using NiFi 
processors.
- Integration skills with various data sources like databases, message queues, 
web services, etc.
- Error handling and management within data flows.
- NiFi performance tuning based on resource utilization and data volume.
- Use of scripting within NiFi, with languages like Python or Groovy.

Core Technical Skills:

- Strong Java development skills, especially for creating custom processors or 
extensions.
- Understanding of data structures, schemas, and data modeling techniques.
- Proficiency with databases, SQL and NoSQL, and writing/optimizing queries.
- Familiarity with integrating APIs and web services (RESTful, SOAP).

Apache NiFi Admin:

NiFi-specific skills:

- Proficiency in setting up and configuring NiFi clusters.
- Skills in managing and monitoring NiFi nodes ensuring high availability.
- Implementing backup strategies and recovery mechanisms for NiFi.
- Security implementation, including SSL, authentication, and authorization.
- Regular updates and patches to NiFi instances.
- Monitoring system resources, log files, and optimizing configurations.

Core Technical Skills:

- Deep understanding of system administration, especially on Linux platforms.
- Networking knowledge, including network protocols, firewall configurations, 
and secure communication.
- Security skills, understanding encryption, public key infrastructure (PKI), 
and SSL/TLS configurations.
- Scripting proficiency, using languages like Bash or Python for automation and 
troubleshooting.

Depending on your needs you’ll need to customize and make a lot of this 
“nice-to-haves.”

Henry

On Sep 1, 2023, at 09:53, Frank Mansilla  wrote:

Dear NiFi Community,

We are currently looking to hire an Apache NiFi developer, and we would
like to gather your insights on the essential skills and knowledge a
candidate should possess to be considered. What specific aspects of NiFi do
you deem crucial in a developer? Your feedback will greatly assist us in
our search.

Thank you for your collaboration.

Sincerely,

Frank Mansilla
Neox
Arg


Inquiry on Key Skills for an Apache NiFi Developer

2023-09-01 Thread Frank Mansilla
Dear NiFi Community,

We are currently looking to hire an Apache NiFi developer, and we would
like to gather your insights on the essential skills and knowledge a
candidate should possess to be considered. What specific aspects of NiFi do
you deem crucial in a developer? Your feedback will greatly assist us in
our search.

Thank you for your collaboration.

Sincerely,

Frank Mansilla
Neox
Arg


Re: [VOTE] Release Apache NiFi MiNiFi C++ 0.15.0

2023-09-01 Thread Pierre Villard
+1 (binding)

Build on u20, tested a couple of simple flows.

Thanks!

Le jeu. 31 août 2023 à 19:35, Arpad Boda  a écrit :

> +1 (binding)
>
> Verified signature, hashes.
> Built on debian and mac.
>
> Executed all tests successfully, verified c2 functionality, designed
> multiple flows, verified those.
>
> Thanks,
> Arpad
>
>
> On Wed, Aug 30, 2023 at 7:11 PM Gábor Gyimesi  wrote:
>
> > +1 (non-binding)
> >
> > Went through the verification process using the helper guide.
> >
> > Compiled all but the JNI extension successfully on Ubuntu 22.04 with
> > GCC 11, ran all unit and integration tests, did not find any issues.
> >
> > Compiled on Windows using MSVC and Ninja using Visual Studio 2019.
> > Used the following command: win_build_vs.bat build /NINJA /P /K /S /A
> > /SFTP /PDH /SPLUNK /GCP /ELASTIC /Z /PR /ENCRYPT_CONFIG /MQTT /OPC
> > /PYTHON_SCRIPTING
> > I had a compilation issue on Windows with the SFTP extension: linking
> > SFTPLoader.cpp.obj failed with unresolved Curl symbols. Seems to be an
> > issue of the static linkage of Curl, which is worth investigating, but
> > I don't think it's a blocking issue. After removing SFTP from the
> > compilation list the project compiled successfully.
> >
> > Ran two flows on both Windows (using the compiled binaries) and Linux
> > (using the provided convenience binaries) successfully:
> > TailFile -> LogAttribute
> > GenerateFlowFile -> UpdateAttribute -> MergeContent -> CompressContent
> > -> PutS3Object
> >
> > Note: Updated the
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=139627733
> > wiki page with the new OpenSSL build requirements on Windows.
> >
> > Thanks,
> > Gábor
> >
> > On Tue, 29 Aug 2023 at 23:05, Marton Szasz  wrote:
> > >
> > > +1 (binding)
> > >
> > > Verified everything according to the release helper guide.
> > >
> > > On linux, bootstrap.sh installs all the required dependencies for
> > > compiling with GCC.
> > > - Ubuntu 22.04 / GCC: works fine
> > > Clang required additional packages: clang libc++-dev libc++abi-dev
> > > - Ubuntu 22.04 / Clang + libc++: didn't compile, but this is not a
> > > showstopper IMO. We can fix it later and prepare the next release a
> > > bit sooner.
> > > - Ubuntu 22.04 / Clang + libstdc++: works fine
> > >
> > > Arch Linux / any compiler: linker issues related to curl. I wouldn't
> > > tank the release for this.
> > >
> > > Windows steps:
> > > 1. Used Visual Studio Community 2019 (VS2022 support is under review,
> > > not yet included)
> > > 2. Installed scoop (in powershell):> irm get.scoop.sh | iex
> > > 3. Installed the latest cmake (for build), python (for scripting
> > > support), sccache (for build caching, like ccache) and wixtoolset (for
> > > installer generation) with scoop:> scoop install cmake python sccache
> > > wixtoolset
> > > 4. Source checked out at C:\a\m (to avoid long path issues)
> > > 5. Built in "x64 Native Tools Command Prompt for VS2019" with the
> > > following command:> win_build_vs.bat ..\bld /64 /P /K /S /A /SFTP /PDH
> > > /SPLUNK /GCP /ELASTIC /Z /PR /ENCRYPT_CONFIG /MQTT /OPC
> > > /PYTHON_SCRIPTING /D /NONFREEUCRT /SCCACHE
> > > 6. Installed the resulting MSI, and copied cwel_config.yml from the
> > > repo, but modified it to send the logs with PutTCP and PutUDP (2
> > > separate tests) to a netcat listening on a linux box. It worked well,
> > > the logs arrived right away on the other box. Also tried the new saved
> > > log file support.
> > >
> > > My reaction to Ferenc's issues:
> > > - I agree that we should make 64bit the default in the future.
> > > - I also ran into the cpack issue in the past, but we have a note
> > > about it in the README, which is good enough for now IMO.
> > > - I prefer not starting the service right after installation, before I
> > > even have the chance to add my flow to config.yml, but C2 users may
> > > have different preferences.
> > >
> > > Thanks,
> > > Márton
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Aug 29, 2023 at 3:20 PM Ferenc Gerlits 
> > wrote:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Verified hashes and signature on the source tarball, checked git
> > > > commit hash and tag.
> > > > Built on Windows 10 with 64-bit VS 2019, installed the msi package
> and
> > > > ran a simple CWEL -> LogAttribute flow.
> > > >
> > > > I ran into some issues during the build, but none of them are
> > showstoppers:
> > > >  - the release helper guide should make it clear that
> win_build_vs.bat
> > > > defaults to 32-bit and you have to
> > > >  add /64 to the command line if you want a 64-bit build (should
> we
> > > > make 64-bit the default?);
> > > >  - win_build_vs.bat fails if the build directory path contains
> spaces;
> > > >  - the cpack command in win_build_vs.bat found chocolatey on my
> > > > computer instead of CMake's cpack;
> > > >  - the installer does not start the service (I don't know if it used
> > > > to, but I think it should).
> > > >
> > > > Thank you,
> > > > Ferenc
> >
>


Re: PGVector and Database Driver

2023-09-01 Thread u...@moosheimer.com

I have found the problem.
To have a "clean" installation, I have the following directory structure

/opt/nifi/nifi-1.23.2
/opt/nifi/current (symbolic link to the current version)
/opt/nifi/driver
/opt/nifi/extensions

I defined the ../extension and ../driver directories separately, so that 
when I upgrade NiFi, I only have to adjust the directories in the config.


Then in the DBCPConnectionPool I specify where the driver is under 
"Database Driver Location(s)" -> "/opt/nifi/driver/postgresql-42.6.0.jar".

This works fine, but PGVector doesn't like it.

If I copy the driver to /opt/nifi/current/lib and leave "Database Driver 
Location(s)" empty, everything works fine.

Also addVectorType(con) is not needed.

Interesting. I don't really understand why, but I accept that some 
things are closed to me :)


Maybe my explanation will help if other developers have similar problems.
And maybe there should be a "don't do" chapter in the documentation, 
pointing out that the JDBC driver should be placed in the ../lib directory.
If someone can tell me why the behavior is like this, I would be happy 
to learn something.


Thanks again for your help.

Regards,
Uwe

On 01.09.23 05:59, Matt Burgess wrote:

Maybe this [1]? Perhaps you have to call unwrap() yourself in this
case. IIRC you don't have access to the DataSource but you can check
it directly on the connection.

Regards,
Matt

[1] 
https://stackoverflow.com/questions/36986653/cast-java-sql-connection-to-pgconnection

On Thu, Aug 31, 2023 at 8:15 PM u...@moosheimer.com  wrote:

Mark & Matt,

Thanks for the quick help. I really appreciate it.

PGvector.addVectorType(con) returns the following:
*java.sql.SQLException: Cannot unwrap to org.postgresql.PGConnection*

Could this be a connection pool issue?

Interestingly, I didn't call addVectorType() at all in my test java code
and it still works?!
I'll have to check again ... maybe I'm not seeing it correctly anymore.
It is already 2:05 a.m. here.


Regards,
Uwe


java.sql.SQLException: Cannot unwrap to org.postgresql.PGConnection

On 31.08.23 18:53, Matt Burgess wrote:

This means the JDBC driver you're using does not support the use of
the two-argument setObject() call when the object is a PGVector. Did
you register the Vector type by calling:

PGvector.addVectorType(conn);

The documentation [1] says that the two-argument setObject() should
work if you have registered the Vector type.

Regards,
Matt

[1]https://github.com/pgvector/pgvector-java

On Thu, Aug 31, 2023 at 12:01 PM Mark Payne  wrote:

Hey Uwe,

The DBCPConnectionPool returns a java.sql.Connection. From that you’d create a 
Statement. So I’m a little confused when you say that you’ve got it working in 
Pure JDBC but not with NiFi, as the class returned IS pure JDBC. Perhaps you 
can share a code snippet of what you’re doing in the “Pure JDBC” route that is 
working versus what you’re doing in the NiFi processor that’s not working?

Thanks
-Mark



On Aug 31, 2023, at 10:58 AM,u...@moosheimer.com  wrote:

Hi,

I am currently writing a processor to write OpenAI embeddings to Postgres.
I am using DBCPConnectionPool for this.
I use Maven to integrate PGVector (https://github.com/pgvector/pgvector).

With pure JDBC this works fine. With the database classes from NiFi I get the 
error:
*Cannot infer the SQL type to use for an instance of com.pgvector.PGvector. Use 
setObject() with an explicit Types value to specify the type to use.*

I use -> setObject (5, new PGvector(embeddingArray)).
embeddingArray is defined as: float[] embeddingArray

Of course I know why I get the error from NiFi and not from the JDBC driver, 
but unfortunately this knowledge does not help me.

Can anyone tell me what SQLType I need to specify for this?
I have searched the internet and the NiFi sources on GitHub for several hours 
now and have found nothing.

One option would be to use native JDBC and ignore the ConnectionPool. But that 
would be a very bad style in my opinion.
Perhaps there is a better solution?

Any help, especially from Matt B., is appreciated as I'm at a loss.
Thanks guys.