If it wasn't already clear, the majority of SSL certificates should be
signed by the certificate authorities in Java's default cacerts truststore
or potentially you could find another public truststore (firefox's for
example) that would contain the root certificate authorities for most of
the websites you would want to use. You would not need to add each
website's certificate manually. If the question is purely for certain
websites which are not valid or are self signed, then you may need to take
some of the other suggestions.

On Fri, Mar 4, 2022 at 11:06 AM Juan Pablo Gardella <
[email protected]> wrote:

> You can set up a SSLManager to ignore all errors
> <https://stackoverflow.com/questions/4663147/is-there-a-java-setting-for-disabling-certificate-validation>,
> but it is not safe. Other option maybe put a reverse proxy in the middle
> <https://www.digitalocean.com/community/questions/does-haproxy-supports-backend-on-https-for-reverse-proxy>
> + some manipulation to extract the URL. For example myproxy?connectoTo=URL
> and proxy handles that. There are some quick ideas to make a POC.
>
>
>
> On Fri, Mar 4, 2022 at 12:35 PM Jean-Sebastien Vachon <
> [email protected]> wrote:
>
>> Thanks David for the information.
>>
>> My main issue is that we are doing massive web scraping (over 400k
>> websites and growing) and I can not just add each certificate manually.
>> I can probably automate most of it but I wanted to see what options were
>> available to me.
>>
>> Thanks again. I will look into this.
>>
>>
>> *Jean-Sébastien Vachon *
>> Co-Founder & Architect
>>
>>
>> *Brizo Data, Inc. www.brizodata.com
>> <https://outlook.office365.com/mail/options/mail/messageContent/www.brizodata.com>
>> *
>> ------------------------------
>> *From:* David Handermann <[email protected]>
>> *Sent:* Friday, March 4, 2022 9:16 AM
>> *To:* [email protected] <[email protected]>
>> *Subject:* Re: InvokeHTTP vs invalid SSL certificates
>>
>> Thanks for raising this question.  The InvokeHTTP processor relies on the
>> OkHttp client library, which implements standard TLS handshaking and
>> hostname verification as described in their documentation:
>>
>> https://square.github.io/okhttp/features/https/
>>
>> There are many things that could make a certificate invalid for a
>> specific connection.  If the remote certificate is self-signed, it is
>> possible to configure a NiFi SSL Context Service with a trust store that
>> includes the self-signed certificate.
>>
>> If the remote certificate is expired, the remote server must be updated
>> with a new certificate.  If the remote certificate does not include a DNS
>> Subject Alternative Name (SAN) matching the domain name that InvokeHTTP
>> uses for the connection, the best solution is for the remote server to be
>> updated with a new certificate containing a matching SAN.
>>
>> It is possible to configure OkHttp with a custom hostname verifier or
>> trust manager that ignores some of these attributes, but this would require
>> custom code that overrides the default behavior of InvokeHTTP.  There have
>> been some requests in the past for NiFi to implement support for a custom
>> hostname verifier, but this approach breaks one of the fundamental aspects
>> of TLS communication security.
>>
>> With that background, the potential solution depends on why InvokeHTTP
>> considers the certificate invalid.
>>
>> Regards,
>> David Handermann
>>
>> On Fri, Mar 4, 2022 at 6:59 AM Jean-Sebastien Vachon <
>> [email protected]> wrote:
>>
>> Hi all,
>>
>> what is the best way to deal with invalid SSL certificates when trying to
>> open an URL using InvokeHTTP?
>>
>>
>> Thanks
>>
>>
>> *Jean-Sébastien Vachon *
>> Co-Founder & Architect
>>
>>
>> *Brizo Data, Inc. www.brizodata.com
>> <https://outlook.office365.com/mail/options/mail/messageContent/www.brizodata.com>
>> *
>>
>>

Reply via email to