Re: InvokeHTTP vs invalid SSL certificates

2022-03-04 Thread Jorge Machado
Just import the certificate into the trust store. 


> On 4. Mar 2022, at 13:59, Jean-Sebastien Vachon  
> wrote:
> 
> Hi all,
> 
> what is the best way to deal with invalid SSL certificates when trying to 
> open an URL using InvokeHTTP?
> 
> 
> Thanks
> 
> Jean-Sébastien Vachon
> Co-Founder & Architect
> Brizo Data, Inc.
> www.brizodata.com 
> 


Re: InvokeHTTP vs invalid SSL certificates

2022-03-04 Thread Nathan Gough
If it wasn't already clear, the majority of SSL certificates should be
signed by the certificate authorities in Java's default cacerts truststore
or potentially you could find another public truststore (firefox's for
example) that would contain the root certificate authorities for most of
the websites you would want to use. You would not need to add each
website's certificate manually. If the question is purely for certain
websites which are not valid or are self signed, then you may need to take
some of the other suggestions.

On Fri, Mar 4, 2022 at 11:06 AM Juan Pablo Gardella <
gardellajuanpa...@gmail.com> wrote:

> You can set up a SSLManager to ignore all errors
> ,
> but it is not safe. Other option maybe put a reverse proxy in the middle
> 
> + some manipulation to extract the URL. For example myproxy?connectoTo=URL
> and proxy handles that. There are some quick ideas to make a POC.
>
>
>
> On Fri, Mar 4, 2022 at 12:35 PM Jean-Sebastien Vachon <
> jsvac...@brizodata.com> wrote:
>
>> Thanks David for the information.
>>
>> My main issue is that we are doing massive web scraping (over 400k
>> websites and growing) and I can not just add each certificate manually.
>> I can probably automate most of it but I wanted to see what options were
>> available to me.
>>
>> Thanks again. I will look into this.
>>
>>
>> *Jean-Sébastien Vachon *
>> Co-Founder & Architect
>>
>>
>> *Brizo Data, Inc. www.brizodata.com
>> 
>> *
>> --
>> *From:* David Handermann 
>> *Sent:* Friday, March 4, 2022 9:16 AM
>> *To:* users@nifi.apache.org 
>> *Subject:* Re: InvokeHTTP vs invalid SSL certificates
>>
>> Thanks for raising this question.  The InvokeHTTP processor relies on the
>> OkHttp client library, which implements standard TLS handshaking and
>> hostname verification as described in their documentation:
>>
>> https://square.github.io/okhttp/features/https/
>>
>> There are many things that could make a certificate invalid for a
>> specific connection.  If the remote certificate is self-signed, it is
>> possible to configure a NiFi SSL Context Service with a trust store that
>> includes the self-signed certificate.
>>
>> If the remote certificate is expired, the remote server must be updated
>> with a new certificate.  If the remote certificate does not include a DNS
>> Subject Alternative Name (SAN) matching the domain name that InvokeHTTP
>> uses for the connection, the best solution is for the remote server to be
>> updated with a new certificate containing a matching SAN.
>>
>> It is possible to configure OkHttp with a custom hostname verifier or
>> trust manager that ignores some of these attributes, but this would require
>> custom code that overrides the default behavior of InvokeHTTP.  There have
>> been some requests in the past for NiFi to implement support for a custom
>> hostname verifier, but this approach breaks one of the fundamental aspects
>> of TLS communication security.
>>
>> With that background, the potential solution depends on why InvokeHTTP
>> considers the certificate invalid.
>>
>> Regards,
>> David Handermann
>>
>> On Fri, Mar 4, 2022 at 6:59 AM Jean-Sebastien Vachon <
>> jsvac...@brizodata.com> wrote:
>>
>> Hi all,
>>
>> what is the best way to deal with invalid SSL certificates when trying to
>> open an URL using InvokeHTTP?
>>
>>
>> Thanks
>>
>>
>> *Jean-Sébastien Vachon *
>> Co-Founder & Architect
>>
>>
>> *Brizo Data, Inc. www.brizodata.com
>> 
>> *
>>
>>


Re: InvokeHTTP vs invalid SSL certificates

2022-03-04 Thread Juan Pablo Gardella
You can set up a SSLManager to ignore all errors
,
but it is not safe. Other option maybe put a reverse proxy in the middle

+ some manipulation to extract the URL. For example myproxy?connectoTo=URL
and proxy handles that. There are some quick ideas to make a POC.



On Fri, Mar 4, 2022 at 12:35 PM Jean-Sebastien Vachon <
jsvac...@brizodata.com> wrote:

> Thanks David for the information.
>
> My main issue is that we are doing massive web scraping (over 400k
> websites and growing) and I can not just add each certificate manually.
> I can probably automate most of it but I wanted to see what options were
> available to me.
>
> Thanks again. I will look into this.
>
>
> *Jean-Sébastien Vachon *
> Co-Founder & Architect
>
>
> *Brizo Data, Inc. www.brizodata.com
> 
> *
> --
> *From:* David Handermann 
> *Sent:* Friday, March 4, 2022 9:16 AM
> *To:* users@nifi.apache.org 
> *Subject:* Re: InvokeHTTP vs invalid SSL certificates
>
> Thanks for raising this question.  The InvokeHTTP processor relies on the
> OkHttp client library, which implements standard TLS handshaking and
> hostname verification as described in their documentation:
>
> https://square.github.io/okhttp/features/https/
>
> There are many things that could make a certificate invalid for a specific
> connection.  If the remote certificate is self-signed, it is possible to
> configure a NiFi SSL Context Service with a trust store that includes the
> self-signed certificate.
>
> If the remote certificate is expired, the remote server must be updated
> with a new certificate.  If the remote certificate does not include a DNS
> Subject Alternative Name (SAN) matching the domain name that InvokeHTTP
> uses for the connection, the best solution is for the remote server to be
> updated with a new certificate containing a matching SAN.
>
> It is possible to configure OkHttp with a custom hostname verifier or
> trust manager that ignores some of these attributes, but this would require
> custom code that overrides the default behavior of InvokeHTTP.  There have
> been some requests in the past for NiFi to implement support for a custom
> hostname verifier, but this approach breaks one of the fundamental aspects
> of TLS communication security.
>
> With that background, the potential solution depends on why InvokeHTTP
> considers the certificate invalid.
>
> Regards,
> David Handermann
>
> On Fri, Mar 4, 2022 at 6:59 AM Jean-Sebastien Vachon <
> jsvac...@brizodata.com> wrote:
>
> Hi all,
>
> what is the best way to deal with invalid SSL certificates when trying to
> open an URL using InvokeHTTP?
>
>
> Thanks
>
>
> *Jean-Sébastien Vachon *
> Co-Founder & Architect
>
>
> *Brizo Data, Inc. www.brizodata.com
> 
> *
>
>


Re: InvokeHTTP vs invalid SSL certificates

2022-03-04 Thread Jean-Sebastien Vachon
Thanks David for the information.

My main issue is that we are doing massive web scraping (over 400k websites and 
growing) and I can not just add each certificate manually.
I can probably automate most of it but I wanted to see what options were 
available to me.

Thanks again. I will look into this.

Jean-Sébastien Vachon
Co-Founder & Architect
Brizo Data, Inc.
www.brizodata.com

From: David Handermann 
Sent: Friday, March 4, 2022 9:16 AM
To: users@nifi.apache.org 
Subject: Re: InvokeHTTP vs invalid SSL certificates

Thanks for raising this question.  The InvokeHTTP processor relies on the 
OkHttp client library, which implements standard TLS handshaking and hostname 
verification as described in their documentation:

https://square.github.io/okhttp/features/https/

There are many things that could make a certificate invalid for a specific 
connection.  If the remote certificate is self-signed, it is possible to 
configure a NiFi SSL Context Service with a trust store that includes the 
self-signed certificate.

If the remote certificate is expired, the remote server must be updated with a 
new certificate.  If the remote certificate does not include a DNS Subject 
Alternative Name (SAN) matching the domain name that InvokeHTTP uses for the 
connection, the best solution is for the remote server to be updated with a new 
certificate containing a matching SAN.

It is possible to configure OkHttp with a custom hostname verifier or trust 
manager that ignores some of these attributes, but this would require custom 
code that overrides the default behavior of InvokeHTTP.  There have been some 
requests in the past for NiFi to implement support for a custom hostname 
verifier, but this approach breaks one of the fundamental aspects of TLS 
communication security.

With that background, the potential solution depends on why InvokeHTTP 
considers the certificate invalid.

Regards,
David Handermann

On Fri, Mar 4, 2022 at 6:59 AM Jean-Sebastien Vachon 
mailto:jsvac...@brizodata.com>> wrote:
Hi all,

what is the best way to deal with invalid SSL certificates when trying to open 
an URL using InvokeHTTP?


Thanks

Jean-Sébastien Vachon
Co-Founder & Architect
Brizo Data, Inc.
www.brizodata.com


Re: InvokeHTTP vs invalid SSL certificates

2022-03-04 Thread David Handermann
Thanks for raising this question.  The InvokeHTTP processor relies on the
OkHttp client library, which implements standard TLS handshaking and
hostname verification as described in their documentation:

https://square.github.io/okhttp/features/https/

There are many things that could make a certificate invalid for a specific
connection.  If the remote certificate is self-signed, it is possible to
configure a NiFi SSL Context Service with a trust store that includes the
self-signed certificate.

If the remote certificate is expired, the remote server must be updated
with a new certificate.  If the remote certificate does not include a DNS
Subject Alternative Name (SAN) matching the domain name that InvokeHTTP
uses for the connection, the best solution is for the remote server to be
updated with a new certificate containing a matching SAN.

It is possible to configure OkHttp with a custom hostname verifier or trust
manager that ignores some of these attributes, but this would require
custom code that overrides the default behavior of InvokeHTTP.  There have
been some requests in the past for NiFi to implement support for a custom
hostname verifier, but this approach breaks one of the fundamental aspects
of TLS communication security.

With that background, the potential solution depends on why InvokeHTTP
considers the certificate invalid.

Regards,
David Handermann

On Fri, Mar 4, 2022 at 6:59 AM Jean-Sebastien Vachon 
wrote:

> Hi all,
>
> what is the best way to deal with invalid SSL certificates when trying to
> open an URL using InvokeHTTP?
>
>
> Thanks
>
>
> *Jean-Sébastien Vachon *
> Co-Founder & Architect
>
>
> *Brizo Data, Inc. www.brizodata.com
> 
> *
>


InvokeHTTP vs invalid SSL certificates

2022-03-04 Thread Jean-Sebastien Vachon
Hi all,

what is the best way to deal with invalid SSL certificates when trying to open 
an URL using InvokeHTTP?


Thanks

Jean-Sébastien Vachon
Co-Founder & Architect
Brizo Data, Inc.
www.brizodata.com


Re: InvokeHTTP performance - max out at 50 post per sec.

2022-03-04 Thread Jens M. Kofoed
Hi Isha

Thanks for your reply.
I have tried to play with the scheduling, and it doesn't help much. So I
will use the concurrent task. Had hoped there where other things which
could be tweaked to get it to run faster.

kind regards
Jens M. Kofoed


Den fre. 4. mar. 2022 kl. 10.16 skrev Isha Lamboo <
isha.lam...@virtualsciences.nl>:

> Hi Jens,
>
>
>
> The behaviour you describe doesn’t seem abnormal if the round-trip time
> for a request is around 20ms in you test setup. A single InvokeHTTP would
> process requests one by one and that gives 1000 ms /20 ms = 50 requests per
> second. Increasing concurrent tasks to send requests in parallel is exactly
> the right solution here and you test results show as much.
>
>
>
> If the round trip time is much smaller than 20 ms, you should be able to
> tune the scheduling to get more throughput per concurrent task. I would
> test increasing the run duration to let the processor take on more tasks
> without yielding in between. This will impact the rest of your  flows if
> you set it too high though.
>
>
>
> Regards,
>
>
>
> Isha
>
>
>
> *Van:* Jens M. Kofoed 
> *Verzonden:* vrijdag 4 maart 2022 09:35
> *Aan:* users@nifi.apache.org
> *Onderwerp:* InvokeHTTP performance - max out at 50 post per sec.
>
>
>
> Hi
>
>
>
> I have created some small test with invokeHTTP, and it seems like that
> invokeHTTP only can handle 50 flowfiles per second. If I configure the
> invokeHTTP to have x Concurrent Tasks, the output increase with x times.
>
>  I created a very small flow
>
> GenerateFlowFile
>
>   File Size: 100kb
>
>   Batch size: 100
>
>   Run Schedule: 1 sec
>
> ->
>
> InvokeHTTP:
>
>   HTTP Method: POST
>
>
>
> The receiving host, is another nifi node
> with HandleHttpRequest/HandleHttpResponse. And it has no issue handling 4x
> invokeHTTP.
>
>
>
> So are there something in invokeHTTP which can be tweaked to perform more
> than 50 files per sec?
>
>
>
> Kind regards
>
> Jens M. Kofoed
>
>
>
>
>


RE: InvokeHTTP performance - max out at 50 post per sec.

2022-03-04 Thread Isha Lamboo
Hi Jens,

The behaviour you describe doesn’t seem abnormal if the round-trip time for a 
request is around 20ms in you test setup. A single InvokeHTTP would process 
requests one by one and that gives 1000 ms /20 ms = 50 requests per second. 
Increasing concurrent tasks to send requests in parallel is exactly the right 
solution here and you test results show as much.

If the round trip time is much smaller than 20 ms, you should be able to tune 
the scheduling to get more throughput per concurrent task. I would test 
increasing the run duration to let the processor take on more tasks without 
yielding in between. This will impact the rest of your  flows if you set it too 
high though.

Regards,

Isha

Van: Jens M. Kofoed 
Verzonden: vrijdag 4 maart 2022 09:35
Aan: users@nifi.apache.org
Onderwerp: InvokeHTTP performance - max out at 50 post per sec.

Hi

I have created some small test with invokeHTTP, and it seems like that 
invokeHTTP only can handle 50 flowfiles per second. If I configure the 
invokeHTTP to have x Concurrent Tasks, the output increase with x times.
 I created a very small flow
GenerateFlowFile
  File Size: 100kb
  Batch size: 100
  Run Schedule: 1 sec
->
InvokeHTTP:
  HTTP Method: POST

The receiving host, is another nifi node with 
HandleHttpRequest/HandleHttpResponse. And it has no issue handling 4x 
invokeHTTP.

So are there something in invokeHTTP which can be tweaked to perform more than 
50 files per sec?

Kind regards
Jens M. Kofoed




InvokeHTTP performance - max out at 50 post per sec.

2022-03-04 Thread Jens M. Kofoed
Hi

I have created some small test with invokeHTTP, and it seems like that
invokeHTTP only can handle 50 flowfiles per second. If I configure the
invokeHTTP to have x Concurrent Tasks, the output increase with x times.
 I created a very small flow
GenerateFlowFile
  File Size: 100kb
  Batch size: 100
  Run Schedule: 1 sec
->
InvokeHTTP:
  HTTP Method: POST

The receiving host, is another nifi node
with HandleHttpRequest/HandleHttpResponse. And it has no issue handling 4x
invokeHTTP.

So are there something in invokeHTTP which can be tweaked to perform more
than 50 files per sec?

Kind regards
Jens M. Kofoed