Re: Spark History UI Error WARN HttpParser: Header is too large >8192

2018-10-10 Thread Sandeep Moré
You can try creating a new rewrite rule that drops that problematic header
if you don't need all those groups.

Best,
Sandeep

On Wed, Oct 10, 2018 at 7:01 PM Theyaa Matti  wrote:

> Hi David, Kevin,
>  I have reached out to the spark users community to see if their are
> alternatives or any work going on to fix this issue. One other thing I am
> trying to test out is the limiting the request header sent from Knox to the
> backend services, like spark history ui.
>
>  The main reason this issue appear on our end is the large number of
> groups the user belongs to that is pushing the request header to go beyond
> 8192. Those groups are not needed in the backend services and I am looking
> to see if there is a way for Knox to exclude them when forwarding the
> request to the backend?
>
> Best,
>
> Theyaa.
>
>
> On Wed, Oct 10, 2018 at 2:26 PM Kevin Risden  wrote:
>
>> Moving this back to the user list since it was sent without the list:
>>
>> Question: Is this error coming from the spark history server or the knox
>> server?
>>
>> Answer: It is coming from the spark history UI. Here is what I see in the
>> log for spark history ui.
>>
>> 18/10/10 12:30:29 WARN HttpParser: Header is too large >8192
>>
>> 18/10/10 12:30:29 WARN HttpParser: badMessage: 413 for
>> HttpChannelOverHttp@2756e900{r=1,c=false,a=IDLE,uri=/?doAs=}
>>
>>
>> With the above information there is probably nothing that can be done
>> from Knox. The bad news is that there is no configuration in Spark for this
>> yet. See the following:
>> * https://issues.apache.org/jira/browse/SPARK-15090
>> *
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala#L337
>> * Should see something about:
>> * connector.setRequestHeaderSize(requestHeaderSize);
>> * connector.setResponseHeaderSize(responseHeaderSize);
>>
>> Without a configuration to change, the Spark history UI is going to
>> default to the 8192 header size. A related change was done for most
>> projects to make that header size configurable.
>>
>> Kevin Risden
>>
>>
>> On Tue, Oct 9, 2018 at 8:45 PM Theyaa Matti 
>> wrote:
>>
>>> Hi David,
>>>  Thank you for the quick response. I do have that property set
>>> with the recommended value in Knox, but the issue still persists only with
>>> the Spark History UI. The issue appears only when I enable ssl for knox. If
>>> I turn ssl off in Knox the issue disappears, and if I enable it, it shows
>>> up right away.
>>>
>>>  Are there any equivalent properties for Spark History UI? since
>>> I suspect the issue is with the Spark History UI and not with Knox since
>>> all the other UIs work fine with Knox.
>>>
>>> Best,
>>>
>>> Theyaa.
>>>
>>>
>>>
>>> On Tue, Oct 9, 2018 at 7:11 PM David Villarreal <
>>> dvillarr...@hortonworks.com> wrote:
>>>
 Hi Theyaa,

 Change the size of gateway.httpserver.requestHeaderBuffer property.  I
 think the default is 8912  (8k) change to 16384. See if that helps.

 For the second problem Request is a replay (34))] this message is often
 seen when the timing of one of the servers is off.  Make sure you use NTPD
 on all servers and they are all in sync.  If everything is in sync you can
 work around this issue by turning off krb5 replay cache. With the following
 parameter
 -Dsun.security.krb5.rcache=none

 dav


 On 10/9/18, 9:01 AM, "Theyaa Matti"  wrote:

 Hi,
I am getting this error message "WARN HttpParser: Header is too
 large
 >8192" when trying to access the spark history ui through knox. Any
 idea
 please?

 Also when trying to load the executors page, I get : GSS initiate
 failed
 [Caused by GSSException: Failure unspecified at GSS-API level
 (Mechanism level:
 Request is a replay (34))]

 when knox is requesting executorspage-template.html

 appreciate any help here.





Re: Spark History UI Error WARN HttpParser: Header is too large >8192

2018-10-10 Thread Theyaa Matti
Hi David, Kevin,
 I have reached out to the spark users community to see if their are
alternatives or any work going on to fix this issue. One other thing I am
trying to test out is the limiting the request header sent from Knox to the
backend services, like spark history ui.

 The main reason this issue appear on our end is the large number of
groups the user belongs to that is pushing the request header to go beyond
8192. Those groups are not needed in the backend services and I am looking
to see if there is a way for Knox to exclude them when forwarding the
request to the backend?

Best,

Theyaa.


On Wed, Oct 10, 2018 at 2:26 PM Kevin Risden  wrote:

> Moving this back to the user list since it was sent without the list:
>
> Question: Is this error coming from the spark history server or the knox
> server?
>
> Answer: It is coming from the spark history UI. Here is what I see in the
> log for spark history ui.
>
> 18/10/10 12:30:29 WARN HttpParser: Header is too large >8192
>
> 18/10/10 12:30:29 WARN HttpParser: badMessage: 413 for
> HttpChannelOverHttp@2756e900{r=1,c=false,a=IDLE,uri=/?doAs=}
>
>
> With the above information there is probably nothing that can be done from
> Knox. The bad news is that there is no configuration in Spark for this yet.
> See the following:
> * https://issues.apache.org/jira/browse/SPARK-15090
> *
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala#L337
> * Should see something about:
> * connector.setRequestHeaderSize(requestHeaderSize);
> * connector.setResponseHeaderSize(responseHeaderSize);
>
> Without a configuration to change, the Spark history UI is going to
> default to the 8192 header size. A related change was done for most
> projects to make that header size configurable.
>
> Kevin Risden
>
>
> On Tue, Oct 9, 2018 at 8:45 PM Theyaa Matti  wrote:
>
>> Hi David,
>>  Thank you for the quick response. I do have that property set
>> with the recommended value in Knox, but the issue still persists only with
>> the Spark History UI. The issue appears only when I enable ssl for knox. If
>> I turn ssl off in Knox the issue disappears, and if I enable it, it shows
>> up right away.
>>
>>  Are there any equivalent properties for Spark History UI? since
>> I suspect the issue is with the Spark History UI and not with Knox since
>> all the other UIs work fine with Knox.
>>
>> Best,
>>
>> Theyaa.
>>
>>
>>
>> On Tue, Oct 9, 2018 at 7:11 PM David Villarreal <
>> dvillarr...@hortonworks.com> wrote:
>>
>>> Hi Theyaa,
>>>
>>> Change the size of gateway.httpserver.requestHeaderBuffer property.  I
>>> think the default is 8912  (8k) change to 16384. See if that helps.
>>>
>>> For the second problem Request is a replay (34))] this message is often
>>> seen when the timing of one of the servers is off.  Make sure you use NTPD
>>> on all servers and they are all in sync.  If everything is in sync you can
>>> work around this issue by turning off krb5 replay cache. With the following
>>> parameter
>>> -Dsun.security.krb5.rcache=none
>>>
>>> dav
>>>
>>>
>>> On 10/9/18, 9:01 AM, "Theyaa Matti"  wrote:
>>>
>>> Hi,
>>>I am getting this error message "WARN HttpParser: Header is too
>>> large
>>> >8192" when trying to access the spark history ui through knox. Any
>>> idea
>>> please?
>>>
>>> Also when trying to load the executors page, I get : GSS initiate
>>> failed
>>> [Caused by GSSException: Failure unspecified at GSS-API level
>>> (Mechanism level:
>>> Request is a replay (34))]
>>>
>>> when knox is requesting executorspage-template.html
>>>
>>> appreciate any help here.
>>>
>>>
>>>


Re: WebHDFS performance issue in Knox

2018-10-10 Thread Sandeep Moré
Awesome, I had seen GCM suck big time in the past.
Great work !

On Wed, Oct 10, 2018 at 3:48 PM Kevin Risden  wrote:

> I tried disabling GCM ciphers based on the following information:
> * https://www.wowza.com/docs/how-to-improve-ssl-performance-with-java-8
> *
> https://stackoverflow.com/questions/25992131/slow-aes-gcm-encryption-and-decryption-with-java-8u20
>
> The results for the read were:
> * knox ssl no GCM - 1,073,741,824  125MB/s   in 8.7s
> * knox ssl - 1,073,741,824 54.3MB/s   in 20s
>
> This is a little more than a 2x speedup. There is also information in the
> links above that there should be more performance improvements with JDK 9+.
>
> For the write side slow down, I found an issue with how Knox is handing
> the streaming data on writes only. I am looking into fixing this to get the
> write performance for HDFS improved.
>
> Kevin Risden
>
>
> On Wed, Oct 10, 2018 at 1:20 PM David Villarreal <
> dvillarr...@hortonworks.com> wrote:
>
>> I believe Curl has an option of what cipher to use..  You may also be
>> able to force it at the server jvm level using
>> /jre/lib/security/java.security
>>
>>
>>
>>
>>
>> *From: *Sandeep Moré 
>> *Reply-To: *"user@knox.apache.org" 
>> *Date: *Tuesday, October 9, 2018 at 6:39 PM
>> *To: *"user@knox.apache.org" 
>> *Subject: *Re: WebHDFS performance issue in Knox
>>
>>
>>
>> I think this would be a good test, worth a try, not sure how we can force
>> a certain cipher to be used perhaps a permutation combination of
>>
>> ssl.include.ciphers, ssl.exclude.ciphers.
>>
>>
>>
>> Best,
>>
>> Sandeep
>>
>>
>>
>>
>>
>> On Tue, Oct 9, 2018 at 5:29 PM David Villarreal <
>> dvillarr...@hortonworks.com> wrote:
>>
>> Hi Kevin,
>>
>>
>>
>> In my humble opinion, this has to do with cpu processing encryption in
>> general based on which cipher being used.  Couldn’t the same type of
>> principals/improvements (hdfs encryption improvements) be done here for
>> let’s say for AES cipher suites?  If the main bottleneck here is CPU
>> couldn’t you enhance encryption though hardware acceleration and you may
>> see better performance numbers?
>>
>>
>>
>> https://calomel.org/aesni_ssl_performance.html
>>
>>
>>
>> Try forcing a less secure cipher to be used in your environment.  Do you
>> then see better numbers?
>>
>>
>>
>> dav
>>
>>
>>
>>
>>
>>
>> *From:*
>> *Kevin Risden*
>>
>>
>>
>>
>> * > Reply-To:
>> "user@knox.apache.org " > > Date: Tuesday, October 9, 2018 at 1:05 PM To:
>> "user@knox.apache.org " > > Subject: Re: WebHDFS performance issue in Knox*
>>
>>
>>
>> @David - Not sure what you mean since this is SSL/TLS and not related to
>> RPC encryption like the two JIRAs that you linked.
>>
>> @Guang - NP just took some time to sit down and look at it.
>>
>>
>>
>> Some preliminary investigation shows this may be the JDK implementation
>> of TLS/SSL that is slowing down the read path. I need to dig into it
>> further but found a few references showing that Java slowness for TLS/SSL
>> affects Jetty.
>>
>>-
>>https://nbsoftsolutions.com/blog/the-cost-of-tls-in-java-and-solutions
>>-
>>https://nbsoftsolutions.com/blog/dropwizard-1-3-upcoming-tls-improvements
>>- https://webtide.com/conscrypting-native-ssl-for-jetty/
>>
>> Locally testing off a Jetty 9.4 branch (for KNOX-1516), I was able to
>> enable conscrypting (
>> https://www.eclipse.org/jetty/documentation/9.4.x/configuring-ssl.html#conscrypt).
>> With that I was able to get read performance on par with non ssl and native
>> webhdfs. The write side of the equation still has some performance
>> differences that need to be looked at further.
>>
>>
>> Kevin Risden
>>
>>
>>
>>
>>
>> On Tue, Oct 9, 2018 at 2:01 PM Guang Yang  wrote:
>>
>> Thanks Kevin conducting such experiment! This is exactly what I saw
>> before. It doesn't look right the download speed is 10x slower when
>> enabling SSL.
>>
>>
>>
>> On Tue, Oct 9, 2018 at 10:40 AM David Villarreal <
>> dvillarr...@hortonworks.com> wrote:
>>
>> I bring this up because HDFS encryption saw an increase in performance.
>>
>> https://issues.apache.org/jira/browse/HDFS-6606
>>
>>
>>
>> https://issues.apache.org/jira/browse/HADOOP-10768
>>
>>
>>
>> Maybe Knox can make some enhancements in this area?
>>
>>
>>
>> *From: *David Villarreal 
>> *Date: *Tuesday, October 9, 2018 at 10:34 AM
>> *To: *"user@knox.apache.org" 
>> *Subject: *Re: WebHDFS performance issue in Knox
>>
>>
>>
>> Hi Kevin,
>>
>> Now increase your CPU processing power and show me the numbers.
>>
>>
>>
>> Do we support AES-NI optimization with extended CPU instruction set for
>> AES hardware acceleration?
>>
>> libcrypto.so library that supports hardware acceleration, such as
>> OpenSSL 1.0.1e. (Many OS versions have an older version of the library that
>> does not support AES-NI.)
>>
>>
>>
>>
>>
>>
>> *From: *
>>
>> *Kevin Risden*
>>
>>
>>
>>
>>
>> *> Reply-To:
>> "user@knox.apache.org " > > Date: Tuesday, October 9, 2018 at 10:26 AM To:
>> "user@knox.apache.org " > > Subject: Re: WebHDFS 

Re: WebHDFS performance issue in Knox

2018-10-10 Thread David Villarreal
Interesting.  Nice work.  2x improvement is great!




From: Kevin Risden 
Reply-To: "user@knox.apache.org" 
Date: Wednesday, October 10, 2018 at 12:48 PM
To: "user@knox.apache.org" 
Subject: Re: WebHDFS performance issue in Knox

I tried disabling GCM ciphers based on the following information:
* https://www.wowza.com/docs/how-to-improve-ssl-performance-with-java-8
* 
https://stackoverflow.com/questions/25992131/slow-aes-gcm-encryption-and-decryption-with-java-8u20

The results for the read were:
* knox ssl no GCM - 1,073,741,824  125MB/s   in 8.7s
* knox ssl - 1,073,741,824 54.3MB/s   in 20s

This is a little more than a 2x speedup. There is also information in the links 
above that there should be more performance improvements with JDK 9+.

For the write side slow down, I found an issue with how Knox is handing the 
streaming data on writes only. I am looking into fixing this to get the write 
performance for HDFS improved.

Kevin Risden


On Wed, Oct 10, 2018 at 1:20 PM David Villarreal 
mailto:dvillarr...@hortonworks.com>> wrote:
I believe Curl has an option of what cipher to use..  You may also be able to 
force it at the server jvm level using /jre/lib/security/java.security


From: Sandeep Moré mailto:moresand...@gmail.com>>
Reply-To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Date: Tuesday, October 9, 2018 at 6:39 PM
To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Subject: Re: WebHDFS performance issue in Knox

I think this would be a good test, worth a try, not sure how we can force a 
certain cipher to be used perhaps a permutation combination of
ssl.include.ciphers, ssl.exclude.ciphers.

Best,
Sandeep


On Tue, Oct 9, 2018 at 5:29 PM David Villarreal 
mailto:dvillarr...@hortonworks.com>> wrote:
Hi Kevin,

In my humble opinion, this has to do with cpu processing encryption in general 
based on which cipher being used.  Couldn’t the same type of 
principals/improvements (hdfs encryption improvements) be done here for let’s 
say for AES cipher suites?  If the main bottleneck here is CPU couldn’t you 
enhance encryption though hardware acceleration and you may see better 
performance numbers?

https://calomel.org/aesni_ssl_performance.html

Try forcing a less secure cipher to be used in your environment.  Do you then 
see better numbers?

dav


From:
Kevin Risden
mailto:kris...@apache.org>>
Reply-To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Date: Tuesday, October 9, 2018 at 1:05 PM
To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Subject: Re: WebHDFS performance issue in Knox

@David - Not sure what you mean since this is SSL/TLS and not related to RPC 
encryption like the two JIRAs that you linked.
@Guang - NP just took some time to sit down and look at it.

Some preliminary investigation shows this may be the JDK implementation of 
TLS/SSL that is slowing down the read path. I need to dig into it further but 
found a few references showing that Java slowness for TLS/SSL affects Jetty.

  *   https://nbsoftsolutions.com/blog/the-cost-of-tls-in-java-and-solutions
  *   https://nbsoftsolutions.com/blog/dropwizard-1-3-upcoming-tls-improvements
  *   https://webtide.com/conscrypting-native-ssl-for-jetty/
Locally testing off a Jetty 9.4 branch (for KNOX-1516), I was able to enable 
conscrypting 
(https://www.eclipse.org/jetty/documentation/9.4.x/configuring-ssl.html#conscrypt).
 With that I was able to get read performance on par with non ssl and native 
webhdfs. The write side of the equation still has some performance differences 
that need to be looked at further.

Kevin Risden


On Tue, Oct 9, 2018 at 2:01 PM Guang Yang mailto:k...@uber.com>> 
wrote:
Thanks Kevin conducting such experiment! This is exactly what I saw before. It 
doesn't look right the download speed is 10x slower when enabling SSL.

On Tue, Oct 9, 2018 at 10:40 AM David Villarreal 
mailto:dvillarr...@hortonworks.com>> wrote:
I bring this up because HDFS encryption saw an increase in performance.
https://issues.apache.org/jira/browse/HDFS-6606

https://issues.apache.org/jira/browse/HADOOP-10768

Maybe Knox can make some enhancements in this area?

From: David Villarreal 
mailto:dvillarr...@hortonworks.com>>
Date: Tuesday, October 9, 2018 at 10:34 AM
To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Subject: Re: WebHDFS performance issue in Knox

Hi Kevin,
Now increase your CPU processing power and show me the numbers.

Do we support AES-NI optimization with extended CPU instruction set for AES 
hardware acceleration?
libcrypto.so library that supports hardware acceleration, such as OpenSSL 
1.0.1e. (Many OS versions have an older version of the library that does not 
support AES-NI.)


From:
Kevin Risden
mailto:kris...@apache.org>>
Reply-To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Date: Tuesday, 

Re: WebHDFS performance issue in Knox

2018-10-10 Thread Kevin Risden
For reference I summarized from this thread and put the results
here: KNOX-1221

The write specific performance improvement is here: KNOX-1521

Kevin Risden


On Wed, Oct 10, 2018 at 3:48 PM Kevin Risden  wrote:

> I tried disabling GCM ciphers based on the following information:
> * https://www.wowza.com/docs/how-to-improve-ssl-performance-with-java-8
> *
> https://stackoverflow.com/questions/25992131/slow-aes-gcm-encryption-and-decryption-with-java-8u20
>
> The results for the read were:
> * knox ssl no GCM - 1,073,741,824  125MB/s   in 8.7s
> * knox ssl - 1,073,741,824 54.3MB/s   in 20s
>
> This is a little more than a 2x speedup. There is also information in the
> links above that there should be more performance improvements with JDK 9+.
>
> For the write side slow down, I found an issue with how Knox is handing
> the streaming data on writes only. I am looking into fixing this to get the
> write performance for HDFS improved.
>
> Kevin Risden
>
>
> On Wed, Oct 10, 2018 at 1:20 PM David Villarreal <
> dvillarr...@hortonworks.com> wrote:
>
>> I believe Curl has an option of what cipher to use..  You may also be
>> able to force it at the server jvm level using
>> /jre/lib/security/java.security
>>
>>
>>
>>
>>
>> *From: *Sandeep Moré 
>> *Reply-To: *"user@knox.apache.org" 
>> *Date: *Tuesday, October 9, 2018 at 6:39 PM
>> *To: *"user@knox.apache.org" 
>> *Subject: *Re: WebHDFS performance issue in Knox
>>
>>
>>
>> I think this would be a good test, worth a try, not sure how we can force
>> a certain cipher to be used perhaps a permutation combination of
>>
>> ssl.include.ciphers, ssl.exclude.ciphers.
>>
>>
>>
>> Best,
>>
>> Sandeep
>>
>>
>>
>>
>>
>> On Tue, Oct 9, 2018 at 5:29 PM David Villarreal <
>> dvillarr...@hortonworks.com> wrote:
>>
>> Hi Kevin,
>>
>>
>>
>> In my humble opinion, this has to do with cpu processing encryption in
>> general based on which cipher being used.  Couldn’t the same type of
>> principals/improvements (hdfs encryption improvements) be done here for
>> let’s say for AES cipher suites?  If the main bottleneck here is CPU
>> couldn’t you enhance encryption though hardware acceleration and you may
>> see better performance numbers?
>>
>>
>>
>> https://calomel.org/aesni_ssl_performance.html
>>
>>
>>
>> Try forcing a less secure cipher to be used in your environment.  Do you
>> then see better numbers?
>>
>>
>>
>> dav
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:Kevin Risden > Reply-To:
>> "user@knox.apache.org " > > Date: Tuesday, October 9, 2018 at 1:05 PM To:
>> "user@knox.apache.org " > > Subject: Re: WebHDFS performance issue in Knox*
>>
>>
>>
>> @David - Not sure what you mean since this is SSL/TLS and not related to
>> RPC encryption like the two JIRAs that you linked.
>>
>> @Guang - NP just took some time to sit down and look at it.
>>
>>
>>
>> Some preliminary investigation shows this may be the JDK implementation
>> of TLS/SSL that is slowing down the read path. I need to dig into it
>> further but found a few references showing that Java slowness for TLS/SSL
>> affects Jetty.
>>
>>-
>>https://nbsoftsolutions.com/blog/the-cost-of-tls-in-java-and-solutions
>>-
>>https://nbsoftsolutions.com/blog/dropwizard-1-3-upcoming-tls-improvements
>>- https://webtide.com/conscrypting-native-ssl-for-jetty/
>>
>> Locally testing off a Jetty 9.4 branch (for KNOX-1516), I was able to
>> enable conscrypting (
>> https://www.eclipse.org/jetty/documentation/9.4.x/configuring-ssl.html#conscrypt).
>> With that I was able to get read performance on par with non ssl and native
>> webhdfs. The write side of the equation still has some performance
>> differences that need to be looked at further.
>>
>>
>> Kevin Risden
>>
>>
>>
>>
>>
>> On Tue, Oct 9, 2018 at 2:01 PM Guang Yang  wrote:
>>
>> Thanks Kevin conducting such experiment! This is exactly what I saw
>> before. It doesn't look right the download speed is 10x slower when
>> enabling SSL.
>>
>>
>>
>> On Tue, Oct 9, 2018 at 10:40 AM David Villarreal <
>> dvillarr...@hortonworks.com> wrote:
>>
>> I bring this up because HDFS encryption saw an increase in performance.
>>
>> https://issues.apache.org/jira/browse/HDFS-6606
>>
>>
>>
>> https://issues.apache.org/jira/browse/HADOOP-10768
>>
>>
>>
>> Maybe Knox can make some enhancements in this area?
>>
>>
>>
>> *From: *David Villarreal 
>> *Date: *Tuesday, October 9, 2018 at 10:34 AM
>> *To: *"user@knox.apache.org" 
>> *Subject: *Re: WebHDFS performance issue in Knox
>>
>>
>>
>> Hi Kevin,
>>
>> Now increase your CPU processing power and show me the numbers.
>>
>>
>>
>> Do we support AES-NI optimization with extended CPU instruction set for
>> AES hardware acceleration?
>>
>> libcrypto.so library that supports hardware acceleration, such as
>> OpenSSL 1.0.1e. (Many OS versions have an older version of the library that
>> does not support AES-NI.)
>>
>>
>>
>>
>>
>>
>> *From: *
>>
>> *Kevin Risden*
>>
>>
>>
>>
>>
>> *> Reply-To:
>> "user@knox.apache.org " > > Date: Tuesday, October 9, 

Re: Spark History UI Error WARN HttpParser: Header is too large >8192

2018-10-10 Thread Kevin Risden
Moving this back to the user list since it was sent without the list:

Question: Is this error coming from the spark history server or the knox
server?

Answer: It is coming from the spark history UI. Here is what I see in the
log for spark history ui.

18/10/10 12:30:29 WARN HttpParser: Header is too large >8192

18/10/10 12:30:29 WARN HttpParser: badMessage: 413 for
HttpChannelOverHttp@2756e900{r=1,c=false,a=IDLE,uri=/?doAs=}


With the above information there is probably nothing that can be done from
Knox. The bad news is that there is no configuration in Spark for this yet.
See the following:
* https://issues.apache.org/jira/browse/SPARK-15090
*
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala#L337
* Should see something about:
* connector.setRequestHeaderSize(requestHeaderSize);
* connector.setResponseHeaderSize(responseHeaderSize);

Without a configuration to change, the Spark history UI is going to default
to the 8192 header size. A related change was done for most projects to
make that header size configurable.

Kevin Risden


On Tue, Oct 9, 2018 at 8:45 PM Theyaa Matti  wrote:

> Hi David,
>  Thank you for the quick response. I do have that property set
> with the recommended value in Knox, but the issue still persists only with
> the Spark History UI. The issue appears only when I enable ssl for knox. If
> I turn ssl off in Knox the issue disappears, and if I enable it, it shows
> up right away.
>
>  Are there any equivalent properties for Spark History UI? since I
> suspect the issue is with the Spark History UI and not with Knox since all
> the other UIs work fine with Knox.
>
> Best,
>
> Theyaa.
>
>
>
> On Tue, Oct 9, 2018 at 7:11 PM David Villarreal <
> dvillarr...@hortonworks.com> wrote:
>
>> Hi Theyaa,
>>
>> Change the size of gateway.httpserver.requestHeaderBuffer property.  I
>> think the default is 8912  (8k) change to 16384. See if that helps.
>>
>> For the second problem Request is a replay (34))] this message is often
>> seen when the timing of one of the servers is off.  Make sure you use NTPD
>> on all servers and they are all in sync.  If everything is in sync you can
>> work around this issue by turning off krb5 replay cache. With the following
>> parameter
>> -Dsun.security.krb5.rcache=none
>>
>> dav
>>
>>
>> On 10/9/18, 9:01 AM, "Theyaa Matti"  wrote:
>>
>> Hi,
>>I am getting this error message "WARN HttpParser: Header is too
>> large
>> >8192" when trying to access the spark history ui through knox. Any
>> idea
>> please?
>>
>> Also when trying to load the executors page, I get : GSS initiate
>> failed
>> [Caused by GSSException: Failure unspecified at GSS-API level
>> (Mechanism level:
>> Request is a replay (34))]
>>
>> when knox is requesting executorspage-template.html
>>
>> appreciate any help here.
>>
>>
>>


Re: WebHDFS performance issue in Knox

2018-10-10 Thread David Villarreal
I believe Curl has an option of what cipher to use..  You may also be able to 
force it at the server jvm level using /jre/lib/security/java.security


From: Sandeep Moré 
Reply-To: "user@knox.apache.org" 
Date: Tuesday, October 9, 2018 at 6:39 PM
To: "user@knox.apache.org" 
Subject: Re: WebHDFS performance issue in Knox

I think this would be a good test, worth a try, not sure how we can force a 
certain cipher to be used perhaps a permutation combination of
ssl.include.ciphers, ssl.exclude.ciphers.

Best,
Sandeep


On Tue, Oct 9, 2018 at 5:29 PM David Villarreal 
mailto:dvillarr...@hortonworks.com>> wrote:
Hi Kevin,

In my humble opinion, this has to do with cpu processing encryption in general 
based on which cipher being used.  Couldn’t the same type of 
principals/improvements (hdfs encryption improvements) be done here for let’s 
say for AES cipher suites?  If the main bottleneck here is CPU couldn’t you 
enhance encryption though hardware acceleration and you may see better 
performance numbers?

https://calomel.org/aesni_ssl_performance.html

Try forcing a less secure cipher to be used in your environment.  Do you then 
see better numbers?

dav


From: Kevin Risden mailto:kris...@apache.org>>
Reply-To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Date: Tuesday, October 9, 2018 at 1:05 PM
To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Subject: Re: WebHDFS performance issue in Knox

@David - Not sure what you mean since this is SSL/TLS and not related to RPC 
encryption like the two JIRAs that you linked.
@Guang - NP just took some time to sit down and look at it.

Some preliminary investigation shows this may be the JDK implementation of 
TLS/SSL that is slowing down the read path. I need to dig into it further but 
found a few references showing that Java slowness for TLS/SSL affects Jetty.

  *   https://nbsoftsolutions.com/blog/the-cost-of-tls-in-java-and-solutions
  *   https://nbsoftsolutions.com/blog/dropwizard-1-3-upcoming-tls-improvements
  *   https://webtide.com/conscrypting-native-ssl-for-jetty/
Locally testing off a Jetty 9.4 branch (for KNOX-1516), I was able to enable 
conscrypting 
(https://www.eclipse.org/jetty/documentation/9.4.x/configuring-ssl.html#conscrypt).
 With that I was able to get read performance on par with non ssl and native 
webhdfs. The write side of the equation still has some performance differences 
that need to be looked at further.

Kevin Risden


On Tue, Oct 9, 2018 at 2:01 PM Guang Yang mailto:k...@uber.com>> 
wrote:
Thanks Kevin conducting such experiment! This is exactly what I saw before. It 
doesn't look right the download speed is 10x slower when enabling SSL.

On Tue, Oct 9, 2018 at 10:40 AM David Villarreal 
mailto:dvillarr...@hortonworks.com>> wrote:
I bring this up because HDFS encryption saw an increase in performance.
https://issues.apache.org/jira/browse/HDFS-6606

https://issues.apache.org/jira/browse/HADOOP-10768

Maybe Knox can make some enhancements in this area?

From: David Villarreal 
mailto:dvillarr...@hortonworks.com>>
Date: Tuesday, October 9, 2018 at 10:34 AM
To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Subject: Re: WebHDFS performance issue in Knox

Hi Kevin,
Now increase your CPU processing power and show me the numbers.

Do we support AES-NI optimization with extended CPU instruction set for AES 
hardware acceleration?
libcrypto.so library that supports hardware acceleration, such as OpenSSL 
1.0.1e. (Many OS versions have an older version of the library that does not 
support AES-NI.)


From:
Kevin Risden
mailto:kris...@apache.org>>
Reply-To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Date: Tuesday, October 9, 2018 at 10:26 AM
To: "user@knox.apache.org" 
mailto:user@knox.apache.org>>
Subject: Re: WebHDFS performance issue in Knox

Writes look to have performance impact as well:

  *   directly to webhdfs - ~2.6 seconds
  *   knox no ssl - ~29 seconds
  *   knox ssl - ~49.6 seconds
Kevin Risden


On Tue, Oct 9, 2018 at 12:39 PM Kevin Risden 
mailto:kris...@apache.org>> wrote:
If I run two downloads concurrently:

1,073,741,824 46.1MB/s   in 22s
1,073,741,824 51.3MB/s   in 22s

So it isn't a limitation of the Knox gateway itself in total bandwidth but a 
per connection limitation somehow.

Kevin Risden


On Tue, Oct 9, 2018 at 12:24 PM Kevin Risden 
mailto:kris...@apache.org>> wrote:
So I was able to reproduce a slowdown with SSL with a pseudo distributed HDFS 
setup on a single node with Knox running on the same node. This was setup in 
Virtualbox on my laptop.

Rough timings with wget for a 1GB random file:

  *   directly to webhdfs - 1,073,741,824  252MB/s   in 3.8s
  *   knox no ssl - 1,073,741,824  264MB/s   in 3.6s
  *   knox ssl - 1,073,741,824 54.3MB/s   in 20s
There is a significant decrease with Knox SSL for some reason.

Kevin Risden


On