[ 
https://issues.apache.org/jira/browse/KNOX-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652353#comment-16652353
 ] 

Kevin Risden commented on KNOX-1524:
------------------------------------

So some interesting new results with Hive 4.0.0-SNAPSHOT commit 
d7be4b9f26345439c472969461d3d2c81f7e5057.

HIVE-20621 didn't seem to have an affect on performance (positive or negative).

HIVE-17194 looks like it caused a performance degradation at least for the test 
case I was running.

2 million rows
 * HDFS native - ~2.2 seconds
 * Hive binary - ~11.0 seconds
 * Hive HTTP - ~19.1 seconds
 * Hive HTTP without HS2 compression - ~13.1 seconds
 * Hive HTTP with Knox - ~24.8 seconds
 * Hive HTTP with Knox without HS2 compression - ~19.3 seconds

I used "--hiveconf hive.server2.thrift.http.compression.enabled=false" to 
disable compression for HiveServer2. 

That brings the HiveServer 2 HTTP and binary modes closer in performance with 
each other. Knox supports compression as well by default so curious if Knox 
compression is causing the remaining bottleneck. 

> Hive "select *" performance evaluation
> --------------------------------------
>
>                 Key: KNOX-1524
>                 URL: https://issues.apache.org/jira/browse/KNOX-1524
>             Project: Apache Knox
>          Issue Type: Task
>            Reporter: Kevin Risden
>            Assignee: Kevin Risden
>            Priority: Major
>             Fix For: 1.2.0
>
>
> While looking at WebHDFS performance in KNOX-1221, I decided to look a bit 
> more into performance for common use cases. Hive performance is another area 
> that could use some research.
> Use "select * ... limit" to get a comparison of raw return speed from 
> HiveServer2. This should show how fast results can be streamed through 
> HiveServer2 and Knox. Compare the results to "hdfs dfs -text" since this will 
> render the data directly from HDFS. This should give comparisons for the 
> difference in overhead between HDFS, HiveServer2 binary, HiveServer2 HTTP, 
> and HiveServer2 HTTP with Knox.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to