please give me the permission to update the wiki of hive on spark

2017-01-02 Thread Zhang, Liyun
Hi

  I want to update 
wiki<https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started>
 of hive on spark because HIVE-8373, my  
Confluence<https://cwiki.apache.org/confluence/signup.action> username is 
kellyzly, please provide the privilege to me to update wiki.


Best Regards
Kelly Zhang/Zhang,Liyun



please give me the permission to update the wiki of hive on spark

2017-01-02 Thread Zhang, Liyun
Hi

  I want to update 
wiki<https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started>
 of hive on spark because HIVE-8373, my  
Confluence<https://cwiki.apache.org/confluence/signup.action> username is 
kellyzly, please provide the privilege to me to update wiki.


Best Regards
Kelly Zhang/Zhang,Liyun



RE: Welcome Rui Li to Hive PMC

2017-05-25 Thread Zhang, Liyun
Congratulations Rui!!

-Original Message-
From: Chao Sun [mailto:sunc...@apache.org] 
Sent: Thursday, May 25, 2017 12:41 PM
To: d...@hive.apache.org
Cc: user@hive.apache.org
Subject: Re: Welcome Rui Li to Hive PMC

Congratulations Rui!!

On Wed, May 24, 2017 at 9:19 PM, Xuefu Zhang  wrote:

> Hi all,
>
> It's an honer to announce that Apache Hive PMC has recently voted to 
> invite Rui Li as a new Hive PMC member. Rui is a long time Hive 
> contributor and committer, and has made significant contribution in 
> Hive especially in Hive on Spark. Please join me in congratulating him 
> and looking forward to a bigger role that he will play in Apache Hive project.
>
> Thanks,
> Xuefu
>


RE: Jimmy Xiang now a Hive PMC member

2017-05-25 Thread Zhang, Liyun
Congratulations Jimmy!!

-Original Message-
From: Chao Sun [mailto:sunc...@apache.org] 
Sent: Thursday, May 25, 2017 12:41 PM
To: d...@hive.apache.org
Cc: user@hive.apache.org
Subject: Re: Jimmy Xiang now a Hive PMC member

Congratulations Jimmy!!

On Wed, May 24, 2017 at 9:16 PM, Xuefu Zhang  wrote:

> Hi all,
>
> It's an honer to announce that Apache Hive PMC has recently voted to 
> invite Jimmy Xiang as a new Hive PMC member. Please join me in 
> congratulating him and looking forward to a bigger role that he will 
> play in Apache Hive project.
>
> Thanks,
> Xuefu
>


RE: How to measure the execution time of query on Hive on Tez

2017-10-12 Thread Zhang, Liyun
Hi all:
   Maybe in last mail the attached picture is not shown.
I re-described my question here.  I saw following statistics about the runtime 
when running query.

The Run DAG is 318s.  But it is not the sum of DURATION of all 
VERTICES((59549+4069+3055+3055+1004+1006+132736+34248+11077+1003+439+140896+35260+8070)/1000=435s
Not the sum of CPU_TIME.   There are several indicator "RUN 
DAG","DURATION","CPU_TIME",  which indicator I should use when measure the 
performance? Sometimes I found there is significant improvement in sum of 
(CPU_TIME) while there is no significant improvement in "RUN DAG".  Is this 
normal?  Appreciate to get some feedback from you!



2017-10-12T16:29:39,262  INFO [main] SessionState: 
--
2017-10-12T16:29:39,262  INFO [main] SessionState: OPERATION
DURATION
2017-10-12T16:29:39,262  INFO [main] SessionState: 
--
2017-10-12T16:29:39,263  INFO [main] SessionState: Compile Query
   3.72s
2017-10-12T16:29:39,263  INFO [main] SessionState: Prepare Plan 
   0.60s
2017-10-12T16:29:39,263  INFO [main] SessionState: Submit Plan  
   0.61s
2017-10-12T16:29:39,263  INFO [main] SessionState: Start DAG
   0.52s
2017-10-12T16:29:39,263  INFO [main] SessionState: Run DAG  
 318.54s
2017-10-12T16:29:39,263  INFO [main] SessionState: 
--
2017-10-12T16:29:39,263  INFO [main] SessionState:
2017-10-12T16:29:39,289  INFO [cea2258c-aa47-46a1-af5b-39860a6edbb3 main] 
counters.Limits: Counter limits initialized with parameters:  
GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64, MAX_COUNTERS=2000
2017-10-12T16:29:39,294  INFO [main] SessionState: Task Execution Summary
2017-10-12T16:29:39,294  INFO [main] SessionState: 
--
2017-10-12T16:29:39,294  INFO [main] SessionState:   VERTICES  DURATION(ms) 
  CPU_TIME(ms)GC_TIME(ms)   INPUT_RECORDS   OUTPUT_RECORDS
2017-10-12T16:29:39,294  INFO [main] SessionState: 
--
2017-10-12T16:29:39,298  INFO [main] SessionState:  Map 1  59549.00 
 1,355,520 28,565 550,076,5541,602,119,842
2017-10-12T16:29:39,300  INFO [main] SessionState: Map 12   4069.00 
15,670522  73,049  732
2017-10-12T16:29:39,300  INFO [main] SessionState: Map 13   3055.00 
14,030567 212  212
2017-10-12T16:29:39,301  INFO [main] SessionState: Map 14   3055.00 
13,820606 212  212
2017-10-12T16:29:39,303  INFO [main] SessionState: Reducer 10   1004.00 
13,450265   44
2017-10-12T16:29:39,305  INFO [main] SessionState: Reducer 11   1006.00 
 4,290 71 216  212
2017-10-12T16:29:39,307  INFO [main] SessionState:  Reducer 2 132736.00 
 2,362,160 83,029 537,120,745  107,740,258
2017-10-12T16:29:39,308  INFO [main] SessionState:  Reducer 3  34248.00 
   643,350 20,661 107,740,470  203
2017-10-12T16:29:39,310  INFO [main] SessionState:  Reducer 4  11077.00 
77,020  1,496 203   31
2017-10-12T16:29:39,311  INFO [main] SessionState:  Reducer 5   1003.00 
40,030824  10   10
2017-10-12T16:29:39,312  INFO [main] SessionState:  Reducer 6439.00 
   590  0  100
2017-10-12T16:29:39,314  INFO [main] SessionState:  Reducer 7 140896.00 
 1,925,760 52,784 537,120,745  107,740,258
2017-10-12T16:29:39,316  INFO [main] SessionState:  Reducer 8  35260.00 
   590,200 22,331 107,740,470   76
2017-10-12T16:29:39,318  INFO [main] SessionState:  Reducer 9   8070.00 
24,630249  764
2017-10-12T16:29:39,318  INFO [main] SessionState: 
-------

From: Zhang, Liyun [mailto:liyun.zh...@intel.com]
Sent: Thursday, October 12, 2017 4:40 PM
To: d...@hive.apache.org
Subject: How to measure the execution time of query on Hive on Tez

Hi  all:
  Anyone knows how to view the detail execution time of every map/reduce task 
in hive on tez?
I screenshot the result:
Run DAG is  324.s . But this 

Anyone knows the problem I found in VectorizedLogicBench.IfExprLongColumnLongColumnBench?

2017-11-16 Thread Zhang, Liyun
Hi all:
Now I am using hive micro bench(HIVE-10189) to test the performance improvement 
of AVX2 and AVX512.
When I test the 
VectorizedLogicBench.IfExprLongColumnLongColumnBench<https://github.com/apache/hive/blob/master/itests/hive-jmh/src/main/java/org/apache/hive/benchmark/vectorization/VectorizedLogicBench.java#L115>,
 I found the result as following
When enabling AVX512:
o.a.h.b.v.VectorizedLogicBench.IfExprLongColumnLongColumnBench.bench
 avgt   10  1621602.652 ± 583775.700  us/op
When enabling AVX2:
o.a.h.b.v.VectorizedLogicBench.IfExprLongColumnLongColumnBench.bench
 avgt   10  1817855.876 ± 49289.868  us/op

You see that there is a great float for IfExprLongColumnLongColumnBench.bench, 
the  float is 583775 and the average value is 1621602. It shows that the values 
in the test are very discrete @Teddy, as you are more familiar with the code, 
do you know why the test data is discrete? If the data is discrete, does this 
mean the test data  is not stable?



Appreciate to get some feedback from you!
Best Regards
Kelly Zhang/Zhang,Liyun