[jira] [Created] (HIVE-22294) ConditionalWork cannot be cast to MapredWork When both skew.join and auto.convert is on.

2019-10-04 Thread Qiang.Kang (Jira)
Qiang.Kang created HIVE-22294:
-

 Summary: ConditionalWork cannot be cast to MapredWork  When both 
skew.join and auto.convert is on.  
 Key: HIVE-22294
 URL: https://issues.apache.org/jira/browse/HIVE-22294
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 2.3.4, 3.1.1, 2.3.0
Reporter: Qiang.Kang


Our hive version is 1.2.1 which has merged some patches (including patches 
mentioned  in https://issues.apache.org/jira/browse/HIVE-14557, 
https://issues.apache.org/jira/browse/HIVE-16155 ) .

 

My sql query string is like this:

```

set hive.auto.convert.join = true;

set hive.optimize.skewjoin=true;

 

SELECT a.*

FROM

a

JOIN b

ON a.id=b.id AND a.uid = b.uid 

LEFT JOIN c

ON b.id=c.id AND b.uid=c.uid;

```

And we met some error: 

FAILED: ClassCastException org.apache.hadoop.hive.ql.plan.ConditionalWork 
cannot be cast to org.apache.hadoop.hive.ql.plan.MapredWork

 

The main reason is that there is a conditional task (*MapJoin*) in the list 
tasks of another Conditional task (*SkewJoin*).  Here is the code snippet where 
it throws this exception:

`org.apache.hadoop.hive.ql.optimizer.physical.MapJoinResolver:`

```java

public Object dispatch(Node nd, Stack stack, Object... nodeOutputs)
 throws SemanticException {
 Task currTask = (Task) nd;
 // not map reduce task or not conditional task, just skip
 if (currTask.isMapRedTask()) {
 if (currTask instanceof ConditionalTask) {
 // get the list of task
 List> taskList = ((ConditionalTask) 
currTask).getListTasks();
 for (Task tsk : taskList) {
 if (tsk.isMapRedTask()) {

 

//  ATTENTION: tsk May be ConditionalTask !!!
 this.processCurrentTask(tsk, ((ConditionalTask) currTask));
 }
 }
 } else {
 this.processCurrentTask(currTask, null);
 }
 }
 return null;
}

private void processCurrentTask(Task currTask,
 ConditionalTask conditionalTask) throws SemanticException {
 // get current mapred work and its local work
 MapredWork mapredWork = (MapredWork) currTask.getWork(); // WRONG!!
 MapredLocalWork localwork = mapredWork.getMapWork().getMapRedLocalWork();

```

 

Here is some detail Information about query plan:

*-  set hive.auto.convert.join = true; set hive.optimize.skewjoin=false;*

```

Stage-1 is a root stage [a join b]
 Stage-12 [map join]depends on stages: Stage-1 , consists of Stage-13, Stage-2
 Stage-13 has a backup stage: Stage-2
 Stage-11 depends on stages: Stage-13
 Stage-8 depends on stages: Stage-2, Stage-11 , consists of Stage-5, Stage-4, 
Stage-6
 Stage-5
 Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
 Stage-14 depends on stages: Stage-0
 Stage-3 depends on stages: Stage-14
 Stage-4
 Stage-6
 Stage-7 depends on stages: Stage-6
 Stage-2

```

*-  set hive.auto.convert.join = false; set hive.optimize.skewjoin=true;*

```

STAGE DEPENDENCIES:
 Stage-1 is a root stage
 Stage-12 depends on stages: Stage-1 , consists of Stage-13, Stage-2
 Stage-13 [skew Join map local task]
 Stage-11 depends on stages: Stage-13
 Stage-2 depends on stages: Stage-11
 Stage-8 depends on stages: Stage-2 , consists of Stage-5, Stage-4, Stage-6
 Stage-5
 Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
 Stage-14 depends on stages: Stage-0
 Stage-3 depends on stages: Stage-14
 Stage-4
 Stage-6
 Stage-7 depends on stages: Stage-6

```

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


ApacheCon Europe 2019 talks which are relevant to Apache Hive

2019-10-04 Thread myrle

Dear Apache Hive committers,

In a little over 2 weeks time, ApacheCon Europe is taking place in 
Berlin. Join us from October 22 to 24 for an exciting program and lovely 
get-together of the Apache Community.


We are also planning a hackathon.  If your project is interested in 
participating, please enter Hive here: 
https://cwiki.apache.org/confluence/display/COMDEV/Hackathon


The following talks should be especially relevant for you:

 * 
**https://aceu19.apachecon.com/session/apache-hivemall-meets-pyspark-scalable-machine-learning-hive-spark-and-python**
 * 
https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
   

 * 
https://aceu19.apachecon.com/session/apache-beam-running-big-data-pipelines-python-and-go-spark
   

 * 
https://aceu19.apachecon.com/session/patterns-and-anti-patterns-running-apache-bigdata-projects-kubernetes
   
 * https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source*
   *
 * 
*https://aceu19.apachecon.com/session/maintaining-java-library-light-new-java-release-train*

Furthermore there will be a whole conference track on community topics: 
Learn how to motivate users to contribute patches, how the board of 
directors works, how to navigate the Incubator and much more: ApacheCon 
Europe 2019 Community track 


Tickets are available here  – 
for Apache Committers we offer discounted tickets.  Prices will be going 
up on October 7th, so book soon.


Please also help spread the word and make ApacheCon Europe 2019 a success!

We’re looking forward to welcoming you at #ACEU19!

Best,

Your ApacheCon team



[jira] [Created] (HIVE-22293) Additional test cases

2019-10-04 Thread Sumin Byeon (Jira)
Sumin Byeon created HIVE-22293:
--

 Summary: Additional test cases
 Key: HIVE-22293
 URL: https://issues.apache.org/jira/browse/HIVE-22293
 Project: Hive
  Issue Type: Improvement
Reporter: Sumin Byeon


I've noticed some classes, such as {{DecimalColumnVector}}, 
{{VoidColumnVector}}, {{TimestampColumnVector}}, do not have any test coverage 
(or insufficient coverage). Would it be okay if I write some test code and 
submit patches or pull requests?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71561: HIVE-22250

2019-10-04 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71561/
---

(Updated Oct. 4, 2019, 12:44 p.m.)


Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet 
Garg.


Bugs: HIVE-22250
https://issues.apache.org/jira/browse/HIVE-22250


Repository: hive-git


Description
---

Describe function does not provide description for rank functions
=
The `DESCRIBE FUNCTION` command gets the description of a function from the 
`@Description` annotations `value` field. If an UDF is annotated with the 
`@WindowFunctionDescription` hive prints 
```
There is no documentation for function 
```
Even if the description is present in the `@WindowFunctionDescription` 
annotation.

This patch moves the `@WindowFunctionDescription.Description` field to a 
separate annotation and provide the `@Description` annotation if both 
annotations are missing.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java fc2a0e1970 
  ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java 
511d9641c3 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java a7f4bf1fcc 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToByte.java 8f4ec3b1ef 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDouble.java 7a0145243d 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToFloat.java 451b45fbbc 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java 4fe9c323cc 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java b31eeb08a0 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java 315789c1c1 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBloomFilter.java 
a8bcc972bb 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java 
70541fe565 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java 
30bfd2bb8c 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFFirstValue.java 
b8b7d8e6da 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java 
e0edbb42af 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLastValue.java 
dadec3b793 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLead.java 
e678278b8b 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFNTile.java 
8b2812d5bc 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java 
1a7c94431b 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileCont.java
 e7e4fda6ea 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileDisc.java
 d7c295cb11 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java 
a28def73a1 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRowNumber.java 
41a3e582ec 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAssertTrueOOM.java 
c5c73835af 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBucketNumber.java 
472cc85047 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEpochMilli.java 
d8e822ae97 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInBloomFilter.java 
733fe63e80 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSurrogateKey.java 
1372b60724 
  ql/src/test/queries/clientpositive/desc_function.q d055d9ca03 
  ql/src/test/results/clientpositive/desc_function.q.out 1f804bba60 
  ql/src/test/results/clientpositive/udaf_percentile_cont.q.out a2dae4a06e 
  ql/src/test/results/clientpositive/udaf_percentile_disc.q.out 3ef0cf9874 
  ql/src/test/results/clientpositive/udf_bigint.q.out 5a7430e120 
  ql/src/test/results/clientpositive/udf_boolean.q.out 8d66d5c23d 
  ql/src/test/results/clientpositive/udf_double.q.out f34efcae8d 
  ql/src/test/results/clientpositive/udf_float.q.out d15132928d 
  ql/src/test/results/clientpositive/udf_int.q.out c954e58dcf 
  ql/src/test/results/clientpositive/udf_smallint.q.out 01d468215a 
  ql/src/test/results/clientpositive/udf_tinyint.q.out 50373c7783 


Diff: https://reviews.apache.org/r/71561/diff/4/

Changes: https://reviews.apache.org/r/71561/diff/3-4/


Testing
---

Added test cases to `desc_function.q`:
```
DESCRIBE FUNCTION dense_rank;
DESCRIBE FUNCTION EXTENDED dense_rank;
```


Thanks,

Krisztian Kasa