[jira] [Created] (HIVE-18112) show create for view having special char in where clause is not showing properly

2017-11-20 Thread Naresh P R (JIRA)
Naresh P R created HIVE-18112:
-

 Summary: show create for view having special char in where clause 
is not showing properly
 Key: HIVE-18112
 URL: https://issues.apache.org/jira/browse/HIVE-18112
 Project: Hive
  Issue Type: Bug
Reporter: Naresh P R
Priority: Minor
 Fix For: 2.3.2


e.g., 
CREATE VIEW `v2` AS select `evil_byte1`.`a` from `default`.`EVIL_BYTE1` where 
`evil_byte1`.`a` = 'abcÖdefÖgh';
Output:
==
0: jdbc:hive2://172.26.122.227:1> show create table v2;
++--+
| createtab_stmt
 |
++--+
| CREATE VIEW `v2` AS select `evil_byte1`.`a` from `default`.`EVIL_BYTE1` where 
`evil_byte1`.`a` = 'abc�def�gh'  |
++--+

Only show create output is having invalid characters, actual source table 
content is displayed properly in the console.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 63972: [HIVE-18037] Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2017-11-20 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63972/#review191563
---




llap-server/bin/llapDaemon.sh
Line 116 (original), 116 (patched)


what is this change for?



llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapOptionsProcessor.java
Line 65 (original), 65 (patched)


hmm... is it possible to keep old names as backward compat for scripts? or 
accept both names



llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java
Line 47 (original), 46 (patched)


is this still needed?



llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java
Lines 176 (patched)


is it a good idea to ignore all exceptions? the old code used to ignore 
UnknownApp.. only



llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java
Lines 182 (patched)


should this be configurable? or at least a constant



llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
Line 352 (original)


where did the timeout logic go? I saw the code above that seems to fail 
immediately when the app is not running, but no timeout logic



llap-server/src/main/resources/package.py
Line 184 (original)


hmm... that doesn't do anything anymore? I think at least the scripts would 
still be needed, right?



llap-server/src/main/resources/templates.py
Lines 43 (patched)


how does it know what LLAP_DAEMON_OPTS is, and other stuff like HEAPSIZE? 
it doesn't seem to be mentioned elsewhere in the patch and doesn't seem to 
follow the convention (e.g. component name is LLAP without DAEMON). Just 
checking; it used to have a fancy name like site.global. ...


- Sergey Shelukhin


On Nov. 21, 2017, 1:37 a.m., Gour Saha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63972/
> ---
> 
> (Updated Nov. 21, 2017, 1:37 a.m.)
> 
> 
> Review request for hive and Sergey Shelukhin.
> 
> 
> Bugs: HIVE-18037
> https://issues.apache.org/jira/browse/HIVE-18037
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> First phase of migration of slider based llap app-package to YARN Services in 
> Hadoop 3.x. There will be follow up changes to migrate status, log links, 
> diagnostics and completely eliminate Slider dependency.
> 
> 
> Diffs
> -
> 
>   bin/ext/llap.sh 0462d26 
>   binary-package-licenses/README ef127e3 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java bd25bc7 
>   jdbc/pom.xml 8710a8b 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/LlapServiceInstance.java
>  30b1810 
>   llap-server/bin/llapDaemon.sh 4945473 
>   llap-server/changes_for_non_slider_install.txt ec20fe1 
>   llap-server/pom.xml 176110d 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapOptionsProcessor.java
>  d01598c 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapServiceDriver.java 
> 5090be2 
>   llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java 
> a0af554 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
>  296a851 
>   llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java 
> d2e9396 
>   llap-server/src/main/resources/llap.py 26756ce 
>   llap-server/src/main/resources/package.py 21c34e9 
>   llap-server/src/main/resources/params.py 8972ba1 
>   llap-server/src/main/resources/templates.py 3d747a2 
>   packaging/src/main/assembly/bin.xml 84686ee 
> 
> 
> Diff: https://reviews.apache.org/r/63972/diff/1/
> 
> 
> Testing
> ---
> 
> Package created and successfully deployed in a Hadoop 3.0 cluster, using cmd 
> line shell script and programatically via Java APIs.
> 
> 
> File Attachments
> 
> 
> HIVE-18037.001.patch
>   
> https://reviews.apache.org/media/uploaded/files/2017/11/21/e0844c04-be9b-4334-80b0-bae05e9ed885__HIVE-18037.001.patch
> 
> 
> Thanks,
> 
> Gour Saha
> 
>



[jira] [Created] (HIVE-18111) Fix temp path for Spark DPP sink

2017-11-20 Thread Rui Li (JIRA)
Rui Li created HIVE-18111:
-

 Summary: Fix temp path for Spark DPP sink
 Key: HIVE-18111
 URL: https://issues.apache.org/jira/browse/HIVE-18111
 Project: Hive
  Issue Type: Bug
Reporter: Rui Li
Assignee: Rui Li






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 63864: HIVE-18072 WM - fix various bugs based on cluster testing - part 2

2017-11-20 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63864/#review191558
---


Ship it!




Ship It!

- Prasanth_J


On Nov. 17, 2017, 7:55 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63864/
> ---
> 
> (Updated Nov. 17, 2017, 7:55 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java
>  a02a414a76 
>   
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
>  9dc521e39e 
>   
> llap-tez/src/test/org/apache/hadoop/hive/llap/tezplugins/TestLlapTaskSchedulerService.java
>  51d2e08393 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/UserPoolMapping.java 
> 851245c154 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java 
> 3990f95334 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestWorkloadManager.java 
> 5ba6639e0c 
> 
> 
> Diff: https://reviews.apache.org/r/63864/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



[jira] [Created] (HIVE-18110) OrcInputFormat.combineOrCreateETLStrategy() doesn't combine if there are deltas

2017-11-20 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-18110:
-

 Summary: OrcInputFormat.combineOrCreateETLStrategy() doesn't 
combine if there are deltas
 Key: HIVE-18110
 URL: https://issues.apache.org/jira/browse/HIVE-18110
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.0.0
Reporter: Eugene Koifman


this method has 
{noformat}
if (!deltas.isEmpty() || combinedCtx == null) {
  //why is this checking for deltas.isEmpty() - is this Acid 1.0 remnant?
  return new ETLSplitStrategy(
  context, fs, dir, files, readerTypes, isOriginal, deltas, covered, 
ugi,
  allowSyntheticFileIds, isDefaultFs);
} else
{noformat}

what is the purpose of checking deltas?  (here with Acid 2.0 deltas is delete 
deltas)
Is this some remnant of Acid 1.0?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Review Request 63972: [HIVE-18037] Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2017-11-20 Thread Gour Saha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63972/
---

Review request for hive and Sergey Shelukhin.


Bugs: HIVE-18037
https://issues.apache.org/jira/browse/HIVE-18037


Repository: hive-git


Description
---

First phase of migration of slider based llap app-package to YARN Services in 
Hadoop 3.x. There will be follow up changes to migrate status, log links, 
diagnostics and completely eliminate Slider dependency.


Diffs
-

  bin/ext/llap.sh 0462d26 
  binary-package-licenses/README ef127e3 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java bd25bc7 
  jdbc/pom.xml 8710a8b 
  
llap-client/src/java/org/apache/hadoop/hive/llap/registry/LlapServiceInstance.java
 30b1810 
  llap-server/bin/llapDaemon.sh 4945473 
  llap-server/changes_for_non_slider_install.txt ec20fe1 
  llap-server/pom.xml 176110d 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapOptionsProcessor.java 
d01598c 
  llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapServiceDriver.java 
5090be2 
  llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java 
a0af554 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java
 296a851 
  llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryInfo.java 
d2e9396 
  llap-server/src/main/resources/llap.py 26756ce 
  llap-server/src/main/resources/package.py 21c34e9 
  llap-server/src/main/resources/params.py 8972ba1 
  llap-server/src/main/resources/templates.py 3d747a2 
  packaging/src/main/assembly/bin.xml 84686ee 


Diff: https://reviews.apache.org/r/63972/diff/1/


Testing
---

Package created and successfully deployed in a Hadoop 3.0 cluster, using cmd 
line shell script and programatically via Java APIs.


File Attachments


HIVE-18037.001.patch
  
https://reviews.apache.org/media/uploaded/files/2017/11/21/e0844c04-be9b-4334-80b0-bae05e9ed885__HIVE-18037.001.patch


Thanks,

Gour Saha



[jira] [Created] (HIVE-18109) Don't reserve pool and default as keywords

2017-11-20 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-18109:
---

 Summary: Don't reserve pool and default as keywords
 Key: HIVE-18109
 URL: https://issues.apache.org/jira/browse/HIVE-18109
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ashutosh Chauhan
Assignee: Sergey Shelukhin


HIVE-17902 broke this



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18108) in case basic stats are missing; rowcount estimation depends on the select columns size

2017-11-20 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18108:
---

 Summary: in case basic stats are missing; rowcount estimation 
depends on the select columns size
 Key: HIVE-18108
 URL: https://issues.apache.org/jira/browse/HIVE-18108
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


in case basicstats are not available (especially rowcount):

{code}
set hive.stats.autogather=false;
create table t (a integer, b string);

insert into t values (1,'asd1');
insert into t values (2,'asd2');
insert into t values (3,'asd3');
insert into t values (4,'asd4');
insert into t values (5,'asd5');

explain select a,count(1) from t group by a;
-- estimated to read 8 rows from table t
explain select b,count(1) from t group by b;
-- estimated: 1 rows
explain select a,b,count(1) from t group by a,b;
-- estimated: 1 rows
{code}

it may not depend on the actually selected column set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18107) CBO Multi Table Insert Query with JOIN operator and GROUPING SETS throws SemanticException Invalid table alias or column reference 'GROUPING__ID'

2017-11-20 Thread Sergey Zadoroshnyak (JIRA)
Sergey Zadoroshnyak created HIVE-18107:
--

 Summary: CBO Multi Table Insert Query with JOIN operator and 
GROUPING SETS  throws SemanticException  Invalid table alias or column 
reference 'GROUPING__ID'
 Key: HIVE-18107
 URL: https://issues.apache.org/jira/browse/HIVE-18107
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 2.3.0
Reporter: Sergey Zadoroshnyak
Assignee: Jesus Camacho Rodriguez
 Fix For: 3.0.0


hive 2.3.0

set hive.execution.engine=tez;
set hive.multigroupby.singlereducer=false;
*set hive.cbo.enable=true;*

Multi Table Insert Query. *Template:*

FROM (SELECT * FROM tableA) AS alias_a JOIN (SELECT * FROM tableB) AS  alias_b 
ON (alias_a.column_1 = alias_b.column_1 AND alias_a.column_2 = alias_b.column_2)
  
  INSERT OVERWRITE TABLE tableC PARTITION
(
  partition1='first_fragment'
)
  SELECT 
GROUPING__ID,
alias_a.column4,
alias_a.column5,
alias_a.column6,
alias_a.column7,
  count(1)  


 AS rownum
  WHERE alias_b.column_3 = 1
  GROUP BY 
alias_a.column4,
alias_a.column5,
alias_a.column6,
alias_a.column7
  GROUPING SETS 
( 
(alias_a.column4),
(alias_a.column4,alias_a.column5), 
(alias_a.column4,alias_a.column5,alias_a.column6,alias_a.column7)
)
 
  INSERT OVERWRITE TABLE tableC PARTITION
(
   partition1='second_fragment'
)
  SELECT 
GROUPING__ID,
alias_a.column4,
alias_a.column5,
alias_a.column6,
alias_a.column7,
count(1)


   AS rownum
  WHERE alias_b.column_3 = 2
  GROUP BY 
alias_a.column4,
alias_a.column5,
alias_a.column6,
alias_a.column7
  GROUPING SETS 
( 
(alias_a.column4),
(alias_a.column4,alias_a.column5), 
(alias_a.column4,alias_a.column5,alias_a.column6,alias_a.column7)
)

16:39:17,822 ERROR CalcitePlanner:423 - CBO failed, skipping CBO. 
org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:537 Invalid table 
alias or column reference 'GROUPING__ID': (possible column names are:..
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11600)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11548)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSelectLogicalPlan(CalcitePlanner.java:3706)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3999)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1315)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1261)
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:997)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149)
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1069)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1085)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:364)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:511)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1316)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1294)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:204)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:290)
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:320)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
at 

[jira] [Created] (HIVE-18106) analyze table fails on parquet table

2017-11-20 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-18106:
---

 Summary: analyze table fails on parquet table
 Key: HIVE-18106
 URL: https://issues.apache.org/jira/browse/HIVE-18106
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan



{noformat}
hive> analyze table item compute statistics for columns;

Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value 
at 0 in block -1 in file hdfs://...
at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:223)
at 
org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:212)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:98)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:60)
at 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:415)
... 27 more
Caused by: java.lang.UnsupportedOperationException: 
org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary
at 
org.apache.parquet.column.Dictionary.decodeToBinary(Dictionary.java:44)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$BinaryConverter.setDictionary(ETypeConverter.java:283)
at 
org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:346)
at 
org.apache.parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(ColumnReadStoreImpl.java:82)
at 
org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:77)
at 
org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:270)
at 
org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:140)
at 
org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:106)
at 
org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:154)
at 
org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:106)
at 
org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:136)
at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:194)
... 32 more

hive> select count(*) from item;
30
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18105) Aggregation of an empty set doesn't pass constants to the UDAF

2017-11-20 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18105:
---

 Summary: Aggregation of an empty set doesn't pass constants to the 
UDAF
 Key: HIVE-18105
 URL: https://issues.apache.org/jira/browse/HIVE-18105
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


the groupbyoperator's logic for firstrow passes {{null}} for all parameters.
see here:
 
{here|https://github.com/apache/hive/blob/39d46e8af5a3794f7395060b890f94ddc84516e7/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java#L1116}

this could obstruct {{compute_stats}} operations because it has a constant 
argument.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)