Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Miklos Csanady

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/
---

(Updated Aug. 13, 2016, 5:34 a.m.)


Review request for hive, Peter Vary and Sergio Pena.


Bugs: HIVE-14345
https://issues.apache.org/jira/browse/HIVE-14345


Repository: hive-git


Description
---

Fixed output table formatting header and footer lines.


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
  beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/50982/diff/


Testing
---

See attached Unit testClass.
After building with patch, the bug eliminated.


Thanks,

Miklos Csanady



[jira] [Created] (HIVE-14535) add micromanaged tables to Hive (metastore keeps track of the files)

2016-08-12 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-14535:
---

 Summary: add micromanaged tables to Hive (metastore keeps track of 
the files)
 Key: HIVE-14535
 URL: https://issues.apache.org/jira/browse/HIVE-14535
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


Design doc: 
https://docs.google.com/document/d/1b3t1RywfyRb73-cdvkEzJUyOiekWwkMHdiQ-42zCllY

Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14534) modify tables in tests in HIVE-14479 to use transactional_properties=default

2016-08-12 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-14534:
-

 Summary: modify tables in tests in HIVE-14479 to use 
transactional_properties=default
 Key: HIVE-14534
 URL: https://issues.apache.org/jira/browse/HIVE-14534
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Affects Versions: 2.2.0
Reporter: Eugene Koifman
Priority: Minor


only need to do this for 2.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Peter Vary

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/#review145674
---



One little nit. Otherwise LGTM


beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java (lines 50 - 
51)


nit: should be one line


- Peter Vary


On Aug. 12, 2016, 8:36 p.m., Miklos Csanady wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50982/
> ---
> 
> (Updated Aug. 12, 2016, 8:36 p.m.)
> 
> 
> Review request for hive, Peter Vary and Sergio Pena.
> 
> 
> Bugs: HIVE-14345
> https://issues.apache.org/jira/browse/HIVE-14345
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Fixed output table formatting header and footer lines.
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
>   beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/50982/diff/
> 
> 
> Testing
> ---
> 
> See attached Unit testClass.
> After building with patch, the bug eliminated.
> 
> 
> Thanks,
> 
> Miklos Csanady
> 
>



[jira] [Created] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-12 Thread Thomas Friedrich (JIRA)
Thomas Friedrich created HIVE-14533:
---

 Summary: improve performance of enforceMaxLength in 
HiveCharWritable/HiveVarcharWritable
 Key: HIVE-14533
 URL: https://issues.apache.org/jira/browse/HIVE-14533
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 2.1.0, 1.2.1
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
Priority: Minor


The enforceMaxLength method in HiveVarcharWritable calls 
set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
maxLength); no matter how long the string is. The calls to getHiveVarchar() and 
getHiveChar() decode the string every time the method is called 
(Text.toString() calls Text.decode). This can be very expensive and is 
unnecessary if the string is shorter than maxLength for HiveVarcharWritable or 
different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14532) Enable qtests from IDE

2016-08-12 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-14532:
---

 Summary: Enable qtests from IDE
 Key: HIVE-14532
 URL: https://issues.apache.org/jira/browse/HIVE-14532
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich
Priority: Minor


with HIVE-1 applied; i've played around with executing qtest-s from 
eclipse...after the patch seemed ok; i've checked it with:

{code}
git clean -dfx
mvn package install eclipse:eclipse -Pitests -DskipTests
mvn -q test -Pitests -Dtest=TestNegativeCliDriver -Dqfile=combine2.q
{code}
the last step I think is not required...but I bootstrapped and checked my 
project integrity this way.

After this I was able to execute {{TestCliDriver}} from eclipse using 
{{-Dqfile=combine.q}}, other qfiles may or may not work...but will have at 
least some chances to be usable.

For my biggest surprise {{alter_concatenate_indexed_table.q}} also 
passed...which contains relative file references - and I suspected that it will 
have issues with that..

note: I've the datanucleus plugin installed...and i use it when I need to.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14531) Support deep nested struct for INSERT OVER DIRECTORY

2016-08-12 Thread Chao Sun (JIRA)
Chao Sun created HIVE-14531:
---

 Summary: Support deep nested struct for INSERT OVER DIRECTORY
 Key: HIVE-14531
 URL: https://issues.apache.org/jira/browse/HIVE-14531
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Serializers/Deserializers
Affects Versions: 2.2.0
Reporter: Chao Sun


Currently if we do something similar to:
{code}
INSERT OVERWRITE DIRECTORY  SELECT * FROM 

{code}

Then Hive may fail with error message like this:
{code}
Error: Error while compiling statement: FAILED: SemanticException 
org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting 
supported for LazySimpleSerde is 7 Unable to work with level 8. Use 
hive.serialization.extend.nesting.levels serde property for tables using 
LazySimpleSerde. (state=42000,code=4)
{code}

It seems there's no way to set serde properties in this case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Miklos Csanady

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/
---

(Updated Aug. 12, 2016, 8:36 p.m.)


Review request for hive, Peter Vary and Sergio Pena.


Changes
---

Removed author tag. Hopefully all others are fixed.


Bugs: HIVE-14345
https://issues.apache.org/jira/browse/HIVE-14345


Repository: hive-git


Description
---

Fixed output table formatting header and footer lines.


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
  beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/50982/diff/


Testing
---

See attached Unit testClass.
After building with patch, the bug eliminated.


Thanks,

Miklos Csanady



Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Miklos Csanady


> On Aug. 12, 2016, 4:54 p.m., Vihang Karajgaonkar wrote:
> > beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java, line 
> > 100
> > 
> >
> > Is it possible to avoid hardcoding this string? If in the future 
> > beelineOpts.maxColoumnWidth default values are changed, this test might 
> > fail. I think it will be more robust to determine this string 
> > programatically using the values in mockResultData and value of 
> > maxColumnWidth.

I have changed for shorter strings in text, so it is now independent from 
maxColumnWidth.


- Miklos


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/#review145631
---


On Aug. 12, 2016, 8:16 p.m., Miklos Csanady wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50982/
> ---
> 
> (Updated Aug. 12, 2016, 8:16 p.m.)
> 
> 
> Review request for hive, Peter Vary and Sergio Pena.
> 
> 
> Bugs: HIVE-14345
> https://issues.apache.org/jira/browse/HIVE-14345
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Fixed output table formatting header and footer lines.
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
>   beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/50982/diff/
> 
> 
> Testing
> ---
> 
> See attached Unit testClass.
> After building with patch, the bug eliminated.
> 
> 
> Thanks,
> 
> Miklos Csanady
> 
>



Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Miklos Csanady

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/
---

(Updated Aug. 12, 2016, 8:15 p.m.)


Review request for hive, Peter Vary and Sergio Pena.


Changes
---

New patch file is uploded


Bugs: HIVE-14345
https://issues.apache.org/jira/browse/HIVE-14345


Repository: hive-git


Description
---

Fixed output table formatting header and footer lines.


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
  beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/50982/diff/


Testing
---

See attached Unit testClass.
After building with patch, the bug eliminated.


File Attachments


corrected version
  
https://reviews.apache.org/media/uploaded/files/2016/08/12/aed736a0-95f1-4f5b-9bda-4db767911065__HIVE-14345.patch


Thanks,

Miklos Csanady



Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Miklos Csanady

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/
---

(Updated Aug. 12, 2016, 8:16 p.m.)


Review request for hive, Peter Vary and Sergio Pena.


Bugs: HIVE-14345
https://issues.apache.org/jira/browse/HIVE-14345


Repository: hive-git


Description
---

Fixed output table formatting header and footer lines.


Diffs
-

  beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
  beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/50982/diff/


Testing
---

See attached Unit testClass.
After building with patch, the bug eliminated.


Thanks,

Miklos Csanady



Re: Review Request 51006: CBO: Return path - Fix for converting GroupBy operator with no map side group by

2016-08-12 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51006/
---

(Updated Aug. 12, 2016, 6:18 p.m.)


Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-14396
https://issues.apache.org/jira/browse/HIVE-14396


Repository: hive-git


Description
---

This patch fixes the following issues:
1. UDAF info collection was looking for wrong expression for a UDAF's parameter 
from HiveAggregate.
2. Converting HiveAggregate to GroupBy operator was creating wrong expressions 
for UDAF's arguments based on underlying Reduce operator.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
774fc59 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveGBOpConvUtil.java
 25fe059 
  ql/src/test/queries/clientpositive/count.q 41ffaf2 
  ql/src/test/queries/clientpositive/groupby_ppr_multi_distinct.q 74bd2fd 
  ql/src/test/results/clientpositive/count.q.out c950c5b 
  ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out 33d1ed0 
  ql/src/test/results/clientpositive/spark/count.q.out b1ad662 
  ql/src/test/results/clientpositive/spark/groupby_ppr_multi_distinct.q.out 
5251241 
  ql/src/test/results/clientpositive/tez/count.q.out 9fc2c75 

Diff: https://reviews.apache.org/r/51006/diff/


Testing
---

Added new tests and Pre-commit testing


Thanks,

Vineet Garg



Review Request 51046: Support explain analyze in Hive

2016-08-12 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51046/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

HIVE-14362


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 559fffc 
  ql/src/java/org/apache/hadoop/hive/ql/Context.java 3785b1e 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 183ed82 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java a183b9b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 43231af 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java a59b781 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java ad48091 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java b0c3d3f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 47b5793 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java 08cc4b4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/LimitOperator.java 9676d70 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ListSinkOperator.java 9bf363c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 546919b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java eaf4792 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java ba71a1e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java e1f7bd9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 6afe957 
  ql/src/java/org/apache/hadoop/hive/ql/exec/UDTFOperator.java a75b52a 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/ATSHook.java 742edc8 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 5ee54b9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AnnotateRunTimeStatsOptimizer.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java 
49706b1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainConfiguration.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSQRewriteSemanticAnalyzer.java
 8d7fd92 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
75753b0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g c411f5e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 5b08ed2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 6758741 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SubQueryDiagnostic.java 57f9432 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 114fa2f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 66a8322 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 08278de 
  ql/src/java/org/apache/hadoop/hive/ql/plan/AbstractOperatorDesc.java adec5c7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java a213c83 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OperatorDesc.java 16be499 
  ql/src/java/org/apache/hadoop/hive/ql/plan/Statistics.java 029043f 
  
ql/src/test/org/apache/hadoop/hive/ql/parse/TestUpdateDeleteSemanticAnalyzer.java
 ae1747d 
  ql/src/test/queries/clientpositive/explainanalyze_0.q PRE-CREATION 
  ql/src/test/queries/clientpositive/explainanalyze_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/explainanalyze_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/explainanalyze_3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/explainanalyze_4.q PRE-CREATION 
  ql/src/test/results/clientpositive/tez/explainanalyze_0.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/explainanalyze_1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/explainanalyze_2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/explainanalyze_3.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/explainanalyze_4.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/51046/diff/


Testing
---


Thanks,

pengcheng xiong



Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Vihang Karajgaonkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/#review145631
---


Fix it, then Ship it!




Overall, the fix looks good. Some comments below.


beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java (line 63)


nit, use the formatter to correct the spaces



beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java (line 35)


I don't think the author comment is required. If you use Eclipse, I found 
importing the coding-style "eclipse-styles.xml" in dev-support directory 
helpful to format the code using the coding conventions. This will also remove 
all the trailing spaces (seen in red below)



beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java (line 100)


Is it possible to avoid hardcoding this string? If in the future 
beelineOpts.maxColoumnWidth default values are changed, this test might fail. I 
think it will be more robust to determine this string programatically using the 
values in mockResultData and value of maxColumnWidth.


- Vihang Karajgaonkar


On Aug. 12, 2016, 3:16 p.m., Miklos Csanady wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50982/
> ---
> 
> (Updated Aug. 12, 2016, 3:16 p.m.)
> 
> 
> Review request for hive, Peter Vary and Sergio Pena.
> 
> 
> Bugs: HIVE-14345
> https://issues.apache.org/jira/browse/HIVE-14345
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Fixed output table formatting header and footer lines.
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
>   beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/50982/diff/
> 
> 
> Testing
> ---
> 
> See attached Unit testClass.
> After building with patch, the bug eliminated.
> 
> 
> File Attachments
> 
> 
> corrected version
>   
> https://reviews.apache.org/media/uploaded/files/2016/08/12/aed736a0-95f1-4f5b-9bda-4db767911065__HIVE-14345.patch
> 
> 
> Thanks,
> 
> Miklos Csanady
> 
>



Re: Review Request 51006: CBO: Return path - Fix for converting GroupBy operator with no map side group by

2016-08-12 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51006/#review145627
---




ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveGBOpConvUtil.java
 (line 727)


Can you add a comment for this?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveGBOpConvUtil.java
 


This comment is useful, instead of deleting this comment, its better to 
update it.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveGBOpConvUtil.java
 (line 1083)


Can you add comment for when this condition is going to be true?


- Ashutosh Chauhan


On Aug. 11, 2016, 8:55 p.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51006/
> ---
> 
> (Updated Aug. 11, 2016, 8:55 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-14396
> https://issues.apache.org/jira/browse/HIVE-14396
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch fixes the following issues:
> 1. UDAF info collection was looking for wrong expression for a UDAF's 
> parameter from HiveAggregate.
> 2. Converting HiveAggregate to GroupBy operator was creating wrong 
> expressions for UDAF's arguments based on underlying Reduce operator.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
> 774fc59 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveGBOpConvUtil.java
>  25fe059 
>   ql/src/test/queries/clientpositive/count.q 41ffaf2 
>   ql/src/test/queries/clientpositive/groupby_ppr_multi_distinct.q 74bd2fd 
>   ql/src/test/results/clientpositive/count.q.out c950c5b 
>   ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out 33d1ed0 
>   ql/src/test/results/clientpositive/spark/count.q.out b1ad662 
>   ql/src/test/results/clientpositive/spark/groupby_ppr_multi_distinct.q.out 
> 5251241 
>   ql/src/test/results/clientpositive/tez/count.q.out 9fc2c75 
> 
> Diff: https://reviews.apache.org/r/51006/diff/
> 
> 
> Testing
> ---
> 
> Added new tests and Pre-commit testing
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>



Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Miklos Csanady

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/
---

(Updated Aug. 12, 2016, 3:16 p.m.)


Review request for hive, Peter Vary and Sergio Pena.


Bugs: HIVE-14345
https://issues.apache.org/jira/browse/HIVE-14345


Repository: hive-git


Description
---

Fixed output table formatting header and footer lines.


Diffs
-

  beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
  beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/50982/diff/


Testing
---

See attached Unit testClass.
After building with patch, the bug eliminated.


File Attachments (updated)


corrected version
  
https://reviews.apache.org/media/uploaded/files/2016/08/12/aed736a0-95f1-4f5b-9bda-4db767911065__HIVE-14345.patch


Thanks,

Miklos Csanady



Review Request 51037: HIVE-14373 Add integration tests for hive on S3

2016-08-12 Thread Illya Yalovyy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51037/
---

Review request for hive, Ashutosh Chauhan, Abdullah Yousufi, and Sergio Pena.


Repository: hive-git


Description
---

Alternative implementation of an integration test framework. It can be used to 
test Hive against S3.


Diffs
-

  data/conf/s3/hive-site.xml PRE-CREATION 
  data/conf/s3/log4j2.properties PRE-CREATION 
  itests/pom.xml 426ba04 
  itests/qtest-s3/README PRE-CREATION 
  itests/qtest-s3/pom.xml PRE-CREATION 
  itests/qtest-s3/src/test/data/scripts/s3_test_cleanup.q PRE-CREATION 
  itests/qtest-s3/src/test/data/scripts/s3_test_init.q PRE-CREATION 
  itests/qtest-s3/src/test/expected-results/s3positive/insert_into_s3.q.out 
PRE-CREATION 
  itests/qtest-s3/src/test/java/org/apache/hadoop/hive/cli/Masker.java 
PRE-CREATION 
  
itests/qtest-s3/src/test/java/org/apache/hadoop/hive/cli/QTestUtilBuilder.java 
PRE-CREATION 
  itests/qtest-s3/src/test/java/org/apache/hadoop/hive/cli/S3TestUtil.java 
PRE-CREATION 
  itests/qtest-s3/src/test/java/org/apache/hadoop/hive/cli/TestMasker.java 
PRE-CREATION 
  itests/qtest-s3/src/test/queries/s3positive/insert_into_s3.q PRE-CREATION 
  itests/qtest-s3/src/test/resources/testconfiguration.properties PRE-CREATION 
  itests/qtest-s3/src/test/templates/TestCliDriverS3.vm PRE-CREATION 

Diff: https://reviews.apache.org/r/51037/diff/


Testing
---

It can be buld and executed with the latest hive master branch.


Thanks,

Illya Yalovyy



Re: Review Request 50906: HIVE-14444 Upgrade qtest execution framework to junit4 - migrate most of them

2016-08-12 Thread Peter Vary


> On Aug. 9, 2016, 12:14 a.m., Peter Vary wrote:
> > Hi Zoltan,
> > 
> > Thanks for the patch, I can see, that you were working on it even on the 
> > weekend.
> > 
> > Please help me to understand the components a little more, so I could help 
> > with the review.
> > As I can see there are 3 levels of the classes for every given test:
> > - Configuration
> > - Adapter
> > - Driver
> > 
> > I have tried to identify the functionality of the given elements, and come 
> > up with the following:
> > - Configuration - The queries to run, the configuration of the clusters, 
> > and the initial data
> > - Adapter - The actual methods for implementing the test, like class, 
> > method level setup, and test execution
> > - Driver - These contain very little code, and they look very simmilar, so 
> > a lot of code duplication is there - should not be a good idea to merge it 
> > with the Adapter class? Also it is a little strange, that the Configuration 
> > has to have a reference to the Adapter. If you decide to merge the Adapter 
> > and the Driver, then the reference is not needed anymore.
> > 
> > Thanks,
> > Peter
> 
> Zoltan Haindrich wrote:
> Hello,
> 
> yeah...i wanted to take advantage of the "empty queue" on the ptest 
> executor ;)
> by the way i think that all hive precommit jobs which end-up on ubuntu-3 
> will fail with some wierd jdk issue...
> 
> I think you are getting it right...those classes which have "Driver" in 
> there names are the successors of the old vm files: i don't want to touch 
> them until we have all of them on board.
> There is some redundancy even between the Driver classes...CliDriver and 
> some others are very similar - it will be easy to drop some of them.
> Merge with the adapter would possibly remove the common parent - and that 
> would possibly break the factory adapterfactory.
> The positive side of the current design is that all configurations are in 
> one place...even the core executor selection is in CliConfigs - so you have 
> to look at just one place if you have to modify it.
> 
> About more refactoring work: reviewboard can pick-up changes in renamed 
> files (which is great) - but if I add more refactors to this patch: it will 
> look like a "20 files remove", "30 files added" - which is not really review 
> friendly ;) it have already lost track of the changes of PerfCliDriver and 
> QTestGenTask.
> 
> I would like to continue this with a cleanup refactor; after AccumuloCli 
> and BeelineCli is on-board. 
> 
> regards,
> Zoltan
> 
> Peter Vary wrote:
> Hi Zoltan,
> 
> If I were in BP, I would offer you a beer, to discuss this above it :).
> Unfortunately this is not an option now, so we have to do it on the hard 
> way.
> 
> What final design do you have in your mind, I think we should discuss 
> these changes in the light of those, and should not focus on partial 
> solutions.
> For example - correct me if I wrong - the Adapter model is most useful, 
> if there is an existing interface, we have to adhere. So the final design 
> does not require an adapter since the interfaces are used by only the tests, 
> and we could change them if needed.
> 
> I think we should plan for the following changes, and keep everything 
> else as simple as possible:
> 1. Adding new queries - this happens very often (maybe too often in my 
> opinion, but let’s not discuss it here :) )
> 2. Changing how to handle the specific test case results 
> (ordering/filtering/regexp) - QTestUtil, HBaseQTestUtil, QFileClient for 
> BeeLine
> 3. Adding new test, to test new integrated components - like it was in 
> case of BeeLine/Spark/Tez
> 
> Only in the 3rd case should we touch the Driver, and the Adapter, but 
> then we should change both of them. For me it means that they are tightly 
> coupled, and might be a good idea to merge them.
> 
> What do you think?
> 
> Thanks,
> Peter
> 
> Zoltan Haindrich wrote:
> I'm afraid I don't fully understand your concerns ..but I try to answer 
> what I could:
> 
> I think these classes are nowhere final now...i'm not even sure how the 
> end result will look like...
> 
> * in the near future i think some junit rule classes may help cleaning up 
> both the drivers and the deeper qtest related features as well
> * future of this adapter named class is unknown...it may or may not be 
> removed later. Its name is adapter because there was a non declared interface 
> which was used in all clidrivers; and I wanted to package those things into 
> junit4 rules + add checks to see if any of them named the method a bit 
> differently; i can rename it to AbtractCliDriver if it causes confusion.
> * 1,2 is I think outside of the scope of this ticket.
> * (3) i think that having some common way to configure these test is an 
> advantageso I think you shouldn't change anything beyond 

Re: Review Request 50982: HIVE-14345:Beeline result table has erroneous characters

2016-08-12 Thread Miklos Csanady

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50982/
---

(Updated Aug. 12, 2016, 8:45 a.m.)


Review request for hive, Peter Vary and Sergio Pena.


Changes
---

uploaded new patch file, set branch and bugs label


Bugs: HIVE-14345
https://issues.apache.org/jira/browse/HIVE-14345


Repository: hive-git


Description
---

Fixed output table formatting header and footer lines.


Diffs
-

  beeline/src/java/org/apache/hive/beeline/TableOutputFormat.java 2753568 
  beeline/src/test/org/apache/hive/beeline/TestTableOutputFormat.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/50982/diff/


Testing
---

See attached Unit testClass.
After building with patch, the bug eliminated.


Thanks,

Miklos Csanady



[jira] [Created] (HIVE-14528) After enabling Hive Parquet Vectorization, many queries in TPCx-BB(BigBench) failed with NullPointerException and IllegalArgumentException

2016-08-12 Thread KaiXu (JIRA)
KaiXu created HIVE-14528:


 Summary: After enabling Hive Parquet Vectorization, many queries 
in TPCx-BB(BigBench)  failed with NullPointerException and 
IllegalArgumentException
 Key: HIVE-14528
 URL: https://issues.apache.org/jira/browse/HIVE-14528
 Project: Hive
  Issue Type: Bug
  Components: API, File Formats
Affects Versions: 2.1.0
 Environment: Apache Hadoop2.6.0
Apache Hive2.1.0
JDK1.8.0_73
TPCx-BB 1.0.1
Reporter: KaiXu


We use TPCx-BB(BigBench) to evaluate the performance of Hive Parquet 
Vectorization in our local cluster(E5-2699 v3, 256G, 72 vcores, 1 master node + 
5 worker nodes). During our performance test of enable Parquet Vectorization, 
we found that many queries failed with the two errors:
a. Error: java.lang.NullPointerException@ VectorizedParquetInputFormat.java:188
For queries: q02, q03, q04, q06, q08, q11, q14, q15, q18, q19, q21, q23
b. java.io.IOException: java.io.IOException: 
java.lang.IllegalArgumentException: 8 > 4@ HiveIOExceptionHandlerChain.java:121
For queries: q07, q09, q13, q17, q24

Error: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.close(VectorizedParquetInputFormat.java:188)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doClose(CombineHiveRecordReader.java:74)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.close(HiveContextAwareRecordReader.java:106)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.close(HadoopShimsSecure.java:172)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.close(MapTask.java:210)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1972)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)


Error: java.io.IOException: java.io.IOException: 
java.lang.IllegalArgumentException: 8 > 4
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:230)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: java.lang.IllegalArgumentException: 8 > 4
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:357)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:106)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:118)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:228)
... 11 more
Caused by: java.lang.IllegalArgumentException: 8 > 4
at java.util.Arrays.copyOfRange(Arrays.java:3519)
at 
org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.assignVector(VectorizedParquetInputFormat.java:313)
at 

[jira] [Created] (HIVE-14529) Union All query returns incorrect results.

2016-08-12 Thread wenhe li (JIRA)
wenhe li created HIVE-14529:
---

 Summary: Union All query returns incorrect results.
 Key: HIVE-14529
 URL: https://issues.apache.org/jira/browse/HIVE-14529
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.1.0
 Environment: Hadoop 2.6
Hive 2.1
Reporter: wenhe li



create table dw_tmp.l_test1 (id bigint,val string,trans_date string) row format 
delimited fields terminated by ' ' ;

create table dw_tmp.l_test2 (id bigint,val string,trans_date string) row format 
delimited fields terminated by ' ' ;  


select * from dw_tmp.l_test1;

1   table_1  2016-08-11


select * from dw_tmp.l_test2;

2   table_2  2016-08-11


-- right like this

select 
id,
'table_1' ,
trans_date
from dw_tmp.l_test1
union all
select 
id,
val,
trans_date
from dw_tmp.l_test2 ;

1   table_1 2016-08-11
2   table_2 2016-08-11

-- incorrect

select 
id,
999,
'table_1' ,
trans_date
from dw_tmp.l_test1
union all
select 
id,
999,
val,
trans_date
from dw_tmp.l_test2 ;

1   999 table_1 2016-08-11
2   999 table_1 2016-08-11 <-- here is wrong

-- incorrect

select 
id,
999,
666,
'table_1' ,
trans_date
from dw_tmp.l_test1
union all
select 
id,
999,
666,
val,
trans_date
from dw_tmp.l_test2 ;

1   999 666 table_1 2016-08-11
2   999 666 table_1 2016-08-11 <-- here is wrong

-- right

select 
id,
999,
'table_1' ,
trans_date,
'2016-11-11'
from dw_tmp.l_test1
union all
select 
id,
999,
val,
trans_date,
trans_date
from dw_tmp.l_test2 ;

1   999 table_1 2016-08-11  2016-11-11
2   999 table_2 2016-08-11  2016-08-11




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14522) CBO: Calcite Operator To Hive Operator(Calcite Return Path): Fix test failure for auto_join_filters

2016-08-12 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-14522:
--

 Summary: CBO: Calcite Operator To Hive Operator(Calcite Return 
Path): Fix test failure for auto_join_filters
 Key: HIVE-14522
 URL: https://issues.apache.org/jira/browse/HIVE-14522
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Vineet Garg
Assignee: Vineet Garg


{code}
CREATE TABLE smb_input1(key int, value int) CLUSTERED BY (key) SORTED BY (key) 
INTO 2 BUCKETS; 
CREATE TABLE smb_input2(key int, value int) CLUSTERED BY (value) SORTED BY 
(value) INTO 2 BUCKETS; 
LOAD DATA LOCAL INPATH '../../data/files/in1.txt' into table smb_input1;
LOAD DATA LOCAL INPATH '../../data/files/in2.txt' into table smb_input1;
LOAD DATA LOCAL INPATH '../../data/files/in1.txt' into table smb_input2;
LOAD DATA LOCAL INPATH '../../data/files/in2.txt' into table smb_input2;

SET hive.optimize.bucketmapjoin = true;
SET hive.optimize.bucketmapjoin.sortedmerge = true;
SET hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;

SET hive.outerjoin.supports.filters = false;
{code}

{code} SELECT sum(hash(a.key,a.value,b.key,b.value)) FROM myinput1 a LEFT OUTER 
JOIN myinput1 b on a.key > 40 AND a.value > 50 AND a.key = a.value AND b.key > 
40 AND b.value > 50 AND b.key = b.value; {code}

{code} Expected result: 3078400 Actual result: 4937935 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14526) HadoopMetrics2Reporter logs way, way too much

2016-08-12 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-14526:
---

 Summary: HadoopMetrics2Reporter logs way, way too much
 Key: HIVE-14526
 URL: https://issues.apache.org/jira/browse/HIVE-14526
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


{noformat}
# grep -c HadoopMetrics2Reporter hiveserver2.log.2016-08-11
547524076
# grep -c . hiveserver2.log.2016-08-11
548430185
# ll hiveserver2.log.2016-08-11
-rw-r--r-- 1 hive hadoop 204695432463 Aug 11 23:59 hiveserver2.log.2016-08-11
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14525) beeline still writing log data to stdout as of version 2.1.0

2016-08-12 Thread stephen sprague (JIRA)
stephen sprague created HIVE-14525:
--

 Summary: beeline still writing log data to stdout as of version 
2.1.0
 Key: HIVE-14525
 URL: https://issues.apache.org/jira/browse/HIVE-14525
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 2.1.0
Reporter: stephen sprague


simple test. note that i'm looking to get a tsv file back.

{code}
$ beeline -u dwrdevnn1 --showHeader=false --outputformat=tsv2 stderr
> select count(*)
> from default.dual;
> SQL
{code}

instead i get this in stdout:

{code}
$ cat stdout
0: jdbc:hive2://dwrdevnn1.sv2.trulia.com:1000> select count(*)
. . . . . . . . . . . . . . . . . . . . . . .> from default.dual;
0
0: jdbc:hive2://dwrdevnn1.sv2.trulia.com:1000> 
{code}

i should only get one row which is the *result* of the query (which is 0) - not 
the over loggy kind lines you see above. that stuff goes to stderr my friends.

also i refer to this ticket b/c the last comment suggested so - its close but 
not exactly the same.
https://issues.apache.org/jira/browse/HIVE-14183




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14527) Schema evolution tests are not running in TestCliDriver

2016-08-12 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-14527:


 Summary: Schema evolution tests are not running in TestCliDriver
 Key: HIVE-14527
 URL: https://issues.apache.org/jira/browse/HIVE-14527
 Project: Hive
  Issue Type: Sub-task
  Components: Test
Affects Versions: 2.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HIVE-14376 broke something that makes schema evolution tests being excluded 
from TestCliDriver test suite. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14530) Union All query returns incorrect results.

2016-08-12 Thread wenhe li (JIRA)
wenhe li created HIVE-14530:
---

 Summary: Union All query returns incorrect results.
 Key: HIVE-14530
 URL: https://issues.apache.org/jira/browse/HIVE-14530
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.1.0
 Environment: Hadoop 2.6
Hive 2.1
Reporter: wenhe li



create table dw_tmp.l_test1 (id bigint,val string,trans_date string) row format 
delimited fields terminated by ' ' ;

create table dw_tmp.l_test2 (id bigint,val string,trans_date string) row format 
delimited fields terminated by ' ' ;  


select * from dw_tmp.l_test1;

1   table_1  2016-08-11


select * from dw_tmp.l_test2;

2   table_2  2016-08-11


-- right like this

select 
id,
'table_1' ,
trans_date
from dw_tmp.l_test1
union all
select 
id,
val,
trans_date
from dw_tmp.l_test2 ;

1   table_1 2016-08-11
2   table_2 2016-08-11

-- incorrect

select 
id,
999,
'table_1' ,
trans_date
from dw_tmp.l_test1
union all
select 
id,
999,
val,
trans_date
from dw_tmp.l_test2 ;

1   999 table_1 2016-08-11
2   999 table_1 2016-08-11 <-- here is wrong

-- incorrect

select 
id,
999,
666,
'table_1' ,
trans_date
from dw_tmp.l_test1
union all
select 
id,
999,
666,
val,
trans_date
from dw_tmp.l_test2 ;

1   999 666 table_1 2016-08-11
2   999 666 table_1 2016-08-11 <-- here is wrong

-- right

select 
id,
999,
'table_1' ,
trans_date,
'2016-11-11'
from dw_tmp.l_test1
union all
select 
id,
999,
val,
trans_date,
trans_date
from dw_tmp.l_test2 ;

1   999 table_1 2016-08-11  2016-11-11
2   999 table_2 2016-08-11  2016-08-11




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)