date:20150707

Re: Review Request 35950: HIVE-11131: Get row information on DataWritableWriter once for better writing performance

2015-07-07 Thread Sergio Pena


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35950/
---

(Updated July 7, 2015, 4:25 p.m.)


Review request for hive, Ryan Blue, cheng xu, and Dong Chen.


Changes
---

Address feedback changes.


Bugs: HIVE-11131
https://issues.apache.org/jira/browse/HIVE-11131


Repository: hive-git


Description
---

Implemented data type writers that will be created before the first Hive row is 
written to Parquet. These writers contain information about object inspectors 
and schema of a specific data type, and calls the specific add() method 
used by Parquet for each data type.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 
c195c3ec3ddae19bf255fc2c9633f8bf4390f428 

Diff: https://reviews.apache.org/r/35950/diff/


Testing
---

Tests from TestDataWritableWriter run OK.

I run other tests with micro-becnhmarks, and I got some better results from 
this new implemntation:

Using repeated rows across the file, this is the throughput increase using 1 
million records:

bigint  boolean double  float   int string
7.598   7.491   7.488   7.588   7.530.270 (before)
10.137  11.511  10.155  10.297  10.242  0.286 (after)

Using random rows across the file, the is the throughput increase using 1 
million records:

bigint  boolean double  float   int string
5.268   7.723   4.107   4.173   4.729   0.20   (before)
6.236   10.466  5.944   4.749   5.234   0.22   (after)


Thanks,

Sergio Pena

Re: Review Request 36156: HIVE-11053: Add more tests for HIVE-10844[Spark Branch]

2015-07-07 Thread lun gao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36156/
---

(Updated July 7, 2015, 3:49 p.m.)


Review request for hive and chengxiang li.


Changes
---

Delete some meaningless lines.


Bugs: HIVE-11053
https://issues.apache.org/jira/browse/HIVE-11053


Repository: hive-git


Description
---

Add some test cases for self union, self-join, CWE, and repeated sub-queries to 
verify the job of combining quivalent works in HIVE-10844.


Diffs (updated)
-

  ql/src/test/queries/clientpositive/dynamic_rdd_cache.q PRE-CREATION 
  ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/36156/diff/


Testing
---


Thanks,

lun gao

Hive-0.14 - Build # 1005 - Still Failing

2015-07-07 Thread Apache Jenkins Server

Changes for Build #986

Changes for Build #987

Changes for Build #988

Changes for Build #989

Changes for Build #990

Changes for Build #991

Changes for Build #992

Changes for Build #993

Changes for Build #994

Changes for Build #995

Changes for Build #996

Changes for Build #997

Changes for Build #998

Changes for Build #999

Changes for Build #1000

Changes for Build #1001

Changes for Build #1002

Changes for Build #1003

Changes for Build #1004

Changes for Build #1005



No tests ran.

The Apache Jenkins build system has built Hive-0.14 (build #1005)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-0.14/1005/ to view 
the results.

[jira] [Created] (HIVE-11191) Beeline-cli: support hive.cli.errors.ignore in new CLI

2015-07-07 Thread Ferdinand Xu (JIRA)

Ferdinand Xu created HIVE-11191:
---

 Summary: Beeline-cli: support hive.cli.errors.ignore in new CLI
 Key: HIVE-11191
 URL: https://issues.apache.org/jira/browse/HIVE-11191
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu


In the old CLI, it uses hive.cli.errors.ignore from the hive configuration to 
force execution a script when errors occurred. In the beeline, it has a similar 
option called force. We need to support the previous configuration using 
beeline functionality. More details about force option are available in 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: is HiveQA broken?

2015-07-07 Thread Szehon Ho

I am looking now into the issue, looks like it is looking for an AMI image
that is no longer there to spawn the slaves, will try to see what is going
on.

Thanks
Szehon

On Tue, Jul 7, 2015 at 11:38 AM, Szehon Ho sze...@cloudera.com wrote:

 Yea you are right, somehow the server is not running the tests, hence no
 logs.  I will try to look at this today when I get a chance.

 Sergio can you also take a look if you have cycles?

 Thanks
 Szehon

 On Tue, Jul 7, 2015 at 11:09 AM, Sergey Shelukhin ser...@hortonworks.com
 wrote:

 It looks like it’s downloading test logs from e.g.

 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK
 -Build-4510
 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4510

 but there are no logs there. It also appears to happen pretty fast, so
 testTailLog does nothing? It either returns prematurely or fails quickly
 w/o logs.

 On 15/7/6, 19:03, Sergey Shelukhin ser...@hortonworks.com wrote:

 Looks like last HiveQA failed again in the same way, not sure if it was
 before or after restart, what time zone is it in?
 May that be related to recent Apache infra build changes?
 
 On 15/7/6, 17:24, Szehon Ho sze...@cloudera.com wrote:
 
 I am not sure what is going on.  Let me restart the Ptest server.
 
 On Mon, Jul 6, 2015 at 1:02 PM, Sergey Shelukhin 
 ser...@hortonworks.com
 wrote:
 
  Looks like all the runs are failing with:
 
  Exception in thread main java.lang.RuntimeException: 404 Not Found
  at
 

 org.apache.hive.ptest.api.client.PTestClient.downloadTestResults(PTestCl
 i
 en
  t.java:181)
  at
 

 org.apache.hive.ptest.api.client.PTestClient.testStart(PTestClient.java:
 1
 29
  )
  at
 
 org.apache.hive.ptest.api.client.PTestClient.main(PTestClient.java:312)
  + ret=1
  + cd target/
  + [[ -f test-results.tar.gz ]]
  + exit 1
  + rm -f /tmp/tmp.FayzMqy1oP
  Build step 'Execute shell' marked build as failure
  Archiving artifacts
  Recording test results
  Finished: FAILURE

Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

2015-07-07 Thread Hari Sankar Sivarama Subramaniyan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
---

(Updated July 7, 2015, 6:12 p.m.)


Review request for hive, John Pullokkaran and Mostafa Mokhtar.


Repository: hive-git


Description
---

Improve RuleRegExp when the Expression node stack gets huge


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
  ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 

Diff: https://reviews.apache.org/r/36069/diff/


Testing
---

Local testing.


Thanks,

Hari Sankar Sivarama Subramaniyan

Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

2015-07-07 Thread Hari Sankar Sivarama Subramaniyan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90748
---



ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java (line 52)
https://reviews.apache.org/r/36069/#comment143877

I dont clearly understand what you suggested, but I can assure that the 
implementation here wont take much time in worst cases.


- Hari Sankar Sivarama Subramaniyan


On July 2, 2015, 1:10 a.m., Hari Sankar Sivarama Subramaniyan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36069/
 ---
 
 (Updated July 2, 2015, 1:10 a.m.)
 
 
 Review request for hive, John Pullokkaran and Mostafa Mokhtar.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Improve RuleRegExp when the Expression node stack gets huge
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
   ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/36069/diff/
 
 
 Testing
 ---
 
 Local testing.
 
 
 Thanks,
 
 Hari Sankar Sivarama Subramaniyan

Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

2015-07-07 Thread Hari Sankar Sivarama Subramaniyan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/
---

(Updated July 7, 2015, 7:54 p.m.)


Review request for hive, John Pullokkaran and Mostafa Mokhtar.


Repository: hive-git


Description
---

Improve RuleRegExp when the Expression node stack gets huge


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
  ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 

Diff: https://reviews.apache.org/r/36069/diff/


Testing
---

Local testing.


Thanks,

Hari Sankar Sivarama Subramaniyan

Re: is HiveQA broken?

2015-07-07 Thread Sergey Shelukhin

It looks like it’s downloading test logs from e.g.
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK
-Build-4510

but there are no logs there. It also appears to happen pretty fast, so
testTailLog does nothing? It either returns prematurely or fails quickly
w/o logs.

On 15/7/6, 19:03, Sergey Shelukhin ser...@hortonworks.com wrote:

Looks like last HiveQA failed again in the same way, not sure if it was
before or after restart, what time zone is it in?
May that be related to recent Apache infra build changes?

On 15/7/6, 17:24, Szehon Ho sze...@cloudera.com wrote:

I am not sure what is going on.  Let me restart the Ptest server.

On Mon, Jul 6, 2015 at 1:02 PM, Sergey Shelukhin ser...@hortonworks.com
wrote:

 Looks like all the runs are failing with:

 Exception in thread main java.lang.RuntimeException: 404 Not Found
 at
 
org.apache.hive.ptest.api.client.PTestClient.downloadTestResults(PTestCl
i
en
 t.java:181)
 at
 
org.apache.hive.ptest.api.client.PTestClient.testStart(PTestClient.java:
1
29
 )
 at
 org.apache.hive.ptest.api.client.PTestClient.main(PTestClient.java:312)
 + ret=1
 + cd target/
 + [[ -f test-results.tar.gz ]]
 + exit 1
 + rm -f /tmp/tmp.FayzMqy1oP
 Build step 'Execute shell' marked build as failure
 Archiving artifacts
 Recording test results
 Finished: FAILURE

Re: is HiveQA broken?

2015-07-07 Thread Szehon Ho

Yea you are right, somehow the server is not running the tests, hence no
logs.  I will try to look at this today when I get a chance.

Sergio can you also take a look if you have cycles?

Thanks
Szehon

On Tue, Jul 7, 2015 at 11:09 AM, Sergey Shelukhin ser...@hortonworks.com
wrote:

 It looks like it’s downloading test logs from e.g.
 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK
 -Build-4510

 but there are no logs there. It also appears to happen pretty fast, so
 testTailLog does nothing? It either returns prematurely or fails quickly
 w/o logs.

 On 15/7/6, 19:03, Sergey Shelukhin ser...@hortonworks.com wrote:

 Looks like last HiveQA failed again in the same way, not sure if it was
 before or after restart, what time zone is it in?
 May that be related to recent Apache infra build changes?
 
 On 15/7/6, 17:24, Szehon Ho sze...@cloudera.com wrote:
 
 I am not sure what is going on.  Let me restart the Ptest server.
 
 On Mon, Jul 6, 2015 at 1:02 PM, Sergey Shelukhin ser...@hortonworks.com
 
 wrote:
 
  Looks like all the runs are failing with:
 
  Exception in thread main java.lang.RuntimeException: 404 Not Found
  at
 
 org.apache.hive.ptest.api.client.PTestClient.downloadTestResults(PTestCl
 i
 en
  t.java:181)
  at
 
 org.apache.hive.ptest.api.client.PTestClient.testStart(PTestClient.java:
 1
 29
  )
  at
  org.apache.hive.ptest.api.client.PTestClient.main(PTestClient.java:312)
  + ret=1
  + cd target/
  + [[ -f test-results.tar.gz ]]
  + exit 1
  + rm -f /tmp/tmp.FayzMqy1oP
  Build step 'Execute shell' marked build as failure
  Archiving artifacts
  Recording test results
  Finished: FAILURE

Re: Review Request 36069: HIVE-11141 : Improve RuleRegExp when the Expression node stack gets huge

2015-07-07 Thread John Pullokkaran



 On July 7, 2015, 6:12 p.m., Hari Sankar Sivarama Subramaniyan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java, line 52
  https://reviews.apache.org/r/36069/diff/2/?file=997802#file997802line52
 
  I dont clearly understand what you suggested, but I can assure that the 
  implementation here wont take much time in worst cases.

Currently for each patter char we are traversing the whole string. Instead 
create a hashset of wild chars (static) then for each char in the rule string 
do a lookup to see if it exists in the wild char set.


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36069/#review90748
---


On July 7, 2015, 6:12 p.m., Hari Sankar Sivarama Subramaniyan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36069/
 ---
 
 (Updated July 7, 2015, 6:12 p.m.)
 
 
 Review request for hive, John Pullokkaran and Mostafa Mokhtar.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Improve RuleRegExp when the Expression node stack gets huge
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java ddc96c2 
   ql/src/test/org/apache/hadoop/hive/ql/lib/TestRuleRegExp.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/36069/diff/
 
 
 Testing
 ---
 
 Local testing.
 
 
 Thanks,
 
 Hari Sankar Sivarama Subramaniyan

[jira] [Created] (HIVE-11196) Utilities.getPartitionDesc() Should try to reuse TableDesc object

2015-07-07 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

Hari Sankar Sivarama Subramaniyan created HIVE-11196:


 Summary: Utilities.getPartitionDesc() Should try to reuse 
TableDesc object 
 Key: HIVE-11196
 URL: https://issues.apache.org/jira/browse/HIVE-11196
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan


Currently, Utilities.getPartitionDesc() creates a new PartitionDesc object 
which inturn creates new TableDesc object via 
Utilities.getTableDesc(part.getTable()) for every call. This value needs to be 
reused  so that we can avoid the expense of creating new Descriptor object 
wherever possible




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-11195) Make auto_sortmerge_join_16.q result sequence more stable

2015-07-07 Thread Pengcheng Xiong (JIRA)

Pengcheng Xiong created HIVE-11195:
--

 Summary: Make auto_sortmerge_join_16.q result sequence more stable
 Key: HIVE-11195
 URL: https://issues.apache.org/jira/browse/HIVE-11195
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Trivial


adding -- SORT_QUERY_RESULTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-11193) ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted

2015-07-07 Thread Wei Zheng (JIRA)

Wei Zheng created HIVE-11193:


 Summary: ConstantPropagateProcCtx should use a Set instead of a 
List to hold operators to be deleted
 Key: HIVE-11193
 URL: https://issues.apache.org/jira/browse/HIVE-11193
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Reporter: Wei Zheng
Assignee: Wei Zheng


During Constant Propagation optimization, sometimes a node ends up being added 
to opToDelete list more than once.

Later in ConstantPropagate transform, we try to delete that operator multiple 
times, which will cause SemanticException since the node has already been 
removed in an earlier pass.

The data structure for storing opToDelete is List. We should use Set to avoid 
the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 36241: HIVE-10927 : Add number of HMS connection metrics

2015-07-07 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36241/
---

(Updated July 7, 2015, 9:28 p.m.)


Review request for hive and Jimmy Xiang.


Changes
---

Address the review comments.


Bugs: HIVE-10927
https://issues.apache.org/jira/browse/HIVE-10927


Repository: hive-git


Description
---

Adds following new metrics to HMS, can be renamed if necessary.

open_connections;  //#HMS clients connecting to HMS

active_jdo_transactions;  //#active JDO transactions
rollbacked_jdo_transactions;  //#failed JDO transactions
committed_jdo_transactions;  //#successful JDO transactions
opened_jdo_transactions;  //#attempted JDO transactions

Also to HS2:
open_connections;  //#HMS clients connecting to HMS


This also fixes some minor issues:
1.  For the metrics JSON-file-reporter, the file system was not chosen right.  
Fixing that, and also making the default the local file system.
2.  The metrics was getting closed in the metastore in the wrong place, fixing 
it.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java ec5ac4a 
  common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
e811339 
  common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
27b69cc 
  
common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsConstant.java
 PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsVariable.java
 PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
 ae353d0 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 6d0cf15 
  
common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
 954b388 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
 25f34d1 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
0bcd053 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 4273c0b 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
dfb7faa 

Diff: https://reviews.apache.org/r/36241/diff/


Testing
---

Adding some unit tests, and tested manually for HS2 which cannot be unit-tested.


Thanks,

Szehon Ho

Review Request 36280: HIVE-11196

2015-07-07 Thread Hari Sankar Sivarama Subramaniyan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36280/
---

Review request for hive and John Pullokkaran.


Repository: hive-git


Description
---

Utilities.getPartitionDesc() should try to reuse TableDesc object


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java afecb1e 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 08ff2e9 

Diff: https://reviews.apache.org/r/36280/diff/


Testing
---


Thanks,

Hari Sankar Sivarama Subramaniyan

Re: is HiveQA broken?

2015-07-07 Thread Szehon Ho

Sergio has uploaded a new image, we are looking at what happened.  Thanks,
Sergio.

Sergey, it looks like the latest commit (HIVE-11013) doesn't compile as it
was done without tests, can you take a look?

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile
(default-compile) on project hive-exec: Compilation failure
[ERROR] 
/data/hive-ptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/plan/MergeJoinWork.java:[149,38]
error: method getDummyOps() is already defined in class MergeJoinWork


Then we can resume the pre commit tests.

Thanks
Szehon

On Tue, Jul 7, 2015 at 12:13 PM, Szehon Ho sze...@cloudera.com wrote:

 I am looking now into the issue, looks like it is looking for an AMI image
 that is no longer there to spawn the slaves, will try to see what is going
 on.

 Thanks
 Szehon

 On Tue, Jul 7, 2015 at 11:38 AM, Szehon Ho sze...@cloudera.com wrote:

 Yea you are right, somehow the server is not running the tests, hence no
 logs.  I will try to look at this today when I get a chance.

 Sergio can you also take a look if you have cycles?

 Thanks
 Szehon

 On Tue, Jul 7, 2015 at 11:09 AM, Sergey Shelukhin ser...@hortonworks.com
  wrote:

 It looks like it’s downloading test logs from e.g.

 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK
 -Build-4510
 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4510

 but there are no logs there. It also appears to happen pretty fast, so
 testTailLog does nothing? It either returns prematurely or fails quickly
 w/o logs.

 On 15/7/6, 19:03, Sergey Shelukhin ser...@hortonworks.com wrote:

 Looks like last HiveQA failed again in the same way, not sure if it was
 before or after restart, what time zone is it in?
 May that be related to recent Apache infra build changes?
 
 On 15/7/6, 17:24, Szehon Ho sze...@cloudera.com wrote:
 
 I am not sure what is going on.  Let me restart the Ptest server.
 
 On Mon, Jul 6, 2015 at 1:02 PM, Sergey Shelukhin 
 ser...@hortonworks.com
 wrote:
 
  Looks like all the runs are failing with:
 
  Exception in thread main java.lang.RuntimeException: 404 Not Found
  at
 

 org.apache.hive.ptest.api.client.PTestClient.downloadTestResults(PTestCl
 i
 en
  t.java:181)
  at
 

 org.apache.hive.ptest.api.client.PTestClient.testStart(PTestClient.java:
 1
 29
  )
  at
 
 org.apache.hive.ptest.api.client.PTestClient.main(PTestClient.java:312)
  + ret=1
  + cd target/
  + [[ -f test-results.tar.gz ]]
  + exit 1
  + rm -f /tmp/tmp.FayzMqy1oP
  Build step 'Execute shell' marked build as failure
  Archiving artifacts
  Recording test results
  Finished: FAILURE

Review Request 36284: HIVE-11197

2015-07-07 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36284/
---

Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-11197
https://issues.apache.org/jira/browse/HIVE-11197


Repository: hive-git


Description
---

While extracting join conditions follow Hive rules for type conversion instead 
of Calcite


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
024097e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java 
9ebb24f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinToMultiJoinRule.java
 c5e0e11 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdSelectivity.java
 960ec40 

Diff: https://reviews.apache.org/r/36284/diff/


Testing
---

Existing tests.


Thanks,

Ashutosh Chauhan

Re: Review Request 35950: HIVE-11131: Get row information on DataWritableWriter once for better writing performance

2015-07-07 Thread cheng xu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35950/#review90838
---

Ship it!


Ship It!

- cheng xu


On July 8, 2015, 12:25 a.m., Sergio Pena wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35950/
 ---
 
 (Updated July 8, 2015, 12:25 a.m.)
 
 
 Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
 
 
 Bugs: HIVE-11131
 https://issues.apache.org/jira/browse/HIVE-11131
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Implemented data type writers that will be created before the first Hive row 
 is written to Parquet. These writers contain information about object 
 inspectors and schema of a specific data type, and calls the specific 
 add() method used by Parquet for each data type.
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
  c195c3ec3ddae19bf255fc2c9633f8bf4390f428 
 
 Diff: https://reviews.apache.org/r/35950/diff/
 
 
 Testing
 ---
 
 Tests from TestDataWritableWriter run OK.
 
 I run other tests with micro-becnhmarks, and I got some better results from 
 this new implemntation:
 
 Using repeated rows across the file, this is the throughput increase using 1 
 million records:
 
 bigintboolean double  float   int string
 7.598 7.491   7.488   7.588   7.530.270 (before)
 10.13711.511  10.155  10.297  10.242  0.286 (after)
 
 Using random rows across the file, the is the throughput increase using 1 
 million records:
 
 bigintboolean double  float   int string
 5.268 7.723   4.107   4.173   4.729   0.20   (before)
 6.236 10.466  5.944   4.749   5.234   0.22   (after)
 
 
 Thanks,
 
 Sergio Pena

Re: Review Request 34666: HIVE-9152 - Dynamic Partition Pruning [Spark Branch]

2015-07-07 Thread Chao Sun



 On June 30, 2015, 8:55 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java,
   line 92
  https://reviews.apache.org/r/34666/diff/1/?file=971715#file971715line92
 
  Can we still get conflicts in the file name?
 
 Chao Sun wrote:
 It shouldn't - I think work ID and Random#nextInt() should both be 
 unique, right?
 
 Xuefu Zhang wrote:
 Random.nextint() doesn't gives uniqueness. If targetWorkID/sourceWorkID 
 gives you uniqueness, then you don't need a random number, right? If 
 targetWorkID/sourceWorkID doesn't give uniqueness, then adding a random 
 number doesn't help much.

Yes targetWorkID/sourceWorkID should be unique, but it could have multiple 
tasks from a single work, and if we don't have the random number, their results 
may overwrite each other. We also did the same thing for the hash table sink in 
Spark, and we haven't seen any issue with that.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review89826
---


On July 3, 2015, 10:45 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34666/
 ---
 
 (Updated July 3, 2015, 10:45 p.m.)
 
 
 Review request for hive, chengxiang li and Xuefu Zhang.
 
 
 Bugs: HIVE-9152
 https://issues.apache.org/jira/browse/HIVE-9152
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
   ql/if/queryplan.thrift c8dfa35 
   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
 21398d8 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
 1de7e40 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
  8546d21 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java
  4803959 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 5f731d7 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 447f104 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 e27ce0d 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
  f7586a4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SplitOpTreeForDPP.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 05a5841 
   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java aa291b9 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
 363e49e 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_2.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/spark/bucket2.q.out 89c3b4c 
   ql/src/test/results/clientpositive/spark/bucket3.q.out 2fc4855 
   ql/src/test/results/clientpositive/spark/bucket4.q.out 44e0f9f 
   ql/src/test/results/clientpositive/spark/column_access_stats.q.out 3e16f61 
   ql/src/test/results/clientpositive/spark/limit_partition_metadataonly.q.out 
 e95d2ab 
   ql/src/test/results/clientpositive/spark/list_bucket_dml_2.q.java1.7.out 
 e38ccf8 
   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 881f41a 
   ql/src/test/results/clientpositive/spark/pcr.q.out 4c22f0b 
   ql/src/test/results/clientpositive/spark/sample3.q.out 2fe6b0d 
   ql/src/test/results/clientpositive/spark/sample9.q.out c9823f7 
   ql/src/test/results/clientpositive/spark/smb_mapjoin_11.q.out c3f996f 
   
 ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out
  PRE-CREATION

[jira] [Created] (HIVE-11197) While extracting join conditions follow Hive rules for type conversion instead of Calcite

2015-07-07 Thread Ashutosh Chauhan (JIRA)

Ashutosh Chauhan created HIVE-11197:
---

 Summary: While extracting join conditions follow Hive rules for 
type conversion instead of Calcite
 Key: HIVE-11197
 URL: https://issues.apache.org/jira/browse/HIVE-11197
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Calcite strict type system throws exception in those cases, which are legal in 
Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 36253: HIVE-11171

2015-07-07 Thread Jesús Camacho Rodríguez


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36253/#review90686
---

Ship it!


Ship It!

- Jesús Camacho Rodríguez


On July 7, 2015, 10:09 a.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36253/
 ---
 
 (Updated July 7, 2015, 10:09 a.m.)
 
 
 Review request for hive, Jesús Camacho Rodríguez and John Pullokkaran.
 
 
 Bugs: HIVE-11171
 https://issues.apache.org/jira/browse/HIVE-11171
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Join reordering algorithm might introduce projects between joins
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinCommuteRule.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7fd8c85 
   ql/src/test/results/clientpositive/auto_join12.q.out e97d7e6 
   ql/src/test/results/clientpositive/auto_join5.q.out 69b7aab 
   ql/src/test/results/clientpositive/constantPropagateForSubQuery.q.out 
 40d2dd4 
   ql/src/test/results/clientpositive/correlationoptimizer15.q.out d5f45da 
   ql/src/test/results/clientpositive/correlationoptimizer6.q.out 85e447c 
   ql/src/test/results/clientpositive/join12.q.out df340a8 
   ql/src/test/results/clientpositive/join5.q.out f83ff73 
   ql/src/test/results/clientpositive/join_merge_multi_expressions.q.out 
 b73643e 
   ql/src/test/results/clientpositive/join_merging.q.out a3afbec 
   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
   ql/src/test/results/clientpositive/limit_pushdown.q.out 6ace047 
   ql/src/test/results/clientpositive/lineage3.q.out 5c392fa 
   ql/src/test/results/clientpositive/louter_join_ppr.q.out c46792b 
   ql/src/test/results/clientpositive/optional_outer.q.out 8616644 
   ql/src/test/results/clientpositive/outer_join_ppr.q.java1.7.out b6ca16f 
   ql/src/test/results/clientpositive/ppd_gby_join.q.out 579c827 
   ql/src/test/results/clientpositive/ppd_join.q.out ae5fb27 
   ql/src/test/results/clientpositive/ppd_join2.q.out 88624ea 
   ql/src/test/results/clientpositive/ppd_join3.q.out 6c5c0da 
   ql/src/test/results/clientpositive/ppd_outer_join4.q.out 18e0154 
   ql/src/test/results/clientpositive/ppd_random.q.out 5e23a1c 
   ql/src/test/results/clientpositive/rcfile_null_value.q.out a1e6f4f 
   ql/src/test/results/clientpositive/router_join_ppr.q.out 70d7542 
   ql/src/test/results/clientpositive/skewjoin.q.out 1b56d5f 
   ql/src/test/results/clientpositive/spark/auto_join12.q.out 1bddcfc 
   ql/src/test/results/clientpositive/spark/auto_join5.q.out 08e9ae6 
   ql/src/test/results/clientpositive/spark/join12.q.out b62334d 
   ql/src/test/results/clientpositive/spark/join5.q.out 1914ad0 
   ql/src/test/results/clientpositive/spark/join_merge_multi_expressions.q.out 
 a18d82e 
   ql/src/test/results/clientpositive/spark/join_merging.q.out ce26543 
   ql/src/test/results/clientpositive/spark/limit_pushdown.q.out 7101beb 
   ql/src/test/results/clientpositive/spark/louter_join_ppr.q.out 190485f 
   ql/src/test/results/clientpositive/spark/outer_join_ppr.q.java1.7.out 
 0614cb9 
   ql/src/test/results/clientpositive/spark/ppd_gby_join.q.out 31b25b3 
   ql/src/test/results/clientpositive/spark/ppd_join.q.out b97431c 
   ql/src/test/results/clientpositive/spark/ppd_join2.q.out cf81423 
   ql/src/test/results/clientpositive/spark/ppd_join3.q.out d2343c4 
   ql/src/test/results/clientpositive/spark/ppd_outer_join4.q.out 2abca78 
   ql/src/test/results/clientpositive/spark/router_join_ppr.q.out 12f1abb 
   ql/src/test/results/clientpositive/spark/skewjoin.q.out ec74786 
   ql/src/test/results/clientpositive/tez/explainuser_1.q.out 9f93574 
   ql/src/test/results/clientpositive/tez/limit_pushdown.q.out 2a41aae 
   ql/src/test/results/clientpositive/tez/mrr.q.out d42f9b0 
   ql/src/test/results/clientpositive/tez/skewjoin.q.out ec368f9 
   ql/src/test/results/clientpositive/tez/tez_union.q.out 4012b90 
 
 Diff: https://reviews.apache.org/r/36253/diff/
 
 
 Testing
 ---
 
 Existing tests
 
 
 Thanks,
 
 Ashutosh Chauhan

Review Request 36253: HIVE-11171

2015-07-07 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36253/
---

Review request for hive, Jesús Camacho Rodríguez and John Pullokkaran.


Bugs: HIVE-11171
https://issues.apache.org/jira/browse/HIVE-11171


Repository: hive-git


Description
---

Join reordering algorithm might introduce projects between joins


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinCommuteRule.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7fd8c85 
  ql/src/test/results/clientpositive/auto_join12.q.out e97d7e6 
  ql/src/test/results/clientpositive/auto_join5.q.out 69b7aab 
  ql/src/test/results/clientpositive/constantPropagateForSubQuery.q.out 40d2dd4 
  ql/src/test/results/clientpositive/correlationoptimizer15.q.out d5f45da 
  ql/src/test/results/clientpositive/correlationoptimizer6.q.out 85e447c 
  ql/src/test/results/clientpositive/join12.q.out df340a8 
  ql/src/test/results/clientpositive/join5.q.out f83ff73 
  ql/src/test/results/clientpositive/join_merge_multi_expressions.q.out b73643e 
  ql/src/test/results/clientpositive/join_merging.q.out a3afbec 
  ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
  ql/src/test/results/clientpositive/limit_pushdown.q.out 6ace047 
  ql/src/test/results/clientpositive/lineage3.q.out 5c392fa 
  ql/src/test/results/clientpositive/louter_join_ppr.q.out c46792b 
  ql/src/test/results/clientpositive/optional_outer.q.out 8616644 
  ql/src/test/results/clientpositive/outer_join_ppr.q.java1.7.out b6ca16f 
  ql/src/test/results/clientpositive/ppd_gby_join.q.out 579c827 
  ql/src/test/results/clientpositive/ppd_join.q.out ae5fb27 
  ql/src/test/results/clientpositive/ppd_join2.q.out 88624ea 
  ql/src/test/results/clientpositive/ppd_join3.q.out 6c5c0da 
  ql/src/test/results/clientpositive/ppd_outer_join4.q.out 18e0154 
  ql/src/test/results/clientpositive/ppd_random.q.out 5e23a1c 
  ql/src/test/results/clientpositive/rcfile_null_value.q.out a1e6f4f 
  ql/src/test/results/clientpositive/router_join_ppr.q.out 70d7542 
  ql/src/test/results/clientpositive/skewjoin.q.out 1b56d5f 
  ql/src/test/results/clientpositive/spark/auto_join12.q.out 1bddcfc 
  ql/src/test/results/clientpositive/spark/auto_join5.q.out 08e9ae6 
  ql/src/test/results/clientpositive/spark/join12.q.out b62334d 
  ql/src/test/results/clientpositive/spark/join5.q.out 1914ad0 
  ql/src/test/results/clientpositive/spark/join_merge_multi_expressions.q.out 
a18d82e 
  ql/src/test/results/clientpositive/spark/join_merging.q.out ce26543 
  ql/src/test/results/clientpositive/spark/limit_pushdown.q.out 7101beb 
  ql/src/test/results/clientpositive/spark/louter_join_ppr.q.out 190485f 
  ql/src/test/results/clientpositive/spark/outer_join_ppr.q.java1.7.out 0614cb9 
  ql/src/test/results/clientpositive/spark/ppd_gby_join.q.out 31b25b3 
  ql/src/test/results/clientpositive/spark/ppd_join.q.out b97431c 
  ql/src/test/results/clientpositive/spark/ppd_join2.q.out cf81423 
  ql/src/test/results/clientpositive/spark/ppd_join3.q.out d2343c4 
  ql/src/test/results/clientpositive/spark/ppd_outer_join4.q.out 2abca78 
  ql/src/test/results/clientpositive/spark/router_join_ppr.q.out 12f1abb 
  ql/src/test/results/clientpositive/spark/skewjoin.q.out ec74786 
  ql/src/test/results/clientpositive/tez/explainuser_1.q.out 9f93574 
  ql/src/test/results/clientpositive/tez/limit_pushdown.q.out 2a41aae 
  ql/src/test/results/clientpositive/tez/mrr.q.out d42f9b0 
  ql/src/test/results/clientpositive/tez/skewjoin.q.out ec368f9 
  ql/src/test/results/clientpositive/tez/tez_union.q.out 4012b90 

Diff: https://reviews.apache.org/r/36253/diff/


Testing
---

Existing tests


Thanks,

Ashutosh Chauhan

[jira] [Created] (HIVE-11192) Wrong results for query with WHERE ... NOT IN when table has null values

2015-07-07 Thread Furcy Pin (JIRA)

Furcy Pin created HIVE-11192:


 Summary: Wrong results for query with WHERE ... NOT IN when table 
has null values
 Key: HIVE-11192
 URL: https://issues.apache.org/jira/browse/HIVE-11192
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0, 1.2.1
 Environment: Hive on MR
Reporter: Furcy Pin


I tested this on cdh5.4.2 cluster and locally on the release-1.2.1 branch


```sql
DROP TABLE IF EXISTS test1 ;
DROP TABLE IF EXISTS test2 ;

CREATE TABLE test1 (col1 STRING) ;
INSERT INTO TABLE test1 VALUES (1), (2), (3), (4) ;

CREATE TABLE test2 (col1 STRING) ;
INSERT INTO TABLE test2 VALUES (1), (4), (NULL) ;

SELECT 
COUNT(1)
FROM test1 T1
WHERE T1.col1 NOT IN (SELECT col1 FROM test2)
;

SELECT 
COUNT(1)
FROM test1 T1
WHERE T1.col1 NOT IN (SELECT col1 FROM test2 WHERE col1 IS NOT NULL)
;
```

The first query returns 0 and the second returns 2.
Obviously, the expected answer is always 2.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 36156: HIVE-11053: Add more tests for HIVE-10844[Spark Branch]

2015-07-07 Thread lun gao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36156/
---

(Updated July 8, 2015, 2:13 a.m.)


Review request for hive and chengxiang li.


Changes
---

Add outfile made by TestCliDriver and modify the configuration file.


Bugs: HIVE-11053
https://issues.apache.org/jira/browse/HIVE-11053


Repository: hive-git


Description
---

Add some test cases for self union, self-join, CWE, and repeated sub-queries to 
verify the job of combining quivalent works in HIVE-10844.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 4f2de12 
  ql/src/test/queries/clientpositive/dynamic_rdd_cache.q PRE-CREATION 
  ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/36156/diff/


Testing
---


Thanks,

lun gao

Re: Review Request 36241: HIVE-10927 : Add number of HMS connection metrics

2015-07-07 Thread Jimmy Xiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36241/#review90856
---

Ship it!


Ship It!

- Jimmy Xiang


On July 7, 2015, 9:28 p.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36241/
 ---
 
 (Updated July 7, 2015, 9:28 p.m.)
 
 
 Review request for hive and Jimmy Xiang.
 
 
 Bugs: HIVE-10927
 https://issues.apache.org/jira/browse/HIVE-10927
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Adds following new metrics to HMS, can be renamed if necessary.
 
 open_connections;  //#HMS clients connecting to HMS
 
 active_jdo_transactions;  //#active JDO transactions
 rollbacked_jdo_transactions;  //#failed JDO transactions
 committed_jdo_transactions;  //#successful JDO transactions
 opened_jdo_transactions;  //#attempted JDO transactions
 
 Also to HS2:
 open_connections;  //#HMS clients connecting to HMS
 
 
 This also fixes some minor issues:
 1.  For the metrics JSON-file-reporter, the file system was not chosen right. 
  Fixing that, and also making the default the local file system.
 2.  The metrics was getting closed in the metastore in the wrong place, 
 fixing it.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java ec5ac4a 
   common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
 e811339 
   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
 27b69cc 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsConstant.java
  PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsVariable.java
  PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
  ae353d0 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 6d0cf15 
   
 common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
  954b388 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
  25f34d1 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 0bcd053 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 4273c0b 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 dfb7faa 
 
 Diff: https://reviews.apache.org/r/36241/diff/
 
 
 Testing
 ---
 
 Adding some unit tests, and tested manually for HS2 which cannot be 
 unit-tested.
 
 
 Thanks,
 
 Szehon Ho

[jira] [Created] (HIVE-11198) Fix load data query file format check for partitioned tables

2015-07-07 Thread Prasanth Jayachandran (JIRA)

Prasanth Jayachandran created HIVE-11198:


 Summary: Fix load data query file format check for partitioned 
tables
 Key: HIVE-11198
 URL: https://issues.apache.org/jira/browse/HIVE-11198
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11198.patch

HIVE-8 added file format check for ORC format. The check will throw 
exception when non ORC formats is loaded to ORC managed table. But it does not 
work for partitioned table. Partitioned tables are allowed to have some 
partitions with different file format. See this discussion for more details
https://issues.apache.org/jira/browse/HIVE-8?focusedCommentId=14617271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14617271



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 36156: HIVE-11053: Add more tests for HIVE-10844[Spark Branch]

2015-07-07 Thread lun gao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36156/
---

(Updated July 8, 2015, 3:05 a.m.)


Review request for hive and chengxiang li.


Changes
---

Delete the meaningless space.


Bugs: HIVE-11053
https://issues.apache.org/jira/browse/HIVE-11053


Repository: hive-git


Description
---

Add some test cases for self union, self-join, CWE, and repeated sub-queries to 
verify the job of combining quivalent works in HIVE-10844.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 4f2de12 
  ql/src/test/queries/clientpositive/dynamic_rdd_cache.q PRE-CREATION 
  ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/36156/diff/


Testing
---


Thanks,

lun gao

Re: Review Request 36156: HIVE-11053: Add more tests for HIVE-10844[Spark Branch]

2015-07-07 Thread cheng xu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36156/#review90854
---



ql/src/test/queries/clientpositive/dynamic_rdd_cache.q (line 6)
https://reviews.apache.org/r/36156/#comment144012

Minor issue: Please remove the tailing space in the qfile. Thank you!


- cheng xu


On July 8, 2015, 10:13 a.m., lun gao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36156/
 ---
 
 (Updated July 8, 2015, 10:13 a.m.)
 
 
 Review request for hive and chengxiang li.
 
 
 Bugs: HIVE-11053
 https://issues.apache.org/jira/browse/HIVE-11053
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add some test cases for self union, self-join, CWE, and repeated sub-queries 
 to verify the job of combining quivalent works in HIVE-10844.
 
 
 Diffs
 -
 
   itests/src/test/resources/testconfiguration.properties 4f2de12 
   ql/src/test/queries/clientpositive/dynamic_rdd_cache.q PRE-CREATION 
   ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/36156/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 lun gao

Re: Review Request 36156: HIVE-11053: Add more tests for HIVE-10844[Spark Branch]

2015-07-07 Thread chengxiang li


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36156/#review90859
---

Ship it!


Ship It!

- chengxiang li


On 七月 8, 2015, 3:05 a.m., lun gao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36156/
 ---
 
 (Updated 七月 8, 2015, 3:05 a.m.)
 
 
 Review request for hive and chengxiang li.
 
 
 Bugs: HIVE-11053
 https://issues.apache.org/jira/browse/HIVE-11053
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add some test cases for self union, self-join, CWE, and repeated sub-queries 
 to verify the job of combining quivalent works in HIVE-10844.
 
 
 Diffs
 -
 
   itests/src/test/resources/testconfiguration.properties 4f2de12 
   ql/src/test/queries/clientpositive/dynamic_rdd_cache.q PRE-CREATION 
   ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/36156/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 lun gao

[jira] [Created] (HIVE-11199) LLAP: merge master into branch

2015-07-07 Thread Sergey Shelukhin (JIRA)

Sergey Shelukhin created HIVE-11199:
---

 Summary: LLAP: merge master into branch
 Key: HIVE-11199
 URL: https://issues.apache.org/jira/browse/HIVE-11199
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 36259: rearrange code

2015-07-07 Thread Hyunjin Kim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36259/
---

Review request for hive and James Taylor.


Repository: hive-git


Description
---

Delete unnecessary space.


Diffs
-

  cli/src/test/org/apache/hadoop/hive/cli/TestRCFileCat.java 
11ceb310ec0916002a1e206a767d2cc28330334b 

Diff: https://reviews.apache.org/r/36259/diff/


Testing
---


Thanks,

Hyunjin Kim

Re: Review Request 34666: HIVE-9152 - Dynamic Partition Pruning [Spark Branch]

2015-07-07 Thread Chao Sun



 On July 6, 2015, 10:26 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java,
   line 77
  https://reviews.apache.org/r/34666/diff/2/?file=999023#file999023line77
 
  I guess I don't know enough to comment on this, but looking at 
  VectorReduceSinkOperator and VectorAppMasterEventOperator I can see some 
  prominent differences:
  
  1. first batch detection and processing there
  2. VectorizedSerde logic here
  
  Probably a live review will help.

Chatted with Xuefu offline and we've cleared some doubts here.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review90593
---


On July 3, 2015, 10:45 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34666/
 ---
 
 (Updated July 3, 2015, 10:45 p.m.)
 
 
 Review request for hive, chengxiang li and Xuefu Zhang.
 
 
 Bugs: HIVE-9152
 https://issues.apache.org/jira/browse/HIVE-9152
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
   ql/if/queryplan.thrift c8dfa35 
   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
 21398d8 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
 1de7e40 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
  8546d21 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java
  4803959 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 5f731d7 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 447f104 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 e27ce0d 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
  f7586a4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SplitOpTreeForDPP.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 05a5841 
   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java aa291b9 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
 363e49e 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_2.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/spark/bucket2.q.out 89c3b4c 
   ql/src/test/results/clientpositive/spark/bucket3.q.out 2fc4855 
   ql/src/test/results/clientpositive/spark/bucket4.q.out 44e0f9f 
   ql/src/test/results/clientpositive/spark/column_access_stats.q.out 3e16f61 
   ql/src/test/results/clientpositive/spark/limit_partition_metadataonly.q.out 
 e95d2ab 
   ql/src/test/results/clientpositive/spark/list_bucket_dml_2.q.java1.7.out 
 e38ccf8 
   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 881f41a 
   ql/src/test/results/clientpositive/spark/pcr.q.out 4c22f0b 
   ql/src/test/results/clientpositive/spark/sample3.q.out 2fe6b0d 
   ql/src/test/results/clientpositive/spark/sample9.q.out c9823f7 
   ql/src/test/results/clientpositive/spark/smb_mapjoin_11.q.out c3f996f 
   
 ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out
  PRE-CREATION 
   
 ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_2.q.out
  PRE-CREATION 
   ql/src/test/results/clientpositive/spark/temp_table.q.out 16d663d 
   ql/src/test/results/clientpositive/spark/udf_example_add.q.out 7916679 
   ql/src/test/results/clientpositive/spark/udf_in_file.q.out c769d1f

Re: Review Request 34666: HIVE-9152 - Dynamic Partition Pruning [Spark Branch]

2015-07-07 Thread Chao Sun



 On May 27, 2015, 6:52 p.m., Xuefu Zhang wrote:
  ql/if/queryplan.thrift, line 60
  https://reviews.apache.org/r/34666/diff/1/?file=971689#file971689line60
 
  I'm not sure if it matters, but it's probably better if we add it as 
  the last.
 
 Xuefu Zhang wrote:
 It's still needed to move to the last as other also pointed out.

OK, fixed. Sorry I forgot last time.


 On May 27, 2015, 6:52 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java, line 
  177
  https://reviews.apache.org/r/34666/diff/1/?file=971700#file971700line177
 
  Any chance that an op might be visited multiple times?
 
 Chao Sun wrote:
 It shouldn't - it'a tree traversing and every operator should only be 
 added once.
 
 Xuefu Zhang wrote:
 Actually there could be a diamond shape in the operator graph such as 
 that formed by demux and mux operators. Join operator is another example. We 
 should use graph traverse instead of tree traverse.

Yes you're right. However, the only usage for this is in SplitOpTreeForDPP, in 
which we pass a set as parameter. So it should only contain unique operators.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review85230
---


On July 3, 2015, 10:45 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34666/
 ---
 
 (Updated July 3, 2015, 10:45 p.m.)
 
 
 Review request for hive, chengxiang li and Xuefu Zhang.
 
 
 Bugs: HIVE-9152
 https://issues.apache.org/jira/browse/HIVE-9152
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
   ql/if/queryplan.thrift c8dfa35 
   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
 21398d8 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
 1de7e40 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
  8546d21 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java
  4803959 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 5f731d7 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 447f104 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 e27ce0d 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
  f7586a4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SplitOpTreeForDPP.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 05a5841 
   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java aa291b9 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
 363e49e 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_2.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/spark/bucket2.q.out 89c3b4c 
   ql/src/test/results/clientpositive/spark/bucket3.q.out 2fc4855 
   ql/src/test/results/clientpositive/spark/bucket4.q.out 44e0f9f 
   ql/src/test/results/clientpositive/spark/column_access_stats.q.out 3e16f61 
   ql/src/test/results/clientpositive/spark/limit_partition_metadataonly.q.out 
 e95d2ab 
   ql/src/test/results/clientpositive/spark/list_bucket_dml_2.q.java1.7.out 
 e38ccf8 
   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 881f41a 
   ql/src/test/results/clientpositive/spark/pcr.q.out 4c22f0b 
   ql/src/test/results/clientpositive/spark/sample3.q.out 2fe6b0d 
   ql/src/test/results/clientpositive/spark/sample9.q.out c9823f7

Re: Review Request 34666: HIVE-9152 - Dynamic Partition Pruning [Spark Branch]

2015-07-07 Thread Chao Sun



 On July 2, 2015, 6:36 a.m., chengxiang li wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkRemoveDynamicPruningBySize.java,
   line 59
  https://reviews.apache.org/r/34666/diff/1/?file=971706#file971706line59
 
  The statistic data shoud be quite unaccurate after filter and group, as 
  it's computered based on estimation during compile time. I think threshold 
  verification on unaccurate data should be unacceptable as that means the 
  threshold may not work at all.
  We may check this threshold in SparkPartitionPruningSinkOperator at 
  runtime.
 
 Chao Sun wrote:
 Switching to runtime would be very different - here we want to check this 
 threshold, and avoid generating the pruning task if possible.
 How inaccurate the stats would be? I'm fine if it's always more 
 conservative.
 
 chengxiang li wrote:
 Take FilterOperator for example, the worst case is, it may just half the 
 input rows as its statistic, you can find the rule for FilterOperator at 
 FilterStatsRule, so it's a bad news that estimated statistics is not always 
 conservative, this would make the threshold does not work as expected 
 sometimes. You may create a followup work for this if it changes a lot.

OK, that makes sense. I think we can address this issue in the follow-up JIRA.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review90197
---


On July 3, 2015, 10:45 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34666/
 ---
 
 (Updated July 3, 2015, 10:45 p.m.)
 
 
 Review request for hive, chengxiang li and Xuefu Zhang.
 
 
 Bugs: HIVE-9152
 https://issues.apache.org/jira/browse/HIVE-9152
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
   ql/if/queryplan.thrift c8dfa35 
   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
 21398d8 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
 1de7e40 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
  8546d21 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java
  4803959 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 5f731d7 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 447f104 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 e27ce0d 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
  f7586a4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SplitOpTreeForDPP.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 05a5841 
   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java aa291b9 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
 363e49e 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_2.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/spark/bucket2.q.out 89c3b4c 
   ql/src/test/results/clientpositive/spark/bucket3.q.out 2fc4855 
   ql/src/test/results/clientpositive/spark/bucket4.q.out 44e0f9f 
   ql/src/test/results/clientpositive/spark/column_access_stats.q.out 3e16f61 
   ql/src/test/results/clientpositive/spark/limit_partition_metadataonly.q.out 
 e95d2ab 
   ql/src/test/results/clientpositive/spark/list_bucket_dml_2.q.java1.7.out 
 e38ccf8 
   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 881f41a

Re: Review Request 34666: HIVE-9152 - Dynamic Partition Pruning [Spark Branch]

2015-07-07 Thread Chao Sun



 On June 30, 2015, 8:55 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java,
   line 92
  https://reviews.apache.org/r/34666/diff/1/?file=971715#file971715line92
 
  Can we still get conflicts in the file name?
 
 Chao Sun wrote:
 It shouldn't - I think work ID and Random#nextInt() should both be 
 unique, right?
 
 Xuefu Zhang wrote:
 Random.nextint() doesn't gives uniqueness. If targetWorkID/sourceWorkID 
 gives you uniqueness, then you don't need a random number, right? If 
 targetWorkID/sourceWorkID doesn't give uniqueness, then adding a random 
 number doesn't help much.
 
 Chao Sun wrote:
 Yes targetWorkID/sourceWorkID should be unique, but it could have 
 multiple tasks from a single work, and if we don't have the random number, 
 their results may overwrite each other. We also did the same thing for the 
 hash table sink in Spark, and we haven't seen any issue with that.

targetWorkID/sourceWorkID are unique. We need random number because we could 
have multiple tasks for a particular work, in which case they may overwrite 
each other's file.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review89826
---


On July 3, 2015, 10:45 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34666/
 ---
 
 (Updated July 3, 2015, 10:45 p.m.)
 
 
 Review request for hive, chengxiang li and Xuefu Zhang.
 
 
 Bugs: HIVE-9152
 https://issues.apache.org/jira/browse/HIVE-9152
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
   ql/if/queryplan.thrift c8dfa35 
   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
 21398d8 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
 1de7e40 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
  8546d21 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/RemoveDynamicPruningBySize.java
  4803959 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 5f731d7 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 447f104 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 e27ce0d 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
  f7586a4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SplitOpTreeForDPP.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 05a5841 
   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java aa291b9 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java 
 363e49e 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_2.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/spark/bucket2.q.out 89c3b4c 
   ql/src/test/results/clientpositive/spark/bucket3.q.out 2fc4855 
   ql/src/test/results/clientpositive/spark/bucket4.q.out 44e0f9f 
   ql/src/test/results/clientpositive/spark/column_access_stats.q.out 3e16f61 
   ql/src/test/results/clientpositive/spark/limit_partition_metadataonly.q.out 
 e95d2ab 
   ql/src/test/results/clientpositive/spark/list_bucket_dml_2.q.java1.7.out 
 e38ccf8 
   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out 881f41a 
   ql/src/test/results/clientpositive/spark/pcr.q.out 4c22f0b 
   ql/src/test/results/clientpositive/spark/sample3.q.out 2fe6b0d 
   ql/src/test/results/clientpositive/spark/sample9.q.out c9823f7

Re: Review Request 36156: HIVE-11053: Add more tests for HIVE-10844[Spark Branch]

2015-07-07 Thread cheng xu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36156/#review90860
---

Ship it!


Ship It!

- cheng xu


On July 8, 2015, 11:05 a.m., lun gao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36156/
 ---
 
 (Updated July 8, 2015, 11:05 a.m.)
 
 
 Review request for hive and chengxiang li.
 
 
 Bugs: HIVE-11053
 https://issues.apache.org/jira/browse/HIVE-11053
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add some test cases for self union, self-join, CWE, and repeated sub-queries 
 to verify the job of combining quivalent works in HIVE-10844.
 
 
 Diffs
 -
 
   itests/src/test/resources/testconfiguration.properties 4f2de12 
   ql/src/test/queries/clientpositive/dynamic_rdd_cache.q PRE-CREATION 
   ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/36156/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 lun gao

RE: is HiveQA broken?

2015-07-07 Thread Xu, Cheng A

HiveQA is coming back to normal. Thanks guys!

-Original Message-
From: Szehon Ho [mailto:sze...@cloudera.com] 
Sent: Wednesday, July 08, 2015 7:08 AM
To: dev@hive.apache.org
Cc: Sergio Pena
Subject: Re: is HiveQA broken?

Sergio has uploaded a new image, we are looking at what happened.  Thanks, 
Sergio.

Sergey, it looks like the latest commit (HIVE-11013) doesn't compile as it was 
done without tests, can you take a look?

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile
(default-compile) on project hive-exec: Compilation failure [ERROR] 
/data/hive-ptest/working/apache-github-source-source/ql/src/java/org/apache/hadoop/hive/ql/plan/MergeJoinWork.java:[149,38]
error: method getDummyOps() is already defined in class MergeJoinWork

Then we can resume the pre commit tests.

Thanks
Szehon

On Tue, Jul 7, 2015 at 12:13 PM, Szehon Ho sze...@cloudera.com wrote:

 I am looking now into the issue, looks like it is looking for an AMI 
 image that is no longer there to spawn the slaves, will try to see 
 what is going on.

 Thanks
 Szehon

 On Tue, Jul 7, 2015 at 11:38 AM, Szehon Ho sze...@cloudera.com wrote:

 Yea you are right, somehow the server is not running the tests, hence 
 no logs.  I will try to look at this today when I get a chance.

 Sergio can you also take a look if you have cycles?

 Thanks
 Szehon

 On Tue, Jul 7, 2015 at 11:09 AM, Sergey Shelukhin 
 ser...@hortonworks.com
  wrote:

 It looks like it’s downloading test logs from e.g.

 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIV
 E-TRUNK
 -Build-4510
 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HI
 VE-TRUNK-Build-4510

 but there are no logs there. It also appears to happen pretty fast, 
 so testTailLog does nothing? It either returns prematurely or fails 
 quickly w/o logs.

 On 15/7/6, 19:03, Sergey Shelukhin ser...@hortonworks.com wrote:

 Looks like last HiveQA failed again in the same way, not sure if it 
 was before or after restart, what time zone is it in?
 May that be related to recent Apache infra build changes?

 On 15/7/6, 17:24, Szehon Ho sze...@cloudera.com wrote:

 I am not sure what is going on.  Let me restart the Ptest server.

 On Mon, Jul 6, 2015 at 1:02 PM, Sergey Shelukhin 
 ser...@hortonworks.com
 wrote:

  Looks like all the runs are failing with:

  Exception in thread main java.lang.RuntimeException: 404 Not Found
  at

 org.apache.hive.ptest.api.client.PTestClient.downloadTestResults(
 PTestCl
 i
 en
  t.java:181)
  at

 org.apache.hive.ptest.api.client.PTestClient.testStart(PTestClient.java:
 1
 29
  )
  at

 org.apache.hive.ptest.api.client.PTestClient.main(PTestClient.java:3
 12)
  + ret=1
  + cd target/
  + [[ -f test-results.tar.gz ]]
  + exit 1
  + rm -f /tmp/tmp.FayzMqy1oP
  Build step 'Execute shell' marked build as failure Archiving 
  artifacts Recording test results
  Finished: FAILURE

[jira] [Created] (HIVE-11194) Exchange partition on external tables should fail with error message when target folder already exists

2015-07-07 Thread Aihua Xu (JIRA)

Aihua Xu created HIVE-11194:
---

 Summary: Exchange partition on external tables should fail with 
error message when target folder already exists
 Key: HIVE-11194
 URL: https://issues.apache.org/jira/browse/HIVE-11194
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Steps to repro:
{noformat}
Create /data/a1/pkey=1 directory with some data in it.
Create /data/a2/pkey=1 directory with some data in it.

create external table a1 (value string) partitioned by (pkey int) location 
'/data/a1';
create external table a2 (value string) partitioned by (pkey int) location 
'/data/a2';
alter table a2 add partition (pkey=1);
alter table a1 exchange partition (pkey=1) with table a2;
select * from a1 should now fail.
{noformat}

pkey=1 is not a partition of a1 but the folder exists. We should give an error 
message for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-11200) LLAP: Cache BuddyAllocator throws NPE

2015-07-07 Thread Gopal V (JIRA)

Gopal V created HIVE-11200:
--

 Summary: LLAP: Cache BuddyAllocator throws NPE
 Key: HIVE-11200
 URL: https://issues.apache.org/jira/browse/HIVE-11200
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
 Environment: large perf cluster - with 64Gb cache sizes
Reporter: Gopal V
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: llap


Built off da1e0cf21aeff0a9501c5e220a6f66ba61f6da94 merge point

{code}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithSplit(BuddyAllocator.java:331)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:399)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$300(BuddyAllocator.java:228)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:156)
at 
org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(InStream.java:761)
at 
org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:462)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:342)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:59)
at 
org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
... 4 more
2015-07-08 01:17:42,798 
[TezTaskRunner_attempt_1435700346116_1212_4_05_80_0(attempt_1435700346116_1212_4_05_80_0)]
 ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.io.IOException: java.lang.NullPointerException
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

40 matches

Mail list logo