[jira] [Created] (HIVE-18242) VectorizedRowBatch cast exception when analyzing partitioned table

2017-12-06 Thread Rui Li (JIRA)
Rui Li created HIVE-18242:
-

 Summary: VectorizedRowBatch cast exception when analyzing 
partitioned table
 Key: HIVE-18242
 URL: https://issues.apache.org/jira/browse/HIVE-18242
 Project: Hive
  Issue Type: Bug
Reporter: Rui Li


Happens when I run the following (vectorization enabled):
{code}
ANALYZE TABLE srcpart PARTITION(ds, hr) COMPUTE STATISTICS;
{code}
The stack trace is:
{noformat}
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch cannot be cast to 
org.apache.hadoop.io.Text
at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.copyObject(WritableStringObjectInspector.java:36)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:425)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.partialCopyToStandardObject(ObjectInspectorUtils.java:314)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.gatherStats(TableScanOperator.java:191)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:138)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:682)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1187)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:784)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18241) Query with LEFT SEMI JOIN producing wrong result

2017-12-06 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-18241:
--

 Summary: Query with LEFT SEMI JOIN producing wrong result
 Key: HIVE-18241
 URL: https://issues.apache.org/jira/browse/HIVE-18241
 Project: Hive
  Issue Type: Bug
Reporter: Vineet Garg
Assignee: Vineet Garg


Following query produces wrong result
{code:sql}
select key, value from src outr left semi join (select a.key, b.value from src 
a join (select distinct value from src) b on a.value > b.value group by a.key, 
b.value) inr on outr.key=inr.key and outr.value=inr.value;
{code}

Expected result is empty set but it output bunch of rows.

Schema for {{src}} table could be find in {{data/scripts/q_test_init.sql}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 64402: HIVE-18240 support getClientInfo/setClientInfo in JDBC

2017-12-06 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64402/#review193093
---




service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java
Lines 305 (patched)


this should be removed


- Sergey Shelukhin


On Dec. 7, 2017, 3:24 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64402/
> ---
> 
> (Updated Dec. 7, 2017, 3:24 a.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
> 87595ee415 
>   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java edf93859fe 
>   jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java 7f21bd3842 
>   service-rpc/if/TCLIService.thrift a1f293bdc2 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 7fbcd13b63 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
> fc9e6b2a91 
> 
> 
> Diff: https://reviews.apache.org/r/64402/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Review Request 64402: HIVE-18240 support getClientInfo/setClientInfo in JDBC

2017-12-06 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64402/
---

Review request for hive and Thejas Nair.


Repository: hive-git


Description
---

see jira


Diffs
-

  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
87595ee415 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java edf93859fe 
  jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java 7f21bd3842 
  service-rpc/if/TCLIService.thrift a1f293bdc2 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
7fbcd13b63 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
fc9e6b2a91 


Diff: https://reviews.apache.org/r/64402/diff/1/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Created] (HIVE-18240) support getClientInfo/setClientInfo in JDBC

2017-12-06 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-18240:
---

 Summary: support getClientInfo/setClientInfo in JDBC
 Key: HIVE-18240
 URL: https://issues.apache.org/jira/browse/HIVE-18240
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Rebasing standalone-metastore branch and database install/upgrade tests

2017-12-06 Thread Alan Gates
As HIVE-17980 has been committed to master, I will be rebasing the
standalone-metastore branch up to the latest in master.  This will again be
a force push.

In this rebase I am also including a newer version of HIVE-17983, which
includes integration tests that use docker images to test the database
installation and upgrade scripts against all four databases beyond Derby.
These were quite useful in finding and fixing issues for Postgres,
SqlServer, and Oracle.  As the tests take a few minutes to run and in the
case of Oracle require the user to download a JDBC jar, they are off by
default.  Instructions on how to run them have been added in a DEV-README
file as part of the patch.

Alan.


Re: Review Request 64282: HIVE-18173: Improve plans for correlated subqueries with non-equi predicate

2017-12-06 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64282/#review193067
---




ql/src/test/results/clientpositive/llap/subquery_in.q.out
Lines 1029-1037 (original)


still wrong result set.


- Ashutosh Chauhan


On Dec. 6, 2017, 11:58 p.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64282/
> ---
> 
> (Updated Dec. 6, 2017, 11:58 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18173
> https://issues.apache.org/jira/browse/HIVE-18173
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve plans for correlated subqueries with non-equi predicate
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties cca1055fde 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java
>  d1fe49c875 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 76c82e2606 
>   ql/src/test/queries/clientpositive/subquery_corr.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/subquery_corr.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/subquery_exists.q.out dfe424046e 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 5dcdfdd15f 
>   ql/src/test/results/clientpositive/llap/subquery_in_having.q.out 0ffbaaea34 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 5da12584f0 
>   ql/src/test/results/clientpositive/spark/subquery_exists.q.out fb13fb73e9 
>   ql/src/test/results/clientpositive/spark/subquery_in.q.out e19240b7ca 
>   ql/src/test/results/clientpositive/spark/subquery_multi.q.out a4282df08a 
>   ql/src/test/results/clientpositive/spark/subquery_notin.q.out 0d12d0db60 
>   ql/src/test/results/clientpositive/subquery_exists.q.out b6b31aaf47 
>   ql/src/test/results/clientpositive/subquery_notexists.q.out a6175f8fec 
> 
> 
> Diff: https://reviews.apache.org/r/64282/diff/6/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>



Re: Review Request 64282: HIVE-18173: Improve plans for correlated subqueries with non-equi predicate

2017-12-06 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64282/
---

(Updated Dec. 6, 2017, 11:58 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

Rebased patch


Bugs: HIVE-18173
https://issues.apache.org/jira/browse/HIVE-18173


Repository: hive-git


Description
---

Improve plans for correlated subqueries with non-equi predicate


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties cca1055fde 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java
 d1fe49c875 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 76c82e2606 
  ql/src/test/queries/clientpositive/subquery_corr.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/subquery_corr.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/subquery_exists.q.out dfe424046e 
  ql/src/test/results/clientpositive/llap/subquery_in.q.out 5dcdfdd15f 
  ql/src/test/results/clientpositive/llap/subquery_in_having.q.out 0ffbaaea34 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out 5da12584f0 
  ql/src/test/results/clientpositive/spark/subquery_exists.q.out fb13fb73e9 
  ql/src/test/results/clientpositive/spark/subquery_in.q.out e19240b7ca 
  ql/src/test/results/clientpositive/spark/subquery_multi.q.out a4282df08a 
  ql/src/test/results/clientpositive/spark/subquery_notin.q.out 0d12d0db60 
  ql/src/test/results/clientpositive/subquery_exists.q.out b6b31aaf47 
  ql/src/test/results/clientpositive/subquery_notexists.q.out a6175f8fec 


Diff: https://reviews.apache.org/r/64282/diff/6/

Changes: https://reviews.apache.org/r/64282/diff/5-6/


Testing
---


Thanks,

Vineet Garg



Re: Review Request 64193: HIVE-18054: Make Lineage work with concurrent queries on a Session

2017-12-06 Thread Andrew Sherman via Review Board


> On Dec. 2, 2017, 12:22 a.m., Sahil Takiar wrote:
> > Since we touch the `LoadSemanticAnalyzer` could we add a q-test (could be 
> > added to one of the existing `lineage*.q` files) for `LOAD` statements. 
> > Same for import / export statements (as far as I can tell there are no 
> > existing ones, correct me if I am wrong).
> > 
> > If you have time, it would be great to run some of the lineage tests for 
> > HoS too, but since thats a bit orthogonal to this JIRA, it can be done in a 
> > follow up JIRA.
> 
> Andrew Sherman wrote:
> I will addsome more tests...

Update: I looked into his some more. It turns out that Lineage info is printed 
by default by the PostExecutePrinter.
So any .q tests should show their lineage in the output.
HoS tests do print some lineage so I think that part is covered.
But currently no lineage is printed for LOAD/IMPORT/EXPORT.
The MoveTask does call getLineageState().setLineage() to set lineage into the 
LineageState but it does not call getLineageState().mapDirToOp().
Possibly this is because LOAD statements don't have a 
org.apache.hadoop.hive.ql.exec.Operator
And IMPORT statements behave similarly.
IMO the whole Lineage stuff is not clearly specified. 
Do you think it is worth doing follow-up work to try to document /test) it 
better?


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64193/#review192601
---


On Nov. 30, 2017, 1:22 a.m., Andrew Sherman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64193/
> ---
> 
> (Updated Nov. 30, 2017, 1:22 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> A Hive Session can contain multiple concurrent sql Operations.
> Lineage is currently tracked in SessionState and is cleared when a query
> completes. This results in Lineage for other running queries being lost.
> 
> To fix this, move LineageState from SessionState to QueryState.
> In MoveTask/MoveWork use the LineageState from the MoveTask's QueryState
> rather than trying to use it from MoveWork.
> Add a test which runs multiple jdbc queries in a thread pool
> against the same connection and show that Vertices are not lost from Lineage.
> As part of this test, add ReadableHook, an ExecuteWithHookContext that stores
> HookContexts in memory and makes them available for reading.
> Make LineageLogger methods static so they can be used elsewhere.
> 
> Sometimes a running query (originating in a Driver) will instantiate
> another Driver to run or compile another query. Because these Drivers
> shared a Session, the child Driver would accumulate Lineage information
> along with that of the parent Driver. For consistency a LineageState is
> passed to these child Drivers and stored in the new Driver's QueryState.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> f5ed735c1ec14dfee338e56020fa2629b168389d 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
> af9f193dc94e2e05caa88d965a34f4483c9d7069 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryState.java 
> 7d5aa8b179e536e25c41a8946e667f8dd5669e0f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> e7af5e004fb560b574b82f6d1b60517511802f37 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 
> e2f8c1f8012ad25114e279747e821b291c7f4ca6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 
> 1f0487f4f72ab18bcf876f45ad5758d83a7f001b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  262225fc202d4627652acfd77350e44b0284b3da 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java
>  bb1f4e50509e57a9d0b9e6793c1fc08baa4d2981 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 
> 7b617309f6b0d8a7ce0dea80ab1f790c2651b147 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/LineageLogger.java 
> 2f764f8a29a9d41a7db013a949ffe3a8a9417d32 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadableHook.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java 
> 68709b4d3baf15d78e60e948ccdef3df84f28cec 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 
> 1e577da82343a1b7361467fb662661f9c6642ec0 
>   ql/src/java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java 
> 29886ae7f97f8dae7116f4fc9a2417ab8f9dac0a 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
> 7b067a0d45e33bc3347c43b050af933c296a9227 
>   
> ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
> 504b0623142a6fa6cdb45a26b49f146e12ec2d7a 
>   

[GitHub] hive pull request #272: HIVE-17980 Moved HiveMetaStoreClient plus a few rema...

2017-12-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/hive/pull/272


---


Re: Review Request 64324: HIVE-18153 refactor reopen and file management in TezTask

2017-12-06 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64324/
---

(Updated Dec. 6, 2017, 8:50 p.m.)


Review request for hive, Prasanth_J and Siddharth Seth.


Repository: hive-git


Description
---

see jira


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 88a75edd35 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 5c338b89c9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPool.java 3bcf657ac4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java 
8417ebb7d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolSession.java 
b3ccd24fd6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java 
dd879fc5e8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 8795cfcee1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java 
dbdbbf25db 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java 
9726af1506 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
4148a8aa3a 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/SampleTezSessionState.java 
52484540ff 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionPool.java 
829ea8cecc 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 47aa936845 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestWorkloadManager.java 
c58e4507f2 


Diff: https://reviews.apache.org/r/64324/diff/3/

Changes: https://reviews.apache.org/r/64324/diff/2-3/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 64282: HIVE-18173: Improve plans for correlated subqueries with non-equi predicate

2017-12-06 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64282/
---

(Updated Dec. 6, 2017, 8:37 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

This change addresses review comments


Bugs: HIVE-18173
https://issues.apache.org/jira/browse/HIVE-18173


Repository: hive-git


Description
---

Improve plans for correlated subqueries with non-equi predicate


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 75b77072c6 
  itests/hive-unit/pom.xml 3a435a8a52 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/AbstractJdbcTriggersTest.java
 62ee66f717 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestTriggersMoveWorkloadManager.java
 74ca958ea8 
  itests/src/test/resources/testconfiguration.properties cca1055fde 
  metastore/scripts/upgrade/mysql/046-HIVE-17566.mysql.sql 02288cbe42 
  metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 915af8bf4b 
  ql/src/java/org/apache/hadoop/hive/ql/Context.java 6d48783d48 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 4d52d748f1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/AmPluginNode.java eb6442180b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KillMoveTriggerActionHandler.java
 b16f1c30a0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KillTriggerActionHandler.java 
50d234deaa 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java 
dd879fc5e8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 8795cfcee1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TriggerValidatorRunnable.java 
6414f05fe0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WmEvent.java 33341ad4a9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WmTezSession.java e78ef44c11 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java 
dbdbbf25db 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManagerFederation.java 
9d56204240 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/PrintSummary.java 
8414c73e2b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java 
9726af1506 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
0ad68166ae 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToBoolean.java
 7a44035337 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncStringToLong.java
 5c0a7fae56 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpression.java
 8c2894b482 
  
ql/src/java/org/apache/hadoop/hive/ql/hooks/PostExecWMEventsSummaryPrinter.java 
83cca8903b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java
 d1fe49c875 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 76c82e2606 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 07742e0485 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java 57949d90aa 
  ql/src/java/org/apache/hadoop/hive/ql/wm/Trigger.java 4adad7a1b6 
  ql/src/java/org/apache/hadoop/hive/ql/wm/TriggerActionHandler.java 7995a8f639 
  ql/src/java/org/apache/hadoop/hive/ql/wm/TriggerContext.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/wm/WmContext.java 7a7ef507e5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestWorkloadManager.java 
c58e4507f2 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java
 e89f2e5a02 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
 6952b4598f 
  ql/src/test/queries/clientpositive/groupby_position.q 446b99d3cb 
  ql/src/test/queries/clientpositive/subquery_corr.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_in.q 7d4ece9dca 
  ql/src/test/queries/clientpositive/udf_to_boolean.q 1a50d055d5 
  ql/src/test/queries/clientpositive/vector_udf_string_to_boolean.q eeb5ab8819 
  ql/src/test/results/clientpositive/groupby_position.q.out 7351a06f9c 
  
ql/src/test/results/clientpositive/llap/insert_values_orig_table_use_metadata.q.out
 143742b3be 
  ql/src/test/results/clientpositive/llap/subquery_corr.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/subquery_exists.q.out dfe424046e 
  ql/src/test/results/clientpositive/llap/subquery_in.q.out 5dcdfdd15f 
  ql/src/test/results/clientpositive/llap/subquery_in_having.q.out 0ffbaaea34 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out 5da12584f0 
  ql/src/test/results/clientpositive/llap/vectorized_casts.q.out 84b4d9454d 
  ql/src/test/results/clientpositive/spark/groupby_position.q.out bcc512be09 
  ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 6a4bea1bd4 
  ql/src/test/results/clientpositive/spark/subquery_exists.q.out fb13fb73e9 
  ql/src/test/results/clientpositive/spark/subquery_in.q.out e19240b7ca 
  

Re: Review Request 64358: HIVE-18003 add explicit jdbc connection string args for mappings

2017-12-06 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64358/
---

(Updated Dec. 6, 2017, 8:03 p.m.)


Review request for hive and Prasanth_J.


Repository: hive-git


Description
---

see jira


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 75b77072c6 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java edf93859fe 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 855de881e9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 8795cfcee1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/UserPoolMapping.java 
33ee8f791f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java 
dbdbbf25db 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestWorkloadManager.java 
c58e4507f2 


Diff: https://reviews.apache.org/r/64358/diff/3/

Changes: https://reviews.apache.org/r/64358/diff/2-3/


Testing
---


Thanks,

Sergey Shelukhin



Re: checkstyle changes

2017-12-06 Thread Eugene Koifman
It currently complains about no space between ; and // as in “…);//foo”

And also about indentation when a single method call is split into multiple 
lines.
It insists on 4 chars in this case, though we use 2 in (all?) other cases.

Could this be dialed down as well?
 

On 12/5/17, 7:26 AM, "Peter Vary"  wrote:

+1 for the changes

> On Dec 5, 2017, at 1:02 PM, Zoltan Haindrich  wrote:
> 
> Hello,
> 
> I've filed a ticket to make the checkstyle warnings less noisy 
(https://issues.apache.org/jira/browse/HIVE-18222)
> 
> * set maxlinelength to 140
>I think everyone is working with big-enough displays to handle this :)
>There are many methods which have complicated names / arguments / etc 
; breaking the lines more frequently hurts readability...
> * disabled some restrictions like: declaration via get/set methods 
for protected/package fields are not mandatory
> 
> If you don't feel comfortable with these changes, please share your point 
of view.
> 
> cheers,
> Zoltan
> 
> 





Re: Review Request 64282: HIVE-18173: Improve plans for correlated subqueries with non-equi predicate

2017-12-06 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64282/#review193025
---




ql/src/test/queries/clientpositive/subquery_in.q
Line 305 (original), 305-313 (patched)


Can you please add these queries in a new q file for MiniLlapDriver only. 
Current q.out is very large and its hard to understand impact of this patch on 
existing tests because of large diff size.



ql/src/test/results/clientpositive/llap/subquery_in.q.out
Lines 1029-1034 (original)


This query is now returning empty result set. Looks incorrect.


- Ashutosh Chauhan


On Dec. 6, 2017, 6:30 a.m., Vineet Garg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64282/
> ---
> 
> (Updated Dec. 6, 2017, 6:30 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18173
> https://issues.apache.org/jira/browse/HIVE-18173
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve plans for correlated subqueries with non-equi predicate
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java
>  d1fe49c875 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 76c82e2606 
>   ql/src/test/queries/clientpositive/subquery_in.q 7d4ece9dca 
>   ql/src/test/results/clientpositive/llap/subquery_exists.q.out dfe424046e 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 5dcdfdd15f 
>   ql/src/test/results/clientpositive/llap/subquery_in_having.q.out 0ffbaaea34 
>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 5da12584f0 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 
> 6a4bea1bd4 
>   ql/src/test/results/clientpositive/spark/subquery_exists.q.out fb13fb73e9 
>   ql/src/test/results/clientpositive/spark/subquery_in.q.out e19240b7ca 
>   ql/src/test/results/clientpositive/spark/subquery_multi.q.out a4282df08a 
>   ql/src/test/results/clientpositive/spark/subquery_notin.q.out 0d12d0db60 
>   ql/src/test/results/clientpositive/spark/subquery_scalar.q.out d8b1c92526 
>   ql/src/test/results/clientpositive/spark/subquery_select.q.out 6feb852965 
>   ql/src/test/results/clientpositive/spark/subquery_views.q.out 9a1c25fffd 
>   ql/src/test/results/clientpositive/subquery_exists.q.out b6b31aaf47 
>   ql/src/test/results/clientpositive/subquery_notexists.q.out a6175f8fec 
> 
> 
> Diff: https://reviews.apache.org/r/64282/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>



[jira] [Created] (HIVE-18239) ACID: Make DeleteEventRegistry pluggable

2017-12-06 Thread Gopal V (JIRA)
Gopal V created HIVE-18239:
--

 Summary: ACID: Make DeleteEventRegistry pluggable
 Key: HIVE-18239
 URL: https://issues.apache.org/jira/browse/HIVE-18239
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: Gopal V


Turn the hash & lookup components of the ColumnizedDeleteEventRegistry into a 
pluggable interface.

Opening up FPGA/GPU acceleration opportunities for Hive ACID 2.x impl.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18238) Driver execution should not have configuration altering sideeffects

2017-12-06 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18238:
---

 Summary: Driver execution should not have configuration altering 
sideeffects 
 Key: HIVE-18238
 URL: https://issues.apache.org/jira/browse/HIVE-18238
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


{{Driver}} executes sql statements which use "hiveconf" settings;
but the {{Driver}} itself may *not* change the configuration...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18237) missing results for insert_only table after DP insert

2017-12-06 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18237:
---

 Summary: missing results for insert_only table after DP insert
 Key: HIVE-18237
 URL: https://issues.apache.org/jira/browse/HIVE-18237
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


{code}
set hive.stats.column.autogather=false;

set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=200;
set hive.exec.max.dynamic.partitions=200;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

create table i0 (p int,v int);
insert into i0 values
(0,0),
(2,2),
(3,3);

create table p0 (v int) partitioned by (p int) stored as orc 
  tblproperties ("transactional"="true", 
"transactional_properties"="insert_only");

explain insert overwrite table p0 partition (p) select * from i0 where v < 3;
insert overwrite table p0 partition (p) select * from i0 where v < 3;
select count(*) from p0 where v!=1;
{code}

The table p0 should contain {{2}} rows at this point; but the result is {{0}}.

* seems to be specific to insert_only tables
* the existing data appears if an {{insert into}} is executed.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18236) Hive vectorized execution returned wrong result

2017-12-06 Thread Hengyu Dai (JIRA)
Hengyu Dai created HIVE-18236:
-

 Summary: Hive vectorized execution returned wrong result
 Key: HIVE-18236
 URL: https://issues.apache.org/jira/browse/HIVE-18236
 Project: Hive
  Issue Type: Bug
  Components: Hive, Physical Optimizer
Affects Versions: 2.1.1
Reporter: Hengyu Dai


vectorized execution returned weird result in a simple query.
the following table foo has id column unique and not null, the query  should 
return 0 for bar, but now it returns the size of foo

{code:sql}
select
dt,
sum(case when id ='' or id is null then 1 else 0 end) as bar
from foo
where dt=20171205
group by dt
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18235) Columnstats gather fails for insert_only table

2017-12-06 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18235:
---

 Summary: Columnstats gather fails for insert_only table
 Key: HIVE-18235
 URL: https://issues.apache.org/jira/browse/HIVE-18235
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich



test: dp_counter_mm.q

at:
{code}
insert overwrite table src2 partition (value) select * from src where key < 100;
{code}

produces:
{code}
2017-12-06T02:39:54,447 DEBUG [d709e6e0-7573-4c79-bb38-b043a88a8dde main] 
metrics.PerfLogger: 
2017-12-06T02:39:54,447 DEBUG [d709e6e0-7573-4c79-bb38-b043a88a8dde main] 
metadata.Hive: NoSuchObjectException(message:Partition for which stats is 
gathered doesn't exist.)
at 
org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:7644)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
at com.sun.proxy.$Proxy52.updatePartitionColumnStatistics(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartitonColStats(HiveMetaStore.java:5340)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:6853)
at sun.reflect.GeneratedMethodAccessor81.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy54.set_aggr_stats_for(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.setPartitionColumnStatistics(HiveMetaStoreClient.java:1748)
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.setPartitionColumnStatistics(SessionHiveMetaStoreClient.java:374)
at sun.reflect.GeneratedMethodAccessor80.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:211)
at com.sun.proxy.$Proxy55.setPartitionColumnStatistics(Unknown Source)
at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4215)
at 
org.apache.hadoop.hive.ql.stats.ColStatsProcessor.persistColumnStats(ColStatsProcessor.java:180)
at 
org.apache.hadoop.hive.ql.stats.ColStatsProcessor.process(ColStatsProcessor.java:84)
at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:108)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2230)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1882)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1613)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1358)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1346)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18234) Hive MergeFileTask doesn't work correctly

2017-12-06 Thread Hengyu Dai (JIRA)
Hengyu Dai created HIVE-18234:
-

 Summary: Hive MergeFileTask doesn't work correctly
 Key: HIVE-18234
 URL: https://issues.apache.org/jira/browse/HIVE-18234
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.1
Reporter: Hengyu Dai


For MergeFileTask, Hive will read hive.merge.mapfiles, hive.merge.mapredfiles, 
hive.merge.size.per.task, hive.merge.smallfiles.avgsize these property to 
determine whether to generate a MergeFileTask to merge small files,  if merge 
is needed, then hive will generate a MergeFileTask/MapWork to merge files, the 
property will finally be set to MapWork#maxSplitSize, 
maxSplitSize#minSplitSize, maxSplitSize#minSplitSizePerNode, 
minSplitSizePerRack#minSplitSizePerRack.

But Hive doesn't use these settings when commit Map task to Hadoop, i.e.,  the 
corresponding settings of Hadoop: "mapred.max.split.size" 
"mapred.min.split.size.per.node" "mapred.min.split.size.per.rack" are not set 
by these Hive setting. SO,  those Hive setting does not take effect for 
MergeFileTask.

steps to reproduce:
this sql will still produce many small files(less than 20MB)
{code:sql}
set hive.merge.mapredfiles=true;
set hive.merge.mapfiles=true;
set hive.merge.smallfiles.avgsize=5;
set hive.merge.size.per.task=10;
insert overwrite table foo partition(dt='20171203')
select * from bar;
{code}

to fix these problem, I think we should set these property to Hadoop in 
MergeFileTask,
those code takes effect to me

{code:java}
  // in MergeFileTask#execute()
  job.setInputFormat(work.getInputformatClass());
  job.setOutputFormat(HiveOutputFormatImpl.class);
  job.setMapperClass(MergeFileMapper.class);
  job.setMapOutputKeyClass(NullWritable.class);
  job.setMapOutputValueClass(NullWritable.class);
  job.setOutputKeyClass(NullWritable.class);
  job.setOutputValueClass(NullWritable.class);
  job.setNumReduceTasks(0);
  // set these property 
  job.setLong("mapred.max.split.size", work.getMaxSplitSize());
  job.setLong("mapred.min.split.size.per.rack", 
work.getMinSplitSizePerRack());
  job.setLong("mapred.min.split.size.per.node", 
work.getMinSplitSizePerNode());
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18233) Adding SSPI support in the Hive JDBC driver

2017-12-06 Thread Rohit Rai Malhotra (JIRA)
Rohit Rai Malhotra created HIVE-18233:
-

 Summary: Adding SSPI support in the Hive JDBC driver
 Key: HIVE-18233
 URL: https://issues.apache.org/jira/browse/HIVE-18233
 Project: Hive
  Issue Type: New Feature
  Components: Hive, JDBC
Reporter: Rohit Rai Malhotra


Please add the SSPI support in the Hive JDBC driver, windows kerberos 
authentication handler. This is implemented as open source in Postgresql JDBC 
here:
https://github.com/pgjdbc/pgjdbc/tree/master/pgjdbc/src/main/java/org/postgresql/sspi



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Does anyone know how to deal with special case in DefaultRuleDispatcher?

2017-12-06 Thread Zhang, Liyun
Hi all:
  I am working on HIVE-17486 
and met a problem about DefaultRuleDispatcher. It will apply rules to the 
operator tree. Here it first compute the minCost of a rule to node( like an 
operator in the operator tree). But how to deal with the case where  the 
minCost are same if two rules applied to node. I guess in current code, only 1 
rule will be applied to the node(May be you will say that 
DefaultRuleDispatcher
#rules
 is a HashMap and is designed to one key with one value, but in my code, I have 
changed rules to multiMap which is designed to one key with multiple values)

{code}

@Override
public Object dispatch(Node nd, Stack ndStack, Object... nodeOutputs)
throws SemanticException {

  // find the firing rule
  // find the rule from the stack specified
  Rule rule = null;
  int minCost = Integer.MAX_VALUE;
  for (Rule r : procRules.keySet()) {
int cost = r.cost(ndStack);
if ((cost >= 0) && (cost <= minCost)) {  // In current code , only 1 rule 
will be applied
  minCost = cost;
  rule = r;
}
  }

  NodeProcessor proc;

  if (rule == null) {
proc = defaultProc;
  } else {
proc = procRules.get(rule);
  }

  // Do nothing in case proc is null
  if (proc != null) {
// Call the process function
return proc.process(nd, ndStack, procCtx, nodeOutputs);
  } else {
return null;
  }
}

{code}





Appreciate to get some suggestions from you.










Best Regards
Kelly Zhang/Zhang,Liyun