[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579314#comment-15579314
 ] 

Matt McCline commented on HIVE-11394:
-

See HIVE-14981 for fix.

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> 

[jira] [Updated] (HIVE-14799) Query operation are not thread safe during its cancellation

2016-10-15 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14799:
---
Attachment: HIVE-14799-branch-2.1.patch

Attach the patch for 2.1.

> Query operation are not thread safe during its cancellation
> ---
>
> Key: HIVE-14799
> URL: https://issues.apache.org/jira/browse/HIVE-14799
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-14799-branch-2.1.patch, HIVE-14799.1.patch, 
> HIVE-14799.2.patch, HIVE-14799.3.patch, HIVE-14799.4.patch, 
> HIVE-14799.5.patch, HIVE-14799.5.patch, HIVE-14799.6.patch, 
> HIVE-14799.6.patch, HIVE-14799.7.patch, HIVE-14799.patch
>
>
> When a query is cancelled either via Beeline (Ctrl-C) or API call 
> TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a 
> different thread from that running the query to close/destroy its 
> encapsulated Driver object. Both SQLOperation and Driver are not thread-safe 
> which could sometimes result in Runtime exceptions like NPE. The errors from 
> the running query are not handled properly therefore probably causing some 
> stuffs (files, locks etc) not being cleaned after the query termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579292#comment-15579292
 ] 

Matt McCline commented on HIVE-14981:
-

It is ok.  It has a Native Vector MapJoin parent.  Normally, regular Vector 
MapJoin creates a new output batch and so the column numbers are lower.  But 
the Native parent is taking the original input batch and adding scratch 
columns.  Thanks for noticing, but not a problem.

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-15 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579245#comment-15579245
 ] 

Chaoyu Tang commented on HIVE-14927:


The test did not run successfully. Could you reattach the patch to kick off 
another run?

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14958) Improve the 'TestClass' did not produce a TEST-*.xml file message to include list of all qfiles in a batch, batch id

2016-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579172#comment-15579172
 ] 

Hive QA commented on HIVE-14958:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833622/HIVE-14958.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10534 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1]
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1589/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1589/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1589/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833622 - PreCommit-HIVE-Build

> Improve the 'TestClass' did not produce a TEST-*.xml file message to include 
> list of all qfiles in a batch, batch id
> 
>
> Key: HIVE-14958
> URL: https://issues.apache.org/jira/browse/HIVE-14958
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14958.01.patch, HIVE-14958.02.patch
>
>
> Should make it easier to hunt down the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation

2016-10-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579154#comment-15579154
 ] 

Lefty Leverenz commented on HIVE-14799:
---

Oops, sorry about the nudge then.

> Query operation are not thread safe during its cancellation
> ---
>
> Key: HIVE-14799
> URL: https://issues.apache.org/jira/browse/HIVE-14799
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, 
> HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.5.patch, 
> HIVE-14799.5.patch, HIVE-14799.6.patch, HIVE-14799.6.patch, 
> HIVE-14799.7.patch, HIVE-14799.patch
>
>
> When a query is cancelled either via Beeline (Ctrl-C) or API call 
> TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a 
> different thread from that running the query to close/destroy its 
> encapsulated Driver object. Both SQLOperation and Driver are not thread-safe 
> which could sometimes result in Runtime exceptions like NPE. The errors from 
> the running query are not handled properly therefore probably causing some 
> stuffs (files, locks etc) not being cleaned after the query termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14966) JDBC: Make cookie-auth work in HTTP mode

2016-10-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579148#comment-15579148
 ] 

Lefty Leverenz commented on HIVE-14966:
---

Cool.  Thanks Gopal.

> JDBC: Make cookie-auth work in HTTP mode
> 
>
> Key: HIVE-14966
> URL: https://issues.apache.org/jira/browse/HIVE-14966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14966.1.patch, HIVE-14966.2.patch
>
>
> HiveServer2 cookie-auth is non-functional and forces authentication to be 
> repeated for the status check loop, row fetch loop and the get logs loop.
> The repeated auth in the fetch-loop is a performance issue, but is also 
> causing occasional DoS responses from the remote auth-backend if this is not 
> using local /etc/passwd.
> The HTTP-Cookie auth once made functional will behave similarly to the binary 
> protocol, authenticating exactly once per JDBC session and not causing 
> further load on the authentication backend irrespective how many rows are 
> returned from the JDBC request.
> This due to the fact that the cookies are not sent out with matching flags 
> for SSL usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation

2016-10-15 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579146#comment-15579146
 ] 

Chaoyu Tang commented on HIVE-14799:


[~leftylev] Thanks for that. I am working on the backport of the patch to 
2.1.1, and will update the status/fix version when the JIRA is resolved.

> Query operation are not thread safe during its cancellation
> ---
>
> Key: HIVE-14799
> URL: https://issues.apache.org/jira/browse/HIVE-14799
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, 
> HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.5.patch, 
> HIVE-14799.5.patch, HIVE-14799.6.patch, HIVE-14799.6.patch, 
> HIVE-14799.7.patch, HIVE-14799.patch
>
>
> When a query is cancelled either via Beeline (Ctrl-C) or API call 
> TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a 
> different thread from that running the query to close/destroy its 
> encapsulated Driver object. Both SQLOperation and Driver are not thread-safe 
> which could sometimes result in Runtime exceptions like NPE. The errors from 
> the running query are not handled properly therefore probably causing some 
> stuffs (files, locks etc) not being cleaned after the query termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14966) JDBC: Make cookie-auth work in HTTP mode

2016-10-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579111#comment-15579111
 ] 

Gopal V commented on HIVE-14966:


Yes, [~leftylev] - with this patch the configuration disappears & leaves no 
ability for a user to misconfigure this.

> JDBC: Make cookie-auth work in HTTP mode
> 
>
> Key: HIVE-14966
> URL: https://issues.apache.org/jira/browse/HIVE-14966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14966.1.patch, HIVE-14966.2.patch
>
>
> HiveServer2 cookie-auth is non-functional and forces authentication to be 
> repeated for the status check loop, row fetch loop and the get logs loop.
> The repeated auth in the fetch-loop is a performance issue, but is also 
> causing occasional DoS responses from the remote auth-backend if this is not 
> using local /etc/passwd.
> The HTTP-Cookie auth once made functional will behave similarly to the binary 
> protocol, authenticating exactly once per JDBC session and not causing 
> further load on the authentication backend irrespective how many rows are 
> returned from the JDBC request.
> This due to the fact that the cookies are not sent out with matching flags 
> for SSL usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation

2016-10-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579106#comment-15579106
 ] 

Lefty Leverenz commented on HIVE-14799:
---

[~ctang.ma] since you've committed this to master, please update the status and 
fix version.  Thanks.

Commit 1901e3a6ab97c150905c04c591b33b2c640e4b87.

> Query operation are not thread safe during its cancellation
> ---
>
> Key: HIVE-14799
> URL: https://issues.apache.org/jira/browse/HIVE-14799
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, 
> HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.5.patch, 
> HIVE-14799.5.patch, HIVE-14799.6.patch, HIVE-14799.6.patch, 
> HIVE-14799.7.patch, HIVE-14799.patch
>
>
> When a query is cancelled either via Beeline (Ctrl-C) or API call 
> TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a 
> different thread from that running the query to close/destroy its 
> encapsulated Driver object. Both SQLOperation and Driver are not thread-safe 
> which could sometimes result in Runtime exceptions like NPE. The errors from 
> the running query are not handled properly therefore probably causing some 
> stuffs (files, locks etc) not being cleaned after the query termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14966) JDBC: Make cookie-auth work in HTTP mode

2016-10-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579088#comment-15579088
 ] 

Lefty Leverenz commented on HIVE-14966:
---

Does this need to be documented in the wiki?  If so, where?

* [Setting Up HiveServer2 -- Running in HTTP Mode | 
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-RunninginHTTPMode]
* [HiveServer2 Clients -- Supporting Cookie Replay in HTTP Mode | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-SupportingCookieReplayinHTTPMode]

Adding a TODOC2.2 label because (at least) the wiki needs to be updated for the 
deprecation of *hive.server2.thrift.http.cookie.is.secure*.

* [Configuration Properties -- hive.server2.thrift.http.cookie.is.secure | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.thrift.http.cookie.is.secure]

> JDBC: Make cookie-auth work in HTTP mode
> 
>
> Key: HIVE-14966
> URL: https://issues.apache.org/jira/browse/HIVE-14966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14966.1.patch, HIVE-14966.2.patch
>
>
> HiveServer2 cookie-auth is non-functional and forces authentication to be 
> repeated for the status check loop, row fetch loop and the get logs loop.
> The repeated auth in the fetch-loop is a performance issue, but is also 
> causing occasional DoS responses from the remote auth-backend if this is not 
> using local /etc/passwd.
> The HTTP-Cookie auth once made functional will behave similarly to the binary 
> protocol, authenticating exactly once per JDBC session and not causing 
> further load on the authentication backend irrespective how many rows are 
> returned from the JDBC request.
> This due to the fact that the cookies are not sent out with matching flags 
> for SSL usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14966) JDBC: Make cookie-auth work in HTTP mode

2016-10-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14966:
--
Labels: TODOC2.2  (was: )

> JDBC: Make cookie-auth work in HTTP mode
> 
>
> Key: HIVE-14966
> URL: https://issues.apache.org/jira/browse/HIVE-14966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14966.1.patch, HIVE-14966.2.patch
>
>
> HiveServer2 cookie-auth is non-functional and forces authentication to be 
> repeated for the status check loop, row fetch loop and the get logs loop.
> The repeated auth in the fetch-loop is a performance issue, but is also 
> causing occasional DoS responses from the remote auth-backend if this is not 
> using local /etc/passwd.
> The HTTP-Cookie auth once made functional will behave similarly to the binary 
> protocol, authenticating exactly once per JDBC session and not causing 
> further load on the authentication backend irrespective how many rows are 
> returned from the JDBC request.
> This due to the fact that the cookies are not sent out with matching flags 
> for SSL usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14980) Minor compaction when triggered simultaniously on the same table/partition deletes data

2016-10-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14980:
---
Component/s: Transactions

> Minor compaction when triggered simultaniously on the same table/partition 
> deletes data
> ---
>
> Key: HIVE-14980
> URL: https://issues.apache.org/jira/browse/HIVE-14980
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.1.0
>Reporter: Mahipal Jupalli
>Assignee: Mahipal Jupalli
>Priority: Critical
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> I have two tables (TABLEA, TABLEB). If I manually trigger compaction after 
> each INSERT into TABLEB from TABLEA, compactions are triggered on random 
> metastore asynchronously and are stepping on each other which is causing the 
> data to be deleted.
> Example here: 
> TABLEA - has 10k rows. 
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> Once all the compactions are complete, I should ideally see 20k rows in 
> TABLEB. But I see only 10k rows (Only the rows INSERTED before the last 
> compaction persist, the old rows are deleted. I believe the old delta files 
> are deleted). 
> To further confirm the bug, if I do only one compaction after two inserts, I 
> see 20k rows in TABLEB.
> Proposed Fix:
> I have identified the bug in the code, it requires an additional check in the 
> org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
> compactions on the table/partition. I will 'share the details of the fix once 
> I test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579041#comment-15579041
 ] 

Gopal V commented on HIVE-14981:


[~mmccline]: LGTM, +1 tests pending.

I don't think the following golden file change is a problem, but it would be 
good to clarify on this JIRA how the 3 turns into 25?

{code}
 Group By Operator
   aggregations: count(1)
   Group By Vectorization:
-  aggregators: 
VectorUDAFCount(ConstantVectorExpression(val 1) -> 3:long) -> bigint
+  aggregators: 
VectorUDAFCount(ConstantVectorExpression(val 1) -> 25:long) -> bigint
{code}

Is that the column id # in lieu of a constant key like 0?

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-15 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14981:

Attachment: HIVE-14981.01.patch

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-15 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14981:

Status: Patch Available  (was: Open)

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14980) Minor compaction when triggered simultaniously on the same table/partition deletes data

2016-10-15 Thread Mahipal Jupalli (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578923#comment-15578923
 ] 

Mahipal Jupalli edited comment on HIVE-14980 at 10/15/16 11:28 PM:
---

Hi,

My idea is to replicate the same checks from the 
org.apache.hadoop.hive.ql.txn.compactor.Initiator.java to the 
org.apache.hadoop.hive.ql.txn.compactor.Worker.java logic.

{code:title=org.apache.hadoop.hive.ql.txn.compactor.Initiator.java|borderStyle=solid}
// Figure out if there are any currently running compactions on the same table 
or partition.
  private boolean lookForCurrentCompactions(ShowCompactResponse compactions,
CompactionInfo ci) {
if (compactions.getCompacts() != null) {
  for (ShowCompactResponseElement e : compactions.getCompacts()) {
 if ((e.getState().equals(TxnStore.WORKING_RESPONSE) || 
e.getState().equals(TxnStore.INITIATED_RESPONSE)) &&
e.getDbname().equals(ci.dbname) &&
e.getTablename().equals(ci.tableName) &&
(e.getPartitionname() == null && ci.partName == null ||
  e.getPartitionname().equals(ci.partName))) {
  return true;
}
  }
}
return false;
  }

public void run(){
//...
if (lookForCurrentCompactions(currentCompactions, ci)) {
LOG.debug("Found currently initiated or working compaction for " + 
ci.getFullPartitionName() + " so we will not initiate another compaction");
continue;
}
//...
}
{code}

{code:title=org.apache.hadoop.hive.ql.txn.compactor.Worker.java|borderStyle=solid}
public void run() {
  //...
  // This chicanery is to get around the fact that the table needs to 
be final in order to
// go into the doAs below.
final Table t = t1;

ShowCompactResponse currentCompactions = txnHandler.showCompact(new 
ShowCompactRequest());
if (lookForCurrentCompactions(currentCompactions, ci)) {
  LOG.debug("Found currently initiated or working compaction for " +
  ci.getFullPartitionName() + " so we will not initiate another 
compaction");
  continue;
}

// Find the partition we will be working with, if there is one.
Partition p = null;
  //...
  //Figure out if there are any currently running compactions on the same 
table or partition.
 private boolean lookForCurrentCompactions(ShowCompactResponse compactions,
   CompactionInfo ci) {
   if (compactions.getCompacts() != null) {
 for (ShowCompactResponseElement e : compactions.getCompacts()) {
if ((e.getState().equals(TxnStore.WORKING_RESPONSE) || 
e.getState().equals(TxnStore.INITIATED_RESPONSE)) &&
   e.getDbname().equals(ci.dbname) &&
   e.getTablename().equals(ci.tableName) &&
   (e.getPartitionname() == null && ci.partName == null ||
 e.getPartitionname().equals(ci.partName))) {
 return true;
   }
 }
   }
   return false;
 }
}
  //...
{code}

Please let me know if this is the correct approach.


was (Author: mahipal.jupalli):
Hi,

My idea is to replicate the same checks from the Initiator to the Worker logic.

{code:title=org.apache.hadoop.hive.ql.txn.compactor.Initiator.java|borderStyle=solid}
// Figure out if there are any currently running compactions on the same table 
or partition.
  private boolean lookForCurrentCompactions(ShowCompactResponse compactions,
CompactionInfo ci) {
if (compactions.getCompacts() != null) {
  for (ShowCompactResponseElement e : compactions.getCompacts()) {
 if ((e.getState().equals(TxnStore.WORKING_RESPONSE) || 
e.getState().equals(TxnStore.INITIATED_RESPONSE)) &&
e.getDbname().equals(ci.dbname) &&
e.getTablename().equals(ci.tableName) &&
(e.getPartitionname() == null && ci.partName == null ||
  e.getPartitionname().equals(ci.partName))) {
  return true;
}
  }
}
return false;
  }

public void run(){
//...
if (lookForCurrentCompactions(currentCompactions, ci)) {
LOG.debug("Found currently initiated or working compaction for " + 
ci.getFullPartitionName() + " so we will not initiate another compaction");
continue;
}
//...
}
{code}

{code:title=org.apache.hadoop.hive.ql.txn.compactor.Worker.java|borderStyle=solid}
public void run() {
  //...
  // This chicanery is to get around the fact that the table needs to 
be final in order to
// go into the doAs below.
final Table t = t1;

ShowCompactResponse currentCompactions = txnHandler.showCompact(new 
ShowCompactRequest());
if (lookForCurrentCompactions(currentCompactions, ci)) {
  LOG.debug("Found currently initiated or working compaction for " +

[jira] [Commented] (HIVE-14980) Minor compaction when triggered simultaniously on the same table/partition deletes data

2016-10-15 Thread Mahipal Jupalli (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578923#comment-15578923
 ] 

Mahipal Jupalli commented on HIVE-14980:


Hi,

My idea is to replicate the same checks from the Initiator to the Worker logic.

{code:title=org.apache.hadoop.hive.ql.txn.compactor.Initiator.java|borderStyle=solid}
// Figure out if there are any currently running compactions on the same table 
or partition.
  private boolean lookForCurrentCompactions(ShowCompactResponse compactions,
CompactionInfo ci) {
if (compactions.getCompacts() != null) {
  for (ShowCompactResponseElement e : compactions.getCompacts()) {
 if ((e.getState().equals(TxnStore.WORKING_RESPONSE) || 
e.getState().equals(TxnStore.INITIATED_RESPONSE)) &&
e.getDbname().equals(ci.dbname) &&
e.getTablename().equals(ci.tableName) &&
(e.getPartitionname() == null && ci.partName == null ||
  e.getPartitionname().equals(ci.partName))) {
  return true;
}
  }
}
return false;
  }

public void run(){
//...
if (lookForCurrentCompactions(currentCompactions, ci)) {
LOG.debug("Found currently initiated or working compaction for " + 
ci.getFullPartitionName() + " so we will not initiate another compaction");
continue;
}
//...
}
{code}

{code:title=org.apache.hadoop.hive.ql.txn.compactor.Worker.java|borderStyle=solid}
public void run() {
  //...
  // This chicanery is to get around the fact that the table needs to 
be final in order to
// go into the doAs below.
final Table t = t1;

ShowCompactResponse currentCompactions = txnHandler.showCompact(new 
ShowCompactRequest());
if (lookForCurrentCompactions(currentCompactions, ci)) {
  LOG.debug("Found currently initiated or working compaction for " +
  ci.getFullPartitionName() + " so we will not initiate another 
compaction");
  continue;
}

// Find the partition we will be working with, if there is one.
Partition p = null;
  //...
  //Figure out if there are any currently running compactions on the same 
table or partition.
 private boolean lookForCurrentCompactions(ShowCompactResponse compactions,
   CompactionInfo ci) {
   if (compactions.getCompacts() != null) {
 for (ShowCompactResponseElement e : compactions.getCompacts()) {
if ((e.getState().equals(TxnStore.WORKING_RESPONSE) || 
e.getState().equals(TxnStore.INITIATED_RESPONSE)) &&
   e.getDbname().equals(ci.dbname) &&
   e.getTablename().equals(ci.tableName) &&
   (e.getPartitionname() == null && ci.partName == null ||
 e.getPartitionname().equals(ci.partName))) {
 return true;
   }
 }
   }
   return false;
 }
}
  //...
{code}

Please let me know if this is the correct approach.

> Minor compaction when triggered simultaniously on the same table/partition 
> deletes data
> ---
>
> Key: HIVE-14980
> URL: https://issues.apache.org/jira/browse/HIVE-14980
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Mahipal Jupalli
>Assignee: Mahipal Jupalli
>Priority: Critical
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> I have two tables (TABLEA, TABLEB). If I manually trigger compaction after 
> each INSERT into TABLEB from TABLEA, compactions are triggered on random 
> metastore asynchronously and are stepping on each other which is causing the 
> data to be deleted.
> Example here: 
> TABLEA - has 10k rows. 
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> Once all the compactions are complete, I should ideally see 20k rows in 
> TABLEB. But I see only 10k rows (Only the rows INSERTED before the last 
> compaction persist, the old rows are deleted. I believe the old delta files 
> are deleted). 
> To further confirm the bug, if I do only one compaction after two inserts, I 
> see 20k rows in TABLEB.
> Proposed Fix:
> I have identified the bug in the code, it requires an additional check in the 
> org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
> compactions on the table/partition. I will 'share the details of the fix once 
> I test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14958) Improve the 'TestClass' did not produce a TEST-*.xml file message to include list of all qfiles in a batch, batch id

2016-10-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14958:
--
Attachment: HIVE-14958.02.patch

Small modification to include a message saying "likely timed out", and a tab.

> Improve the 'TestClass' did not produce a TEST-*.xml file message to include 
> list of all qfiles in a batch, batch id
> 
>
> Key: HIVE-14958
> URL: https://issues.apache.org/jira/browse/HIVE-14958
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14958.01.patch, HIVE-14958.02.patch
>
>
> Should make it easier to hunt down the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14958) Improve the 'TestClass' did not produce a TEST-*.xml file message to include list of all qfiles in a batch, batch id

2016-10-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578912#comment-15578912
 ] 

Siddharth Seth commented on HIVE-14958:
---

e.g. output
{code}
TestMinimrCliDriver - did not produce a TEST-*.xml file (batchId=99)
[infer_bucket_sort_merge.q,infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q]
{code}

cc [~prasanth_j], [~spena] for review.

> Improve the 'TestClass' did not produce a TEST-*.xml file message to include 
> list of all qfiles in a batch, batch id
> 
>
> Key: HIVE-14958
> URL: https://issues.apache.org/jira/browse/HIVE-14958
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14958.01.patch
>
>
> Should make it easier to hunt down the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14958) Improve the 'TestClass' did not produce a TEST-*.xml file message to include list of all qfiles in a batch, batch id

2016-10-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14958:
--
Attachment: HIVE-14958.01.patch

> Improve the 'TestClass' did not produce a TEST-*.xml file message to include 
> list of all qfiles in a batch, batch id
> 
>
> Key: HIVE-14958
> URL: https://issues.apache.org/jira/browse/HIVE-14958
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14958.01.patch
>
>
> Should make it easier to hunt down the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14958) Improve the 'TestClass' did not produce a TEST-*.xml file message to include list of all qfiles in a batch, batch id

2016-10-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14958:
--
Status: Patch Available  (was: Open)

> Improve the 'TestClass' did not produce a TEST-*.xml file message to include 
> list of all qfiles in a batch, batch id
> 
>
> Key: HIVE-14958
> URL: https://issues.apache.org/jira/browse/HIVE-14958
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14958.01.patch
>
>
> Should make it easier to hunt down the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14980) Minor compaction when triggered simultaniously on the same table/partition deletes data

2016-10-15 Thread Mahipal Jupalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahipal Jupalli updated HIVE-14980:
---
Description: 
I have two tables (TABLEA, TABLEB). If I manually trigger compaction after each 
INSERT into TABLEB from TABLEA, compactions are triggered on random metastore 
asynchronously and are stepping on each other which is causing the data to be 
deleted.

Example here: 
TABLEA - has 10k rows. 

insert into mj.tableb select * from mj.tablea;
alter table mj.tableb compact 'MINOR';
insert into mj.tableb select * from mj.tablea;
alter table mj.tableb compact 'MINOR';

Once all the compactions are complete, I should ideally see 20k rows in TABLEB. 
But I see only 10k rows (Only the rows INSERTED before the last compaction 
persist, the old rows are deleted. I believe the old delta files are deleted). 

To further confirm the bug, if I do only one compaction after two inserts, I 
see 20k rows in TABLEB.

Proposed Fix:
I have identified the bug in the code, it requires an additional check in the 
org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
compactions on the table/partition. I will 'share the details of the fix once I 
test it.

  was:
I have two tables (TABLEA, TABLEB). If I manually trigger compaction after each 
INSERT into TABLEB from TABLEA, compactions are triggered on random metastore 
asynchronously and are stepping on each other which is causing the data to be 
deleted.

Example here: 
TABLEA - has 10k rows. 

insert into mj.tableb select * from mj.tablea;
alter table mj.tableb compact 'MINOR';
insert into mj.tableb select * from mj.tablea;
alter table mj.tableb compact 'MINOR';

Once all the compactions are complete, I should ideally see 20k rows in the 
table. But I see only 10k rows (Only the rows INSERTED before the last 
compaction persist, the old rows are deleted. I believe the old delta files are 
deleted). 

To further confirm the bug, if I do only one compaction after two inserts, I 
see 20k rows in TABLEB.

Proposed Fix:
I have identified the bug in the code, it requires an additional check in the 
org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
compactions on the table/partition. I will 'share the details of the fix once I 
test it.


> Minor compaction when triggered simultaniously on the same table/partition 
> deletes data
> ---
>
> Key: HIVE-14980
> URL: https://issues.apache.org/jira/browse/HIVE-14980
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Mahipal Jupalli
>Assignee: Mahipal Jupalli
>Priority: Critical
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> I have two tables (TABLEA, TABLEB). If I manually trigger compaction after 
> each INSERT into TABLEB from TABLEA, compactions are triggered on random 
> metastore asynchronously and are stepping on each other which is causing the 
> data to be deleted.
> Example here: 
> TABLEA - has 10k rows. 
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> Once all the compactions are complete, I should ideally see 20k rows in 
> TABLEB. But I see only 10k rows (Only the rows INSERTED before the last 
> compaction persist, the old rows are deleted. I believe the old delta files 
> are deleted). 
> To further confirm the bug, if I do only one compaction after two inserts, I 
> see 20k rows in TABLEB.
> Proposed Fix:
> I have identified the bug in the code, it requires an additional check in the 
> org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
> compactions on the table/partition. I will 'share the details of the fix once 
> I test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14980) Minor compaction when triggered simultaniously on the same table/partition deletes data

2016-10-15 Thread Mahipal Jupalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahipal Jupalli updated HIVE-14980:
---
Labels:   (was: newbie patch-pending)

> Minor compaction when triggered simultaniously on the same table/partition 
> deletes data
> ---
>
> Key: HIVE-14980
> URL: https://issues.apache.org/jira/browse/HIVE-14980
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Mahipal Jupalli
>Assignee: Mahipal Jupalli
>Priority: Critical
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> I have two tables (TABLEA, TABLEB). If I manually trigger compaction after 
> each INSERT into TABLEB from TABLEA, compactions are triggered on random 
> metastore asynchronously and are stepping on each other which is causing the 
> data to be deleted.
> Example here: 
> TABLEA - has 10k rows. 
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> Once all the compactions are complete, I should ideally see 20k rows in the 
> table. But I see only 10k rows (Only the rows INSERTED before the last 
> compaction persist, the old rows are deleted. I believe the old delta files 
> are deleted). 
> To further confirm the bug, if I do only one compaction after two inserts, I 
> see 20k rows in TABLEB.
> Proposed Fix:
> I have identified the bug in the code, it requires an additional check in the 
> org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
> compactions on the table/partition. I will 'share the details of the fix once 
> I test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-10-15 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578800#comment-15578800
 ] 

Zoltan Haindrich commented on HIVE-13557:
-

i've posted the first version.

instead of trying to force the existing interval evaluation logic to do these 
too...i've added an interval evalution logic inside an udf; i think it would be 
good to use the same thing to evaluate the old {{intervalLiteral}}'s outputs - 
however that would need intervalLiteral to be removed from the 'constant' 
group...I think the only feature which would be missing after that: is the 
ability to specify skewed columns by intervalLiterals...

using an udf enables that the interval argument can be possibly a column 
reference also - if I interpretted the sql2011 correctly: that's desired by the 
standard.

note: currently the parentheses are mandatory..but in case we merge 
{{intervalExpression}} with {{intervalLiteral}}...it wont be needed anymore - I 
haven't found any references about these parentheses to be mandatory - its a 
large document...I lose my mind every time I open it ;)

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-10-15 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13557:

Attachment: HIVE-13557.1.patch

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578590#comment-15578590
 ] 

Pengcheng Xiong commented on HIVE-14957:


also cc'ing [~ashutoshc]

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578579#comment-15578579
 ] 

Pengcheng Xiong commented on HIVE-14957:


[~jcamachorodriguez], yes, i agree that a test case will absolutely be better. 
I have already considered this when I found the issue. My concern is that this 
bug will be exposed only when we do a project-remove (see my patch for 
intersect-merge rule). It is not reproducible on current master according to 
our current sequence of optimization rules. Thus, I would prefer just backport 
the code change without q tests. And for the future release, HIVE-12765 has 
test cases. How does that sound to you? 

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578570#comment-15578570
 ] 

Pengcheng Xiong commented on HIVE-12765:


[~ashutoshc], the test cases failures are ok to me. I only need to update 1-2 
golden files. Could you please take a look? Thanks.

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch, HIVE-12765.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578544#comment-15578544
 ] 

Matt McCline commented on HIVE-11394:
-

I misinterpreted your message as the entire problem was fixed.

I added EXPLAIN VECTORIZATION EXPRESSION to Q file.

{code}
  Map Join Vectorization:
  className: VectorMapJoinOperator
  native: false
  nativeConditionsMet: 
hive.vectorized.execution.mapjoin.native.enabled IS true, hive.execution.engine 
tez IN [tez, spark] IS true, One MapJoin Condition IS true, No nullsafe IS 
true, Supports Key Types IS true, When Fast Hash Table, then requires no Hybrid 
Hash Join IS true, Small table vectorizes IS true
  nativeConditionsNotMet: Not empty key IS false
{code}

I think my change causes an expensive query to not use native Vector MapJoin.  
I suspect the "Not empty key" condition is now too strict.  So, the query isn't 
hanging but now taking too long.

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 

[jira] [Commented] (HIVE-11957) SHOW TRANSACTIONS should show queryID/agent id of the creator

2016-10-15 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578503#comment-15578503
 ] 

Eugene Koifman commented on HIVE-11957:
---

+1 patch 7

> SHOW TRANSACTIONS should show queryID/agent id of the creator
> -
>
> Key: HIVE-11957
> URL: https://issues.apache.org/jira/browse/HIVE-11957
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-11957.1.patch, HIVE-11957.2.patch, 
> HIVE-11957.3.patch, HIVE-11957.4.patch, HIVE-11957.5.patch, 
> HIVE-11957.6.patch, HIVE-11957.7.patch
>
>
> this would be very useful for debugging
> should also include heartbeat/create timestamps
> would be nice to support some filtering/sorting options, like sort by create 
> time, agent id. filter by table, database, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577742#comment-15577742
 ] 

Hive QA commented on HIVE-12765:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833487/HIVE-12765.04.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10540 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_limit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_top_level]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_null_projection]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1]
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1584/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1584/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1584/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833487 - PreCommit-HIVE-Build

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch, HIVE-12765.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14932) handle bucketing for MM tables

2016-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577647#comment-15577647
 ] 

Hive QA commented on HIVE-14932:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833483/HIVE-14932.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1583/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1583/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1583/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-15 08:13:35.594
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-1583/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-15 08:13:35.596
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   8f886f2..c71ef4f  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 8f886f2 HIVE-14942: HS2 UI: Canceled queries show up in "Open 
Queries" (Tao Li via Mohit Sabharwal)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at c71ef4f HIVE-14966: JDBC: Make cookie-auth work in HTTP mode 
(Gopal V reviewed by Tao Li, Vaibhav Gumashta)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-15 08:13:37.098
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:1525
error: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: patch does 
not apply
error: ql/src/test/queries/clientpositive/mm_all.q: No such file or directory
error: ql/src/test/queries/clientpositive/mm_current.q: No such file or 
directory
error: ql/src/test/results/clientpositive/llap/mm_all.q.out: No such file or 
directory
error: ql/src/test/results/clientpositive/llap/mm_current.q.out: No such file 
or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833483 - PreCommit-HIVE-Build

> handle bucketing for MM tables
> --
>
> Key: HIVE-14932
> URL: https://issues.apache.org/jira/browse/HIVE-14932
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14932.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11957) SHOW TRANSACTIONS should show queryID/agent id of the creator

2016-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577642#comment-15577642
 ] 

Hive QA commented on HIVE-11957:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833479/HIVE-11957.7.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testConnections
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1]
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1582/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1582/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1582/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833479 - PreCommit-HIVE-Build

> SHOW TRANSACTIONS should show queryID/agent id of the creator
> -
>
> Key: HIVE-11957
> URL: https://issues.apache.org/jira/browse/HIVE-11957
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-11957.1.patch, HIVE-11957.2.patch, 
> HIVE-11957.3.patch, HIVE-11957.4.patch, HIVE-11957.5.patch, 
> HIVE-11957.6.patch, HIVE-11957.7.patch
>
>
> this would be very useful for debugging
> should also include heartbeat/create timestamps
> would be nice to support some filtering/sorting options, like sort by create 
> time, agent id. filter by table, database, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14966) JDBC: Make cookie-auth work in HTTP mode

2016-10-15 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14966:

Affects Version/s: (was: 2.2.0)
   (was: 1.3.0)
   1.2.1
   2.1.0

> JDBC: Make cookie-auth work in HTTP mode
> 
>
> Key: HIVE-14966
> URL: https://issues.apache.org/jira/browse/HIVE-14966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 2.2.0
>
> Attachments: HIVE-14966.1.patch, HIVE-14966.2.patch
>
>
> HiveServer2 cookie-auth is non-functional and forces authentication to be 
> repeated for the status check loop, row fetch loop and the get logs loop.
> The repeated auth in the fetch-loop is a performance issue, but is also 
> causing occasional DoS responses from the remote auth-backend if this is not 
> using local /etc/passwd.
> The HTTP-Cookie auth once made functional will behave similarly to the binary 
> protocol, authenticating exactly once per JDBC session and not causing 
> further load on the authentication backend irrespective how many rows are 
> returned from the JDBC request.
> This due to the fact that the cookies are not sent out with matching flags 
> for SSL usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14966) JDBC: Make cookie-auth work in HTTP mode

2016-10-15 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14966:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks [~gopalv] and [~taoli-hwx]

> JDBC: Make cookie-auth work in HTTP mode
> 
>
> Key: HIVE-14966
> URL: https://issues.apache.org/jira/browse/HIVE-14966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 2.2.0
>
> Attachments: HIVE-14966.1.patch, HIVE-14966.2.patch
>
>
> HiveServer2 cookie-auth is non-functional and forces authentication to be 
> repeated for the status check loop, row fetch loop and the get logs loop.
> The repeated auth in the fetch-loop is a performance issue, but is also 
> causing occasional DoS responses from the remote auth-backend if this is not 
> using local /etc/passwd.
> The HTTP-Cookie auth once made functional will behave similarly to the binary 
> protocol, authenticating exactly once per JDBC session and not causing 
> further load on the authentication backend irrespective how many rows are 
> returned from the JDBC request.
> This due to the fact that the cookies are not sent out with matching flags 
> for SSL usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-15 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577605#comment-15577605
 ] 

Jesus Camacho Rodriguez commented on HIVE-14957:


Thanks [~pxiong], patch LGTM, that should fix the issue. I think we should add 
a test case that reproduces the issue to verify that it works properly.

Btw, this should probably be backported at least to 2.1.0.

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14864) Distcp is not called from MoveTask when src is a directory

2016-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577556#comment-15577556
 ] 

Hive QA commented on HIVE-14864:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833472/HIVE-14864.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats]
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1]
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1581/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1581/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1581/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833472 - PreCommit-HIVE-Build

> Distcp is not called from MoveTask when src is a directory
> --
>
> Key: HIVE-14864
> URL: https://issues.apache.org/jira/browse/HIVE-14864
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Sahil Takiar
> Attachments: HIVE-14864.1.patch, HIVE-14864.2.patch, HIVE-14864.patch
>
>
> In FileUtils.java the following code does not get executed even when src 
> directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because 
> srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We 
> should use srcFS.getContentSummary(src).getLength() instead.
> {noformat}
> /* Run distcp if source file/dir is too big */
> if (srcFS.getUri().getScheme().equals("hdfs") &&
> srcFS.getFileStatus(src).getLen() > 
> conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) {
>   LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. 
> (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + 
> ")");
>   LOG.info("Launch distributed copy (distcp) job.");
>   HiveConfUtil.updateJobCredentialProviders(conf);
>   copied = shims.runDistCp(src, dst, conf);
>   if (copied && deleteSource) {
> srcFS.delete(src, true);
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5867) JDBC driver and beeline should support executing an initial SQL script

2016-10-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577538#comment-15577538
 ] 

Lefty Leverenz commented on HIVE-5867:
--

Thanks for the documentation, [~JonnyR], looks good.  I added version 
information and links to this JIRA issue.

* [HiveServer2 Clients -- JDBC -- Connection URL Format | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-ConnectionURLFormat]
* [HiveServer2 Clients -- JDBC -- Connection URL for Remote or Embedded Mode | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-ConnectionURLforRemoteorEmbeddedMode]

Removed the TODOC2.2 label.

> JDBC driver and beeline should support executing an initial SQL script
> --
>
> Key: HIVE-5867
> URL: https://issues.apache.org/jira/browse/HIVE-5867
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Reporter: Prasad Mujumdar
>Assignee: Jianguo Tian
> Fix For: 2.2.0
>
> Attachments: HIVE-5867.1.patch, HIVE-5867.2.patch, HIVE-5867.3 .patch
>
>
> HiveCLI support the .hiverc script that is executed at the start of the 
> session. This is helpful for things like registering UDFs, session specific 
> configs etc.
> This functionality is missing for beeline and JDBC clients. It would be 
> useful for JDBC driver to support an init script with SQL statements that's 
> automatically executed after connection. The script path can be specified via 
> JDBC connection URL. For example 
> {noformat}
> jdbc:hive2://localhost:1/default;initScript=/home/user1/scripts/init.sql
> {noformat}
> This can be added to Beeline's command line option like "-i 
> /home/user1/scripts/init.sql"
> To help transition from HiveCLI to Beeline, we can keep the default init 
> script as $HOME/.hiverc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5867) JDBC driver and beeline should support executing an initial SQL script

2016-10-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-5867:
-
Labels:   (was: TODOC2.2)

> JDBC driver and beeline should support executing an initial SQL script
> --
>
> Key: HIVE-5867
> URL: https://issues.apache.org/jira/browse/HIVE-5867
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Reporter: Prasad Mujumdar
>Assignee: Jianguo Tian
> Fix For: 2.2.0
>
> Attachments: HIVE-5867.1.patch, HIVE-5867.2.patch, HIVE-5867.3 .patch
>
>
> HiveCLI support the .hiverc script that is executed at the start of the 
> session. This is helpful for things like registering UDFs, session specific 
> configs etc.
> This functionality is missing for beeline and JDBC clients. It would be 
> useful for JDBC driver to support an init script with SQL statements that's 
> automatically executed after connection. The script path can be specified via 
> JDBC connection URL. For example 
> {noformat}
> jdbc:hive2://localhost:1/default;initScript=/home/user1/scripts/init.sql
> {noformat}
> This can be added to Beeline's command line option like "-i 
> /home/user1/scripts/init.sql"
> To help transition from HiveCLI to Beeline, we can keep the default init 
> script as $HOME/.hiverc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14966) JDBC: Make cookie-auth work in HTTP mode

2016-10-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577501#comment-15577501
 ] 

Gopal V commented on HIVE-14966:


Failed tests have been failing for a while & unrelated.

Filed bugs for flaky tests

HIVE-14973
HIVE-14974
HIVE-14975
HIVE-14976
HIVE-14977
HIVE-14978

> JDBC: Make cookie-auth work in HTTP mode
> 
>
> Key: HIVE-14966
> URL: https://issues.apache.org/jira/browse/HIVE-14966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.3.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14966.1.patch, HIVE-14966.2.patch
>
>
> HiveServer2 cookie-auth is non-functional and forces authentication to be 
> repeated for the status check loop, row fetch loop and the get logs loop.
> The repeated auth in the fetch-loop is a performance issue, but is also 
> causing occasional DoS responses from the remote auth-backend if this is not 
> using local /etc/passwd.
> The HTTP-Cookie auth once made functional will behave similarly to the binary 
> protocol, authenticating exactly once per JDBC session and not causing 
> further load on the authentication backend irrespective how many rows are 
> returned from the JDBC request.
> This due to the fact that the cookies are not sent out with matching flags 
> for SSL usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14975) Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz

2016-10-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14975:
---
Description: 
{code}
2016-10-14T22:51:32,947  INFO [main] beeline.TestBeelineArgParsing: Add 
/home/hiveptest/104.155.175.228-hiveptest-0/maven/postgresql/postgresql/9.1-901.jdbc4/postgresql-9.1-901.jdbc4.jar
 for the driver class org.postgresql.Driver
Fail to add local jar due to the exception:java.util.zip.ZipException: error in 
opening zip file
error in opening zip file
{code}

  was:
{code}
2016-10-14T22:51:33,072  INFO [main] beeline.TestBeelineArgParsing: Add 
/home/hiveptest/104.155.175.228-hiveptest-0/apache-github-source-source/beeline/target/test-classes/DummyDriver.jar
 for the driver class DummyDriver
Fail to add local jar due to the exception:java.util.zip.ZipException: error in 
opening zip file
error in opening zip file
Fail to scan drivers due to the exception:java.util.zip.ZipException: error in 
opening zip file
error in opening zip file
{code}


> Flaky Test: TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz
> --
>
> Key: HIVE-14975
> URL: https://issues.apache.org/jira/browse/HIVE-14975
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> 2016-10-14T22:51:32,947  INFO [main] beeline.TestBeelineArgParsing: Add 
> /home/hiveptest/104.155.175.228-hiveptest-0/maven/postgresql/postgresql/9.1-901.jdbc4/postgresql-9.1-901.jdbc4.jar
>  for the driver class org.postgresql.Driver
> Fail to add local jar due to the exception:java.util.zip.ZipException: error 
> in opening zip file
> error in opening zip file
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14966) JDBC: Make cookie-auth work in HTTP mode

2016-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15577483#comment-15577483
 ] 

Hive QA commented on HIVE-14966:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833463/HIVE-14966.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1]
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1580/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1580/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1580/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833463 - PreCommit-HIVE-Build

> JDBC: Make cookie-auth work in HTTP mode
> 
>
> Key: HIVE-14966
> URL: https://issues.apache.org/jira/browse/HIVE-14966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.3.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14966.1.patch, HIVE-14966.2.patch
>
>
> HiveServer2 cookie-auth is non-functional and forces authentication to be 
> repeated for the status check loop, row fetch loop and the get logs loop.
> The repeated auth in the fetch-loop is a performance issue, but is also 
> causing occasional DoS responses from the remote auth-backend if this is not 
> using local /etc/passwd.
> The HTTP-Cookie auth once made functional will behave similarly to the binary 
> protocol, authenticating exactly once per JDBC session and not causing 
> further load on the authentication backend irrespective how many rows are 
> returned from the JDBC request.
> This due to the fact that the cookies are not sent out with matching flags 
> for SSL usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)